You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/1-Introduction/04-stats-and-probability/notebook.ipynb

373 lines
91 KiB

{
"cells": [
{
"cell_type": "markdown",
"source": [
"# Introduction to Probability and Statistics\r\n",
"|\r\n",
"In this notebook, we will play around with some of the concepts we have previously discussed. Many concepts from probability and statistics are well-represented in major libraries for data processing in Python, such as `numpy` and `pandas`."
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 6,
"source": [
"import numpy as np\r\n",
"import pandas as pd\r\n",
"import random\r\n",
"import matplotlib.pyplot as plt"
],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"\r\n",
"## Random Variables and Distributions\r\n",
"\r\n",
"Let's start with drawing a sample of 30 variables from a uniform disribution from 0 to 9. We will also compute mean and variance."
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 17,
"source": [
"sample = [ random.randint(0,10) for _ in range(30) ]\r\n",
"print(f\"Sample: {sample}\")\r\n",
"print(f\"Mean = {np.mean(sample)}\")\r\n",
"print(f\"Variance = {np.var(sample)}\")"
],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Sample: [4, 6, 3, 0, 3, 4, 7, 7, 9, 6, 8, 2, 0, 3, 10, 7, 2, 0, 2, 1, 1, 6, 5, 0, 9, 0, 1, 8, 2, 9]\n",
"Mean = 4.166666666666667\n",
"Variance = 10.272222222222222\n"
]
}
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"To visually estimate how many different values are there in the sample, we can plot the **histogram**:"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 18,
"source": [
"plt.hist(sample)\r\n",
"plt.show()"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
],
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"297.190125pt\" version=\"1.1\" viewBox=\"0 0 384.8825 297.190125\" width=\"384.8825pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-12T14:31:22.124750</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 297.190125 \r\nL 384.8825 297.190125 \r\nL 384.8825 0 \r\nL 0 0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 20.5625 273.312 \r\nL 377.6825 273.312 \r\nL 377.6825 7.2 \r\nL 20.5625 7.2 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"patch_3\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 36.795227 273.312 \r\nL 69.260682 273.312 \r\nL 69.260682 19.872 \r\nL 36.795227 19.872 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_4\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 69.260682 273.312 \r\nL 101.726136 273.312 \r\nL 101.726136 121.248 \r\nL 69.260682 121.248 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_5\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 101.726136 273.312 \r\nL 134.191591 273.312 \r\nL 134.191591 70.56 \r\nL 101.726136 70.56 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_6\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 134.191591 273.312 \r\nL 166.657045 273.312 \r\nL 166.657045 121.248 \r\nL 134.191591 121.248 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_7\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 166.657045 273.312 \r\nL 199.1225 273.312 \r\nL 199.1225 171.936 \r\nL 166.657045 171.936 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_8\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 199.1225 273.312 \r\nL 231.587955 273.312 \r\nL 231.587955 222.624 \r\nL 199.1225 222.624 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_9\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 231.587955 273.312 \r\nL 264.053409 273.312 \r\nL 264.053409 121.248 \r\nL 231.587955 121.248 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_10\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 264.053409 273.312 \r\nL 296.518864 273.312 \r\nL 296.518864 121.248 \r\nL 264.053409 121.248 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_11\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 296.518864 273.312 \r\nL 328.984318 273.312 \r\nL 328.984318 171.936 \r\nL 296.518864 171.936 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_12\">\r\n <path clip-path=\"url(#p163557bfa1)\" d=\"M 328.984318 273.312 \r\nL 361.449773 273.312 \r\nL 361.449773 70.56 \r\nL 328.984318 70.56 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"matplotlib.axis_1\">\r\n <g id=\"xtick_1\">\r\n <g id=\"line2d_1\">\r\n <defs>\r\n <path d=\"M 0 0 \r\nL 0 3.5 \r\n\" id=\"mc00045cf69\" style=\"stroke:#000000;stroke-width:0.8;\"/>\r\n </defs>\r\n <g>\r\n <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"36.795227\" xlink:href=\"#mc00045cf69\" y=\"273.312\"/>\r\n </g>\r\n </g>\r\n <g id=\"text_1\">\r\n <!-- 0 -->\r\n <g transform=\"translate(33.613977 287.910437)scale(
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhYAAAGdCAYAAABO2DpVAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAUzklEQVR4nO3df4zXBf3A8dcJ8QHt7goK4caBWBQKYgZWICmlshFjudYPTY1l/WEDg26VoG2KC45suVoUhmu2VgZrhdJMFv0Aco2EmyRD54+JeuUPZrU7vObHCe/vH81b9xXUz/H63IfPx8dj+/zxft/7c+/X3rvd+7n3vT/3biqKoggAgAQn1XoAAKBxCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAIM3wod7hkSNH4umnn47m5uZoamoa6t0DAINQFEUcOnQo2tra4qSTjn1dYsjD4umnn4729vah3i0AkKC7uzsmTJhwzK8PeVg0NzdHxH8Ha2lpGerdAwCD0NvbG+3t7f3n8WMZ8rB45c8fLS0twgIA6szr3cbg5k0AII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSVBQWN954YzQ1NQ14jRs3rlqzAQB1puJnhUybNi1+//vf9y8PGzYsdSAAoH5VHBbDhw93lQIAOKqK77F49NFHo62tLSZPnhyXXnppPP7446+5fblcjt7e3gEvAKAxVXTF4oMf/GD89Kc/jfe85z3x3HPPxTe/+c2YM2dO7N+/P8aMGXPU93R2dsaqVatShn09p624e0j2k+mJtQtrPQIApGkqiqIY7Jv7+vriXe96V3z961+Pjo6Oo25TLpejXC73L/f29kZ7e3v09PRES0vLYHd9VMICAKqjt7c3WltbX/f8XfE9Fv/rlFNOibPOOiseffTRY25TKpWiVCodz24AgDpxXP/Holwux0MPPRTjx4/PmgcAqGMVhcVXv/rV2LFjRxw4cCD++te/xic/+cno7e2NxYsXV2s+AKCOVPSnkL///e9x2WWXxfPPPx/vfOc740Mf+lDs2rUrJk2aVK35AIA6UlFYbNy4sVpzAAANwLNCAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASCMsAIA0wgIASHNcYdHZ2RlNTU2xfPnypHEAgHo26LDYvXt3bNiwIWbMmJE5DwBQxwYVFi+88EJcfvnlcdttt8Xb3/727JkAgDo1qLBYsmRJLFy4MC666KLX3bZcLkdvb++AFwDQmIZX+oaNGzdGV1dX7Nmz5w1t39nZGatWrap4ME5cp624u9YjVOyJtQtrPQJQh/y+q1xFVyy6u7tj2bJl8fOf/zxGjhz5ht6zcuXK6Onp6X91d3cPalAA4MRX0RWLrq6uOHjwYMycObN/3eHDh2Pnzp2xbt26KJfLMWzYsAHvKZVKUSqVcqYFAE5oFYXFhRdeGPv27Ruw7vOf/3xMnTo1rr322ldFBQDw5lJRWDQ3N8f06dMHrDvllFNizJgxr1oPALz5+M+bAECaij8V8v9t3749YQwAoBG4YgEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApKkoLNavXx8zZsyIlpaWaGlpidmzZ8c999xTrdkAgDpTUVhMmDAh1q5dG3v27Ik9e/bERz/60fj4xz8e+/fvr9Z8AEAdGV7JxosWLRqwvHr16li/fn3s2rUrpk2bljoYAFB/KgqL/3X48OH45S9/GX19fTF79uxjblcul6NcLvcv9/b2DnaXAMAJruKw2LdvX8yePTtefPHFeOtb3xqbN2+OM88885jbd3Z2xqpVq45ryEZ22oq7az3Cm0K9Hucn1i6s9QhvCvX681Fv/Dy/OVT8qZD3vve9sXfv3ti1a1d86UtfisWLF8eDDz54zO1XrlwZPT09/a/u7u7jGhgAOHFVfMVixIgR8e53vzsiImbNmhW7d++O733ve/GjH/3oqNuXSqUolUrHNyUAUBeO+/9YFEUx4B4KAODNq6IrFtddd10sWLAg2tvb49ChQ7Fx48bYvn17bN26tVrzAQB1pKKweO655+LKK6+MZ555JlpbW2PGjBmxdevWuPjii6s1HwBQRyoKix//+MfVmgMAaACeFQIApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApKkoLDo7O+Pcc8+N5ubmGDt2bFxyySXx8MMPV2s2AKDOVBQWO3bsiCVLlsSuXbti27Zt8fLLL8f8+fOjr6+vWvMBAHVkeCUbb926dcDy7bffHmPHjo2urq44//zzUwcDAOpPRWHx//X09ERExOjRo4+5TblcjnK53L/c29t7PLsEAE5ggw6Loiiio6Mj5s6dG9OnTz/mdp2dnbFq1arB7gbe1E5bcXetR6jYE2sX1noETlD1+PNM5Qb9qZClS5fGAw88EL/4xS9ec7uVK1dGT09P/6u7u3uwuwQATnCDumJxzTXXxJYtW2Lnzp0xYcKE19y2VCpFqVQa1HAAQH2pKCyKoohrrrkmNm/eHNu3b4/JkydXay4AoA5VFBZLliyJO+64I+66665obm6OZ599NiIiWltbY9SoUVUZEACoHxXdY7F+/fro6emJefPmxfjx4/tfmzZtqtZ8AEAdqfhPIQAAx+JZIQBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKSpOCx27twZixYtira2tmhqaoo777yzCmMBAPWo4rDo6+uLs88+O9atW1eNeQCAOja80jcsWLAgFixYUI1ZAIA6V3FYVKpcLke5XO5f7u3trfYuAYAaqXpYdHZ2xqpVq6q
},
"metadata": {}
}
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Analyzing Real Data\r\n",
"\r\n",
"Mean and variance are very important when analyzing real-world data. Let's load the data about baseball players from [SOCR MLB Height/Weight Data](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_MLB_HeightsWeights)"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 26,
"source": [
"df = pd.read_csv(\"../../data/SOCR_MLB.tsv\",sep='\\t',header=None,names=['Name','Team','Rome','Height','Weight','Age'])\r\n",
"df"
],
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" Name Team Rome Height Weight Age\n",
"0 Adam_Donachie BAL Catcher 74 180.0 22.99\n",
"1 Paul_Bako BAL Catcher 74 215.0 34.69\n",
"2 Ramon_Hernandez BAL Catcher 72 210.0 30.78\n",
"3 Kevin_Millar BAL First_Baseman 72 210.0 35.43\n",
"4 Chris_Gomez BAL First_Baseman 73 188.0 35.71\n",
"... ... ... ... ... ... ...\n",
"1029 Brad_Thompson STL Relief_Pitcher 73 190.0 25.08\n",
"1030 Tyler_Johnson STL Relief_Pitcher 74 180.0 25.73\n",
"1031 Chris_Narveson STL Relief_Pitcher 75 205.0 25.19\n",
"1032 Randy_Keisler STL Relief_Pitcher 75 190.0 31.01\n",
"1033 Josh_Kinney STL Relief_Pitcher 73 195.0 27.92\n",
"\n",
"[1034 rows x 6 columns]"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Name</th>\n",
" <th>Team</th>\n",
" <th>Rome</th>\n",
" <th>Height</th>\n",
" <th>Weight</th>\n",
" <th>Age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Adam_Donachie</td>\n",
" <td>BAL</td>\n",
" <td>Catcher</td>\n",
" <td>74</td>\n",
" <td>180.0</td>\n",
" <td>22.99</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Paul_Bako</td>\n",
" <td>BAL</td>\n",
" <td>Catcher</td>\n",
" <td>74</td>\n",
" <td>215.0</td>\n",
" <td>34.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Ramon_Hernandez</td>\n",
" <td>BAL</td>\n",
" <td>Catcher</td>\n",
" <td>72</td>\n",
" <td>210.0</td>\n",
" <td>30.78</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Kevin_Millar</td>\n",
" <td>BAL</td>\n",
" <td>First_Baseman</td>\n",
" <td>72</td>\n",
" <td>210.0</td>\n",
" <td>35.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Chris_Gomez</td>\n",
" <td>BAL</td>\n",
" <td>First_Baseman</td>\n",
" <td>73</td>\n",
" <td>188.0</td>\n",
" <td>35.71</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1029</th>\n",
" <td>Brad_Thompson</td>\n",
" <td>STL</td>\n",
" <td>Relief_Pitcher</td>\n",
" <td>73</td>\n",
" <td>190.0</td>\n",
" <td>25.08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1030</th>\n",
" <td>Tyler_Johnson</td>\n",
" <td>STL</td>\n",
" <td>Relief_Pitcher</td>\n",
" <td>74</td>\n",
" <td>180.0</td>\n",
" <td>25.73</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1031</th>\n",
" <td>Chris_Narveson</td>\n",
" <td>STL</td>\n",
" <td>Relief_Pitcher</td>\n",
" <td>75</td>\n",
" <td>205.0</td>\n",
" <td>25.19</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1032</th>\n",
" <td>Randy_Keisler</td>\n",
" <td>STL</td>\n",
" <td>Relief_Pitcher</td>\n",
" <td>75</td>\n",
" <td>190.0</td>\n",
" <td>31.01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1033</th>\n",
" <td>Josh_Kinney</td>\n",
" <td>STL</td>\n",
" <td>Relief_Pitcher</td>\n",
" <td>73</td>\n",
" <td>195.0</td>\n",
" <td>27.92</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1034 rows × 6 columns</p>\n",
"</div>"
]
},
"metadata": {},
"execution_count": 26
}
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"> We are using a package called **Pandas** here for data analysis. We will talk more about Pandas and working with data in Python later in this course.\r\n",
"\r\n",
"Let's compute average values for age, height and weight:"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 30,
"source": [
"df[['Age','Height','Weight']].mean()"
],
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Age 28.736712\n",
"Height 73.697292\n",
"Weight 201.689255\n",
"dtype: float64"
]
},
"metadata": {},
"execution_count": 30
}
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"Age, height and weight are all continuous random variables. What do you think their distribution is? A good way to find out is to plot the histogram of values: "
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 40,
"source": [
"df['Weight'].hist(bins=15)\r\n",
"plt.suptitle('Weight distribution of MLB Players')\r\n",
"plt.xlabel('Weight')\r\n",
"plt.ylabel('Count')\r\n",
"plt.show()"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
],
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"345.42825pt\" version=\"1.1\" viewBox=\"0 0 411.285625 345.42825\" width=\"411.285625pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-12T14:47:40.679219</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 345.42825 \r\nL 411.285625 345.42825 \r\nL 411.285625 -0 \r\nL 0 -0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 46.965625 307.872 \r\nL 404.085625 307.872 \r\nL 404.085625 41.76 \r\nL 46.965625 41.76 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"patch_3\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 63.198352 307.872 \r\nL 84.841989 307.872 \r\nL 84.841989 297.474462 \r\nL 63.198352 297.474462 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_4\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 84.841989 307.872 \r\nL 106.485625 307.872 \r\nL 106.485625 268.881231 \r\nL 84.841989 268.881231 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_5\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 106.485625 307.872 \r\nL 128.129261 307.872 \r\nL 128.129261 212.994462 \r\nL 106.485625 212.994462 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_6\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 128.129261 307.872 \r\nL 149.772898 307.872 \r\nL 149.772898 114.217846 \r\nL 128.129261 114.217846 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_7\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 149.772898 307.872 \r\nL 171.416534 307.872 \r\nL 171.416534 77.826462 \r\nL 149.772898 77.826462 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_8\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 171.416534 307.872 \r\nL 193.06017 307.872 \r\nL 193.06017 54.432 \r\nL 171.416534 54.432 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_9\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 193.06017 307.872 \r\nL 214.703807 307.872 \r\nL 214.703807 110.318769 \r\nL 193.06017 110.318769 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_10\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 214.703807 307.872 \r\nL 236.347443 307.872 \r\nL 236.347443 176.603077 \r\nL 214.703807 176.603077 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_11\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 236.347443 307.872 \r\nL 257.99108 307.872 \r\nL 257.99108 211.694769 \r\nL 236.347443 211.694769 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_12\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 257.99108 307.872 \r\nL 279.634716 307.872 \r\nL 279.634716 255.884308 \r\nL 257.99108 255.884308 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_13\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 279.634716 307.872 \r\nL 301.278352 307.872 \r\nL 301.278352 277.979077 \r\nL 279.634716 277.979077 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_14\">\r\n <path clip-path=\"url(#p2cb13b89fe)\" d=\"M 301.278352 307.872 \r\nL 322.921989 307.872 \r\nL 322.921989 298.774154 \r\nL 301.278352 298.774154 \r\nz\r\n\" sty
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAHgCAYAAABDx6wqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAABGB0lEQVR4nO3de3xU1b3///fkNhBIAiHmVkJAFFqBolzEA2gSIdxBRCtKVbBYORVoEdB6KRKsCsWKKBRse7gpRqj+AKl4wAAJF4EKQSpQi6gBFIKUiwkQHIZk/f7wmzkMuYckk6y8no/HPGDWXnvv9VnJbN7s2XvGYYwxAgAAsJSfrwcAAABQnQg7AADAaoQdAABgNcIOAACwGmEHAABYjbADAACsRtgBAABWI+wAAACrEXYAAIDVCDuoNd599105HA4tX768yLKOHTvK4XBo3bp1RZa1bt1anTp1qtC+Ro0apZYtW1ZqnCkpKXI4HDp58mSZfV988UWtWrWqUvspdOjQITkcDi1evLjIGCoiLy9PKSkpysjIqNB6xe2rZcuWGjRoUIW2U5bU1FTNnj272GUOh0MpKSlVur+qtmHDBnXp0kWNGjWSw+Eo8ede+PMsraZf/OIXnj6XS0xMVPv27UsdR+HPq/Dh5+enmJgYDRgwQB999FG5amnZsqXXNho3bqxu3brpjTfeKDKexMTEcm0T8CXCDmqNxMREORwOpaene7WfPn1ae/fuVaNGjYos++abb/TVV18pKSmpQvuaMmWKVq5cedVjLktVhJ3iPPzww9q+fXuF1snLy9O0adMqHHYqs6/KKC3sbN++XQ8//HC1j6GyjDG65557FBgYqNWrV2v79u1KSEgodZ2QkBAtXrxYBQUFXu3nzp3TO++8o9DQ0Ksa09q1a7V9+3Zt3bpVr7zyio4fP67ExETt3r27XOv36NFD27dv1/bt27V48WI5HA6NHDlS8+fPv6pxAb4Q4OsBAIUiIiLUvn37Iv8Yb9q0SQEBARo9enSRsFP4vKJhp3Xr1lc1Vl9r3ry5mjdvXq37yMvLU3BwcI3sqyy33HKLT/dflmPHjun06dO688471atXr3KtM3z4cP3P//yPNmzYoOTkZE/78uXLlZ+fr6FDh2rp0qWVHlPnzp0VEREhSerevbtuvvlmtW7dWu+++265zoQ2adLEa9579+6t+Ph4zZo1S7/61a8qPa6aVPg7DHBmB7VKUlKSDhw4oOzsbE9bRkaGunbtqgEDBigzM1Nnz571Wubv769bb71V0g//w543b55uvPFGNWzYUE2bNtXdd9+tr776yms/xb2N9d1332n06NEKDw9X48aNNXDgQH311Vclvt3w7bff6r777lNYWJiioqL0i1/8Qjk5OZ7lDodD58+f15IlSzxvB5R1yv/YsWO65557FBISorCwMA0fPlzHjx8v0q+4t5Y2btyoxMRENWvWTA0bNlSLFi101113KS8vT4cOHdI111wjSZo2bZpnPKNGjfLa3u7du3X33XeradOmnkBY2ltmK1eu1E9/+lM1aNBA1157rV577TWv5YVnBA4dOuTVnpGRIYfD4Qm2iYmJWrNmjQ4fPuz19snlc3nlz2Dfvn2644471LRpUzVo0EA33nijlixZUux+3n77bT3zzDOKjY1VaGioevfurQMHDhRb05W2bt2qXr16KSQkRMHBwerevbvWrFnjWZ6SkuIJg7/97W/lcDjK9RZp27Zt1b17dy1cuNCrfeHChRo2bJjCwsLKNb7yKtxeYGBgpdZv0qSJ2rZtq8OHD5fab9q0aerWrZvCw8MVGhqqTp06acGCBbr8O6cLX2d5eXlF1r/99tvVrl07z/PyvqYL3+LbvHmzunfvruDgYP3iF7+QVPprA/UDYQe1SuEZmsvP7qSnpyshIUE9evSQw+HQli1bvJZ16tTJcyAfM2aMJkyYoN69e2vVqlWaN2+e9u/fr+7du+vbb78tcb8FBQUaPHiwUlNT9dvf/lYrV65Ut27d1K9fvxLXueuuu9SmTRv9f//f/6cnn3xSqampeuyxxzzLt2/froYNG2rAgAGetwPmzZtX4vYuXLig3r1768MPP9T06dP1zjvvKDo6WsOHDy9z3g4dOqSBAwcqKChICxcu1Nq1azVjxgw1atRIFy9eVExMjNauXSvph39oCsczZcoUr+0MGzZM1113nd555x29/vrrpe5zz549mjBhgh577DGtXLlS3bt3129+8xv98Y9/LHO8V5o3b5569Oih6Ohoz9hKe+vswIED6t69u/bv36/XXntNK1as0A033KBRo0Zp5syZRfo//fTTOnz4sP7nf/5Hf/nLX3Tw4EENHjxY+fn5pY5r06ZNuv3225WTk6MFCxbo7bffVkhIiAYPHuy5tuzhhx/WihUrJEnjx4/X9u3by/0W6ejRo7Vq1SqdOXPGU9e2bds0evTocq1fmvz8fF26dEkXL17UF198obFjx8rpdOruu++u1PbcbrcOHz7sCc0lOXTokMaMGaO//e1vWrFihYYNG6bx48fr97//vafPb37zG505c0apqale6/7rX/9Senq6xo4d62mryGs6Oztb999/v0aMGKEPPvhAjz76aJmvDdQTBqhFTp8+bfz8/MwjjzxijDHm5MmTxuFwmLVr1xpjjLn55pvN5MmTjTHGHDlyxEgyTzzxhDHGmO3btxtJ5uWXX/ba5tdff20aNmzo6WeMMSNHjjTx8fGe52vWrDGSzPz5873WnT59upFkpk6d6mmbOnWqkWRmzpzp1ffRRx81DRo0MAUFBZ62Ro0amZEjR5ar9vnz5xtJ5r333vNq/+Uvf2kkmUWLFhUZQ6F3333XSDJ79uwpcfv/+c9/itRy5faeffbZEpddLj4+3jgcjiL7S05ONqGhoeb8+fPGGGMWLVpkJJmsrCyvfunp6UaSSU9P97QNHDjQ62dyuSvHfe+99xqn02mOHDni1a9///4mODjYfPfdd177GTBggFe/v/3tb0aS2b59e7H7K3TLLbeYyMhIc/bsWU/bpUuXTPv27U3z5s09P+usrCwjybz00kulbu/KvmfPnjWNGzc2c+fONcYY8/jjj5tWrVqZgoICM3bs2CLznpCQYNq1a1fq9gt/Xlc+QkNDzYoVK8ocnzE//HwHDBhg3G63cbvdJisry4wcOdJIMo8//rjXeBISEkrcTn5+vnG73ea5554zzZo183ptJCQkmBtvvNGr/69+9SsTGhrqme+KvKYTEhKMJLNhwwavvuV5bcB+nNlBrdK0aVN17NjRc2Zn06ZN8vf3V48ePSRJCQkJnut0rrxe5/3335fD4dD999+vS5cueR7R0dFe2yzOpk2bJEn33HOPV/t9991X4jpDhgzxev7Tn/5U33//vU6cOFH+gi+Tnp6ukJCQItsdMWJEmeveeOONCgoK0iOPPKIlS5YUOcVfXnfddVe5+7Zr104dO3b0ahsxYoRyc3PLfRFsZW3cuFG9evVSXFycV/uoUaOUl5dX5KxQcT8rSaW+JXP+/Hn94x//0N13363GjRt72v39/fXAAw/om2++KfdbYSVp3Lixfvazn2nhwoW6dOmS3njjDT300EMVvtOuOOvXr9fOnTv18ccf6/3331fv3r117733lvus0wcffKDAwEAFBgaqVatW+tvf/qbx48fr+eefL3W9jRs3qnfv3goLC5O/v78CAwP17LPP6tSpU16vjd/85jfas2eP5w6x3Nxcvfnmmxo5cqRnviv6mm7atKluv/12r7aqem2gbiPsoNZJSkrS559/rmPHjik9PV2dO3f2HPwSEhL0ySefKCcnR+np6QoICFDPnj0l/XANjTFGUVFRnoN04WPHjh2l3ip+6tQpBQQEKDw83Ks9KiqqxHWaNWvm9dzpdEr64e2oyjh16lSx+4uOji5z3datW2v9+vWKjIzU2LFj1bp1a7Vu3VqvvvpqhcYQExNT7r7Fjauw7dSpUxXab0WdOnWq2LHGxsYWu//K/KzOnDkjY0yF9lMZo0eP1u7du/XCCy/oP//5j+c6qqvVsWNHdenSRV27dtXAgQP1zjvv6Lr
},
"metadata": {}
}
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 44,
"source": [
"print(list(df['Weight'])[:20])"
],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"[180.0, 215.0, 210.0, 210.0, 188.0, 176.0, 209.0, 200.0, 231.0, 180.0, 188.0, 180.0, 185.0, 160.0, 180.0, 185.0, 197.0, 189.0, 185.0, 219.0]\n"
]
}
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"source": [],
"outputs": [],
"metadata": {}
}
],
"metadata": {
"orig_nbformat": 4,
"language_info": {
"name": "python",
"version": "3.8.8",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3.8.8 64-bit (conda)"
},
"interpreter": {
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}