You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/2-Working-With-Data/07-python/notebook.ipynb

1500 lines
486 KiB

{
"cells": [
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"## Basic Pandas Examples\n",
"\n",
"This notebook will walk you through some very basic Pandas concepts. We will start with importing typical data science libraries:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 36,
3 years ago
"metadata": {},
"outputs": [],
"source": [
3 years ago
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt"
3 years ago
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"### Series\n",
"\n",
"Series is like a list or 1D-array, but with index. All operations are index-aligned."
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 37,
3 years ago
"metadata": {},
"outputs": [
{
"name": "stdout",
3 years ago
"output_type": "stream",
"text": [
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 5\n",
"5 6\n",
"6 7\n",
"7 8\n",
"8 9\n",
"dtype: int64 0 I\n",
"1 like\n",
"2 to\n",
"3 use\n",
"4 Python\n",
"5 and\n",
"6 Pandas\n",
"7 very\n",
"8 much\n",
"dtype: object\n"
]
}
],
3 years ago
"source": [
"a = pd.Series(range(1,10))\n",
"b = pd.Series([\"I\",\"like\",\"to\",\"use\",\"Python\",\"and\",\"Pandas\",\"very\",\"much\"],index=range(0,9))\n",
"print(a,b)"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"One of the frequent usage of series is **time series**. In time series, index has a special structure - typically a range of dates or datetimes. We can create such an index with `pd.date_range`.\n",
"\n",
"Suppose we have a series that shows the amount of product bought every day, and we know that every sunday we also need to take one item for ourselves. Here is how to model that using series:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 38,
3 years ago
"metadata": {},
"outputs": [
{
"name": "stdout",
3 years ago
"output_type": "stream",
"text": [
"Length of index is 366\n"
]
},
{
"data": {
3 years ago
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzYAAAEmCAYAAACwBgXzAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOz9ebglV3UfDK+qOsMdem6BZgkxyA4W+MVgxBAHm8GBjxC/xsnnxNgv2I7fODjPZ+L4BWMSBycYYYgd4hD7i7HB4Agz2sYgMwuJWSMakITmobulnm/3nc9QVe8fVWvvtddee9euc+rc7pZqPU8/fe+559TZVbVr77XW77d+K8rzPIfWWmuttdZaa6211lprrbUz2OJTPYDWWmuttdZaa6211lprrbVprQ1sWmuttdZaa6211lprrbUz3trAprXWWmuttdZaa6211lo7460NbFprrbXWWmuttdZaa621M97awKa11lprrbXWWmuttdZaO+OtDWxaa6211lprrbXWWmuttTPe2sCmtdZaa6211lprrbXWWjvjrQ1sWmuttdZaa6211lprrbUz3trAprXWWmuttdZaa6211lo7460NbFprrbXWWmuttdZaa621M95qBTZvf/vbIYoi498555yj/p7nObz97W+H8847D+bn5+HHf/zH4Y477mh80K211lprrbXWWmuttdZaa9Q6dT/wQz/0Q/DlL39Z/Z4kifr53e9+N/zhH/4h/MVf/AVceuml8I53vANe8YpXwN133w3bt28POn6WZfDoo4/C9u3bIYqiusNrrbXWWmuttdZaa6211h4nluc5rKyswHnnnQdx7Mdkagc2nU7HQGnol773ve+Ft73tbfDa174WAAA+9KEPwdlnnw0f+chH4F//638ddPxHH30ULrzwwrrDaq211lprrbXWWmuttdYep7Zv3z644IILvO+pHdjce++9cN5550G/34fLL78c3vnOd8JTn/pUePDBB+HgwYPwkz/5k+q9/X4fXvKSl8C3vvUtZ2AzGAxgMBio3/M8V4PfsWNH3eG11lprrbXWWmuttdZaa48TW15ehgsvvDCI/VUrsLn88svhwx/+MFx66aVw6NAheMc73gEvetGL4I477oCDBw8CAMDZZ59tfObss8+Ghx9+2HnMK664An73d3/Xen3Hjh1tYNNaa6211lprrbXWWmutBZWo1BIPeNWrXgU/8zM/A8961rPg5S9/OVx11VUAUFDOXF+a57l3IG9961vh5MmT6t++ffvqDKm11lprrbXWWmuttdZaa206uefFxUV41rOeBffee6+qu0HkBu3w4cMWikOt3+8rdKZFaVprrbXWWmuttdZaa621SWyqwGYwGMBdd90F5557LlxyySVwzjnnwJe+9CX19+FwCNdeey286EUvmnqgrbXWWmuttdZaa6211lprLqtVY/Obv/mb8JrXvAYuuugiOHz4MLzjHe+A5eVleP3rXw9RFMGb3vQmeOc73wnPeMYz4BnPeAa8853vhIWFBfi5n/u5WY2/tdZaa6211lprrbXWWmutXmCzf/9++Jf/8l/C0aNH4UlPehK84AUvgO985ztw8cUXAwDAm9/8ZtjY2IA3vvGNsLS0BJdffjl88YtfDO5h01prrbXWWmuttdZaa621NolFOeornya2vLwMO3fuhJMnT7b1Nq211lprrbXWWmuttfYEtjqxwVQ1Nk8kG6cZ/O13D8CBExvqtW/cexRu2XdiouN97vbH4L7Dqw2NrrXWZMvzHD53+2Nw/5HZzLW7D67AF+84WP3GBizPc/jsbY/CQ0fXan92ME7hr2/eD4dXNmcwstZae+LZN+49Cjc/snSqh9HaaW7fuu8o3PTw43ueHDixAZ++5QCk2exwAtz/Hpxg/wuxOx9dhq/cdWgmx95qawObQLvm7iPwpo/dAr/7d3cAAMDS2hDe8MHr4Zf/4obax7rrsWX4N1feDL/5iVubHmZrrRn2/YMr8G+uvBn+/cdnM9d+/aPfhf/7L2+Ch4/NZrGldvMjJ+DffuS78La/vb32Z//ulkfhNz5+K/y3L90zg5G11toTy1Y2R/CGD14Pv/Bn18HmKD3Vw2ntNLW1wRje8MEb4A0fuB6yGTr9p9p+9+/ugF//6C3wrfuPzuw7vruv2P/e8qnbZnL8X/3fN8Evf+hGI3l/plob2ATaI8fXAQBg31Jx0x89uQHjLIel9WHtYx1aLrLGR1cHzQ2wtdYEW1or5ueRldnMtaOrxfGPrdV/DuoansPxtVHtzyI6iuNtrbXWJre1QQrjLIe1YToxa6G1x7+tbI5hmGawMhjDxuM4AD6xXuxJx2e4D+4rfdClGXxHluWwf6k4/vHHwR7ZBjaBhgEMTqql0rnK8gIirGOY4RqMswZH2FprtqXl3ByMZ7OpjLNiDo+2YC7jOUyS+dtfJiRGafvMtdbatJaSPe/6B4+fwpG0djobXW8fz4ENPg+zpKJh0DSL7zi5MQI87Kx8ha20NrAJNMxIH18bQp7ncGxNZ8DrzjN8wAeP4we9tdPDcBHcHM3GoR+nxfHHW0AzwIRAOoHeyf4TbWDTWmtNGU0utIFNay6jyduN4ePX38F9disCm1HW/B5GGRePh4R7G9gEGsJzwzSD1cHYgByzmo7WevmAD1snq7UZmw5sZrOpYKCwFXMZg7NJEJsDiNiMH78879Za2yqjiYybHl5qEwatiTYcPzEQm3wLEBsMPtK0+e+g/uywDWyeOEZv/PG1ocFzrDuZMXMxGGe1aWyttVbHcG6OsxzGM3A+0KHZCirapIjN5ihV9WxtMqG11qY3uudtjFK4/cDJUzia1k5Xo+vt4xqxyWfPXFhSiM1sA5uWivYEsuPrZmBzbArEBh/wPAcYzSD6bq01NDo3NxsOPrIsVzTMraGiFeOvm0h4lKi8PB6yUa21dqqN73ktHa01yZ4oiA3Gb3V9wTp2bIY1NsdbKtoT0zhic3waxIY84I+H6Li109dowNE0HY1yfbeCirI5oXgAla9sKTOttTa9jdM2sGmt2oZPkBob3JP4c9Gkoc85C+bFcVIzPphRPe5WWhvYBFjKZJ2PscBmUvEAgFOTQT6xPoS3fPI2uPEh92b0t989AL931Z0TUeXe/7UH4E+/dv80Q3TaTQ8vwVs+edtMZRV99oU7DsLvfPp7tRzkP/rKvfDhbz80u0F5jAbdk2ZiRmkGv/Pp71mNOOkivhXzGAOzuugQ1tcATBbY3H9kFf6fT9w6UWPQ092WN0fw1r++Db59/7FTPZTWoJAlPxPmGs9Mf+v+o/D6D1wPv/Pp7zXqeN356DK8+ZO3GqjrVtinbzkA/+Wzd54xvVc2him87W9uh69+/3Cjx/3ja+6D13/genj9B66Hd3/++7X9gWGqfZ1pEJtTuYeGWLaFqmh19r+Hjq7Bmz95a2WDbkM8YIbJv8Mrm/CWT94Gt+0/MbPvAGgDmyA7uTEC+jxzxKbu4kczF6cC9vvMrY/Cx27cB3/29Qed73n3578P7//6g3DPoXod6zdHKbzzc3fBFZ/7PqwPx9MO1bI/+/oD8LEb98FVtz/W+LFD7L1fvhc+/O2H4ebATsrH14bwh1+6B/7zZ07NJmlQ0SbcWL77yAn48Lcfhv/25XuN12lgsxVUNHxW6sL9JmJTf5z/+zsPwydu2g+funl/7c+e7va1e47AX12/D/7bl9vGpaeDffjbD8EnbtoPn7zp9J5r6MCds2MOztrWg81RBtfecwQ+/O2H4db9zdXb/OV3HoaP37gfPnPro40dM8T+6xfvhj//xoNw18HlLf3eSe07Dx6DK697BN7+mTsaO+by5gje/fm74dp7jsC19xyBP77mfiWbH2pNIDa4h/6Xz9450ee3wpTc84yoaGmWw4n1+oHNp27eDx+/cT98/IZ93vfRmvFZqvV+7vaD8LEb98EHvuH2PZuwNrAJMArTARSTgCI4dSezSUXb+sAGm4z6aHA4rrWawck4yyHPi/qhWUDPa+UxZ9VwssrwmoVmnyjKcCruNU2+TBrY4Dnz+bLlVDQUD5gCsZlEPADn2uOBe8w
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"211.135156pt\" version=\"1.1\" viewBox=\"0 0 592.125 211.135156\" width=\"592.125pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-25T17:30:45.936228</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 211.135156 \r\nL 592.125 211.135156 \r\nL 592.125 0 \r\nL 0 0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 26.925 176.059219 \r\nL 584.925 176.059219 \r\nL 584.925 9.739219 \r\nL 26.925 9.739219 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"matplotlib.axis_1\">\r\n <g id=\"xtick_1\">\r\n <g id=\"line2d_1\">\r\n <defs>\r\n <path d=\"M 0 0 \r\nL 0 3.5 \r\n\" id=\"ma31a05846d\" style=\"stroke:#000000;stroke-width:0.8;\"/>\r\n </defs>\r\n <g>\r\n <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"26.925\" xlink:href=\"#ma31a05846d\" y=\"176.059219\"/>\r\n </g>\r\n </g>\r\n <g id=\"text_1\">\r\n <!-- Jan -->\r\n <g transform=\"translate(19.217187 190.657656)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 628 4666 \r\nL 1259 4666 \r\nL 1259 325 \r\nQ 1259 -519 939 -900 \r\nQ 619 -1281 -91 -1281 \r\nL -331 -1281 \r\nL -331 -750 \r\nL -134 -750 \r\nQ 284 -750 456 -515 \r\nQ 628 -281 628 325 \r\nL 628 4666 \r\nz\r\n\" id=\"DejaVuSans-4a\" transform=\"scale(0.015625)\"/>\r\n <path d=\"M 2194 1759 \r\nQ 1497 1759 1228 1600 \r\nQ 959 1441 959 1056 \r\nQ 959 750 1161 570 \r\nQ 1363 391 1709 391 \r\nQ 2188 391 2477 730 \r\nQ 2766 1069 2766 1631 \r\nL 2766 1759 \r\nL 2194 1759 \r\nz\r\nM 3341 1997 \r\nL 3341 0 \r\nL 2766 0 \r\nL 2766 531 \r\nQ 2569 213 2275 61 \r\nQ 1981 -91 1556 -91 \r\nQ 1019 -91 701 211 \r\nQ 384 513 384 1019 \r\nQ 384 1609 779 1909 \r\nQ 1175 2209 1959 2209 \r\nL 2766 2209 \r\nL 2766 2266 \r\nQ 2766 2663 2505 2880 \r\nQ 2244 3097 1772 3097 \r\nQ 1472 3097 1187 3025 \r\nQ 903 2953 641 2809 \r\nL 641 3341 \r\nQ 956 3463 1253 3523 \r\nQ 1550 3584 1831 3584 \r\nQ 2591 3584 2966 3190 \r\nQ 3341 2797 3341 1997 \r\nz\r\n\" id=\"DejaVuSans-61\" transform=\"scale(0.015625)\"/>\r\n <path d=\"M 3513 2113 \r\nL 3513 0 \r\nL 2938 0 \r\nL 2938 2094 \r\nQ 2938 2591 2744 2837 \r\nQ 2550 3084 2163 3084 \r\nQ 1697 3084 1428 2787 \r\nQ 1159 2491 1159 1978 \r\nL 1159 0 \r\nL 581 0 \r\nL 581 3500 \r\nL 1159 3500 \r\nL 1159 2956 \r\nQ 1366 3272 1645 3428 \r\nQ 1925 3584 2291 3584 \r\nQ 2894 3584 3203 3211 \r\nQ 3513 2838 3513 2113 \r\nz\r\n\" id=\"DejaVuSans-6e\" transform=\"scale(0.015625)\"/>\r\n </defs>\r\n <use xlink:href=\"#DejaVuSans-4a\"/>\r\n <use x=\"29.492188\" xlink:href=\"#DejaVuSans-61\"/>\r\n <use x=\"90.771484\" xlink:href=\"#DejaVuSans-6e\"/>\r\n </g>\r\n <!-- 2020 -->\r\n <g transform=\"translate(14.2 201.855469)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 1228 531 \r\nL 3431 531 \r\nL 3431 0 \r\nL 469 0 \r\nL 469 531 \r\nQ 828 903 1448 1529 \r\nQ 2069 2156 2228 2338 \r\nQ 2531 2678 2651 2914 \r\nQ 2772 3150 2772 3378 \r\nQ 2772 3750 2511 3984 \r\nQ 2250 4219 1831 4219 \r\nQ 1534 4219 1204 4116 \r\nQ 875 4013 500 3803 \r\nL 500 4441 \r\nQ 881 4594 1212 4672 \r\nQ 1544 4750 1819 4750 \r\nQ 2544 4750
"text/plain": [
"<Figure size 1000x300 with 1 Axes>"
3 years ago
]
},
3 years ago
"metadata": {},
"output_type": "display_data"
}
],
3 years ago
"source": [
"start_date = \"Jan 1, 2020\"\n",
"end_date = \"Dec 31, 2020\"\n",
"idx = pd.date_range(start_date,end_date)\n",
"print(f\"Length of index is {len(idx)}\")\n",
"items_sold = pd.Series(np.random.randint(25,50,size=len(idx)),index=idx)\n",
"items_sold.plot(figsize=(10,3))\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 39,
3 years ago
"metadata": {},
"outputs": [
{
"name": "stdout",
3 years ago
"output_type": "stream",
"text": [
"Additional items (10 item each week):\n",
"2020-01-05 10\n",
"2020-01-12 10\n",
"2020-01-19 10\n",
"2020-01-26 10\n",
"2020-02-02 10\n",
"2020-02-09 10\n",
"2020-02-16 10\n",
"2020-02-23 10\n",
"2020-03-01 10\n",
"2020-03-08 10\n",
"2020-03-15 10\n",
"2020-03-22 10\n",
"2020-03-29 10\n",
"2020-04-05 10\n",
"2020-04-12 10\n",
"2020-04-19 10\n",
"2020-04-26 10\n",
"2020-05-03 10\n",
"2020-05-10 10\n",
"2020-05-17 10\n",
"2020-05-24 10\n",
"2020-05-31 10\n",
"2020-06-07 10\n",
"2020-06-14 10\n",
"2020-06-21 10\n",
"2020-06-28 10\n",
"2020-07-05 10\n",
"2020-07-12 10\n",
"2020-07-19 10\n",
"2020-07-26 10\n",
"2020-08-02 10\n",
"2020-08-09 10\n",
"2020-08-16 10\n",
"2020-08-23 10\n",
"2020-08-30 10\n",
"2020-09-06 10\n",
"2020-09-13 10\n",
"2020-09-20 10\n",
"2020-09-27 10\n",
"2020-10-04 10\n",
"2020-10-11 10\n",
"2020-10-18 10\n",
"2020-10-25 10\n",
"2020-11-01 10\n",
"2020-11-08 10\n",
"2020-11-15 10\n",
"2020-11-22 10\n",
"2020-11-29 10\n",
"2020-12-06 10\n",
"2020-12-13 10\n",
"2020-12-20 10\n",
"2020-12-27 10\n",
"Freq: W-SUN, dtype: int64\n",
"Total items (sum of two series):\n",
"2020-01-01 NaN\n",
"2020-01-02 NaN\n",
"2020-01-03 NaN\n",
"2020-01-04 NaN\n",
"2020-01-05 54.0\n",
" ... \n",
"2020-12-27 43.0\n",
"2020-12-28 NaN\n",
"2020-12-29 NaN\n",
"2020-12-30 NaN\n",
"2020-12-31 NaN\n",
"Length: 366, dtype: float64\n"
]
}
],
3 years ago
"source": [
"additional_items = pd.Series(10,index=pd.date_range(start_date,end_date,freq=\"W\"))\n",
"print(f\"Additional items (10 item each week):\\n{additional_items}\")\n",
"total_items = items_sold+additional_items\n",
"print(f\"Total items (sum of two series):\\n{total_items}\")"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"As you can see, we are having problems here, because in the weekly series non-mentioned days are considered to be missing (`NaN`), and adding `NaN` to a number gives us `NaN`. In order to get correct result, we need to specify `fill_value` when adding series:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 40,
3 years ago
"metadata": {},
"outputs": [
{
"name": "stdout",
3 years ago
"output_type": "stream",
"text": [
"2020-01-01 26.0\n",
"2020-01-02 25.0\n",
"2020-01-03 37.0\n",
"2020-01-04 30.0\n",
"2020-01-05 54.0\n",
" ... \n",
"2020-12-27 43.0\n",
"2020-12-28 44.0\n",
"2020-12-29 36.0\n",
"2020-12-30 38.0\n",
"2020-12-31 34.0\n",
"Length: 366, dtype: float64\n"
]
},
{
"data": {
3 years ago
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzYAAAEkCAYAAAD9zqT4AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOz9edxlV1Umjq8z3Hvfsd6apxBCgFQgJEEgCAlKIAxfECMN2oqkW/D3a0TR1mjzVSNtk0ZNlG4QFRsbWhkUGm1tZAghCYQECAlkIEllrEyVqtQ8vvN7h3PO949z1t5rr733Ofuce9+p6qzPpz5V9b73nnEPa63nWc/ykiRJoLbaaqutttpqq6222mqrbRWbv9wXUFtttdVWW2211VZbbbXV1q/VgU1ttdVWW2211VZbbbXVtuqtDmxqq6222mqrrbbaaquttlVvdWBTW2211VZbbbXVVlttta16qwOb2mqrrbbaaqutttpqq23VWx3Y1FZbbbXVVltttdVWW22r3urAprbaaqutttpqq6222mpb9VYHNrXVVltttdVWW2211Vbbqrc6sKmtttpqq6222mqrrbbaVr2Fy30B3OI4hv3798P4+Dh4nrfcl1NbbbXVVltttdVWW221LZMlSQLT09Owfft28P0CTCYpac8880xyxRVXJOvXr0+Gh4eTF7/4xcldd90lfh/HcfLBD34w2bZtWzI0NJRceumlyQMPPOB8/L179yYAUP+p/9R/6j/1n/pP/af+U/+p/9R/6j8JACR79+4tjCNKITYnTpyAV73qVfDa174Wrr/+eti8eTM88cQTsHbtWvGZD3/4w/DRj34UPvOZz8COHTvgj//4j+ENb3gDPProozA+Pl54DvzM3r17Yc2aNWUur7baaqutttpqq6222mo7hWxqagrOPPNMpzjCS5IkcT3w7//+78Ntt90G3/3ud42/T5IEtm/fDldeeSX83u/9HgAAtNtt2LJlC/zZn/0ZvPe973W6+ImJCZicnKwDm9pqq6222mqrrbbaajuNrUxsUEo84Ctf+QpcdNFF8G//7b+FzZs3w0te8hL41Kc+JX7/1FNPwcGDB+GNb3yj+Fmr1YJLL70Uvv/97xuP2W63YWpqSvlTW2211VZbbbXVVltttdVWxkoFNk8++SR84hOfgHPOOQduuOEG+NVf/VX4zd/8Tfjc5z4HAAAHDx4EAIAtW7Yo39uyZYv4Hbdrr70WJiYmxJ8zzzyzyn3UVltttdVWW2211VZbbaexlQps4jiGl770pXDNNdfAS17yEnjve98L73nPe+ATn/iE8jmuZpYkiVXh7KqrroLJyUnxZ+/evSVvobbaaqutttpqq6222mo73a1UYLNt2zY477zzlJ+98IUvhD179gAAwNatWwEANHTm8OHDGoqD1mq1YM2aNcqf2mqrrbbaaqutttpqq622MlYqsHnVq14Fjz76qPKzXbt2wVlnnQUAAGeffTZs3boVbrrpJvH7TqcDt956K1xyySUDuNzaaqutttpqq6222mqrrTbdSsk9//Zv/zZccsklcM0118DP//zPww9/+EP45Cc/CZ/85CcBIKWgXXnllXDNNdfAOeecA+eccw5cc801MDIyAu985zsX5QZqq6222mqrrbbaaqutttpKBTYvf/nL4Utf+hJcddVV8KEPfQjOPvts+NjHPgZXXHGF+Mzv/u7vwvz8PLzvfe+DEydOwCte8Qq48cYbnbSnV7L1ohiu23kAXv6c9bB97fByX05ttdVWW2211VZbbbXVRqxUH5ulsJXax+bbjxyGX/7MnfDm87fCJ/7dy5b7cmqrrbbaaqutttpqq+2Ut0XrY3M625GZNgAAHJvpLPOV1FZbbbXVVltttdVWW23c6sDG0Tq9GAAA2lG8zFdSW2211VZbbbXVVltttXGrAxtHw8AG/66tttpqq6222mqrrbbaVo7VgY2jdSIMbKJlvpLaaqutttpqq6222mqrjVsd2DhaN0NqutGK0lpYNJuc68Luo7OVvntoagEOTi4M+Ipqq6222mqrrbbaaqvNbnVg42gSsTk9qGi/9Okfwus/eiscmW6X+l4UJ/CWv/wu/NRffhd6dT1SbbXVVltttdVWW21LZHVg42iixuY0cdb3nZiDXpyURl4WuhEcnenA8dkOzHZq2l5ttdVWW2211VZbbUtjdWDjaO3TTDwgilPKXTcud78RaYtUIza11VZbbbXVVltttS2V1YGNo3Wj0wuxyeIaEeA4f498vux3a6utttpqq6222mqrrarVgY2jUbnnJDn1HXYMULolAzkazHTrwKa22mqrrbbaaquttiWyOrBxNIrUnA7KaEgp65W8VxrLnC5UtIVuXUvkYgvdaFmSAvX7qa222mqrrbbTw+rAxtFobc3pQEdD5KVXssYmJo7r6RAA/tNde+H8D94ANz10aLkvZUXbgcl5uOiPvwl/8KWdS3rer9y3H87/4A3w1fv2L+l5a6utttpqq622pbc6sHE0SsnqngYCAhiflA1OKBWtbFC0Gu1He05CL07gvr0nl/tSVrQ9enAaZto9uOfpk0t63nuz93P30yeW9Ly11VZbbbXVVtvSWx3YOFr7dENsKlLRlMDmNEBsovj0EpWoaojkRUtMRcPger6WHq+tttpqq622U97qwMbRFCraaYDYDIaKduo/JwzeTocx0Y/hUIiXWFACx+BcXWdTW2211VZbbae81YGNo9GMfPsUd2Kp89kfFe3UR2zwHmvEJt8w4I2XGLHB8Tvf6S3peWurrbbaaquttqW3OrBxtNMJsYn7aLJJY5nTArGJpQx4bXbDYHmpqWgCsampaLXVVltttdV2ylsd2DiaIh5wijvs1Pks24uGBkWnQ4PObk1FczIcU0utJ4FUwTqwybfZdg/e+vHvwce+uWu5L6W22mqrrbbaKlsd2Dja6ST3TJ3PqI8GnaeHeEAd2LgYPqelDnZxrtbiAfm2c98k3PfMJHzpR/uW+1Jqq6222mqrrbLVgY2jnU5UNIrYlK2TieLTSzwA7/F0uNd+bNlU0YR4QF1jk2dzWQ3S6YCy1lZbbbXVdupaqcDm6quvBs/zlD9bt24Vv3/3u9+t/f6Vr3zlwC96OYyiNKd6YNNPk824j6BoNZpQRasDm1xbPlU0FA+oEZs8Q6reUr+f2mqrrbbaahukhWW/8KIXvQi++c1viv8HQaD8/k1vehN8+tOfFv9vNpt9XN7KsdOLilZdPOB0Q2zwfk91pbx+rRYPWNmGz+d0SEbUVltttdV26lrpwCYMQwWl4dZqtXJ/v1rtdEJslOCkD/GA06HGplurojkZBjRLTXXCwGa+G0GSJOB53pKef7UYIlpLLcddW2211VZbbYO00jU2jz32GGzfvh3OPvtseMc73gFPPvmk8vtbbrkFNm/eDDt27ID3vOc9cPjw4dzjtdttmJqaUv6sROunxubpY7PwG1+4Bx7YNznoy1oUiwYk91y2uedqNAzeTgd0qh/DgGapqU6IQCQJwEK33Du648lj8B//94/gyHR7MS5tUezPvvEI/M9bnyj9PURs6hqb2mpbWlvoRvA7/3QvXL/zwHJfSm21nRJWKrB5xSteAZ/73OfghhtugE996lNw8OBBuOSSS+DYsWMAAPDmN78ZPv/5z8PNN98MH/nIR+DOO++Eyy67DNptu2Nw7bXXwsTEhPhz5pln9ndHi2C9KFYc9rJUtK/etx++dv8B+N8/3DPgK1scS5TgpB/xgFPfSerVqmhOtlziAfS9zJVs0vmZ23bDV+/bDzc8eHDQl7UodmymDZ+45Qn4bzc8Wvq787V4QG21LYvd/uQx+L/37INPVEhI1FZbbbqVoqK9+c1vFv++4IIL4OKLL4bnPe958NnPfhZ+53d+B37hF35B/P7888+Hiy66CM466yy47rrr4O1vf7vxmFdddRX8zu/8jvj/1NTUigtuuINeNjuPztV8d3Xw/Pupk+mnPmc1Gt7jqV531a9JxGZpz0sD87lOBBtKfHcum6/TC6tDUQ3HYC9OII4T8H132p0QD6jjmtpqW1KbydaXOjlWW22
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"209.282215pt\" version=\"1.1\" viewBox=\"0 0 592.125 209.282215\" width=\"592.125pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-25T17:30:46.696769</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 209.282215 \r\nL 592.125 209.282215 \r\nL 592.125 0 \r\nL 0 0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 26.925 174.206278 \r\nL 584.925 174.206278 \r\nL 584.925 7.886278 \r\nL 26.925 7.886278 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"matplotlib.axis_1\">\r\n <g id=\"xtick_1\">\r\n <g id=\"line2d_1\">\r\n <defs>\r\n <path d=\"M 0 0 \r\nL 0 3.5 \r\n\" id=\"m66edd990ab\" style=\"stroke:#000000;stroke-width:0.8;\"/>\r\n </defs>\r\n <g>\r\n <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"26.925\" xlink:href=\"#m66edd990ab\" y=\"174.206278\"/>\r\n </g>\r\n </g>\r\n <g id=\"text_1\">\r\n <!-- Jan -->\r\n <g transform=\"translate(19.217187 188.804715)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 628 4666 \r\nL 1259 4666 \r\nL 1259 325 \r\nQ 1259 -519 939 -900 \r\nQ 619 -1281 -91 -1281 \r\nL -331 -1281 \r\nL -331 -750 \r\nL -134 -750 \r\nQ 284 -750 456 -515 \r\nQ 628 -281 628 325 \r\nL 628 4666 \r\nz\r\n\" id=\"DejaVuSans-4a\" transform=\"scale(0.015625)\"/>\r\n <path d=\"M 2194 1759 \r\nQ 1497 1759 1228 1600 \r\nQ 959 1441 959 1056 \r\nQ 959 750 1161 570 \r\nQ 1363 391 1709 391 \r\nQ 2188 391 2477 730 \r\nQ 2766 1069 2766 1631 \r\nL 2766 1759 \r\nL 2194 1759 \r\nz\r\nM 3341 1997 \r\nL 3341 0 \r\nL 2766 0 \r\nL 2766 531 \r\nQ 2569 213 2275 61 \r\nQ 1981 -91 1556 -91 \r\nQ 1019 -91 701 211 \r\nQ 384 513 384 1019 \r\nQ 384 1609 779 1909 \r\nQ 1175 2209 1959 2209 \r\nL 2766 2209 \r\nL 2766 2266 \r\nQ 2766 2663 2505 2880 \r\nQ 2244 3097 1772 3097 \r\nQ 1472 3097 1187 3025 \r\nQ 903 2953 641 2809 \r\nL 641 3341 \r\nQ 956 3463 1253 3523 \r\nQ 1550 3584 1831 3584 \r\nQ 2591 3584 2966 3190 \r\nQ 3341 2797 3341 1997 \r\nz\r\n\" id=\"DejaVuSans-61\" transform=\"scale(0.015625)\"/>\r\n <path d=\"M 3513 2113 \r\nL 3513 0 \r\nL 2938 0 \r\nL 2938 2094 \r\nQ 2938 2591 2744 2837 \r\nQ 2550 3084 2163 3084 \r\nQ 1697 3084 1428 2787 \r\nQ 1159 2491 1159 1978 \r\nL 1159 0 \r\nL 581 0 \r\nL 581 3500 \r\nL 1159 3500 \r\nL 1159 2956 \r\nQ 1366 3272 1645 3428 \r\nQ 1925 3584 2291 3584 \r\nQ 2894 3584 3203 3211 \r\nQ 3513 2838 3513 2113 \r\nz\r\n\" id=\"DejaVuSans-6e\" transform=\"scale(0.015625)\"/>\r\n </defs>\r\n <use xlink:href=\"#DejaVuSans-4a\"/>\r\n <use x=\"29.492188\" xlink:href=\"#DejaVuSans-61\"/>\r\n <use x=\"90.771484\" xlink:href=\"#DejaVuSans-6e\"/>\r\n </g>\r\n <!-- 2020 -->\r\n <g transform=\"translate(14.2 200.002528)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 1228 531 \r\nL 3431 531 \r\nL 3431 0 \r\nL 469 0 \r\nL 469 531 \r\nQ 828 903 1448 1529 \r\nQ 2069 2156 2228 2338 \r\nQ 2531 2678 2651 2914 \r\nQ 2772 3150 2772 3378 \r\nQ 2772 3750 2511 3984 \r\nQ 2250 4219 1831 4219 \r\nQ 1534 4219 1204 4116 \r\nQ 875 4013 500 3803 \r\nL 500 4441 \r\nQ 881 4594 1212 4672 \r\nQ 1544 4750 1819 4750 \r\nQ 2544 4750
"text/plain": [
"<Figure size 1000x300 with 1 Axes>"
3 years ago
]
},
3 years ago
"metadata": {},
"output_type": "display_data"
}
],
3 years ago
"source": [
"total_items = items_sold.add(additional_items,fill_value=0)\n",
"print(total_items)\n",
"total_items.plot(figsize=(10,3))\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 41,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
3 years ago
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAzYAAAE/CAYAAACU31agAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAABpyklEQVR4nO3dd1gVx/s28GdBqgI2pImKil2KDURR7A270cRuNMTYWzRGY4lRY4m9x66xRmOLscTEkliisWNDRcWCWEB6v98/eM+GIya/+BU42cP9uS6uhNk9Mjtnd3ae2dkZBQCEiIiIiIhIw0wMnQEiIiIiIqJ3xcCGiIiIiIg0j4ENERERERFpHgMbIiIiIiLSPAY2RERERESkeQxsiIiIiIhI8xjYEBERERGR5jGwISIiIiIizctn6Ay8Lj09XR4/fiw2NjaiKIqhs0NERERERAYCQGJiYsTZ2VlMTP75mcx/LrB5/PixuLq6GjobRERERET0HxEWFibFixf/x33+c4GNjY2NiGRk3tbW1sC5ISIiIiIiQ4mOjhZXV1c1Rvgn/7nARjf8zNbWloENERERERH9q1dUOHkAERERERFpHgMbIiIiIiLSPAY2RERERESkee8U2EyfPl0URZFhw4apaQBk0qRJ4uzsLFZWVhIQECDBwcHvmk8iIiIiIqK/9T8HNmfPnpUVK1aIh4eHXvrMmTNlzpw5smjRIjl79qw4OjpKkyZNJCYm5p0zS0RERERE9Cb/U2ATGxsr3bp1k2+//VYKFSqkpgOQefPmybhx46RDhw5SpUoVWbduncTHx8umTZuyLdNERERERESZ/U+BzcCBA6VVq1bSuHFjvfTQ0FAJDw+Xpk2bqmkWFhZSv359OXny5Bv/raSkJImOjtb7ISIiIiIiehtvvY7Nli1b5M8//5Rz585l2RYeHi4iIg4ODnrpDg4Ocv/+/Tf+e9OnT5fJkye/bTaIiIiIiIhUbxXYhIWFydChQ+XQoUNiaWn5t/u9voAOgL9dVGfs2LEyYsQI9Xfd6qJERO+q1Gc/GvTv3/u6lUH/PhERUV7yVoHNn3/+KREREVK9enU1LS0tTY4fPy6LFi2SmzdvikjGkxsnJyd1n4iIiCxPcXQsLCzEwsLif8k7ERERERGRiLzlOzaNGjWSK1euyMWLF9WfGjVqSLdu3eTixYtSunRpcXR0lMOHD6ufSU5OlmPHjomfn1+2Z56IiIiIiEjkLZ/Y2NjYSJUqVfTS8ufPL0WKFFHThw0bJtOmTRN3d3dxd3eXadOmibW1tXTt2jX7ck1ERERE/wqH5VJe8daTB/xfRo8eLQkJCTJgwACJjIwUHx8fOXTokNjY2GT3nyIiIiIiIhKRbAhsjh49qve7oigyadIkmTRp0rv+00TvhD1URERERHnH/7SODRERERER0X9Jtg9F+69gbz0RERERUd7BJzZERERERKR5RvvEJq/jEysiIiIiykv4xIaIiIiIiDSPgQ0REREREWkeh6IRERGR0TL00GwRDs82NJ4Dhi+D3Dp+BjZERERGLK80aIiIOBSNiIiIiIg0j09siIjIaPFpBRFR3sHAhshIGbpBJ8JGHREREeUeDkUjIiIiIiLNY2BDRERERESax8CGiIiIiIg0j+/YEBEZMUO/a8X3rIiIKLfwiQ0REREREWkeAxsiIiIiItI8BjZERERERKR5DGyIiIiIiEjzGNgQEREREZHmMbAhIiIiIiLNY2BDRERERESax8CGiIiIiIg0j4ENERERERFpHgMbIiIiIiLSPAY2RERERESkeQxsiIiIiIhI8xjYEBERERGR5jGwISIiIiIizWNgQ0REREREmsfAhoiIiIiINI+BDRERERERaR4DGyIiIiIi0jwGNkREREREpHkMbIiIiIiISPMY2BARERERkeYxsCEiIiIiIs1jYENERERERJrHwIaIiIiIiDSPgQ0REREREWkeAxsiIiIiItI8BjZERERERKR5DGyIiIiIiEjzGNgQEREREZHmvVVgs3TpUvHw8BBbW1uxtbWV2rVry08//aRuByCTJk0SZ2dnsbKykoCAAAkODs72TBMREREREWX2VoFN8eLF5euvv5Zz587JuXPnpGHDhtK2bVs1eJk5c6bMmTNHFi1aJGfPnhVHR0dp0qSJxMTE5EjmiYiIiIiIRN4ysGndurW0bNlSypUrJ+XKlZOpU6dKgQIF5PTp0wJA5s2bJ+PGjZMOHTpIlSpVZN26dRIfHy+bNm3KqfwTERERERH97+/YpKWlyZYtWyQuLk5q164toaGhEh4eLk2bNlX3sbCwkPr168vJkyf/9t9JSkqS6OhovR8iIiIiIqK38daBzZUrV6RAgQJiYWEh/fv3lx9++EEqVaok4eHhIiLi4OCgt7+Dg4O67U2mT58udnZ26o+rq+vbZomIiIiIiPK4tw5sypcvLxcvXpTTp0/LJ598Ir169ZJr166p2xVF0dsfQJa0zMaOHSuvXr1Sf8LCwt42S0RERERElMfle9sPmJubS9myZUVEpEaNGnL27FmZP3++jBkzRkREwsPDxcnJSd0/IiIiy1OczCwsLMTCwuJts0FERERERKR653VsAEhSUpK4ubmJo6OjHD58WN2WnJwsx44dEz8/v3f9M0RERERERH/rrZ7YfP7559KiRQtxdXWVmJgY2bJlixw9elQOHDggiqLIsGHDZNq0aeLu7i7u7u4ybdo0sba2lq5du+ZU/omIiIiIiN4usHn69Kn06NFDnjx5InZ2duLh4SEHDhyQJk2aiIjI6NGjJSEhQQYMGCCRkZHi4+Mjhw4dEhsbmxzJPBERERERkchbBjarVq36x+2KosikSZNk0qRJ75InIiIiIiKit/LO79gQEREREREZGgMbIiIiIiLSPAY2RERERESkeQxsiIiIiIhI8xjYEBERERGR5jGwISIiIiIizWNgQ0REREREmsfAhoiIiIiINI+BDRERERERaR4DGyIiIiIi0jwGNkREREREpHkMbIiIiIiISPMY2BARERERkeYxsCEiIiIiIs1jYENERERERJrHwIaIiIiIiDSPgQ0REREREWkeAxsiIiIiItI8BjZERERERKR5DGyIiIiIiEjzGNgQEREREZHmMbAhIiIiIiLNY2BDRERERESax8CGiIiIiIg0j4ENERERERFpHgMbIiIiIiLSPAY2RERERESkeQxsiIiIiIhI8xjYEBERERGR5jGwISIiIiIizWNgQ0REREREmsfAhoiIiIiINI+BDRERERERaR4DGyIiIiIi0jwGNkREREREpHkMbIiIiIiISPMY2BARERERkeYxsCEiIiIiIs1jYENERERERJrHwIaIiIiIiDSPgQ0REREREWkeAxsiIiIiItI8BjZERERERKR5DGyIiIiIiEjz3iqwmT59utSsWVNsbGykWLFi0q5dO7l586bePgBk0qRJ4uzsLFZWVhIQECDBwcHZmmkiIiIiIqLM3iqwOXbsmAwcOFBOnz4thw8fltTUVGnatKnExcWp+8ycOVPmzJkjixYtkrNnz4qjo6M0adJEYmJisj3zREREREREIiL53mbnAwcO6P2+Zs0aKVasmPz5559Sr149ASDz5s2TcePGSYcOHUREZN26deLg4CCbNm2Sjz/+OPtyTkRERERE9P+90zs2r169EhGRwoULi4hIaGiohIeHS9OmTdV9LCwspH79+nLy5Mk3/htJSUkSHR2t90NERERERPQ2/ufABoCMGDFC6tatK1WqVBERkfDwcBERcXBw0NvXwcFB3fa66dOni52dnfrj6ur6v2aJiIiIiIjyqP85sBk0aJBcvnxZNm/enGWboih6vwPIkqYzduxYefXqlfoTFhb2v2aJiIiIiIjyqLd6x0Zn8ODBsmfPHjl+/LgUL15cTXd0dBSRjCc3Tk5OanpERESWpzg6FhYWYmFh8b9kg4iIiIiISETe8okNABk0aJDs3LlTfvnlF3Fzc9Pb7ubmJo6OjnL48GE1LTk5WY4dOyZ+fn7Zk2MiIiIiIqLXvNUTm4EDB8qmTZtk9+7dYmNjo743Y2dnJ1ZWVqI
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"229.606133pt\" version=\"1.1\" viewBox=\"0 0 592.125 229.606133\" width=\"592.125pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-25T17:30:47.138983</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 229.606133 \r\nL 592.125 229.606133 \r\nL 592.125 0 \r\nL 0 0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 26.925 173.52 \r\nL 584.925 173.52 \r\nL 584.925 7.2 \r\nL 26.925 7.2 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"patch_3\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 38.55 173.52 \r\nL 61.8 173.52 \r\nL 61.8 20.489492 \r\nL 38.55 20.489492 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_4\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 85.05 173.52 \r\nL 108.3 173.52 \r\nL 108.3 21.825252 \r\nL 85.05 21.825252 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_5\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 131.55 173.52 \r\nL 154.8 173.52 \r\nL 154.8 19.338886 \r\nL 131.55 19.338886 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_6\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 178.05 173.52 \r\nL 201.3 173.52 \r\nL 201.3 27.01385 \r\nL 178.05 27.01385 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_7\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 224.55 173.52 \r\nL 247.8 173.52 \r\nL 247.8 15.12 \r\nL 224.55 15.12 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_8\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 271.05 173.52 \r\nL 294.3 173.52 \r\nL 294.3 29.920194 \r\nL 271.05 29.920194 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_9\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 317.55 173.52 \r\nL 340.8 173.52 \r\nL 340.8 26.498208 \r\nL 317.55 26.498208 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_10\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 364.05 173.52 \r\nL 387.3 173.52 \r\nL 387.3 24.708378 \r\nL 364.05 24.708378 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_11\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 410.55 173.52 \r\nL 433.8 173.52 \r\nL 433.8 23.9754 \r\nL 410.55 23.9754 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_12\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 457.05 173.52 \r\nL 480.3 173.52 \r\nL 480.3 17.421211 \r\nL 457.05 17.421211 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_13\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 503.55 173.52 \r\nL 526.8 173.52 \r\nL 526.8 23.711186 \r\nL 503.55 23.711186 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_14\">\r\n <path clip-path=\"url(#pe9b68071cd)\" d=\"M 550.05 173.52 \r\nL 573.3 173.52 \r\nL 573.3 24.196998 \r\nL 550.05 24.196998 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"matplotlib.axis_1\">\r\n <g id=\"xtick_1\">\r\n <g id=\"line2d_1\">\r\n <defs>\r\n <path d=\"M 0 0 \r\nL 0 3.5 \r\n\" id=\"m61300069dc\" style=\"stroke:#000000;stroke-width:0.8;\"/>\r\n </defs>\r\n <g>\r\n
"text/plain": [
"<Figure size 1000x300 with 1 Axes>"
3 years ago
]
},
3 years ago
"metadata": {},
"output_type": "display_data"
}
],
3 years ago
"source": [
"monthly = total_items.resample(\"1M\").mean()\n",
"ax = monthly.plot(kind='bar',figsize=(10,3))\n",
"ax.set_xticklabels([x.strftime(\"%b-%Y\") for x in monthly.index], rotation=45)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"## DataFrame\n",
"\n",
"A dataframe is essentially a collection of series with the same index. We can combine several series together into a dataframe. Given `a` and `b` series defined above:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 42,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" <th>3</th>\n",
" <th>4</th>\n",
" <th>5</th>\n",
" <th>6</th>\n",
" <th>7</th>\n",
" <th>8</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>7</td>\n",
" <td>8</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>I</td>\n",
" <td>like</td>\n",
" <td>to</td>\n",
" <td>use</td>\n",
" <td>Python</td>\n",
" <td>and</td>\n",
" <td>Pandas</td>\n",
" <td>very</td>\n",
" <td>much</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" 0 1 2 3 4 5 6 7 8\n",
"0 1 2 3 4 5 6 7 8 9\n",
"1 I like to use Python and Pandas very much"
]
},
3 years ago
"execution_count": 42,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df = pd.DataFrame([a,b])\n",
"df"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"We can also use Series as columns, and specify column names using dictionary:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 43,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Python</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>and</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>Pandas</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>very</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>much</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B\n",
"0 1 I\n",
"1 2 like\n",
"2 3 to\n",
"3 4 use\n",
"4 5 Python\n",
"5 6 and\n",
"6 7 Pandas\n",
"7 8 very\n",
"8 9 much"
]
},
3 years ago
"execution_count": 43,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df = pd.DataFrame({ 'A' : a, 'B' : b })\n",
"df"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"The same result can be achieved by transposing (and then renaming columns, to match the previous example):"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 44,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Python</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>and</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>Pandas</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>very</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>much</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B\n",
"0 1 I\n",
"1 2 like\n",
"2 3 to\n",
"3 4 use\n",
"4 5 Python\n",
"5 6 and\n",
"6 7 Pandas\n",
"7 8 very\n",
"8 9 much"
]
},
3 years ago
"execution_count": 44,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"pd.DataFrame([a,b]).T.rename(columns={ 0 : 'A', 1 : 'B' })"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"**Selecting columns** from DataFrame can be done like this:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 45,
3 years ago
"metadata": {},
"outputs": [
{
"name": "stdout",
3 years ago
"output_type": "stream",
"text": [
"Column A (series):\n",
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 5\n",
"5 6\n",
"6 7\n",
"7 8\n",
"8 9\n",
"Name: A, dtype: int64\n",
"Columns B and A (DataFrame):\n",
" B A\n",
"0 I 1\n",
"1 like 2\n",
"2 to 3\n",
"3 use 4\n",
"4 Python 5\n",
"5 and 6\n",
"6 Pandas 7\n",
"7 very 8\n",
"8 much 9\n"
]
}
],
3 years ago
"source": [
"print(f\"Column A (series):\\n{df['A']}\")\n",
"print(f\"Columns B and A (DataFrame):\\n{df[['B','A']]}\")"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"**Selecting rows** based on filter expression:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 46,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B\n",
"0 1 I\n",
"1 2 like\n",
"2 3 to\n",
"3 4 use"
]
},
3 years ago
"execution_count": 46,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df[df['A']<5]"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"The way it works is that expression `df['A']<5` returns a boolean series, which indicates whether expression is `True` or `False` for each elements of the series. When series is used as an index, it returns subset of rows in the DataFrame. Thus it is not possible to use arbitrary Python boolean expression, for example, writing `df[df['A']>5 and df['A']<7]` would be wrong. Instead, you should use special `&` operation on boolean series:"
]
},
{
"cell_type": "code",
"execution_count": 47,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>and</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B\n",
"5 6 and"
]
},
3 years ago
"execution_count": 47,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df[(df['A']>5) & (df['A']<7)]"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"**Creating new computable columns**. We can easily create new computable columns for our DataFrame by using intuitive expressions. The code below calculates divergence of A from its mean value."
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 48,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" <th>DivA</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" <td>-4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" <td>-3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" <td>-2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" <td>-1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Python</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>and</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>Pandas</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>very</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>much</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B DivA\n",
"0 1 I -4.0\n",
"1 2 like -3.0\n",
"2 3 to -2.0\n",
"3 4 use -1.0\n",
"4 5 Python 0.0\n",
"5 6 and 1.0\n",
"6 7 Pandas 2.0\n",
"7 8 very 3.0\n",
"8 9 much 4.0"
]
},
3 years ago
"execution_count": 48,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df['DivA'] = df['A']-df['A'].mean()\n",
"df"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"What actually happens is we are computing a series, and then assigning this series to the left-hand-side, creating another column."
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 49,
3 years ago
"metadata": {},
"outputs": [],
"source": [
3 years ago
"# WRONG: df['ADescr'] = \"Low\" if df['A'] < 5 else \"Hi\"\n",
"df['LenB'] = len(df['B']) # Wrong result"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 50,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" <th>DivA</th>\n",
" <th>LenB</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" <td>-4.0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" <td>-3.0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" <td>-2.0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" <td>-1.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Python</td>\n",
" <td>0.0</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>and</td>\n",
" <td>1.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>Pandas</td>\n",
" <td>2.0</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>very</td>\n",
" <td>3.0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>much</td>\n",
" <td>4.0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B DivA LenB\n",
"0 1 I -4.0 1\n",
"1 2 like -3.0 4\n",
"2 3 to -2.0 2\n",
"3 4 use -1.0 3\n",
"4 5 Python 0.0 6\n",
"5 6 and 1.0 3\n",
"6 7 Pandas 2.0 6\n",
"7 8 very 3.0 4\n",
"8 9 much 4.0 4"
]
},
3 years ago
"execution_count": 50,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df['LenB'] = df['B'].apply(lambda x: len(x))\n",
"# or\n",
"df['LenB'] = df['B'].apply(len)\n",
"df"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"**Selecting rows based on numbers** can be done using `iloc` construct. For example, to select first 5 rows from the DataFrame:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 52,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" <th>DivA</th>\n",
" <th>LenB</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" <td>-4.0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" <td>-3.0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" <td>-2.0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" <td>-1.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Python</td>\n",
" <td>0.0</td>\n",
" <td>6</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B DivA LenB\n",
"0 1 I -4.0 1\n",
"1 2 like -3.0 4\n",
"2 3 to -2.0 2\n",
"3 4 use -1.0 3\n",
"4 5 Python 0.0 6"
]
},
3 years ago
"execution_count": 52,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df.iloc[:5]"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"**Grouping** is often used to get a result similar to *pivot tables* in Excel. Suppose that we want to compute mean value of column `A` for each given number of `LenB`. Then we can group our DataFrame by `LenB`, and call `mean`:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 53,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>DivA</th>\n",
" </tr>\n",
" <tr>\n",
" <th>LenB</th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1.000000</td>\n",
" <td>-4.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3.000000</td>\n",
" <td>-2.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>5.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>6.333333</td>\n",
" <td>1.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>6.000000</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A DivA\n",
"LenB \n",
"1 1.000000 -4.000000\n",
"2 3.000000 -2.000000\n",
"3 5.000000 0.000000\n",
"4 6.333333 1.333333\n",
"6 6.000000 1.000000"
]
},
3 years ago
"execution_count": 53,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df.groupby(by='LenB').mean()"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
"If we need to compute mean and the number of elements in the group, then we can use more complex `aggregate` function:"
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 58,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Count</th>\n",
" <th>Mean</th>\n",
" </tr>\n",
" <tr>\n",
" <th>LenB</th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>1.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>3.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2</td>\n",
" <td>5.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3</td>\n",
" <td>6.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>2</td>\n",
" <td>6.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" Count Mean\n",
"LenB \n",
"1 1 1.000000\n",
"2 1 3.000000\n",
"3 2 5.000000\n",
"4 3 6.333333\n",
"6 2 6.000000"
]
},
3 years ago
"execution_count": 58,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df.groupby(by='LenB') \\\n",
" .aggregate({ 'DivA' : len, 'A' : lambda x: x.mean() }) \\\n",
" .rename(columns={ 'DivA' : 'Count', 'A' : 'Mean'})"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"## Printing and Plotting\n",
"\n",
"Data Scientist often has to explore the data, thus it is important to be able to visualize it. When DataFrame is big, many times we want just to make sure we are doing everything correctly by printing out the first few rows. This can be done by calling `df.head()`. If you are running it from Jupyter Notebook, it will print out the DataFrame in a nice tabular form."
3 years ago
]
},
{
"cell_type": "code",
"execution_count": 59,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" <th>DivA</th>\n",
" <th>LenB</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>I</td>\n",
" <td>-4.0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>like</td>\n",
" <td>-3.0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>to</td>\n",
" <td>-2.0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>use</td>\n",
" <td>-1.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>Python</td>\n",
" <td>0.0</td>\n",
" <td>6</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
3 years ago
],
"text/plain": [
" A B DivA LenB\n",
"0 1 I -4.0 1\n",
"1 2 like -3.0 4\n",
"2 3 to -2.0 2\n",
"3 4 use -1.0 3\n",
"4 5 Python 0.0 6"
]
},
3 years ago
"execution_count": 59,
"metadata": {},
3 years ago
"output_type": "execute_result"
}
],
3 years ago
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"\n",
"We have also seen the usage of `plot` function to visualize some columns. While `plot` is very useful for many tasks, and supports many different graph types via `kind=` parameter, you can always use raw `matplotlib` library to plot something more complex. We will cover data visualization in detail in separate course lessons.\n"
]
},
{
"cell_type": "code",
"execution_count": 61,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
3 years ago
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhYAAAGdCAYAAABO2DpVAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA+SUlEQVR4nO3dd3zUheH/8dcne5CEGSEQtjLCJsnlEFdFWqTUVQcVCCNWKyKIA3HVqhhXbatUkGEIoGKtglgHglaQClkQluwZRoAAyWWQC7n7/P7wW35FQTm4yye5vJ+Pxz0eTbzLva6Me/P5XC6GaZomIiIiIl4QYHWAiIiI+A8NCxEREfEaDQsRERHxGg0LERER8RoNCxEREfEaDQsRERHxGg0LERER8RoNCxEREfGaoJq+Q7fbzcGDB4mKisIwjJq+exEREbkApmlSWlpKXFwcAQHnPi5R48Pi4MGDxMfH1/TdioiIiBcUFBTQqlWrc/73Gh8WUVFRwPdh0dHRNX33IiIicgEcDgfx8fGnn8fPpcaHxX9Pf0RHR2tYiIiI1DE/9zIGvXhTREREvEbDQkRERLxGw0JERES8RsNCREREvEbDQkRERLxGw0JERES8RsNCREREvEbDQkRERLxGw0JERES8xuNhUVpayoQJE2jTpg3h4eH069ePnJwcX7SJiIhIHePxsEhLS2Pp0qXMmzePDRs2MHDgQAYMGMCBAwd80SciIiJ1iGGapnm+Vz558iRRUVF89NFHDB48+PTne/Xqxa9//Wuee+65n/0aDoeDmJgYSkpK9LNCRERE6ojzff726IeQVVdX43K5CAsLO+Pz4eHhrFy58qy3cTqdOJ3OM8JERETEu0y3m+z3X4Yjm7CNm2tZh0enQqKiorDb7Tz77LMcPHgQl8vF/PnzycrK4tChQ2e9TXp6OjExMacv8fHxXgkXERGR7zmKj7H21RuwbX4e27GP2LBioWUtHr/GYt68eZimScuWLQkNDeW1117jd7/7HYGBgWe9/uTJkykpKTl9KSgouOhoERER+d72tSso+5udPmUrqDIDWX3pg3Trf4NlPR6dCgHo0KEDy5cvp7y8HIfDQYsWLbj99ttp167dWa8fGhpKaGjoRYeKiIjI/2e63WQteJ4+W18lxHBx0Iil7DczSelztaVdHg+L/4qMjCQyMpITJ06wZMkSXnrpJW92iYiIyDmUHDvMrtkjSan4FgxYE3kFHdLmENeoqdVpng+LJUuWYJomnTp1YseOHTz88MN06tSJUaNG+aJPRERE/seW3C9p+K+76c1Rqswg1nZ5iOTbJmEE1I73vPR4WJSUlDB58mT2799P48aNueWWW5gyZQrBwcG+6BMRERHA7XKR/e4z9N3+OsGGi/1Gcypvmo2tZ3+r087g0ftYeIPex0JERMQzJ44eYt9bqfQ8mQVAXtQ1XJb2FlExjWuswSfvYyEiIiI1a3PWEhp/9gd6cgynGUx+t0dJvmVirTn18UMaFiIiIrWQ2+Uia/6TJO2aRpDhpsCIo+qWDGzdU6xO+0kaFiIiIrXMscP7OZAxAntlHhiQG30dXe6aRWRUQ6vTfpaGhYiISC2y6T+fELt0LD04wUkzhI09nyDxxnG19tTHD2lYiIiI1AKu6mqy5z5G8t4ZBBomewLi4dY5JHVJtDrNIxoWIiIiFisq3EdhxnDsznwwILvh9XRLm05Egxir0zymYSEiImKhDSs+Iu6rcXSjhAozlE19nib5hnutzrpgGhYiIiIWqD5VRU7mJGwFGQQYJrsD2hJweyZJnXpZnXZRNCxERERq2JEDuynKHI69agMYkNX4N/RMm0ZYRAOr0y6ahoWIiEgNWv/vfxK//AG64qDcDGNz8nPYBt9ldZbXaFiIiIjUgFNVTnIzHsJ+aC4AOwI7EDo0k8SO3S0u8y4NCxERER8rLNhB8dzh2E99B0BW05vpOWYqYeGRFpd5n4aFiIiID+Uve5e2Kx+iM2WUmuFsT0nHNmiU1Vk+o2EhIiLiA1XOSta8NYGUw+8CsD3oUiJ+N48+7btYXOZbGhYiIiJednD3FsreHk5K9TYAVsfeTp8xrxESGmZxme9pWIiIiHjRmiXz6LjqEeKowEEkO/u9RMrAYVZn1RgNCxERES9wVlaQP3sctqP/BGBrUGeih8+ld5tOFpfVLA0LERGRi3Rg1yYq3h6BzbUDgNXN76Tv6L8QHBJqcVnN07AQERG5CHmfzqZT1uO0NE5ygij2XfkKKb+4w+osy2hYiIiIXIDKijLWzR6L7dgiMGBzcAKNU+fRs1UHq9MspWEhIiLioX3b8qlekIrNvQe3aZDVKpWkkS8TFBxidZrlNCxEREQ8kLt4Ol3zniLCcHKcaA784m/Yr7rZ6qxaQ8NCRETkPJwsL2XDrLtJPvEJGLAppAexI+fRPa6t1Wm1ioaFiIjIz9i7OQ/z/ZEku/d9f+qjdRrJqS8QGKSn0R/S/yMiIiI/IXvh63TLf5YIw0kRDSm87nXs/X9jdVatpWEhIiJyFuWlxXw36/cklywBAzaE9qbFqHl0ax5vdVqtpmEhIiLyA7s3ZRHwwWiS3PtxmQbZ7e7BNnwKAYGBVqfVehoWIiIi/8d0u8lZ+Dd6rJ9CmHGKIzSm6FdvYLcPsjqtztCwEBERAcocJ9gycwzJpV+CAevDkmg1OpOusS2tTqtTNCxERKTe27HuP4QtGkOieYhqM4DcDveRfOfTOvVxATQsRESk3jLdbrLff5le371MqHGKQppSPHg6KcnXWZ1WZwV4cuXq6mqeeOIJ2rVrR3h4OO3bt+eZZ57B7Xb7qk9ERMQnHMXHWPvqDdg2P0+ocYr8CDth9/2HzhoVF8WjIxYvvvgi06dPJzMzk4SEBHJzcxk1ahQxMTGMHz/eV40iIiJetX3tCiIWp9HHPMwpM5C8yyZgG/oERoBH/96Ws/BoWKxatYobbriBwYMHA9C2bVveffddcnNzfRInIiLiTabbTdaC5+mz9VVCDBcHjVjKfjOTlD5XW53mNzyaZv379+fLL79k27ZtAKxbt46VK1dy/fXXn/M2TqcTh8NxxkVERKSmlRw/Sv4rvyZl28uEGC7WRF5B5P2ruEyjwqs8OmIxadIkSkpK6Ny5M4GBgbhcLqZMmcLQoUPPeZv09HT+9Kc/XXSoiIjIhdqS+yUx/7qb3hylygxibZeHSL5tkk59+IBH/4++9957zJ8/n3feeYc1a9aQmZnJK6+8QmZm5jlvM3nyZEpKSk5fCgoKLjpaRETkfLhdLlbP/yMdPr6VFhxlv9GcvTctwnbHZI0KHzFM0zTP98rx8fE8+uijjB079vTnnnvuOebPn8+WLVvO62s4HA5iYmIoKSkhOjra82IREZHzUFxUyN7ZI+h5MguAvAZXc9ldGUTFNLa4rG463+dvj06FVFRUEPCDhRcYGKhvNxURkVplc9YSGn/2B3pyDKcZTH63R0m+ZaKOUtQAj4bFkCFDmDJlCq1btyYhIYG1a9fy6quvMnr0aF/1iYiInDe3y0XW/CdJ2jWNIMNNgRFH1S0Z2LqnWJ1Wb3h0KqS0tJQnn3yShQsXcuTIEeLi4hg6dChPPfUUISEh5/U1dCpERER84djh/RzISKVH5fdvgZAbPYDOabNoEN3I4jL/cL7P3x4NC2/QsBAREW/b9O2nxH5xL804wUkzhI09nyDxxnE69eFFPnmNhYiISG3iqq4me+5jJO+dQaBhsjcgHvdvM0jqmmR1Wr2lYSEiInVSUeE+CjOGY3fmgwHZDa+nW9p0IhrEWJ1Wr2lYiIhInbPxm49o/uX9dKOYCjOUTX2eJvmGe63OEjQsRESkDqk+VUVu5qMkF7xFgGGyO6AtAbdnktSpl9Vp8n80LEREpE44cmA3RZnDSana8P2pj8ZD6JE2nbCIBlanyf/QsBARkVpv/b//SfzyB+iKg3IzjM1Jz5L8699bnSVnoWEhIiK11qkqJ7lzHsJ+cC4AOwPbEzJ0Lokdu1tcJueiYSEiIrVSYcEOiucOx37qOwCymt5MzzFTCQuPtLhMfoqGhYiI1Dr5Xy6g7TcP0pkySs1
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"297.190125pt\" version=\"1.1\" viewBox=\"0 0 384.8825 297.190125\" width=\"384.8825pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-25T17:54:22.234661</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 297.190125 \r\nL 384.8825 297.190125 \r\nL 384.8825 0 \r\nL 0 0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 20.5625 273.312 \r\nL 377.6825 273.312 \r\nL 377.6825 7.2 \r\nL 20.5625 7.2 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"matplotlib.axis_1\">\r\n <g id=\"xtick_1\">\r\n <g id=\"line2d_1\">\r\n <defs>\r\n <path d=\"M 0 0 \r\nL 0 3.5 \r\n\" id=\"m3e8fdb0258\" style=\"stroke:#000000;stroke-width:0.8;\"/>\r\n </defs>\r\n <g>\r\n <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"36.795227\" xlink:href=\"#m3e8fdb0258\" y=\"273.312\"/>\r\n </g>\r\n </g>\r\n <g id=\"text_1\">\r\n <!-- 0 -->\r\n <g transform=\"translate(33.613977 287.910437)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 2034 4250 \r\nQ 1547 4250 1301 3770 \r\nQ 1056 3291 1056 2328 \r\nQ 1056 1369 1301 889 \r\nQ 1547 409 2034 409 \r\nQ 2525 409 2770 889 \r\nQ 3016 1369 3016 2328 \r\nQ 3016 3291 2770 3770 \r\nQ 2525 4250 2034 4250 \r\nz\r\nM 2034 4750 \r\nQ 2819 4750 3233 4129 \r\nQ 3647 3509 3647 2328 \r\nQ 3647 1150 3233 529 \r\nQ 2819 -91 2034 -91 \r\nQ 1250 -91 836 529 \r\nQ 422 1150 422 2328 \r\nQ 422 3509 836 4129 \r\nQ 1250 4750 2034 4750 \r\nz\r\n\" id=\"DejaVuSans-30\" transform=\"scale(0.015625)\"/>\r\n </defs>\r\n <use xlink:href=\"#DejaVuSans-30\"/>\r\n </g>\r\n </g>\r\n </g>\r\n <g id=\"xtick_2\">\r\n <g id=\"line2d_2\">\r\n <g>\r\n <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"77.377045\" xlink:href=\"#m3e8fdb0258\" y=\"273.312\"/>\r\n </g>\r\n </g>\r\n <g id=\"text_2\">\r\n <!-- 1 -->\r\n <g transform=\"translate(74.195795 287.910437)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 794 531 \r\nL 1825 531 \r\nL 1825 4091 \r\nL 703 3866 \r\nL 703 4441 \r\nL 1819 4666 \r\nL 2450 4666 \r\nL 2450 531 \r\nL 3481 531 \r\nL 3481 0 \r\nL 794 0 \r\nL 794 531 \r\nz\r\n\" id=\"DejaVuSans-31\" transform=\"scale(0.015625)\"/>\r\n </defs>\r\n <use xlink:href=\"#DejaVuSans-31\"/>\r\n </g>\r\n </g>\r\n </g>\r\n <g id=\"xtick_3\">\r\n <g id=\"line2d_3\">\r\n <g>\r\n <use style=\"stroke:#000000;stroke-width:0.8;\" x=\"117.958864\" xlink:href=\"#m3e8fdb0258\" y=\"273.312\"/>\r\n </g>\r\n </g>\r\n <g id=\"text_3\">\r\n <!-- 2 -->\r\n <g transform=\"translate(114.777614 287.910437)scale(0.1 -0.1)\">\r\n <defs>\r\n <path d=\"M 1228 531 \r\nL 3431 531 \r\nL 3431 0 \r\nL 469 0 \r\nL 469 531 \r\nQ 828 903 1448 1529 \r\nQ 2069 2156 2228 2338 \r\nQ 2531 2678 2651 2914 \r\nQ 2772 3150 2772 3378 \r\nQ 2772 3750 2511 3984 \r\nQ 2250 4219 1831 4219 \r\nQ 1534 4219 1204 4116 \r\nQ 875 4013 500 3803 \r\nL 500 4441 \r\nQ 881 4594 1212 4672 \r\nQ 1544 4750 1819 4750 \r\nQ 2544 4750 2975 4387 \r\nQ 3406 4025 3406 3419 \r\nQ 3406 3131 3298 2873 \r\nQ 3191 2616 290
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
3 years ago
]
},
3 years ago
"metadata": {},
"output_type": "display_data"
}
],
3 years ago
"source": [
"df['A'].plot()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 63,
3 years ago
"metadata": {},
"outputs": [
{
"data": {
3 years ago
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhYAAAGYCAYAAAAeFavmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAXCklEQVR4nO3de5Cd8/3A8c8myyapXAiJpCJZg7qEkUkyxp2fS0cTqjfScRstIySItEgoJpTEPxpDu8qUoYiYYjAabap1b6rZJqTqEkxqizS0ZqMui+z390fHTrdEezaf3ZOzfb1mnj+eW57PMyT7znPOyakrpZQAAEjQp9oDAAC9h7AAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANLU9/QF29vb47XXXouBAwdGXV1dT18eAOiCUkq8/fbbMXLkyOjTZ/3PJXo8LF577bUYNWpUT18WAEjQ0tIS22yzzXr393hYDBw4MCL+OdigQYN6+vIAQBesXbs2Ro0a1fFzfH16PCw+fvlj0KBBwgIAasx/ehuDN28CAGmEBQCQRlgAAGmEBQCQRlgAAGmEBQCQRlgAAGmEBQCQRlgAAGmEBQCQRlgAAGmEBQCQRlgAAGmEBQCQpse/Nh0A6GzMrPt79Hqr5k3qtl/bEwsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSCAsAII2wAADSVBQWH330UXzve9+LxsbG6N+/f2y33XZxySWXRHt7e3fNBwDUkPpKDr7iiivi2muvjZtuuil23XXXWLp0aZx00kkxePDgOOuss7prRgCgRlQUFr/97W/jy1/+ckyaNCkiIsaMGRMLFiyIpUuXdstwAEBtqeilkH333TcefPDBeOGFFyIi4qmnnorHHnssvvSlL3XLcABAbanoicV5550Xra2tsdNOO0Xfvn1j3bp1cdlll8U3v/nN9Z7T1tYWbW1tHetr167t+rQAwEatorBYuHBh3HLLLXHbbbfFrrvuGsuXL48ZM2bEyJEj48QTT/zUc+bOnRtz5sxJGRaA/11jZt3fo9dbNW9Sj16vt6jopZBzzjknZs2aFVOmTInddtstjj/++Dj77LNj7ty56z1n9uzZ0dra2rG0tLRs8NAAwMapoicW7777bvTp07lF+vbt+5kfN21oaIiGhoauTQcA1JSKwuKII46Iyy67LLbddtvYddddY9myZXHllVfGt771re6aDwCoIRWFxdVXXx0XXnhhnH766bFmzZoYOXJknHrqqXHRRRd113wAQA2pKCwGDhwY8+fPj/nz53fTOABALfNdIQBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAGmEBAKQRFgBAmvpqDwBAnjGz7u/R662aN6lHr8fGzxMLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACCNsAAA0ggLACBNxWHx6quvxnHHHRdDhw6NAQMGxB577BHNzc3dMRsAUGPqKzn4rbfein322ScOOuigWLRoUQwbNixeeumlGDJkSDeNBwDUkorC4oorrohRo0bFjTfe2LFtzJgx2TMBADWqopdC7r333pgwYUJ84xvfiGHDhsW4cePi+uuv767ZAIAaU1FYvPzyy9HU1BQ77LBD/OIXv4ipU6fGmWeeGTfffPN6z2lra4u1a9d2WgCA3qmil0La29tjwoQJcfnll0dExLhx4+KZZ56JpqamOOGEEz71nLlz58acOXM2fFKAJGNm3d9j11o1b1KPXQs2BhU9sRgxYkTssssunbbtvPPO8corr6z3nNmzZ0dra2vH0tLS0rVJAYCNXkVPLPbZZ594/vnnO2174YUXYvTo0es9p6GhIRoaGro2HQBQUyp6YnH22WfHkiVL4vLLL48XX3wxbrvttrjuuuti2rRp3TUfAFBDKgqLiRMnxt133x0LFiyIsWPHxqWXXhrz58+PY489trvmAwBqSEUvhURETJ48OSZPntwdswAANc53hQAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaYQFAJBGWAAAaeqrPQCwcRoz6/4eu9aqeZN67FpA9/LEAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDTCAgBIIywAgDQbFBZz586Nurq6mDFjRtI4AEAt63JY/P73v4/rrrsudt9998x5AIAa1qWw+Mc//hHHHntsXH/99bH55ptnzwQA1KguhcW0adNi0qRJccghh/zHY9va2mLt2rWdFgCgd6qv9ITbb789mpubY+nSpf/V8XPnzo05c+ZUPBjUgjGz7u+xa62aN6nHrgXQVRU9sWhpaYmzzjorbr311ujXr99/dc7s2bOjtbW1Y2lpaenSoADAxq+iJxbNzc2xZs2aGD9+fMe2devWxSOPPBLXXHNNtLW1Rd++fTud09DQEA0NDTnTAgAbtYrC4uCDD44VK1Z02nbSSSfFTjvtFOedd94nogIA+N9SUVgMHDgwxo4d22nb5z73uRg6dOgntgMA/3v8y5sAQJqKPxXy7x566KGEMQCA3sATCwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANIICwAgjbAAANLUV3sAer8xs+7vsWutmjepx64FwCd5YgEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAEAaYQEApBEWAECaisJi7ty5MXHixBg4cGAMGzYsjjrqqHj++ee7azYAoMZUFBYPP/xwTJs2LZYsWRKLFy+Ojz76KA477LB45513ums+AKCG1Fdy8AMPPNBp/cYbb4xhw4ZFc3Nz7L///qmDAQC1p6Kw+Hetra0REbHFFlus95i2trZoa2vrWF+7du2GXBIA2Ih1OSxKKTFz5szYd999Y+zYses9bu7cuTFnzpyuXuZ/wphZ9/fo9VbNm9Sj1wPgf0eXPxUyffr0ePrpp2PBggWfedzs2bOjtbW1Y2lpaenqJQGAjVyXnlicccYZce+998YjjzwS22yzzWce29DQEA0NDV0aDgCoLRWFRSklzjjjjLj77rvjoYceisbGxu6aCwCoQRWFxbR
"image/svg+xml": "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\r\n \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\r\n<svg height=\"293.8745pt\" version=\"1.1\" viewBox=\"0 0 384.8825 293.8745\" width=\"384.8825pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\r\n <metadata>\r\n <rdf:RDF xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\r\n <cc:Work>\r\n <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\r\n <dc:date>2021-08-25T17:54:43.474235</dc:date>\r\n <dc:format>image/svg+xml</dc:format>\r\n <dc:creator>\r\n <cc:Agent>\r\n <dc:title>Matplotlib v3.4.2, https://matplotlib.org/</dc:title>\r\n </cc:Agent>\r\n </dc:creator>\r\n </cc:Work>\r\n </rdf:RDF>\r\n </metadata>\r\n <defs>\r\n <style type=\"text/css\">*{stroke-linecap:butt;stroke-linejoin:round;}</style>\r\n </defs>\r\n <g id=\"figure_1\">\r\n <g id=\"patch_1\">\r\n <path d=\"M 0 293.8745 \r\nL 384.8825 293.8745 \r\nL 384.8825 0 \r\nL 0 0 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"axes_1\">\r\n <g id=\"patch_2\">\r\n <path d=\"M 20.5625 273.312 \r\nL 377.6825 273.312 \r\nL 377.6825 7.2 \r\nL 20.5625 7.2 \r\nz\r\n\" style=\"fill:#ffffff;\"/>\r\n </g>\r\n <g id=\"patch_3\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 30.4825 273.312 \r\nL 50.3225 273.312 \r\nL 50.3225 245.152 \r\nL 30.4825 245.152 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_4\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 70.1625 273.312 \r\nL 90.0025 273.312 \r\nL 90.0025 216.992 \r\nL 70.1625 216.992 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_5\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 109.8425 273.312 \r\nL 129.6825 273.312 \r\nL 129.6825 188.832 \r\nL 109.8425 188.832 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_6\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 149.5225 273.312 \r\nL 169.3625 273.312 \r\nL 169.3625 160.672 \r\nL 149.5225 160.672 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_7\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 189.2025 273.312 \r\nL 209.0425 273.312 \r\nL 209.0425 132.512 \r\nL 189.2025 132.512 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_8\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 228.8825 273.312 \r\nL 248.7225 273.312 \r\nL 248.7225 104.352 \r\nL 228.8825 104.352 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_9\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 268.5625 273.312 \r\nL 288.4025 273.312 \r\nL 288.4025 76.192 \r\nL 268.5625 76.192 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_10\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 308.2425 273.312 \r\nL 328.0825 273.312 \r\nL 328.0825 48.032 \r\nL 308.2425 48.032 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_11\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 347.9225 273.312 \r\nL 367.7625 273.312 \r\nL 367.7625 19.872 \r\nL 347.9225 19.872 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_12\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 30.4825 273.312 \r\nL 50.3225 273.312 \r\nL 50.3225 245.152 \r\nL 30.4825 245.152 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_13\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 70.1625 273.312 \r\nL 90.0025 273.312 \r\nL 90.0025 216.992 \r\nL 70.1625 216.992 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_14\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 109.8425 273.312 \r\nL 129.6825 273.312 \r\nL 129.6825 188.832 \r\nL 109.8425 188.832 \r\nz\r\n\" style=\"fill:#1f77b4;\"/>\r\n </g>\r\n <g id=\"patch_15\">\r\n <path clip-path=\"url(#p2f04dd44da)\" d=\"M 149.5225 273.312 \r\nL 169.3625 273.312 \r\nL 169.3625 160.672 \r\nL 149.5225 160
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
3 years ago
]
},
3 years ago
"metadata": {},
"output_type": "display_data"
}
],
3 years ago
"source": [
"df['A'].plot(kind='bar')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": [
3 years ago
"\n",
"This overview covers most important concepts of Pandas, however, the library is very rich, and there is no limit to what you can do with it! Let's now apply this knowledge for solving specific problem."
3 years ago
]
},
{
"cell_type": "markdown",
3 years ago
"metadata": {},
"source": []
}
],
"metadata": {
3 years ago
"interpreter": {
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"kernelspec": {
"display_name": "Python 3.8.8 64-bit (conda)",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
3 years ago
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
3 years ago
"pygments_lexer": "ipython3",
"version": "3.8.8"
},
3 years ago
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}