diff --git a/2-Regression/3-Linear/README.md b/2-Regression/3-Linear/README.md index 10aed096..c7204c9c 100644 --- a/2-Regression/3-Linear/README.md +++ b/2-Regression/3-Linear/README.md @@ -1,48 +1,150 @@ -# Introduction to Machine Learning +# Build a Regression Model using Scikit-Learn: Regression Two Ways +## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/9/) +### Introduction -Add a sketchnote if possible/appropriate +So far you have explored what regression is with sample data gathered from the pumpkin pricing dataset that we will use throughout this unit. You have also visualized it using Matplotlib. Now you are ready to dive deeper into regression for ML. In this lesson, you will learn more about two types of regression: simple regression and polynomial regression, along with some of the math underlying these techniques. -![Embed a video here if available](video-url) +> Throughout this curriculum, we assume minimal knowledge of math, and seek to make it very accessible for students coming from other fields, so watch for notes, callouts, diagrams, and other learning tools to aid in comprehension. +### Prerequisite -## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/9/) +You should be familiar by now with the structure of the pumpkin data that we are examining. You can find it preloaded and pre-cleaned in this lesson's notebook.ipynb files, with the pumpkin price displayed per bushel in a new dataframe. Make sure you can run these notebooks in kernels in VS Code. +### Preparation -Describe what we will learn +As a reminder, you are loading this data so as to ask questions of it. When is the best time to buy pumpkins? What price can I expect of a miniature pumpkin? Should I buy them in half-bushel baskets or by the 1 1/9 bushel box? Let's keep digging into this data. + +In the previous lesson, you created a Pandas dataframe and populated it with part of the original dataset, standardizing the pricing by the bushel. By doing that, however, you were only able to gather about 400 datapoints and only for the fall months. Take a look at the data that we preloaded in this lesson's accompanying notebook. Maybe we can get a little more detail about the nature of the data by cleaning it more. +## A Linear Regression Line + +As you learned in Lesson 1, the goal of a linear regression exercise is to be able to plot a line to show the relationship between variables and make accurate predictions on where a new datapoint would fall in relationship to that line. + +> **🧮 Show me the math** +> +> This line has an equation: `Y = a + bX`. It is typical of **Least-Squares Regression** to draw this type of line. +> +> `X` is the 'explanatory variable'. `Y` is the 'dependent variable'. The slope of the line is `b` and `a` is the intercept, which refers to the value of `Y` when `X = 0`. +> +> In other words, and referring to our pumpkin data's original question: "predict the price of a pumpkin per bushel by month", `X` would refer to the price and `Y` would refer to the month of sale. The math that calculates the line must demonstrate the slope of the line, which is also dependent on the intercept, or where `Y` is situated when `X = 0`. +> +> You can observe the method of calculation for these values on the [Math is Fun](https://www.mathsisfun.com/data/least-squares-regression.html) web site. +> +> A common method of regression is **Least-Squares Regression** which means that all the datapoints surounding the regression line are squared and then added up. Ideally, that final sum is as small as possible, because we want a low number of errors, or `least-squares`. +> +> One more term to understand is the **Correlation Coefficient** between given X and Y variables. For a scatterplot, you can quickly visualize this coefficient: a plot with datapoints scattered in a neat line have high correlation, but a plot with datapoints scattered everywhere between X and Y have a low correlation. +> +> A good regression model will be one that has a low (nearly zero) Correlation Coefficient using the Least-Squares Regression method with a line of regression. + +✅ Run the notebook accompanying this lesson. Does the data associating City to Price for pumpkin sales seem to have high or low correlation, according to your visual interpretation of the scatterplot? +## Create a Regression Model correlating Pumpkin Datapoints + +Now that you have an understanding of the math behind this exercise, create a Regression model to see if you can predict which type of pumpkins will have the best pumpkin prices. Someone buying pumpkins for a holiday pumpkin patch might want this information to be able to pre-order the best-priced pumpkins for the patch (normally there is a mix of miniature and large pumpkins in a patch). + +Since you'll use Scikit-Learn, there's no reason to do this by hand (although you could!). In the main data-processing block of your lesson notebook, add a library from Scikit-Learn to automatically convert all string data to numbers: + +```python +from sklearn.preprocessing import LabelEncoder +... +new_pumpkins.iloc[:, 0:-1] = new_pumpkins.iloc[:, 0:-1].apply(LabelEncoder().fit_transform) +new_pumpkins.iloc[:, 0:-1] = new_pumpkins.iloc[:, 0:-1].apply(LabelEncoder().fit_transform) +``` -### Introduction +If you look at the new_pumpkins dataframe now, you see that all the strings are now numeric. This makes it harder to read but much more intelligible to Scikit-Learn! -Describe what will be covered +Now, you can make more educated decisions (not just based on eyeballing a scatterplot) about the data that is best suited to regression. ` -> Notes +Try to find a good correlation between two points of your data. As it turns out, there's only weak correlation between the City and Price: -### Prerequisite +```python +print(new_pumpkins['City'].corr(new_pumpkins['Price'])) +0.3236397181608923 +``` +However there's a better correlation between the Variety and its Price (makes sense, right? Think about miniature pumpkin prices vs. the big pumpkins you might buy for Halloween. The little ones are more expensive, volume-wise, than the big ones) -What steps should have been covered before this lesson? +```python +print(new_pumpkins['Variety'].corr(new_pumpkins['Price'])) +-0.8634790400214403 +``` +This is a negative correlation, meaning the slope heads downhill, but it's still useful. So, a question to ask of this data will be: 'What price can I expect of a given type of pumpkin?' -### Preparation +Let's build this regression model +## Building the model -Preparatory steps to start this lesson +Before building your model, create a fresh dataframe with only the data you intend to query. Drop any null data and see what the data looks like. ---- +```python +new_pumpkins.dropna(inplace=True) +new_pumpkins.info() +``` -[Step through content in blocks] +Then, create a new dataframe from this minimal set: -## [Topic 1] +```python +new_columns = ['Variety', 'Price'] +ml_pumpkins = new_pumpkins.drop([c for c in new_pumpkins.columns if c not in new_columns], axis='columns') -### Task: +ml_pumpkins -Work together to progressively enhance your codebase to build the project with shared code: +``` + +Now you can assign your X and y coordinate data: -```html -code blocks +```python +X = ml_pumpkins.values[:, :1] +y = ml_pumpkins.values[:, 1:2] ``` +> What's going on here? You're using [Python slice notation](https://stackoverflow.com/questions/509211/understanding-slice-notation/509295#509295) to create arrays to populate `X` and `y`. + +Next, start the regression model-building routines: -✅ Knowledge Check - use this moment to stretch students' knowledge with open questions +```python +from sklearn.linear_model import LinearRegression +from sklearn.metrics import r2_score, mean_absolute_error +from sklearn.model_selection import train_test_split -## [Topic 2] -## [Topic 3] +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) +lin_reg = LinearRegression() +lin_reg.fit(X_train,y_train) + +pred = lin_reg.predict(X_test) + +accuracy_score = lin_reg.score(X_train,y_train) +print('Model Accuracy: ', accuracy_score) + +# The coefficients +print('Coefficients: ', lin_reg.coef_) +# The mean squared error +print('Mean squared error: ', + mean_squared_error(y_test, pred)) +# The coefficient of determination: 1 is perfect prediction +print('Coefficient of determination: ', + r2_score(y_test, pred)) +``` +Because there's a reasonably high correlation between the two variables, there accuracy of this model isn't bad! + +``` +Model Accuracy: 0.7327987875929955 +Coefficients: [[-8.54296764]] +Mean squared error: 23.443815358076087 +Coefficient of determination: 0.7802537224707632 +``` + +You can visualize the line that's drawn in the process: + +```python +plt.scatter(X_test, y_test, color='black') +plt.plot(X_test, pred, color='blue', linewidth=3) + +plt.xticks(()) +plt.yticks(()) + +plt.show() + +``` +## Polynomial Regression + + -🚀 Challenge: Add a challenge for students to work on collaboratively in class to enhance the project +🚀 Challenge: Test several different variables in this notebook to see how correlation corresponds to model accuracy. ## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/10/) diff --git a/2-Regression/3-Linear/notebook.ipynb b/2-Regression/3-Linear/notebook.ipynb index e69de29b..efebc6db 100644 --- a/2-Regression/3-Linear/notebook.ipynb +++ b/2-Regression/3-Linear/notebook.ipynb @@ -0,0 +1,187 @@ +{ + "metadata": { + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.3-final" + }, + "orig_nbformat": 2, + "kernelspec": { + "name": "python3", + "display_name": "Python 3", + "language": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 2, + "cells": [ + { + "source": [ + "## Pumpkin Pricing Per Bushel, by City\n", + "\n", + "Load up required libraries and dataset. Convert the data to a dataframe containing a subset of the data: \n", + "\n", + "- Only get pumpkins priced by the bushel\n", + "- Convert the date to a month\n", + "- Calculate the price to be an average of high and low prices\n", + "- Convert the price to reflect the pricing by bushel quantity" + ], + "cell_type": "markdown", + "metadata": {} + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " City Name Type Package Variety Sub Variety Grade Date \\\n", + "0 BALTIMORE NaN 24 inch bins NaN NaN NaN 4/29/17 \n", + "1 BALTIMORE NaN 24 inch bins NaN NaN NaN 5/6/17 \n", + "2 BALTIMORE NaN 24 inch bins HOWDEN TYPE NaN NaN 9/24/16 \n", + "3 BALTIMORE NaN 24 inch bins HOWDEN TYPE NaN NaN 9/24/16 \n", + "4 BALTIMORE NaN 24 inch bins HOWDEN TYPE NaN NaN 11/5/16 \n", + "\n", + " Low Price High Price Mostly Low ... Unit of Sale Quality Condition \\\n", + "0 270.0 280.0 270.0 ... NaN NaN NaN \n", + "1 270.0 280.0 270.0 ... NaN NaN NaN \n", + "2 160.0 160.0 160.0 ... NaN NaN NaN \n", + "3 160.0 160.0 160.0 ... NaN NaN NaN \n", + "4 90.0 100.0 90.0 ... NaN NaN NaN \n", + "\n", + " Appearance Storage Crop Repack Trans Mode Unnamed: 24 Unnamed: 25 \n", + "0 NaN NaN NaN E NaN NaN NaN \n", + "1 NaN NaN NaN E NaN NaN NaN \n", + "2 NaN NaN NaN N NaN NaN NaN \n", + "3 NaN NaN NaN N NaN NaN NaN \n", + "4 NaN NaN NaN N NaN NaN NaN \n", + "\n", + "[5 rows x 26 columns]" + ], + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
City NameTypePackageVarietySub VarietyGradeDateLow PriceHigh PriceMostly Low...Unit of SaleQualityConditionAppearanceStorageCropRepackTrans ModeUnnamed: 24Unnamed: 25
0BALTIMORENaN24 inch binsNaNNaNNaN4/29/17270.0280.0270.0...NaNNaNNaNNaNNaNNaNENaNNaNNaN
1BALTIMORENaN24 inch binsNaNNaNNaN5/6/17270.0280.0270.0...NaNNaNNaNNaNNaNNaNENaNNaNNaN
2BALTIMORENaN24 inch binsHOWDEN TYPENaNNaN9/24/16160.0160.0160.0...NaNNaNNaNNaNNaNNaNNNaNNaNNaN
3BALTIMORENaN24 inch binsHOWDEN TYPENaNNaN9/24/16160.0160.0160.0...NaNNaNNaNNaNNaNNaNNNaNNaNNaN
4BALTIMORENaN24 inch binsHOWDEN TYPENaNNaN11/5/1690.0100.090.0...NaNNaNNaNNaNNaNNaNNNaNNaNNaN
\n

5 rows × 26 columns

\n
" + }, + "metadata": {}, + "execution_count": 2 + } + ], + "source": [ + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "\n", + "pumpkins = pd.read_csv('../data/US-pumpkins.csv')\n", + "\n", + "pumpkins.head()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " Month Variety City Package Low Price High Price \\\n", + "70 9 PIE TYPE BALTIMORE 1 1/9 bushel cartons 15.0 15.0 \n", + "71 9 PIE TYPE BALTIMORE 1 1/9 bushel cartons 18.0 18.0 \n", + "72 10 PIE TYPE BALTIMORE 1 1/9 bushel cartons 18.0 18.0 \n", + "73 10 PIE TYPE BALTIMORE 1 1/9 bushel cartons 17.0 17.0 \n", + "74 10 PIE TYPE BALTIMORE 1 1/9 bushel cartons 15.0 15.0 \n", + "\n", + " Price \n", + "70 13.636364 \n", + "71 16.363636 \n", + "72 16.363636 \n", + "73 15.454545 \n", + "74 13.636364 " + ], + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
MonthVarietyCityPackageLow PriceHigh PricePrice
709PIE TYPEBALTIMORE1 1/9 bushel cartons15.015.013.636364
719PIE TYPEBALTIMORE1 1/9 bushel cartons18.018.016.363636
7210PIE TYPEBALTIMORE1 1/9 bushel cartons18.018.016.363636
7310PIE TYPEBALTIMORE1 1/9 bushel cartons17.017.015.454545
7410PIE TYPEBALTIMORE1 1/9 bushel cartons15.015.013.636364
\n
" + }, + "metadata": {}, + "execution_count": 3 + } + ], + "source": [ + "\n", + "pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]\n", + "\n", + "new_columns = ['Package', 'Variety', 'City Name', 'Month', 'Low Price', 'High Price', 'Date', 'City Num', 'Variety Num']\n", + "\n", + "\n", + "pumpkins = pumpkins.drop([c for c in pumpkins.columns if c not in new_columns], axis=1)\n", + "\n", + "price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2\n", + "\n", + "month = pd.DatetimeIndex(pumpkins['Date']).month\n", + "\n", + "\n", + "new_pumpkins = pd.DataFrame({'Month': month, 'Variety': pumpkins['Variety'], 'City': pumpkins['City Name'], 'Package': pumpkins['Package'], 'Low Price': pumpkins['Low Price'],'High Price': pumpkins['High Price'], 'Price': price})\n", + "\n", + "new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/1.1\n", + "\n", + "new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price*2\n", + "\n", + "new_pumpkins.head()\n" + ] + }, + { + "source": [ + "A basic scatterplot reminds us that we only have month data from August through December. We probably need more data to be able to draw conclusions in a linear fashion." + ], + "cell_type": "markdown", + "metadata": {} + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], )" + ] + }, + "metadata": {}, + "execution_count": 4 + }, + { + "output_type": "display_data", + "data": { + "text/plain": "
", + "image/svg+xml": "\n\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXUAAAEvCAYAAAC66FFZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO2deZxU1bH4v8UAMgKKKCAiBEUCLiiYUTD4EkURwQ3Nc4sLSVSMWzQaFCPGJahEDC6JS9yiRlyyGPQpkeD2XjSKDqJiojziHjRAYnhRf2hwqN8fdZq50/Qw3TP39r3TXd/Ppz/d9/SdPjV3qVunTlUdUVUcx3GcyqBD2gI4juM48eFK3XEcp4Jwpe44jlNBuFJ3HMepIFypO47jVBCu1B3HcSqIjuXsbIstttCBAweWs0vHcZx2z8KFC/+uqr2K2besSn3gwIHU19eXs0vHcZx2j4i8U+y+7n5xHMepIFypO47jVBCu1B3HcSoIV+qO4zgVhCt1x3GcCqKs0S+O41Q+cxYtY+a8Jby/ajVb9ahlyrghTBzRL22xqgZX6o7jxMacRcs4/4HFrF7TAMCyVas5/4HFAK7Yy4S7XxzHiY2Z85asU+g5Vq9pYOa8JSlJVH24UnccJzbeX7W6pHYnflypO44TG1v1qC2p3YkfV+qO48TGlHFD6NRBmrR16iBMGTckJYmqj6ImSkXkbeAjoAH4XFXrRKQncD8wEHgbOEJV/5mMmI7jtBukhW0nUUqx1PdW1eGqWhe2pwKPq+pg4PGw7ThOFTNz3hLWNDRdzH5Ng/pEaRlpi/vlEODO8PlOYGLbxXEcpz3jE6XpU6xSV+D3IrJQRCaHtj6q+gFAeO+dhICO47QffKI0fYpV6qNVdVdgPHCaiHyl2A5EZLKI1ItI/cqVK1slpOM47YMp44ZQ26mmSVttpxqfKC0jRSl1VX0/vK8AfgvsDiwXkb4A4X1FM397s6rWqWpdr15FLdzhOE47ZeKIflxx2DD69ahFgH49arnisGGeTVpGWox+EZGuQAdV/Sh83g+4FHgImATMCO8PJimo4zjtg4kj+rkST5FiQhr7AL8Vkdz+96jqoyLyAvBLETkBeBc4PDkxHcdxnGJoUamr6pvALgXa/wHsk4RQjuM4TuvwjFLHcZwKwpW64zhOBeH11B2njUybs5h7F7xHgyo1Ihw9sj/TJw5LW6zU8EUy0sWVuuO0gWlzFnP3c++u225QXbddjYrdF8lIH3e/OE4buHfBeyW1Vzq+SEb6uFJ3nDbQoFpSe6XjtV/Sx5W647SBGilcV7a59krHa7+kjyt1x2kDR4/sX1J7peOLZKSPT5Q6ThvITYZ69EsEXyQjVUTL6Purq6vT+vr6svXnOE55GT3jCZYV8J/361HLM1PHpCBRZSAiCyMLFG0Qd784jhMbhRT6htqd+HGl7jiOU0G4Unccx6kgXKk7juNUEK7UHcdxKghX6o7jOBVE0UpdRGpEZJGIPBy2LxaRZSLyUnhNSE5Mx3HaA55hmz6lWOpnAq/ltV2tqsPDa26McjmO0w4Zte1mJbU78VOUUheRrYEDgFuTFcdxnPbM2/8oHI/eXLsTP8Va6tcA5wJr89pPF5FXROR2ESn4KBaRySJSLyL1K1eubIusjuNkHK/SmD4tKnURORBYoaoL8766ERgEDAc+AH5c6O9V9WZVrVPVul69erVVXsdxMoxXaUyfYiz10cDBIvI2cB8wRkTuVtXlqtqgqmuBW4DdE5TTcZx2wJRxQ6jtVNOkrbZTjVdpLCMtKnVVPV9Vt1bVgcBRwBOqeqyI9I3sdijwakIyOo7TTpg4oh9XHDaMfj1qEayQ1xWHDfOl7MpIW0rvXikiwwEF3gZOjkUix3HaNRNH9HMlniIlKXVVfQp4Knw+LgF5HMdxnDbgGaWO4zgVhCt1x3GcCsKVuuM4TgXhSt1xHKeCcKXuOI5TQbhSdxzHqSDaEqfuOKkyZ9EyZs5bwvurVrNVj1qmjBvi8dEZYODUR9Zre3vGASlIUp24pe60S+YsWsb5Dyxm2arVKLZa/fkPLGbOomVpi1bVFFLoG2p34seVutMumTlvCavXNDRpW72mgZnzlqQkkeNkA1fqTrvES7w6TmFcqTvtEi/x6jiFcaXutEu8xKvjFMaVutMu8RKv2aS5KBePfikfoqpl66yurk7r6+vL1p/jOE4lICILVbWumH3dUnccx6kgilbqIlIjIotE5OGw3VNE5ovI0vBecOFpx3Ecp3yUklF6JvAasEnYngo8rqozRGRq2D4vZvkcJ/Mcc8uzPPPGh+u2Rw/qyeyT9khRonTxjNJ0KcpSF5GtgQOAWyPNhwB3hs93AhPjFc1xsk++Qgd45o0POeaWZ1OSKF08ozR9inW/XAOcC6yNtPVR1Q8AwnvvmGVznMyTr9BbanecpGlRqYvIgcAKVV3Ymg5EZLKI1ItI/cqVK1vzE47jOE6RFGOpjwYOFpG3gfuAMSJyN7BcRPoChPcVhf5YVW9W1TpVrevVq1dMYjuO4ziFaFGpq+r5qrq1qg4EjgKeUNVjgYeASWG3ScCDiUnpOBll9KCeJbU7TtK0JU59BjBWRJYCY8O241QVs0/aYz0FXs3RL55Rmj6eUeo4jpNxPKPUcRynSnGl7jiOU0H4GqVOqxh6wVw+bWh03XWpEV6/bEKKEqXHtDmLuXfBezSoUiPC0SP7M33isLTFqnqqdQ1bt9SdkslX6ACfNihDL5ibkkTpMW3OYu5+7l0awtxUgyp3P/cu0+YsTlmy6qaa17B1pe6UTL5Cb6m9krl3wXsltTvloZrXsHWl7jhtoKGZ6LHm2p3yUM1r2LpSd5w2UCNSUrtTHqp5DVtX6k7JdKkprLCaa69kjh7Zv6R2pzxU8xq2rtSdknn9sgnrKfBqjX6ZPnEYx44asM4yrxHh2FEDPPolZap5DVvPKHUcJ1aqNZQwSUrJKPU4dcdxYiMXSpiLPMmFEgKu2MtE5pW6J7lkk50vepR/fdYYMrbJRjW8csn+ZZUhK8umjbxsPss/+ve67T7dO7PggrFllyMLbCiUsNxKvVpHDJn2qXuSSzbJV+gA//qsgZ0verRsMmRl2bR8hQ6w/KN/M/Ky+WWVIytkJZTQk48yiie5ZJN8hd5SeyWTr9Bbaq90shJK6MlHjuM4MZCVUMKsjBjSoJg1SruIyPMi8rKI/ElELgntF4vIMhF5Kbzc0e04VU5WQgmzMmJIg2ImSj8DxqjqxyLSCXhaRH4XvrtaVa9KSrguNVLQ1VKNSS5ZYpONagq6WjbZqKbA3pVNn+6dC7pa+nTvnII02WDiiH6pT0hOGTeEs3/5Emsj6qOD4MlHAGp8HDY7hVdZnNqe5JJNsuBTz8qyaQsuGLueAq/m6JesUP/Oh00UOsBatfZKp6iQRhGpARYC2wHXq+oCERkPnC4ixwP1wDmq+s+4BXQF7jRHVta9dAWePTZUPbPSs32LmihV1QZVHQ5sDewuIjsBNwKDgOHAB8CPC/2tiEwWkXoRqV+5cmVMYjuO4zRPNVfPLCn6RVVXAU8B+6vq8qDs1wK3ALs38zc3q2qdqtb16tWrzQI7juO0RDVXz2zR/SIivYA1qrpKRGqBfYEfiUhfVf0g7HYo8GoSAnq2ntMcWchqBV/OLoscPbI/dz/3bsH2SqcYS70v8KSIvAK8AMxX1YeBK0VkcWjfG/hu3MJ5tp7THFnIagVfzi6rvLXy45LaK4kWLXVVfQUYUaD9uEQkiuDZek5zZCECB6p7Qi7LPPNG4SiX5torCc8odZw2UM0Tck42caXuOG2gmifknGyS6dK7nq23PlkpN5s2WclqreYJuSwzelDPgq6W0YN6piBNecm0pb7634X9o821VzpZKTebhWzOVy7Zfz0Fnkb0y4I3/1FSu1MetunVraT2SiLTlnpWJsOc9cnC6CCN8MV8lq74pKR2pzxU8wR2pi11x3Gc1lDNE9iu1B3HqTiqeQI700q9uUmvaizx6mSTwb27ltTulIfmJqqrYQI700r90mZ8X821VzpZmKB0mnLa3oPJt/0ktDvpUfeFnnTIOzEdxNornUxPlDa3nmAaK5NnBVfg2WLmvCXrLS6gVPc1mgVmzltSsJ56GudlzqJlzJy3hPdXrWarHrVMGTckURkyrdSreZ1Bp33g12g2ycp5mbNoGec/sHjdItjLVq3m/AesLlBSij3T7pdqXmfQaR/4NZpNsnJeZs5bsk6h51i9pqFZL0QcZFqpZ2Vl8iwx9IK5DJz6yLrX0Avmpi1SVTNl3BBq8py3NR2kqq/RLDBl3BA65Z2XTimclzRGDJlW6llZmTwrDL1g7noLcX/aoK7YU6T+nQ9pyHPeNqzVqlgLM/MUmsEuM2mMGDLtU4dsrEyeFfIVekvtTvJUc+Zilpk5bwlr8u6LNQ1a9onSgZvXsqyAVT5w8+SUeqYtdcfJOtWcuZhlsjJR+tyb/yypPQ5aVOoi0kVEnheRl0XkTyJySWjvKSLzRWRpeN8sMSkdJ6NUc+ZilsnKRGkaD/1i3C+fAWNU9WMR6QQ8LSK/Aw4DHlfVGSIyFZgKnBe3gF5qtpEuNVLQ1dKlpjoVSBauDS+9m02mjBvClF+/3MQF06mm/BOlNSIFFXiSD/0WLXU1cgv7dQovBQ4B7gztdwIT4xYuK6Vms8Lrl01YT4F3qRFev2xCShKlR1aujekTh3HsqAHrbtIaEY4dNcD96VmgUFZYmcmqpY6I1AALge2A61V1gYj0UdUPAFT1AxHpnZiUzjqqUYFnnekTh7kSzxgz5y1hTV5U0pq15Z8oTYOiJkpVtUFVhwNbA7uLyE7FdiAik0WkXkTqV65c2Vo5HcdxiiYrE6VpUFL0i6quAp4C9geWi0hfgPC+opm/uVlV61S1rlevXm0U13Ecp2WyMlGaBsVEv/QSkR7hcy2wL/A68BAwKew2CXgwKSEdx3FKoZqz0Yux1PsCT4rIK8ALwHxVfRiYAYwVkaXA2LAdK15q1mkOvzacDZGVbPQ0rlPRMiZJ1NXVaX19fdn6cxzHSZNpcxYXDHktNUJKRBaqal0x+3pGqeM4TkJsqIxEUmS+9ovjNEcWko8cZ0OkEafulrrTLslK8pHjbIg0yki4Unccx0mINBbAdqXuOI6TENMnDmNw765N2gb37ppoBrIrdcdxnISYNmcxS1d80qRt6YpPmDZncWJ9ulJ3HMdJiDSiX1ypO+0STz5y2gOZrdLoOFnEFbiTdTJZT91xHMdpHWlEv7il7jiOkxC5KJd7F7xHgyo1Ihw9sn+i0S9e+8VxHCfjlFL7xS11x3GcBJk2Z3FZLXVX6o7jOAmRX6WxQXXddlKK3SdKHcdxEiKTceoi0l9EnhSR10TkTyJyZmi/WESWichL4eUrIjuO40TIapz658A5qvqiiHQHForI/PDd1ap6VWLSOY7jtGMyGaeuqh+o6ovh80fAa0B514RyHMdph4zadrOS2uOgJJ+6iAwERgALQtPpIvKKiNwuIslJ6TiO0w55+x+rS2qPg6KVuoh0A34DnKWq/wJuBAYBw4EPgB8383eTRaReROpXrlwZg8iO4zjtg/dXFVbezbXHQVFKXUQ6YQp9tqo+AKCqy1W1QVXXArcAuxf6W1W9WVXrVLWuV69eccntOI6TebbqUVtSexwUE/0iwG3Aa6o6K9LeN7LbocCr8YvnOI7TfpkybkhJ7XFQjKU+GjgOGJMXvniliCwWkVeAvYHvJial4zhOO+T6J5eW1B4HLYY0qurTQKH4m7nxi+M4jlM55K961FJ7HHhGqeM4TgXhSt1xHKeCcKXuOI6TEIN7dy2pPQ5cqTuO4yTE/LP3Wk+BD+7dlfln75VYn15613EcJ0GSVOCFcEvdcRyngnBLvZ1R7lVUHMdpX7hSb0eksYqK4zjtC3e/tCPSWEXFcZz2hSv1dkQaq6g4jtO+cKXejmhutZQkV1FxHKd94Uq9HXH0yP4ltTuOU334RGk7IjcZ6tEvjuM0h2gZ/bF1dXVaX19ftv4cx3EqARFZqKp1xezrlrrTKraZ+ghRc0CAt2YckJY4juMEiln5qL+IPCkir4nIn0TkzNDeU0Tmi8jS8O4LT1cJ+QodQEO74zjpUoyl/jlwjqq+KCLdgYUiMh/4BvC4qs4QkanAVOC85ER1skJzDrtyB1bOWbSMmfOW8P6q1WzVo5Yp44YwcUS/MkuRHTzb2IEiLHVV/UBVXwyfPwJeA/oBhwB3ht3uBCYmJaTj5DNn0TLOf2Axy1atRoFlq1Zz/gOLmbNoWdqipUIu2ziXs5DLNp42Z3HKkjnlpqSQRhEZCIwAFgB9VPUDMMUP9I5bOMdpjpnzlrB6TUOTttVrGpg5b0lKEqWLZxs7OYpW6iLSDfgNcJaq/quEv5ssIvUiUr9y5crWyOhkjOZSncqZAvX+qtUltVc6nm3s5ChKqYtIJ0yhz1bVB0LzchHpG77vC6wo9LeqerOq1qlqXa9eveKQ2UmZt2YcsJ4CL3f0y1Y9aktqr3Q829jJ0eJEqYgIcBvwmqrOinz1EDAJmBHeH0xEQieTpB2+OGXcEM66/6WC7eVm7KynmqwOn/TKNoU4emT/JhU8o+1OdVGMpT4aOA4YIyIvhdcETJmPFZGlwNiw7ThlYeqvXy6pPSnyFTrA0hWfMHbWU2WVY/rEYRw7asA6y7xGhGNHDfDolyqkRUtdVZ+meXfpPvGK4zjF8WlDYV9xc+1Jka/QW2pPkukTh7kSd7ygl+M4TiXhZQKcVnHMLc/yzBsfrtsePagns0/aI0WJHD8nTanW5DS31J2SyVceAM+88SHH3PJs2WTo2IxDsLn2pBjcu2tJ7UmRhXOSJao5Oc2VulMy+cqjpfYk6LNp4dDF5tqTYv7Ze62nwNOIfsnCOckS1Zyc5u4Xp12SpeSjcitwp2WydH2UG7fUnXaJJx85G6Karw9X6k7JbLJRTUntSTBl3BBqOzXtr7ZTTSrJR1lg9KCeJbVXOnsPLZy93lx7JeHuF6dk/vVZQ0ntSZCLYqjG6IZCzD5pj8xEv4y8bD7LP/r3uu0+3Tuz4IKxZZXhydcL15lqrr2ScKXutFsmjuhXtUq8EFkIX8xX6ADLP/o3Iy+bX1bF7j51x3GcGMhX6C21J0U1+9Qzb6lnYSjnNGX0oJ4FQ+Wq1X+780WPNnE9bbJRDa9csn+KEjlTxg1hyq9eZs3axrIRnTpIVcy5ZNpS39BQzkmPv6z4uKT2SiZfoYPNLex80aMpSeSso1B96Cog00o9K0M5pyl+XhrJwqRxjjmLljF6xhNsM/URRs94IpXsyT7dO5fUnhQz5y1hTV5xtzUNWhXJR5lW6lkiCzeM4zRHVtLiF1wwdj0FnobLtJonSjPvU88CuRsml3acu2EAj75wMsGG0uLLfY1mYc5rqx61LCugwKthotQt9SKo5joSTvugmi3TQlRzclqLSl1EbheRFSLyaqTtYhFZlrcSUsXiN4yTdao5hK8QE0f044rDhtGvRy0C9OtRyxWHDauKkXUx7pc7gJ8Cd+W1X62qV8UuUQap5qGc0z6YMm5IExchVI9l2hzVmpzWoqWuqv8DVGf9zkA1D+Wc9kE1W6ZOU9oyUXq6iBwP1APnqOo/Y5JpHQIUWnGy3OGmXmfEaY4uNVJwXdQuNeUPiq5Wy9Rpiqi2vFCviAwEHlbVncJ2H+DvmM79IdBXVb/VzN9OBiYDDBgw4EvvvPNO0cINnPpIs9+9PeOAon/HiZ9C56Zaz8nQC+Y2UexdaoTXL6voaSanzIjIQlWtK2bfVlnqqro80tktwMMb2Pdm4GaAurq68i71XoFkRZlWqwIvhCtwJ0u0KqRRRPpGNg8FXm1uXyc+mhu5bGhE4zhOddGipS4i9wJ7AVuIyF+Bi4C9RGQ45n55Gzg5QRkdxymCOYuW+byP07JSV9WjCzTfloAs65GViVKAsbOeYumKT9Ztp7G4sOM0h2c9OzkynVHanAO+3I75fIUOsHTFJ4yd9VSZJXGcwnjWs5Mj00o9K+Qr9JbaHafceNazk8OVejuiuYgTj0RxvEyAkyPTVRr7dO9csEZ3uWszZwlX4E4hvEyAkyPTlnpWajMP7t21pHbHKTdeJsDJUVRGaVzU1dVpfX192fqLE49+cRwnLRLPKK1GXIE7jtMecKXuOE5FUq3JWK7UHcepOKo5GSvTE6WO4zitoZqTsVypO45TcVRzMpa7XxynQjjmlmd55o3GRcpGD+rJ7JP2SFGi9KjmJSjdUnecCiBfoQM888aHHHPLsylJlC4DNy+svJtrryRcqTtOBZCv0Ftqr3See7Pw6prNtVcSrtQdx6k4GppJqmyuvZJoUamLyO0iskJEXo209RSR+SKyNLxvlqyYjuM4xVMjhVddaK69kijGUr8D2D+vbSrwuKoOBh4P247jpMToQT1Laq90jh7Zv6T2SqJFpa6q/wPkO+YOAe4Mn+8EJsYsl+M4JTD7pD3WU+DVHP0yfeIwjh01YJ1lXiPCsaMGMH3isJQlS56iCnqJyEDgYVXdKWyvUtUeke//qaotumDac0Evx3GctCiloFfiE6UiMllE6kWkfuXKlUl35ziOU9W0VqkvF5G+AOF9RXM7qurNqlqnqnW9evVqZXeO4zhOMbRWqT8ETAqfJwEPxiOO4ziO0xaKCWm8F3gWGCIifxWRE4AZwFgRWQqMDduO4zhOyrRY+0VVj27mq31ilsVxHMdpI2Vdzk5EVgLvtPLPtwD+HqM4rcXlaEoW5MiCDOBy5ONyNKUtcnxBVYualCyrUm8LIlJfbEiPy1FdcmRBBpfD5ciKHF77xXEcp4Jwpe44jlNBtCelfnPaAgRcjqZkQY4syAAuRz4uR1PKIke78ak7juM4LdOeLHXHcRynBVypO47jVBBVodRFpPIXJtwAIlIV57m9IiL7ishZKfU9VES2TqPvOBCpvFUvRGQfETm+tX9f8Te7iOwDfE9EupTzAsjCxSYiY0RkF1VdmzXFnoXjAyAiNSn2nTsGI4HPUuh/AjZ5t1G5+24LIrKJiHQD0BQmBZO8l0RkPHAdUNNaYzRTN3rciMh+wE+Ap1T1UyBxRSIiPaDxYktZee0M1IvIsDQVu4jsLyKXi8htInKUiAxRVU1Rnt1F5HAAVW1IQ4bQd04hdQPKukSRiIwDrgImq+obWXnItkR4ED0IzBGRX4vIJBHZvEx9D899DNuxXr8isitWR+sbqvpzVV3dmt+pWKUuIocAlwLfUtU/iEg/4AgR6ZJgn6OBuSJyqoh0F5GaNJWXql4DTAGeSstiF5EDgWuBhcAyYBhwh4jsGuQp6+hJRDYC7gHuE5EHwlB3m9z3ZZRlmIjMCZsfAhuXse/xwK3AUGBVaE5txFIswUj7KfAj4Gjgf4A64BwRSfShKCJ9gP8Rkf8CpgXDZG3k+ziunY2A/1bVF0RkUxH5lojcH9aJHl1sHxWr1IGzgVpVfU5EtgDmhu1PE+xzM8w6/i5wIfAjEdkkevKTRkT2E5GLRWRPEZGg2M/DFPvO5VTsIrIp8D3gRFX9jar+ALgM+BVwnYhsV87hsxqfAWcBs4AlwEHAz4Plvm71rqQUfOR33w2bvwA+AV6L7NMxvHdOoP/hwHRgX+AbwJ/CA//zNF1RGyI8jDsBBwLfV9VHVXWlql6HWe2bYP9Pkg/mT4GngP8HvAf8XkSOE5EvQ9vcQCKyi9jqciuA7UXku8AfgDHAG5hr7mygOHeMqlbsC6gHHgWexBRL9LtNY+znP4A9w+ezgZOBPbGb5y/AaUBdGf7frsDtQAOwCJgPnAFsG2T4ANgu7NuhDPL0Ap4HBuS1dweuBA4o47WwReTzrtiDZXDYvgxYDVwDXJ2wHJ3zztcdwFpMWcwGngaeAX4N3Ah0jLHvocBFWHGoXNuZwD+AncN2TbnOSSvkvxo4KXzeKNJ+LvCrMvT/FeDFcF3vCPwSexhfio1ASz5XmGF9ObZGRXdgHHAxMA3oH9nvcWCHYn6zoix1ERmeewGoFc/pAvRT1Vsj+x0JXBCHJRSGhLcCa0LTGmBvVX0aU7BbAdtgPsDT29rfhlDVTzA/6SzgbqwO/sfAI0BnoDfw5/yhY9wEqwpVXQm8Anwx8p2o6kdAJ8pUvjm4Vy4ILjlU9UXsZrxYRA7CFk4/DTtmO4tIIkvOi8hY4F4R+YGITAzn6wzgJsxyPxv4FmYMXAVco6qfx9T3xsB+2AN+y1y7ql6LKaUnw9xLQ5YsdhEZEfz/AP+HKVZU9bOIK/UeoHPccovIl0XknkjT88DvsXu6AzAKuAQbnZ+AjRhK+f3hgAJXAIuxSevFqnqxqk5X1ffCfkeF3y5uPdC0n74xPkX3B14H7gMeAw6LfLeQ8CQHDgNeAraPoc9xmD/04Lz2OZiltTT3HTAC+GJC//tQ4KvhvTswEJtwuQgr99kH2AmzdBYAQxI8D2OA7wMHhu3pwP3AVnn7nQmcX6ZrY0vgB8DMiFxdgXvDjXJQma7PBZgS/yFwC42jph5BlrsSluELmGvwFmCvvO9Ox0YMO5bjnBQp7wTs4TslyF4L/Am4IW+/ycA8oEvM/XcE3gTujLR9N7S9nbtuglwljfyxh8IvME+CYJPll2PW/xfCPoOwUf/LwE5F/3baJy6mgz8uKOqdML/oiZj10SGyzwuYNfRHihzGFNHnn4PyvgnYMvLdWMwXNj5sd25rfxuQ48Dw8Pg1NoR7AfhSUGSzwnHYIbL/xgnKMgFzeR0FjIi03wH8BrOItwOODzdnogoEGyIPxiYhu2NzCz+OnJfvA3Mj+8fukgo37HZBYR4Q2gYAdwG7RfbrghkDd8fc/3DgUGAvYGtshHQ2ZhXulbfvyST4wC9R7q8A/0twa0batwzXzmxM2Z8Trv+ilV4Rfe9JMDiwCeRXgHsi3/8KuDH3fRv66Q7cBjxAo2Kfjin2rYD+4XotSV+lfvJiOgk3AK9GtocHJbczEV82ZjHuEkN/QzDf539gFt+PwkXWK3w/ICjXnJUuCf3fu4cL/8the2PsgfYh5jceEGS7Ms6LvhlZ9gg318i89pHh/QzMQnwCs6qGJSzPeGxI+1vMAuoI9A2K/dqgNKuCd9wAAB5iSURBVDqGG/bohGSQyOfbgjxdw/bvwrGYhU3c9gA2B/rG2P9B2JzOzzEXxd8wg2OToAxvAb6S5Hlo7THDRg7fCZ9rwnun8L4RZjFfirku2myk5cmwI/A+cG7Yzl0n94XtccCdtM6H3i3vuugWfiun2DfBXDpzMd996X2kfRJjPBGzgXnh81XAX8ON83Y4aGNj6meXcGPsGLY7YMOkGUGGPqH9RGxoVUtySn0C8MPwuSZyQ5yIhXt1Ccr2EiIThQnJcijwvZws4f1KTNFfF5FtEywKKUlZJmATxTthVvLDwDbhuy6YhXctNnKYFKcizZOjlqYTetdiD+HrMBfhUcApmHvwFmCTGPveBRtJjoq0HYG5m/bBLPazMGW/Z1z9xij/2cD14XOH8C7hfhse2S+RCX9s/ulN4Jyw3RF7KP8syPF47l4v4TcHY0bGiAKK/XbglrDdDzgfmwssXfa0T14bDvru2MTPNyMn/XZsuby5Ybsm3NTntvYAFej325ileSDQLdK+LabY78KGiJKUIg2KenvMEnuW4M8LfQrmR3+URt9crL7GZmQ6F3gksj0Cm9/4ImalTinTddETi3a6K2zXYlbXA5gP87Cg0C4BLgA2S0iO/YH/Cjfx/TSOWC7BXDHRaJyOcV8rWJbqVeFzp8g9ciTmhtwSm8A/hYQeaq2QeTdg3/B5IuayyxkDHSKfrwC+Gj7HYjBho+9xke3O2EPxQxpdMR2xCLIftrZfbF7rV5gXIarY+2PzKsNyfbX6f0n7RLbywByETYpehLk5fg6cEL67Cngsb/82n/i8E/CtcMMeRBhOh/ZtsOSIm0jOgtgfs+xGYNbErdjDbaO8/eZQhjDKSH8DwnkYH2nLWexTgO+WQYbcKOmgcB1cjs2hTA43zVFYJFCf8No8ITnGA28BhwRlcRU2r/DN8P21mNUXa/951+h+mO85d0w6RM7Hb3LXBsGlkfYrXNdLgAMIRghmDf8OM85yD6VjsInDrWPsuzP2gL8Je7BsicWJH4cZSK/T6IqpAQaV+Pu9gYGR7fOw+Pqdifjkg1If3eb/J+2T2YoTsEM4+buF7Y3Cif5Z5Ka5B7NgY5kUDBfczzC3RsfQdlBQ7AcQGTZjkSe9E/rfx2ETsGMibT/AQvH2zykJ4OtY1MCWScgR+tgTGzFsFrZ7YPG1s4hEAwVF+t+EmPAE5dkHs0C/HjlnjwEPR/bZLCi0oQnKUYO54Q7La5+EjRSGh+3Z2KRybK45oGd4z1m012Mjyx452cL7fcTkjoxJ7pFBce4dtqMBDrkAgLsxH/prxDspugU2F9URMz5uDPfYqZF9dsQSg0o2TLAR/Qs05h78OPR1DjZ6/I/Q//7Ac8TwsEr9hLbiIO1NY3jiRuG9S7hpbo/sdxuR4P029NcB8299iiVpXBduihFY0srPMcssURcH5id+E3ugjabpCOFCzCe7GEug+TMJTkTSODR9C8uy2ztcqN2xiJK7gyK/ItysiU7SRo7Ph9hcQi5BZRw2csr5Rcdjo5zYrLw8GYZgD7cbCEP56HWB1SGaHdmO7aEblMLTmPGR+3+/jkW5nEZwP2Lp9f8bx70Ro+yHAheFz73C9o+B6aFtV2y0NYkYjQPMIHsem3+5JJy7KZgVvVvevkMo3ULfNxzrPYOOGhXOz4Phfvk2Nt83BxsZxHLPtpuVj0Sku6p+JCIjgB+o6qGhvaNaivPGWBzwJar66xj6Ew0HR0Q2wyzzQzBf9T+wSafPQ/vnwCRVfbit/TYjy7aYhTkJi+CYiinv32so+hNk3A1LNnpfVd9OQpaITCdj8wj12A33CnZzzMaiOHIx/EtU9c0kZQnydMOy8N7DLL/nVfWnoc7JOGxupQ9WLOlPCfS/Baa0p2Ex+N1U9Vvhu1pVXR2u3e+q6vGhfd011sa+D8QMjysx18GXsJC7RSIyCauPMgFLnPkycKyqLm5rv21FRLbCMmmHYsbBKVgSz7uYEbUFdi3tGcdxyut7HObfnoRFBT2BuTKvwazozYDHVfXRNvQxBViuqndF2jbFztPfVfWCUJOqC/Cxqi5vbV9NSPspXeQTbzwWuzkCCyFcitWAyH2fG1bOAvaIqc/N87a3Ak7CfLJDQtuWmIJ9gISG9JiiPoxI8g5mgT0BHEwZJkEj/XajMaxseyyy6Eth+wfYBOA1WORCtzLIs25iLWx/B4t02TNcL98O7TlXWewjBpr6sX+JuQK3wHz5F+btezo2SRabHxsbHa3FjBkwd+Qsgisyst8ozHWZyCilFXJvgY14Tw7bX8PcLJcTscYxP3Osk9mYwr4fG93m/Pd7AbeGz7nQ1xuJuDpb0c8VhHj2vPZ9SLCsQeont8iDMw2rJjcDe6p/ERv6RxX7EZj7YWAM/Q3DLIXv0TQzddNwYz5CZPad5CZFJ2CTQkdgw7XopMrRQbEfRILJTZH+xmND+SMIcxXYhPFUbBi7BDgcSyy6hgSH95hLrF9QZp9hw9hxmD/7Qmz4vg82ujkr/E0iSVc0dYOdDEwNn/fA5nXuDMfsO5ivNraEK8xfOxJ7uP+dxqSq+7BJ0gdC/3UkNCncSrlzrqAJWLTYN4hE6ET2+2Y4hj0SkOFAzGI+C/NpX4m5zXITsv0w46Tk+bHIb0zARm89877vHu7dZCbq0z7BRR6kUZjP6SJsMm4YFvP5KmaBPRgu4p1j6m9XLDRyOmYp3Bv63Dgo8VMw321i8b1YcszLROKMC+xzJFZgaELCx/8gzL2ybjI2tO+GTUb+FdgvtHUg4YgKGudSjseG7zcFhfpYuDG/H74fj01Oxq4Uwu8PwpJ7foIlw5yMRWx0D99vhvn0Z2K+1DgV+n7YvEZucnEi8K/I/bBjOF83ByWfCaUelOmThElHzEK/Oyj2XKRO/6BsFxF/YlG0oNoEbFTzBPBkpL0m+l7Cb/fK2+6JFdW7Mu++OQ6bOE1kNJv6Sd7AAdqWptXkZmFD14vDQdoO80UNxqyVWGNtsQnRe0Mf52Pxxr/ERgqDMesrSWv0OBrDqHpELsAziAxHMdfMFxKUY2vMlbB72Ja87y8Gni7jdTEUm+zsH7a/iY3idsKs8/vDa1PMFZFYWYTQ/2jMT30H9nB5D7PSO+ftF2e1xXHAcuC0sJ2zDMdjBeUOzdu/e7nOTwtyH4AZYqOj5wUbWf0iKPbemGtvDvFGuURLMkQV+77YQ2VKW64VLOrtGuBree1bYnMZN4R+pmIRPMkFMqR9ops5QDthQ+sFwKnYsHp7bLg/Cgtt+hEJpDhHbpCtgdvC572wBR5uxqyHC0nYZxyU1fxw0T2GPVDuwWbrryzjueiDWX7d8xVV5DjdhUXAJJI5W6DPm7BheW4Yfzrmfshl+SYWyhm9Rpr57mQsOW1vIiOWuI5NRDHej/l9R+d9PxGbyD887r7bKHctZiSNCds1ed8fhrmKTsHmzeIuzvUoNnme28632K/C5oW2auXvb4ZFfs0EDinw3R6YC+40EgypVc2uUu+AhTT9ISj208IJfxyrAtgznISLiC8WPd8C7Y5FcszGhrkTQ3sdecOsGP/v/ljoVLew/QNshHA1jfWut8TCBXsmIUNEli2widG+4RxsFTk3uaipAdiDdmYZFGm+5XsNFv+7ddg+E7Neo2nxsSszLJP5kPC5Y6Q9Glt9Sjhm/xFz39tgscy7BUVxGTYZNypvvyOwCJKuWVDoQaaNMIt1v2a+l6BcbyHetQ52pfHhP5embpZouOk3MGOt5Psqcj/0wCJnZmGRctFJ9CMpsaxAq//ntE923sEZjFk5nYJCOROLZhgXDspbwMyw77bEkFqNPUG/kHdycu87YUkHiae4Y37rl7DQrjkE10+B/Y7BLPiuCcqyNfYQzSmvK7AHbO+wnfM5fhtLmU66lsvQcLP0y2u/GnPF5IpkfS9cIxslpcywkLsPaXT/RG/cDnn7DYix3wOwEWp04YTtw/G/vIBiTzz6qEi5BwHbhs+zCIEHNK1VtDlwQfgc23VNY5ZqtGLovKhiD21HY26RVs87RP6XzcK1ejWNhuBxwEfEPD/QrCxpn/TIQRmFWSK5imUdg2K/CHM9bI65AgbG2OcmmBvn9dwNGDk5uczR87B1TpvcwDH/73uHi28kZrFshSU1/YlGq70f5pJ5kQSTeSKK+wwaE6v6YJFHz2IP095YpugiylB/G3vg3YQ95PvmffczbASXe9AkNSlaG/l8BeYKW68vkinfOxabqM5NRkejoIYExf5DIiODpK7VEuXuiY3iLg+fD8Hq8Hwpb7+vY1Z0nBb6BGwycp+w3Tvy3TrFjhmLy0u9jingISig2H8Y7qE3SbgqaRM50j7x4SDkapPvgllZt2Ixqx2xiJNcFMp2MfZ5EDA/fL4cS6L5QtiO3jQXYe6OJGuiT6YxXjdq7d2C+U4lyPtAwgq9P5Za/p9h+9vhojwQe6j+MByLB7AIhliijYqUbQKNMfDRmP3NgmLvHLaTcLmMCUrniLC9FTbxdVhSfUb63g+bfM2537YJxyDqE/4iZgVfSBnzFoqUfxxmOF2EWecnYdFSx2ITpN/ERqixKT1sQY1VwNlheyAW7hzNaXgUixb6S6nXMTZqurzQsc5T7N/HsnzbXO67JPkyctLfBPaPtHXF0vxzir0W86Hfiblm2nQThT4XE6n9HS68ekKJ1tD2Daz4Uknpwa2Q54fALyPbOatzMBZrnQvhi600azNy1GIWxlU0ul5OIVKoK1ysm5KQRRyRZQjrr216QFDsZxFGbOEcPUGCER6YG+4d7EF2M1Ym4VzCRHrYJwkLvRPmUnoJG1Xmsqa/18zxSmSupxVy70IkiAHza18TFHvncB5/iNVCuS0uhR5RqD2xaJa7sAnY+cCZ4buowXYTkTK+RfaxHzZC/XIRcnQjoSqgG5Qx5ZO/f7hZnsLisntEvssp9t+Gi7uWeHzo+2HlM+eyftZoTrF3woaEy+K64ArIsTmNBZi2x6zNXWjqo+2IRb4MLPN5mUYoLxy2Tw6K/VDKkym6KWYJ3UBe2GhQCJdhFtAVmIsqEV8lNmrMPWCPxZKrrsMeehMxi3N6GY7FGeE6WAocl/d93yQeKK2UVbCJ/LXhdSk2yt4SS5Caik3+54rAxZrTQGO2c4cgy3ex+vHX5e03jla4DcPf/RP4eaStYCw7Kbq/Ult4WkR2xKzg/8SG/CcAY0RkE1i3iPIZWMbgXaq6WlX/3sY+D8BuyAux4delIrJD7ntVPQ/zz67AFMZ4TaBGhohMwEqK3iQil2BDwI7YhMrwyK6HYzf1/8UtQ0SWISLycG6xaBHphbl6PgH2CQsk/wxLhNo3KTmiqOr/YZPFq4DvisiAyHeP0Dg5uBirCPnnuGUI18Uc4GAR2RyLtNkbe6D8Fot0WAqcFRYfj7PvwSKyh4jsDaiq/gR7yK7Gol9y+30DG+LXxtl/a1Hjb9h1uxozjv5GYx37PbDosaki0ltV16rqmmZ/sATCot6zRGQ2ZqF/ARsdXAb0E5FRYb/jsMi6j0v8/XFYfswlwFoROR9A8xbqFpEOoV3b/E+1lhSf6rU0XTvzZCyB4zCalrKtJYbEIsxtcDihNgx2cV2EZQNun7fvaSRn/eWq6R2CKfDZob079nC7BRtFXEiMWbLNyDKCxtVY7sR86k/S6N8/FXvw5nzJSbtcvoiF6w3GrK3cwiOzaFqP+lTyapvELMdI7GF6ARbFcA/myz4CC8vLTV4fjLlG4o5yeTGck8ew0MTcXNOZWMbotjRWnEy8AmYr/48jsYfyUEy5j8SU7cvYCCe2RUEwC3oJlp16NGa4vR/6rMHcdfdjI/E/UPqk6GbhvswtdPJVLHhjat5+Y9iAW6Zsxz6Fk70zYQm2sB1d7msyptgnEu9M+PZAQ1AQx0Tah2BP3vUUe0L/e09sWHpo2N4dcwX9LMi2KRb7/T0suuSLCcpyQHhoHE5j1FEDcEZkn66Yj/1Kkk+2OhhLqnkEC2N9ASvDOiTcjLPCDXpEuGFHJCTHfuH3c4lMX8Ss4b9ilvr1mL82N9SPLZyTxpraX420XYS5KHcK22dgI7ul5bhmS5D7bNaPajkxKPa9wnYXzC0S9zqsfybPCMNcPX/DDJWasL3uOJbw+wdiWdPbRdo6Y3XQ748qdsxF94XUz0cKF8BILHri4khbNPPuBGwC5UDiy8Lrj4U3nYcNY3+BRVN0wUIFp2E+40QzvYIsB2ATLbtgEziX0LhQ9eyk+w8yfDUohuhEcbdwXO7J27cLCU/2YGnjr0eVAuZPX4w9CLfFXC7PBoWb1DzHhHAe9gzbm9MYVfM1bNL6v4ISGRTa47pGcw/83DxGNDHmYqwYWDfM7XN6VhR6kO86LHb/OezhuwONk/vfwjJcCyYdxdD3pcCn0es18vla4IbweTNKTCwKOuhlbC4pPzmxS1Ds9xAxUrPwSuMC6IRZqHfT1GLP3Ty9MFdM3LVcrsaerB2xIdpD2ATtcMwqnky5Mr7MsllL06d8N2y4negC0aGvs2mMBog+ULtiFvsv41JWRcrzNRoXrY6O3G7EXFWCuWWuTVChb4uNXHLrUW4VbuiJkX0GY9bnEmJ0uUR+/wAiy9zlHYsngV3D50xMjEZk+yoWcrw1jaPO+whZxjRmuCayCHt4kLyVu3cIoyfMTXdDK39zy3DMcyusdcaij7aO/H4t5nK5nYwUTFMt00SpiOwS3juqTYy8jLk8thWR6QCq+m8ROQuzhu5R1Q9i6lvCx/MAxdLfP8DC1P4X813vAfxG4ypS3wJqhffHAd8UkR6h+XDsovksqX4jx2Ib7OEJtsBHTq5PsAnIzpgFkigiUicidZhvf2yQ4bPcpC1WK2MVlkn6AlbgLImJ642xBJTfAioiX8cMgBtVdU5uP1Vdqqq3YnHH78Yth9ok8LnA8yKyWd6x+BdWrAtVXRt3321BVf8bu56+qaonYyOZI4A7RORWbI5gqFqwQ5snEEWkR+S4oKpnY0ba8yLSS8PCMZjb5f9EpEPk2i+Wz7Dj/amIdMFccA9hZUNuFJGeoZ8/YIXV/tHGfys+yvAU3xJLkZ2DKdCob2o37Al/FlZGdQnBGolZBsEmmqZjyup1GlN4h5BCLGnoezzmRz4VK+VblkkvzLp4jMYFLjrQWMjsVCxZI9FRC2aVvoJNqI3CEs4OiciRy+idS4JZq1i1z6uwCb2e4Rp9mbzFDTDf7ZAyXhdv0Bj6dzwWn57I2retkG83LFJrp8j52hGb+5gUZB+DzZ99m0juRwx974q5eq4ilB2OfPcT4K3weQI28mqVmyrojHMwd+1fsbm+EzH38R1E1uHN2qscF0APbPLrXiyb7G2sfklOoXwJiyj4jIQzFIMCX0HeijSpngDz2/07ScVVoM+uNJYwjvqxj8T8/YmujkOeTx/zT34fC0E7NLLf4Vh0SWLKLCiJH2Eug8Hh2FyIGQC5WuVHBHkTTULLk2s85oo5BSt9nIkoFxojTe4M9/IBob0vNlf2EU0Tj2J1FWFG4rPhnL0bztVRke9nYq7Nl2ljBBvmEt0jnP+oK+w2bEnA1M9HoVdZ1igVkTHYk3UsNjF5IfaU/wVmiXUG/qYJDGkLyPJNLIb1SlX9f0n3VwwisnG5ZQlrI56A1SB/AVvp6T+xEgGvJtz32UCDql4rIp3VXG9bYJNqO2DuoT9iD7xjVPWVBGTooaqrwucdsXmWXImKFdhkZEcsk3MPzLXwWtxytCDjgZiiHKEJrKvaCnm+jE0UH6SqfxSRE7GHzldU9ZPw/fXYZO+yhGTYCFOqf8SOzZexaLF3gEtV9bUg1/MJXTeHY67cI1X1jbh/Pw4S8amLyOZhIeQcz2DD/b6YX3skFs3wZWwY95dyKPTAs9joIDOk8XAJN91MLBb7Y6y+yMFJKvRmfPprRKSDWmLZddjk6EPYJNXEhG7MfTH/67Uishs2nP8pdhyOxwqYXY8ZG18GTiy3QgdQW8i8RxYUeqAXVvtnWwC1+YW3gKPCcXwDc1fsE03IiYuwUPdnmFE4CdMlf8f0Sidgmoj8Arg97utGRPqGOb+LscXLM6nQgfgt9ZAteTE2NFuqqheE9rNpXPDiO6r6XyLSFYu+WBWrEC3LWHbL2GkkjNy+D5ynqgtDFl4HVf1cRE7HVlJ6KcH+h2Phd/8OcpyJDeeHYmnlvTBX0CosRK5NmcztHRE5GKvM+Qts9LQvtljLjtjo7hnsQd2AuVvHqer7CcnSAVPm3wt97odFcj0iItsAH6vqygT6zUW6LFHVv8T9+3HSMc4fE5H9sZvkMmw4dLaIdFXVT1R1VhierQgKvUYt2qLsuEJPnQVYqOKRIoKqLsRSr4/ELLCHkuxcVV8SkV0xq/NfmGLYGxvBbYqFuXYEzlHVktLJK42Qfj8dewB/JiJzMaV6AjbJvI2qqohsjcWCr45LoYvI9tgD9o+q+jk0Rv6IyGIsMes7alFDqOpbcfRbCLVIl0eS+v04ic1SF5Ge2FDoa6r6WxHZHVsG7QFskuHEYMV/DZisqg2xdOy0S9L06UdkyC2cfaaq3hFcBrtgSv7BNFwuWSIo9NlYRuifg+Luj1nph2CTps+o6l0J9X8xlph3O/CsWp0V0aC0RORCbM7j4rQMxCwSm09dVT/Ewr5+EOLSL8PKlM4AhonIzzHLaAcsU8+pYtLw6ReQ4QXMlXC1iJyqqg2q+qKqzqh2hR7YFLPK14hIZyx2f1AwyOYRlqcTkRPi7FREhorInli26JtYxdTRYXSvIpLzMPydBPM62iuxul+CX6sBC4v7vqrOABCRfYCHwwz5VzSmymxO+yYMaZ8Or7RkeCFMnL4gIp+q6u1pyZI1VPXXIqKYO2xjbMm5u4O1/ImIzMOS1xbE1WeYTJ+ILYO3FkuGm4ZlfXcQkafD3MskLGJpklvpTYk9+kWbz5bsLCLdiWQwOk4WCD79L2ETflVPNHJFVX+DKdUOWHQLmHLtEOYbHlIrtxtHvzti0UbXYrHwR4Xt6VhxriOwLPSTsWCM05L0o7dXEotTF5Hx2PD6BuzknFrOobXjOKUhIjtoqE0fXB0Nke8Oxxa4OF9VH476tmPqeyOs/tIeWHjpIqxMRD+sFtEfMVfd3liU0nhVfTmu/iuJRJOPspY84ThOYULI3s1YUtg3Qlu+Yv8aZkV/S1V/H1O/0YnPbbGKnWNCP/+LlRruhxUIew4rOzBfVZfG0X8lknhGqceEO072CS6XQVhI8qeq+u1ce55iPxh4VVXfjKnfbVT1reDOWSsiucU06rFEtHosu3cH4FZV/WMc/VYyZSkT4DhONhGRLwGfqOrrYXsgFv/9uaqeFNqaKPaY+q3Bav68ixVPmxbCXO/BsonfxNYtvg2r/zMZuE/LVEm1PeNK3XGqlODueBarSPgbLKz0TiyJ6ARsjYPTwr6xKvZIzZ9BwBPYwjjDsInXnwYFPxFT7D9W1efj6rvSSW3hacdx0iW4UK7HlPlzWE2mC7CJyjeBvURkVtg3ToU+Frgzkjw0CouQ66CqPw39LcOWNXwsyOcUiVvqjlNl5PzXke0fY2Vmf4KV9zgSKwFwDPD/sOX9YqmnEkqJXIrVkemDVUzNhUw+j1nlMyL7d8yVCHCKw5W641QRIvJVwnKSGlldTER+ghXtuijiX98SW6zkrzH1nSslckio/9QfK8n9a1X9lYgMxvzpP1fVC+Posxpx94vjVBfnYhUp54rI8SKyF4CqnkGoSS4iOwcL+W9xKfTQR66UyAwR2URV38OWjOsZfPZLsTUXjhQr313qEnQObqk7TlUR6jIdDryPrSI0Clv2bYaqLheRK7BFt09S1X8nJMN4rHb+vNDX11X105yrRUQ6eSmR1hNr7RfHcbJHXq7ISmzVsYWqeoOIHIP5t7uHsh7HA7VJKXQAVf2diJyCFQTbMij0Lqr6adjFfehtwN0vjlPBhHLXl4tI/zBB+j7wM+DbIjIZi3Y5CpusXA70VdV/JC2Xqj6GLT7+pIj0jih04iw/UI24+8VxKpRQpuMybPJzTqS9I3A1FuVyjKrOT0lEROQQLNmpDtPnrpDaiCt1x6lAQuTKvcC5obxwZ8zduhnwATAeW3x9x7B/kzDHMsvaTat8hak4cfeL41Qmn2GRJZ+KSBespstDWGGsm4C5wBsici40LhOXBq7Q48UtdcepQEI44NnY0nw7YpmZTwOLscqH92MLRb+kCS0S7aSDK3XHqVBEpBtWT6U/tubqZ6H9duB3qvqrNOVzksGVuuNUEWGxi6nAkar6l7TlceLH49QdpwoQkb5YtMtJuEKvaNxSd5wqIKxsNAZY4gq9snGl7jiOU0F4SKPjOE4F4UrdcRyngnCl7jiOU0G4Unccx6kgXKk7juNUEK7UHcdxKghX6o7jOBXE/weJZsGU4jnUAwAAAABJRU5ErkJggg==\n" + }, + "metadata": { + "needs_background": "light" + } + } + ], + "source": [ + "import matplotlib.pyplot as plt\n", + "plt.scatter('City','Price',data=new_pumpkins)\n", + "plt.xticks(rotation=45)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ] +} \ No newline at end of file diff --git a/2-Regression/3-Linear/solution/notebook.ipynb b/2-Regression/3-Linear/solution/notebook.ipynb new file mode 100644 index 00000000..f0c5a590 --- /dev/null +++ b/2-Regression/3-Linear/solution/notebook.ipynb @@ -0,0 +1,333 @@ +{ + "metadata": { + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.3-final" + }, + "orig_nbformat": 2, + "kernelspec": { + "name": "python3", + "display_name": "Python 3", + "language": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 2, + "cells": [ + { + "source": [ + "## Pumpkin Pricing Per Bushel, by City\n", + "\n", + "Load up required libraries and dataset. Convert the data to a dataframe containing a subset of the data: \n", + "\n", + "- Only get pumpkins priced by the bushel\n", + "- Convert the date to a month\n", + "- Calculate the price to be an average of high and low prices\n", + "- Convert the price to reflect the pricing by bushel quantity" + ], + "cell_type": "markdown", + "metadata": {} + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " City Name Type Package Variety Sub Variety Grade Date \\\n", + "0 BALTIMORE NaN 24 inch bins NaN NaN NaN 4/29/17 \n", + "1 BALTIMORE NaN 24 inch bins NaN NaN NaN 5/6/17 \n", + "2 BALTIMORE NaN 24 inch bins HOWDEN TYPE NaN NaN 9/24/16 \n", + "3 BALTIMORE NaN 24 inch bins HOWDEN TYPE NaN NaN 9/24/16 \n", + "4 BALTIMORE NaN 24 inch bins HOWDEN TYPE NaN NaN 11/5/16 \n", + "\n", + " Low Price High Price Mostly Low ... Unit of Sale Quality Condition \\\n", + "0 270.0 280.0 270.0 ... NaN NaN NaN \n", + "1 270.0 280.0 270.0 ... NaN NaN NaN \n", + "2 160.0 160.0 160.0 ... NaN NaN NaN \n", + "3 160.0 160.0 160.0 ... NaN NaN NaN \n", + "4 90.0 100.0 90.0 ... NaN NaN NaN \n", + "\n", + " Appearance Storage Crop Repack Trans Mode Unnamed: 24 Unnamed: 25 \n", + "0 NaN NaN NaN E NaN NaN NaN \n", + "1 NaN NaN NaN E NaN NaN NaN \n", + "2 NaN NaN NaN N NaN NaN NaN \n", + "3 NaN NaN NaN N NaN NaN NaN \n", + "4 NaN NaN NaN N NaN NaN NaN \n", + "\n", + "[5 rows x 26 columns]" + ], + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
City NameTypePackageVarietySub VarietyGradeDateLow PriceHigh PriceMostly Low...Unit of SaleQualityConditionAppearanceStorageCropRepackTrans ModeUnnamed: 24Unnamed: 25
0BALTIMORENaN24 inch binsNaNNaNNaN4/29/17270.0280.0270.0...NaNNaNNaNNaNNaNNaNENaNNaNNaN
1BALTIMORENaN24 inch binsNaNNaNNaN5/6/17270.0280.0270.0...NaNNaNNaNNaNNaNNaNENaNNaNNaN
2BALTIMORENaN24 inch binsHOWDEN TYPENaNNaN9/24/16160.0160.0160.0...NaNNaNNaNNaNNaNNaNNNaNNaNNaN
3BALTIMORENaN24 inch binsHOWDEN TYPENaNNaN9/24/16160.0160.0160.0...NaNNaNNaNNaNNaNNaNNNaNNaNNaN
4BALTIMORENaN24 inch binsHOWDEN TYPENaNNaN11/5/1690.0100.090.0...NaNNaNNaNNaNNaNNaNNNaNNaNNaN
\n

5 rows × 26 columns

\n
" + }, + "metadata": {}, + "execution_count": 18 + } + ], + "source": [ + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "\n", + "pumpkins = pd.read_csv('../../data/US-pumpkins.csv')\n", + "\n", + "pumpkins.head()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " Month Variety City Package Low Price High Price Price\n", + "70 1 3 1 0 5 3 13.636364\n", + "71 1 3 1 0 10 7 16.363636\n", + "72 2 3 1 0 10 7 16.363636\n", + "73 2 3 1 0 9 6 15.454545\n", + "74 2 3 1 0 5 3 13.636364" + ], + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
MonthVarietyCityPackageLow PriceHigh PricePrice
7013105313.636364
71131010716.363636
72231010716.363636
7323109615.454545
7423105313.636364
\n
" + }, + "metadata": {}, + "execution_count": 19 + } + ], + "source": [ + "from sklearn.preprocessing import LabelEncoder\n", + "\n", + "pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]\n", + "\n", + "new_columns = ['Package', 'Variety', 'City Name', 'Month', 'Low Price', 'High Price', 'Date']\n", + "\n", + "pumpkins = pumpkins.drop([c for c in pumpkins.columns if c not in new_columns], axis=1)\n", + "\n", + "price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2\n", + "\n", + "month = pd.DatetimeIndex(pumpkins['Date']).month\n", + "\n", + "new_pumpkins = pd.DataFrame({'Month': month, 'Variety': pumpkins['Variety'], 'City': pumpkins['City Name'], 'Package': pumpkins['Package'], 'Low Price': pumpkins['Low Price'],'High Price': pumpkins['High Price'], 'Price': price})\n", + "\n", + "new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/1.1\n", + "\n", + "new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price*2\n", + "\n", + "new_pumpkins.iloc[:, 0:-1] = new_pumpkins.iloc[:, 0:-1].apply(LabelEncoder().fit_transform)\n", + "new_pumpkins.iloc[:, 0:-1] = new_pumpkins.iloc[:, 0:-1].apply(LabelEncoder().fit_transform)\n", + "\n", + "\n", + "new_pumpkins.head()\n" + ] + }, + { + "source": [ + "A basic scatterplot reminds us that we only have month data from August through December. We probably need more data to be able to draw conclusions in a linear fashion." + ], + "cell_type": "markdown", + "metadata": {} + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ] + }, + "metadata": {}, + "execution_count": 20 + }, + { + "output_type": "display_data", + "data": { + "text/plain": "
", + "image/svg+xml": "\n\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAULUlEQVR4nO3dcYyU9Z3H8c+H7SobtbdyLNy6QOkRQq9X2qWZIA3JhdZ6GDR1NedVUj1yZ4p3qYmNDT2ozam5NpqjVnPJxQtWU+7q2TMpRWNpOUI1TU1LuygCHnLUhlpxA1s9qva2Fpfv/TEPdBlmmefZnWfneXbfr2Qy83znGeb7+AsfHp95nufniBAAoHymtboBAMDYEOAAUFIEOACUFAEOACVFgANASb1rIr9s5syZMX/+/In8SgAovd27d/8qIrpq6xMa4PPnz1d/f/9EfiUAlJ7tX9SrcwgFAEqKAAeAkiLAAaCkCHAAKCkCHABKakLPQsHksfW5I9q4/aBePT6kSzo7tG7lIvUt6Wl1W8CUQoAjs63PHdGGLfs0dGJYknTk+JA2bNknSYQ4MIE4hILMNm4/eDq8Txk6MayN2w+2qCNgaiLAkdmrx4cy1QHkgwBHZpd0dmSqA8gHAY7M1q1cpPZpPqPWPs1at3JRizoCpiYCHGPjBssAcpcqwG0ftr3P9h7b/Ulthu0dtg8lzxfn2yqKYuP2gzoxfOZcqieGgx8xgQmWZQ/8oxHRGxGVZHm9pJ0RsVDSzmQZUwA/YgLFMJ5DKFdL2py83iypb/ztoAz4ERMohrQBHpL+y/Zu22uT2uyIGJCk5HlWHg2ieNatXKSO9rYzah3tbfyICUywtFdiLo+IV23PkrTD9otpvyAJ/LWSNG/evDG0iKI5dbUll9IDreWIaLzWyA/Yd0p6S9KnJa2IiAHb3ZKejohz7oJVKpVgRh4AyMb27hG/P57W8BCK7QtsX3TqtaQ/l7Rf0hOS1iSrrZH0ePPaBQA0kuYQymxJ37Z9av3/iIjv2f6ppMds3yTpZUnX5dcmAKBWwwCPiJ9L+lCd+muSLsujKQBAY1yJCQAlxf3AMSafevBHeual108vL18wQ498+iMt7AiYetgDR2a14S1Jz7z0uj714I9a1BEwNRHgyKw2vBvVAeSDAAeAkiLAAaCkCHBktnzBjEx1APkgwJHZdZV5qpmQR9NcrQOYOAQ4Mtu4/aBO1txC52SICR2ACUaAIzMmdACKgQBHZm21x08a1AHkgwBHZu/UHj9pUAeQDwIcAEqKAAeAkiLAAaCkCHBk1uZRfsQcpQ4gH6kD3Hab7edsP5ks32n7iO09yWNVfm2iSFZfOjdTHUA+suyB3yrpQE3tvojoTR7bmtgXCqzynhl1r8SsvIdL6YGJlCrAbc+RdKWkr+XbDsqAKzGBYki7B36/pM9LOllTv8X2XtsP27643gdtr7Xdb7t/cHBwPL2iILgSEyiGhgFu+ypJxyJid81bD0haIKlX0oCke+t9PiI2RUQlIipdXV3j7RcFcElnR6Y6gHyk2QNfLukTtg9L+qakj9n+RkQcjYjhiDgp6UFJS3PsEwWybuUidbS3nVHraG/TupWLWtQRMDU1DPCI2BARcyJivqTrJX0/Im6w3T1itWsk7c+pRxRM35Ie3X3tYvV0dsiSejo7dPe1i9W3pKfVrQFTynhmpf8n272SQtJhSTc3pSOUQt+SHgIbaLFMAR4RT0t6Onl9Yw79AABS4kpMACgpAhwASooAB4CSIsABoKQIcAAoKQIcAEpqPOeBYwqbv/47Z9UO33NlCzoBpi72wJFZvfA+Vx1APghwACgpAhwASooAB4CSIsABoKQIcGQ22tkmnIUCTCxOI8SYENZA67EHDgAllXoP3HabpH5JRyLiKtszJP2npPmqTujwlxHxv3k0ieLhQh6g9bLsgd8q6cCI5fWSdkbEQkk7k2VMAVzIAxRDqgC3PUfSlZK+NqJ8taTNyevNkvqa2xoA4FzS7oHfL+nzkk6OqM2OiAFJSp5n1fug7bW2+233Dw4OjqtZAMDvNQxw21dJOhYRu8fyBRGxKSIqEVHp6uoayx8BAKgjzY+YyyV9wvYqSdMlvdv2NyQdtd0dEQO2uyUdy7NRAMCZGu6BR8SGiJgTEfMlXS/p+xFxg6QnJK1JVlsj6fHcukShcCEPUAzjuZDnHkmP2b5J0suSrmtOSygDwhpovUwBHhFPS3o6ef2apMua3xIAIA2uxASAkuJeKBgTrsQEWo89cGTGlZhAMRDgAFBSBDgAlBQBDgAlRYADQEkR4MiMKzGBYiDAMSb3f7JXPZ0dsqSezg7d/8neVrcETDmcB47Mtj53RBu27NPQiWFJ0pHjQ9qwZZ8kqW9JTytbA6YU9sCR2cbtB0+H9ylDJ4a1cfvBFnUETE0EODJ79fhQpjqAfBDgyCwy1gHkgwAHgJIiwAGgpNLMiTnd9k9sP2/7Bdt3JfU7bR+xvSd5rMq/XQDAKWlOI3xb0sci4i3b7ZJ+aPu7yXv3RcRX8msPADCahgEeESHprWSxPXnwexUAtFiqY+C222zvUXXm+R0RsSt56xbbe20/bPvi3LoEAJwlVYBHxHBE9EqaI2mp7Q9IekDSAkm9kgYk3Vvvs7bX2u633T84ONiktgEAmc5CiYjjqk5qfEVEHE2C/aSkByUtHeUzmyKiEhGVrq6ucTcMAKhKcxZKl+3O5HWHpI9LetF294jVrpG0P58WAQD1pDkLpVvSZtttqgb+YxHxpO1/t92r6g+ahyXdnF+bAIBaac5C2StpSZ36jbl0BABIhSsxAaCkCHAAKCkCHABKigAHgJIiwAGgpAhwACgpAhwASooAR2bLF8zIVAeQDwIcmT3z0uuZ6gDyQYADQEkR4ABQUmluZtVS77t9m347/PsJgKa3WS9+mek3W+n8d03T2++crFsHMHEK/TeuNrwl6bfDoffdvq1FHUFS3fA+Vx1APgod4LXh3agOAFNJoQMcADA6AhwASirNlGrTbf/E9vO2X7B9V1KfYXuH7UPJc9NnpZ/e5kx1AJhK0uyBvy3pYxHxIVVnoL/C9jJJ6yXtjIiFknYmy0314pdXnRXWnIUCAFVpplQLSW8li+3JIyRdLWlFUt+s6mz1f9/sBglrAKgv1TFw222290g6JmlHROySNDsiBiQpeZ41ymfX2u633T84ONisvgFgyksV4BExHBG9kuZIWmr7A2m/ICI2RUQlIipdXV1j7RMAUCPTWSgRcVzVQyVXSDpqu1uSkudjTe8OADCqNGehdNnuTF53SPq4pBclPSFpTbLaGkmP59UkiuXwPVdmqgPIR5o98G5JT9neK+mnqh4Df1LSPZIut31I0uXJMqaAL27dl6kOIB9pzkLZK2lJnfprki7LoykU26O7fjlq/Ut9iye4G2Dq4kpMZDYc9e9FM1odQD4IcGTW5vpXwo5WB5APAhyZsQcOFAMBDgAlRYADQEkR4ABQUgQ4AJQUAY7MOAsFKAYCHJmtvnRupjqAfDS8EhOodepqy0d3/VLDEWqztfrSuVyFCUwwxwSeu1upVKK/v3/Cvg8AJgPbuyOiUlvnEAoAlBSHUDAmX9y6j0MoQIsR4Mjsi1v36Rs/fvn08nDE6WVCHJg4HEJBZue6nSyAiZNmRp65tp+yfcD2C7ZvTep32j5ie0/yYPr4KYKbWQHFkOYQyjuSPhcRz9q+SNJu2zuS9+6LiK/k1x6KaJqlk3WyehrX8QATKs2MPAOSBpLXb9o+IKkn78ZQXOe/a5qGTpysWwcwcTL9jbM9X9Xp1XYlpVts77X9sO2LR/nMWtv9tvsHBwfH1SyK4bd1wvtcdQD5SB3gti+U9C1Jn42INyQ9IGmBpF5V99Dvrfe5iNgUEZWIqHR1dTWhZbTaJZ0dmeoA8pEqwG23qxrej0TEFkmKiKMRMRwRJyU9KGlpfm2iSNatXJSpDiAfac5CsaSHJB2IiK+OqHePWO0aSfub3x6K6F+eOpSpDiAfac5CWS7pRkn7bO9Jal+QtNp2r6SQdFjSzbl0iMI5dOw3meoA8pHmLJQfSqp3gti25rcDAEiL874AoKQIcGS2cNYFmeoA8kGAI7Mdt604K6wXzrpAO25b0ZqGgCmKuxFiTAhroPXYAweAkiLAAaCkOIQCADnKc/YqAhwAcpL37FUcQgGAnOQ9exUBDgA5yXv2KgIcAHLS5vrTVI1Wz4oAB4CcrL50bqZ6VvyICQA5OfVDZV5nobAHDgA5euTHL58+5j0coUdGnJUyXgQ4AOTkveu/o9qfKyOpN0OaGXnm2n7K9gHbL9i+NanPsL3D9qHkue6kxgAwVY12rklzzkFJtwf+jqTPRcSfSFom6TO23y9pvaSdEbFQ0s5kGQAwQRoGeEQMRMSzyes3JR2Q1CPpakmbk9U2S+rLq0kAwNkyHQO3PV/SEkm7JM2OiAGpGvKSZjW7OQDA6FIHuO0LJX1L0mcj4o0Mn1tru992/+Dg4Fh6BADUkSrAbberGt6PRMSWpHzUdnfyfrekY/U+GxGbIqISEZWurq5m9AwAULqzUCzpIUkHIuKrI956QtKa5PUaSY83vz0AKK/lC2ZkqmeV5krM5ZJulLTP9p6k9gVJ90h6zPZNkl6WdF1TOgKASeKZl17PVM+qYYBHxA8ljXbnlcua0gUAIDPuhQJMEnnO/IJiIsCBSSDvmV9QTNwLBZgE8p75BcVEgAOTQN4zv6CYCHBgEsh75hcUEwEOTAJ5z/yCsXn3+W2Z6lkR4MAk8KW+xbph2bzTe9xttm5YNo8fMFts9h9Mz1TPirNQgEniS32LCeyCOXTsN5nqWbEHDgAlxR44xuSDd3xPb7w9fHr53ee3ae9dV7SwI2DqYQ8cmdWGtyS98fawPnjH91rUETA1EeDIrDa8G9UB5IMAB4CSIsABoKQIcGQ2va3+1X2j1QHkgwBHZn94Uf2LEEarA8hHminVHrZ9zPb+EbU7bR+xvSd5rMq3TRTJq8eHMtUB5CPNHvjXJdU7wfe+iOhNHtua2xaK7JLOjkx1APloGOAR8QNJzZnADZPCupWL1NF+5s14OtrbtG7lohZ1BExN4zkGfovtvckhlotHW8n2Wtv9tvsHBwfH8XUoir4lPbr72sXq6eyQJfV0dujuaxerb0lPq1sDppSxXkr/gKR/lBTJ872S/qbeihGxSdImSapUKtxdfpLoW9JDYAMtNqY98Ig4GhHDEXFS0oOSlja3LQBAI2MKcNvdIxavkbR/tHUBAPloeAjF9qOSVkiaafsVSXdIWmG7V9VDKIcl3ZxjjwCAOhoGeESsrlN+KIdeAAAZcCUmAJQUAQ4AJUWAA0BJEeAAUFIEOADkZOGsCzLVsyLAASAnO25bcVZYL5x1gXbctqIpfz4BDgA5+sxHF55x36DPfHRh0/7ssd4LBQDQwNbnjmjDln0aOlGd8PvI8SFt2LJPkppyLyH2wAEgJxu3Hzwd3qcMnRjWxu0Hm/LnE+AAkJO8Z68iwAEgJ3nPXkWAA0BORpulqlmzVxHgAJCTu7f9d6Z6VgQ4AOTk6Ju/y1TPigAHgJJqGODJpMXHbO8fUZthe4ftQ8nzqJMaAwDykWYP/OuSrqiprZe0MyIWStqZLAMARph90XmZ6lk1DPCI+IGk12vKV0vanLzeLKmvKd0AwCSy6/bLzwrr2Redp123X96UP3+sl9LPjogBSYqIAduzRlvR9lpJayVp3rx5Y/w6ACinZoV1Pbn/iBkRmyKiEhGVrq6uvL8OAKaMsQb4UdvdkpQ8H2teSwCANMYa4E9IWpO8XiPp8ea0AwBIK81phI9K+pGkRbZfsX2TpHskXW77kKTLk2UAwARq+CNmRKwe5a3LmtwLACADR8TEfZk9KOkXY/z4TEm/amI7rcS2FM9k2Q6JbSmq8WzLeyLirLNAJjTAx8N2f0RUWt1HM7AtxTNZtkNiW4oqj23hXigAUFIEOACUVJkCfFOrG2gitqV4Jst2SGxLUTV9W0pzDBwAcKYy7YEDAEYgwAGgpAoX4LavsH3Q9s9sn3WfcVf9c/L+XtsfbkWfaaTYlhW2f217T/L4h1b02Ui9ST1q3i/FmKTYjlKMhyTZnmv7KdsHbL9g+9Y665RlXNJsS+HHxvZ02z+x/XyyHXfVWae5YxIRhXlIapP0kqQ/lnSepOclvb9mnVWSvivJkpZJ2tXqvsexLSskPdnqXlNsy59J+rCk/aO8X5YxabQdpRiPpNduSR9OXl8k6X9K/HclzbYUfmyS/84XJq/bJe2StCzPMSnaHvhSST+LiJ9HxO8kfVPVySNGulrSv0XVjyV1nrozYsGk2ZZSiPqTeoxUijFJsR2lEREDEfFs8vpNSQck9dSsVpZxSbMthZf8d34rWWxPHrVniTR1TIoW4D2Sfjli+RWdPZBp1imCtH1+JPlfru/a/tOJaa3pyjImaZRuPGzPl7RE1T2+kUo3LufYFqkEY2O7zfYeVW+xvSMich2Tsc7IkxfXqdX+C5ZmnSJI0+ezqt7j4C3bqyRtlbQw986aryxj0kjpxsP2hZK+JemzEfFG7dt1PlLYcWmwLaUYm4gYltRru1PSt21/ICJG/ubS1DEp2h74K5LmjlieI+nVMaxTBA37jIg3Tv0vV0Rsk9Rue+bEtdg0ZRmTcyrbeNhuVzXwHomILXVWKc24NNqWso1NRByX9LTOnhC+qWNStAD/qaSFtt9r+zxJ16s6ecRIT0j6q+TX3GWSfh3J/JwF03BbbP+RbSevl6o6Hq9NeKfjV5YxOacyjUfS50OSDkTEV0dZrRTjkmZbyjA2truSPW/Z7pD0cUkv1qzW1DEp1CGUiHjH9i2Stqt6FsfDEfGC7b9N3v9XSdtU/SX3Z5L+T9Jft6rfc0m5LX8h6e9svyNpSNL1kfxUXSSuTuqxQtJM269IukPVH2hKNSYptqMU45FYLulGSfuSY66S9AVJ86RyjYvSbUsZxqZb0mbbbar+A/NYRDyZZ35xKT0AlFTRDqEAAFIiwAGgpAhwACgpAhwASooAB4CSIsABoKQIcAAoqf8HQKptRCdCEaMAAAAASUVORK5CYII=\n" + }, + "metadata": { + "needs_background": "light" + } + } + ], + "source": [ + "import matplotlib.pyplot as plt\n", + "plt.scatter('Variety','Price',data=new_pumpkins)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "-0.8634790400214403\n" + ] + } + ], + "source": [ + "print(new_pumpkins['Variety'].corr(new_pumpkins['Price']))" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "\nInt64Index: 415 entries, 70 to 1742\nData columns (total 7 columns):\n # Column Non-Null Count Dtype \n--- ------ -------------- ----- \n 0 Month 415 non-null int64 \n 1 Variety 415 non-null int64 \n 2 City 415 non-null int64 \n 3 Package 415 non-null int64 \n 4 Low Price 415 non-null int64 \n 5 High Price 415 non-null int64 \n 6 Price 415 non-null float64\ndtypes: float64(1), int64(6)\nmemory usage: 25.9 KB\n" + ] + } + ], + "source": [ + "\n", + "new_pumpkins.dropna(inplace=True)\n", + "new_pumpkins.info()\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " Variety Price\n", + "70 3 13.636364\n", + "71 3 16.363636\n", + "72 3 16.363636\n", + "73 3 15.454545\n", + "74 3 13.636364\n", + "... ... ...\n", + "1738 1 30.000000\n", + "1739 1 28.750000\n", + "1740 1 25.750000\n", + "1741 1 24.000000\n", + "1742 1 24.000000\n", + "\n", + "[415 rows x 2 columns]" + ], + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
VarietyPrice
70313.636364
71316.363636
72316.363636
73315.454545
74313.636364
.........
1738130.000000
1739128.750000
1740125.750000
1741124.000000
1742124.000000
\n

415 rows × 2 columns

\n
" + }, + "metadata": {}, + "execution_count": 23 + } + ], + "source": [ + "# create new dataframe \n", + "new_columns = ['Variety', 'Price']\n", + "ml_pumpkins = new_pumpkins.drop([c for c in new_pumpkins.columns if c not in new_columns], axis='columns')\n", + "\n", + "ml_pumpkins\n" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [], + "source": [ + "X = ml_pumpkins.values[:, :1]\n", + "y = ml_pumpkins.values[:, 1:2]\n" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Model Accuracy: 0.7327987875929955\nCoefficients: [[-8.54296764]]\nMean squared error: 23.443815358076087\nCoefficient of determination: 0.7802537224707632\n" + ] + } + ], + "source": [ + "from sklearn.linear_model import LinearRegression\n", + "from sklearn.metrics import r2_score, mean_absolute_error\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "\n", + "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n", + "lin_reg = LinearRegression()\n", + "lin_reg.fit(X_train,y_train)\n", + "\n", + "pred = lin_reg.predict(X_test)\n", + "\n", + "accuracy_score = lin_reg.score(X_train,y_train)\n", + "print('Model Accuracy: ', accuracy_score)\n", + "\n", + "# The coefficients\n", + "print('Coefficients: ', lin_reg.coef_)\n", + "# The mean squared error\n", + "print('Mean squared error: ',\n", + " mean_squared_error(y_test, pred))\n", + "# The coefficient of determination: 1 is perfect prediction\n", + "print('Coefficient of determination: ',\n", + " r2_score(y_test, pred)) " + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": "
", + "image/svg+xml": "\n\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWAAAADrCAYAAABXYUzjAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAANg0lEQVR4nO3dMYgcZR/H8f/s7gUyFm8uxhcSdGdsYiFYhLUQu9yphaCgIsKCdpOwnWg3EBCZQmxsvNOxiBZTiAqCVnKHjaSQO4JNJFEkuyGv8r5JPJGs8S578xZ6685mN3fP3PPss7Pz/cAW92effR4sfj75zzwzTpqmAgCYvIrtBQBAWRHAAGAJAQwAlhDAAGAJAQwAlhDAAGBJTeXLR44cSX3fN7QUAJhN6+vr19I0vW+4rhTAvu/L2tqavlUBQAk4jtMeVacFAQCWEMAAYAkBDACWEMAAYAkBDACWGA/gJEnE932pVCri+74kSWJ6SgAoBKXb0FQlSSJBEEi32xURkXa7LUEQiIhIs9k0OTUATD2jO+AwDPvhu6Pb7UoYhianBYBCMBrAnU5HqQ4AZWI0gOv1ulIdAMrEaABHUSSu62ZqrutKFEUmpwWAQjAawM1mU+I4Fs/zxHEc8TxP4jjmAhwAiIij8lLORqOR8jAeAFDjOM56mqaN4ToHMQDAEgIYACwhgAHAEgIYACwhgAHAEgIYACwhgAHAEgIYACwhgAHAEgIYACwhgAHAEgIYACwhgAHAEgIYACwhgAHAEgIYACwxHsBJkojv+1KpVMT3fUmSxPSUAFAINZM/niSJBEHQfzV9u92WIAhERHgtEYDSM7oDDsOwH747ut2uhGFocloAKASjAdzpdJTqAFAmRgP48OHDSnUAKBPuggAAS4wG8I0bN5TqAFAmRgO4Xq8r1QGgTIwGcBRF4rpupua6rkRRZHJaACgEowHcbDYljmPxPE8cxxHP8ySOY+4BBgARcdI03fOXG41Gura2ZnA5ADB7HMdZT9O0MVznLggAsIQABgBLCGAAsIQABgBLCGAAsIQABgBLCGAAsIQABgBLCGAAsIQABgBLCGAAsMR4AC8uLorjOP3P4uKi6SkBoBCMBvDi4qKsrq5maqurq4QwAIjhAB4O393qAFAm9IABwBICGAAsMRrACwsLSnUAKBOjAbyysnJH2C4sLMjKyorJaQGgEGqmJyBsAWA0esAAYAkBDACWGA/gVqsltVpNHMeRWq0mrVbL9JQAUAhGe8CtVkuWl5f7f/d6vf7fS0tLJqcGgKnnpGm65y83Go10bW1tz9+v1WrS6/XuqFerVbl9+/aefwcAisxxnPU0TRvDdaMtiFHhe7c6AJSJ0QCuVqtKdQAoE6MBHASBUh0AysToRbidC21xHEuv15NqtSpBEHABDgDE8EU4AICli3AAgPGMB3CSJOL7vlQqFfF9X5IkMT0lABSC0R5wkiQSBIF0u10REWm32/0LcM1m0+TUADD1jO6AwzDsh++ObrcrYRianBYACsFoAHc6HaU6AJSJ0QCu1+tKdQAoE6MBHEWRuK6bqbmuK1EUmZwWAArBaAA3m02J41g8zxPHccTzPInjmAtwACAcxAAA4ziIAQBThgAGAEuMB3C1WhXHcfofHkUJAH8x/jzg7e3tTG17e5sQBgAxHMDD4btbHQDKhB4wAFhCAAOAJUYDuFIZ/fPj6gBQJsbfijwctpVKhbciA4AYfh6wCK+gB4Bx6AUAgCUEMABYYjyA5+fnMyfh5ufnTU8JAIVgNIDn5+dlY2MjU9vY2CCEAUAMB/Bw+O5WB4AymUAP+KiIpAMfAIDIBG5DE/nP0N9/hbDj/P0XmQygpIzugA8dOrTrdxznn88HH5hcDQBMF6MBfPPmTRH5156/HwTZQAaAWWa0BbG1tSUiWyKyk6Zq/YbhEKZdAWCWTPgghtP/5AnTwd3xxx/rXhsATJbVk3Bp+s/np5/Uxr70Eu0KAMVmNIDn5ub2XH/wwWwgqxoMYwIZQBEYDeDNzc07wnZubk42Nzd3HTsYxvsN5JUV9fEAYJrxFsTm5qakadr/7CV8RxkM4+++Uxv7xBPsjgFMnwkcxNDvkUeyu2LVUOXuCgDTYCYeR6mzXbG+rn99ADDKTATwsMEwPndObWyjQbsCwGQYD+AkScT3falUKuL7viRJYnrKjMce4+4KANPJaAAnSSJBEEi73ZY0TaXdbksQBBMP4UE62xU//KB/fQDKw2gAh2Eo3W43U+t2uxKGoclplQyG8Zdfqo09fpzdMYD8jN4F0el0lOq2Pf00d1cAmByjO+B0TAKNq08bne2Kn3/Wvz4AxTaTd0GYMhjGH32kNvbYMdoVALII4Jxefpm7KwDsDwGsic52Be8sBcphap6GNmsGw/jtt9XGzs+zOwbKYGqfhjZLXn+ddgWAOxlvQZw9e1Y8zxPHccTzPDl79qzpKaeeznbFrVv61wdgMkp3Em4aDYbxa6+pjT14kN0xUFSOyj25jUYjXVtb2/P3fd+Xdrt9R93zPLl8+fKef6fM9hOqhw6J/PqrvrUAyMdxnPU0TRvDdaM74KKdhJtG+2lXbGxkd8dbW2bWCCAfowF8zz33KNWxu8EwfuUVtbEHDtCuAKaJ0QC+efOmUh1qPvxQ38W8Eye0Lw/ALow+jKfoz4IomsH/rGkqUlH43+v589ld8fY2u2TANKM74Gq1qlSHPo6T3R2fPKk2vlKhXQGYZjSAgyBQqsOc1VV97YpnntG/PqCMjLYglpaWREQkjmPp9XpSrVYlCIJ+HfYMhvD2tojKP0q++CK7K6ajBORj9D5gFNNDD4lcupR/PIEMZFm5DxjFdPGivnZFq6V/fcCsIICxq8Ew/vNPtbHLy1zMA8YhgKHkwIFsINcUryLwZDfgH8YDOEkS8X1fKpWK+L7Pg3hmzNaWvnbFm2/qXx8wzXgaGrQaDOPff1cbe+YMu2OUC09Dw8TsN1S5uwJFxdPQYJ3OB9HHsf71AZNmNIDr9bpSHeUyGMbXr6uNPXWKdgWKz2gAR1Ekrutmaq7rShRFJqdFAR0+zHvzUD5GA7jZbEocx5l3wsVxLM1m0+S0mAE62xWffaZ/fYAOHEVG4Vy5IrKfLhYX8zBp4y7CGX0YD2DCAw9kQ1S15TD8fQIZtnAQA4Wns13x9df61weMw0EMzJzBMP7+e7WxJ09yMQ+Tw0EMlAqHQWADBzEA0duuOH9e//pQLhzEQKkNhvG336qNPXGCdgX2h4MYwN8efZTDIJgsDmIAY+hsV/z4o/71ofg4iAHk8NVXIk89lX88F/PKhYMYgEZPPslhEOwfryQCNNDZrvjlF/3rw3QigAEDBsNY9WFAR49yMa8saEEAhj33HO0KjMYOGJgwne2K337Tvz5MDgEMWDYYxu+9pzb20CHaFUVGAANT5NQpfbvjalX/+qAXAQxMsf20K7a3s4F865aZNSI/AhgokMEwfuMNtbEHD9KumDYEMFBQZ87oa1fcf7/+9c0C0y+U4DY0YEYMh7DKLvfq1ez3t7ZEaiVPh50XSnS7XRGR/gslRETb82zYAQMzanB33GqpjZ2bo10RhmE/fHd0u10Jw1DbHAQwUALvvquvXfH44/rXN41Gvc3nbvU8Sv6PDKCcBkM4TUUqCluxc+eyu+Kduy2gjh0wUHKOk90dP/us2vhKhXZFXgQwgIzPP9fXrnjxRf3rm5TqmJMs4+p5EMAA7mowjHs9tbGffFLc3fHOHQ97redBAAPYs0olG8iqF+SK9N68S5cuKdXzIIAB5PbNN/raFa++qn99+7G6uqpUz4MABqDNYBhvbamNfeed4uyOdSGAUWimj4oiv1otG8iqx52L1K7IiwBGYe0cFW2325Kmaf+oKCE8na5c0deueOst/esbtrCwoFTPg9fSo7B83x95KsnzPLl8+fLkF4Tc/vhDxHXzjzf1mqbFxcVMz3dhYUFWVlaUf4fX0mPmdDodpTqm18GD0/nevDxhq4IWBAqrXq8r1VEcOt+b9/77+tenCwGMwoqiSNyhf7e6ritRFFlaEUwZDOMbN9TGnj49vRfzCGAUVrPZlDiOxfM8cRxHPM+TOI61PasV02l+Xt/u2HYgcxEOwEzZT6h++qnI88/rW8uOcRfh2AEDmCmDu+OrV9XGvvDCZHfHBDCAmXXsmM52xb/l4Ycf1ro+AhhAaezv7or/yoULV7WGMAEMoLQGw/jixb2M+E0uXLigbX7jAcxZfQBFcPz4/toVeRg9CTeJ1zoDgBnDV+H071eN7oAn8VpnAJiMbe2/aDSAOasPAOMZDWDO6gPAeEYDmLP6ADCe0QDmrD4AjMezIABghCNHjsj169fvqN97771y7do1pd/iWRAAMGUIYAAYYdTu9271PAhgALCEAAYASwhgALCEAAYASwhgALCEAAaAEZwx7yQaV8+DAAaAEU6fPq1Uz8Po84ABoKiWlpZERCSOY+n1elKtViUIgn5dB44iA4BhHEUGgClDAAOAJQQwAFhCAAOAJQQwAIzRarWkVquJ4zhSq9Wk1Wpp/X1uQwOAEVqtliwvL/f/7vV6/b913YrGbWgAMEKtVpNer3dHvVqtyu3bt5V+i9vQAEDBqPC9Wz0PAhgARqhWq0r1PAhgABghCAKleh5chAOAEXgWBADMAC7CAcCUIYABwBICGAAsIYABwBICGAAsUboLwnGc/4lI29xyAGAmeWma3jdcVApgAIA+tCAAwBICGAAsIYABwBICGAAsIYABwBICGAAsIYABwBICGAAsIYABwJL/A9LCp6YpkCnOAAAAAElFTkSuQmCC\n" + }, + "metadata": {} + } + ], + "source": [ + "\n", + "plt.scatter(X_test, y_test, color='black')\n", + "plt.plot(X_test, pred, color='blue', linewidth=3)\n", + "\n", + "plt.xticks(())\n", + "plt.yticks(())\n", + "\n", + "plt.show()\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ] +} \ No newline at end of file