From 5c41dc39c518351443c24ae32e185e2f69ba67dd Mon Sep 17 00:00:00 2001 From: Jim Bennett Date: Tue, 14 Feb 2023 17:39:48 -0800 Subject: [PATCH 1/3] Adding the call to the `corr` function The README is missing the call to the `corr` function that is in the final notebook. --- 2-Regression/3-Linear/README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/2-Regression/3-Linear/README.md b/2-Regression/3-Linear/README.md index 63596de4..8601abde 100644 --- a/2-Regression/3-Linear/README.md +++ b/2-Regression/3-Linear/README.md @@ -103,7 +103,14 @@ This suggests that there should be some correlation, and we can try training lin Scatter plot of Price vs. Day of Year -It looks like there are different clusters of prices corresponding to different pumpkin varieties. To confirm this hypothesis, let's plot each pumpkin category using a different color. By passing an `ax` parameter to the `scatter` plotting function we can plot all points on the same graph: +Let's see if there is a correlation using the `corr` function: + +```python +print(new_pumpkins['Month'].corr(new_pumpkins['Price'])) +print(new_pumpkins['DayOfYear'].corr(new_pumpkins['Price'])) +``` + +It looks like the correlation is pretty small, -0.15 by `Month` and -0.17 by the `DayOfMonth`, but there could be another important relationship. It looks like there are different clusters of prices corresponding to different pumpkin varieties. To confirm this hypothesis, let's plot each pumpkin category using a different color. By passing an `ax` parameter to the `scatter` plotting function we can plot all points on the same graph: ```python ax=None From 44e442039a30b802392a4aa146efa089d745a9e9 Mon Sep 17 00:00:00 2001 From: Jim Bennett Date: Tue, 14 Feb 2023 18:03:34 -0800 Subject: [PATCH 2/3] Adding bar chart by price --- 2-Regression/3-Linear/README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/2-Regression/3-Linear/README.md b/2-Regression/3-Linear/README.md index 8601abde..7012eb62 100644 --- a/2-Regression/3-Linear/README.md +++ b/2-Regression/3-Linear/README.md @@ -122,7 +122,15 @@ for i,var in enumerate(new_pumpkins['Variety'].unique()): Scatter plot of Price vs. Day of Year -Our investigation suggests that variety has more effect on the overall price than the actual selling date. So let us focus for the moment only on one pumpkin variety, and see what effect the date has on the price: +Our investigation suggests that variety has more effect on the overall price than the actual selling date. We can see this with a bar graph: + +```python +new_pumpkins.groupby('Variety')['Price'].mean().plot(kind='bar') +``` + +Scatter plot of Price vs. Day of Year + +Let us focus for the moment only on one pumpkin variety, the 'pie type', and see what effect the date has on the price: ```python pie_pumpkins = new_pumpkins[new_pumpkins['Variety']=='PIE TYPE'] From 5dcaf60052259aedc513b7e14e71963f5a96c078 Mon Sep 17 00:00:00 2001 From: Jim Bennett Date: Tue, 14 Feb 2023 18:05:31 -0800 Subject: [PATCH 3/3] Update README.md --- 2-Regression/3-Linear/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2-Regression/3-Linear/README.md b/2-Regression/3-Linear/README.md index 7012eb62..d213a7b4 100644 --- a/2-Regression/3-Linear/README.md +++ b/2-Regression/3-Linear/README.md @@ -128,7 +128,7 @@ Our investigation suggests that variety has more effect on the overall price tha new_pumpkins.groupby('Variety')['Price'].mean().plot(kind='bar') ``` -Scatter plot of Price vs. Day of Year +Bar graph of price vs variety Let us focus for the moment only on one pumpkin variety, the 'pie type', and see what effect the date has on the price: