Merge pull request #643 from jimbobbennett/patch-3

Fixing README to match solution notebook
pull/649/head
Carlotta Castelluccio 2 years ago committed by GitHub
commit 65f0066f75
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -103,7 +103,14 @@ This suggests that there should be some correlation, and we can try training lin
<img alt="Scatter plot of Price vs. Day of Year" src="images/scatter-dayofyear.png" width="50%" /> <img alt="Scatter plot of Price vs. Day of Year" src="images/scatter-dayofyear.png" width="50%" />
It looks like there are different clusters of prices corresponding to different pumpkin varieties. To confirm this hypothesis, let's plot each pumpkin category using a different color. By passing an `ax` parameter to the `scatter` plotting function we can plot all points on the same graph: Let's see if there is a correlation using the `corr` function:
```python
print(new_pumpkins['Month'].corr(new_pumpkins['Price']))
print(new_pumpkins['DayOfYear'].corr(new_pumpkins['Price']))
```
It looks like the correlation is pretty small, -0.15 by `Month` and -0.17 by the `DayOfMonth`, but there could be another important relationship. It looks like there are different clusters of prices corresponding to different pumpkin varieties. To confirm this hypothesis, let's plot each pumpkin category using a different color. By passing an `ax` parameter to the `scatter` plotting function we can plot all points on the same graph:
```python ```python
ax=None ax=None
@ -115,7 +122,15 @@ for i,var in enumerate(new_pumpkins['Variety'].unique()):
<img alt="Scatter plot of Price vs. Day of Year" src="images/scatter-dayofyear-color.png" width="50%" /> <img alt="Scatter plot of Price vs. Day of Year" src="images/scatter-dayofyear-color.png" width="50%" />
Our investigation suggests that variety has more effect on the overall price than the actual selling date. So let us focus for the moment only on one pumpkin variety, and see what effect the date has on the price: Our investigation suggests that variety has more effect on the overall price than the actual selling date. We can see this with a bar graph:
```python
new_pumpkins.groupby('Variety')['Price'].mean().plot(kind='bar')
```
<img alt="Bar graph of price vs variety" src="images/price-by-variety.png" width="50%" />
Let us focus for the moment only on one pumpkin variety, the 'pie type', and see what effect the date has on the price:
```python ```python
pie_pumpkins = new_pumpkins[new_pumpkins['Variety']=='PIE TYPE'] pie_pumpkins = new_pumpkins[new_pumpkins['Variety']=='PIE TYPE']

Loading…
Cancel
Save