It looks like the correlation is pretty small, -0.15 by `Month` and -0.17 by the `DayOfMonth`, but there could be another important relationship. It looks like there are different clusters of prices corresponding to different pumpkin varieties. To confirm this hypothesis, let's plot each pumpkin category using a different color. By passing an `ax` parameter to the `scatter` plotting function we can plot all points on the same graph:
It looks like the correlation is pretty small, -0.15 by `Month` and -0.17 by the `DayOfYear`, but there could be another important relationship. It looks like there are different clusters of prices corresponding to different pumpkin varieties. To confirm this hypothesis, let's plot each pumpkin category using a different color. By passing an `ax` parameter to the `scatter` plotting function we can plot all points on the same graph:
```python
ax=None
@ -262,7 +262,7 @@ Pipelines can be used in the same manner as the original `LinearRegression` obje
Using Polynomial Regression, we can get slightly lower MSE and higher determination, but not significantly. We need to take into account other features!
Using Polynomial Regression, we can get slightly lower RMSE and higher determination, but not significantly. We need to take into account other features!
> You can see that the minimal pumpkin prices are observed somewhere around Halloween. How can you explain this?
@ -319,7 +319,7 @@ X = pd.get_dummies(new_pumpkins['Variety']) \
y = new_pumpkins['Price']
```
Here we also take into account `City` and `Package` type, which gives us MSE 2.84 (10%), and determination 0.94!
Here we also take into account `City` and `Package` type, which gives us RMSE 2.84 (10%), and determination 0.94!