## Pumpkin Pricing

Load the necessary libraries and dataset. Transform the data into a dataframe containing a subset of the information:

- Include only pumpkins priced by the bushel
- Change the date format to show the month
- Compute the price as the average of the high and low prices
- Adjust the price to represent the cost per bushel quantity


In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime

pumpkins = pd.read_csv('../data/US-pumpkins.csv')

pumpkins.head()


In [None]:
pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]

columns_to_select = ['Package', 'Variety', 'City Name', 'Low Price', 'High Price', 'Date']
pumpkins = pumpkins.loc[:, columns_to_select]

price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2

month = pd.DatetimeIndex(pumpkins['Date']).month
day_of_year = pd.to_datetime(pumpkins['Date']).apply(lambda dt: (dt-datetime(dt.year,1,1)).days)

new_pumpkins = pd.DataFrame(
 {'Month': month, 
 'DayOfYear' : day_of_year, 
 'Variety': pumpkins['Variety'], 
 'City': pumpkins['City Name'], 
 'Package': pumpkins['Package'], 
 'Low Price': pumpkins['Low Price'],
 'High Price': pumpkins['High Price'], 
 'Price': price})

new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/1.1
new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price*2

new_pumpkins.head()


A basic scatterplot reminds us that we only have month data from August through December. We probably need more data to be able to draw conclusions in a linear fashion.


In [None]:
import matplotlib.pyplot as plt
plt.scatter('Month','Price',data=new_pumpkins)

In [None]:

plt.scatter('DayOfYear','Price',data=new_pumpkins)


---

**Disclaimer**: 
This document has been translated using the AI translation service [Co-op Translator](https://github.com/Azure/co-op-translator). While we aim for accuracy, please note that automated translations may include errors or inaccuracies. The original document in its native language should be regarded as the authoritative source. For critical information, professional human translation is advised. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.
