{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 南瓜定價\n", "\n", "載入所需的函式庫和數據集。將數據轉換為包含部分數據的資料框:\n", "\n", "- 僅選取以蒲式耳定價的南瓜\n", "- 將日期轉換為月份\n", "- 計算價格為高價和低價的平均值\n", "- 將價格轉換為反映每蒲式耳數量的定價\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from datetime import datetime\n", "\n", "pumpkins = pd.read_csv('../data/US-pumpkins.csv')\n", "\n", "pumpkins.head()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]\n", "\n", "columns_to_select = ['Package', 'Variety', 'City Name', 'Low Price', 'High Price', 'Date']\n", "pumpkins = pumpkins.loc[:, columns_to_select]\n", "\n", "price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2\n", "\n", "month = pd.DatetimeIndex(pumpkins['Date']).month\n", "day_of_year = pd.to_datetime(pumpkins['Date']).apply(lambda dt: (dt-datetime(dt.year,1,1)).days)\n", "\n", "new_pumpkins = pd.DataFrame(\n", " {'Month': month, \n", " 'DayOfYear' : day_of_year, \n", " 'Variety': pumpkins['Variety'], \n", " 'City': pumpkins['City Name'], \n", " 'Package': pumpkins['Package'], \n", " 'Low Price': pumpkins['Low Price'],\n", " 'High Price': pumpkins['High Price'], \n", " 'Price': price})\n", "\n", "new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/1.1\n", "new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price*2\n", "\n", "new_pumpkins.head()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "基本散佈圖提醒我們,我們只有從八月到十二月的月份數據。我們可能需要更多數據才能以線性方式得出結論。\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "plt.scatter('Month','Price',data=new_pumpkins)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "plt.scatter('DayOfYear','Price',data=new_pumpkins)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n---\n\n**免責聲明**: \n本文件已使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。雖然我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3-final" }, "orig_nbformat": 2, "coopTranslator": { "original_hash": "b032d371c75279373507f003439a577e", "translation_date": "2025-08-29T22:45:22+00:00", "source_file": "2-Regression/3-Linear/notebook.ipynb", "language_code": "mo" } }, "nbformat": 4, "nbformat_minor": 2 }