You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/translations/mo/2-Regression/3-Linear/notebook.ipynb

128 lines
3.8 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 南瓜定價\n",
"\n",
"載入所需的函式庫和數據集。將數據轉換為包含部分數據的資料框:\n",
"\n",
"- 僅選取以蒲式耳定價的南瓜\n",
"- 將日期轉換為月份\n",
"- 計算價格為高價和低價的平均值\n",
"- 將價格轉換為反映每蒲式耳數量的定價\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"from datetime import datetime\n",
"\n",
"pumpkins = pd.read_csv('../data/US-pumpkins.csv')\n",
"\n",
"pumpkins.head()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]\n",
"\n",
"columns_to_select = ['Package', 'Variety', 'City Name', 'Low Price', 'High Price', 'Date']\n",
"pumpkins = pumpkins.loc[:, columns_to_select]\n",
"\n",
"price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2\n",
"\n",
"month = pd.DatetimeIndex(pumpkins['Date']).month\n",
"day_of_year = pd.to_datetime(pumpkins['Date']).apply(lambda dt: (dt-datetime(dt.year,1,1)).days)\n",
"\n",
"new_pumpkins = pd.DataFrame(\n",
" {'Month': month, \n",
" 'DayOfYear' : day_of_year, \n",
" 'Variety': pumpkins['Variety'], \n",
" 'City': pumpkins['City Name'], \n",
" 'Package': pumpkins['Package'], \n",
" 'Low Price': pumpkins['Low Price'],\n",
" 'High Price': pumpkins['High Price'], \n",
" 'Price': price})\n",
"\n",
"new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/1.1\n",
"new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price*2\n",
"\n",
"new_pumpkins.head()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"基本散佈圖提醒我們,我們只有從八月到十二月的月份數據。我們可能需要更多數據才能以線性方式得出結論。\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"plt.scatter('Month','Price',data=new_pumpkins)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"plt.scatter('DayOfYear','Price',data=new_pumpkins)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n本文件已使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。雖然我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3-final"
},
"orig_nbformat": 2,
"coopTranslator": {
"original_hash": "b032d371c75279373507f003439a577e",
"translation_date": "2025-08-29T22:45:22+00:00",
"source_file": "2-Regression/3-Linear/notebook.ipynb",
"language_code": "mo"
}
},
"nbformat": 4,
"nbformat_minor": 2
}