You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/translations/hk/2-Regression/3-Linear/notebook.ipynb

128 lines
3.8 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 南瓜定價\n",
"\n",
"載入所需的庫和數據集。將數據轉換為包含部分數據的數據框:\n",
"\n",
"- 只選取按蒲式耳定價的南瓜\n",
"- 將日期轉換為月份\n",
"- 計算價格為高價和低價的平均值\n",
"- 將價格轉換為按蒲式耳數量定價\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"from datetime import datetime\n",
"\n",
"pumpkins = pd.read_csv('../data/US-pumpkins.csv')\n",
"\n",
"pumpkins.head()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]\n",
"\n",
"columns_to_select = ['Package', 'Variety', 'City Name', 'Low Price', 'High Price', 'Date']\n",
"pumpkins = pumpkins.loc[:, columns_to_select]\n",
"\n",
"price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2\n",
"\n",
"month = pd.DatetimeIndex(pumpkins['Date']).month\n",
"day_of_year = pd.to_datetime(pumpkins['Date']).apply(lambda dt: (dt-datetime(dt.year,1,1)).days)\n",
"\n",
"new_pumpkins = pd.DataFrame(\n",
" {'Month': month, \n",
" 'DayOfYear' : day_of_year, \n",
" 'Variety': pumpkins['Variety'], \n",
" 'City': pumpkins['City Name'], \n",
" 'Package': pumpkins['Package'], \n",
" 'Low Price': pumpkins['Low Price'],\n",
" 'High Price': pumpkins['High Price'], \n",
" 'Price': price})\n",
"\n",
"new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/1.1\n",
"new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price*2\n",
"\n",
"new_pumpkins.head()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"基本散佈圖提醒我們,我們只有八月至十二月的月份數據。我們可能需要更多數據才能以線性方式得出結論。\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"plt.scatter('Month','Price',data=new_pumpkins)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"plt.scatter('DayOfYear','Price',data=new_pumpkins)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n此文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議使用專業的人類翻譯。我們對因使用此翻譯而引起的任何誤解或錯誤解讀概不負責。\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3-final"
},
"orig_nbformat": 2,
"coopTranslator": {
"original_hash": "b032d371c75279373507f003439a577e",
"translation_date": "2025-09-03T19:16:33+00:00",
"source_file": "2-Regression/3-Linear/notebook.ipynb",
"language_code": "hk"
}
},
"nbformat": 4,
"nbformat_minor": 2
}