You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/translations/mo/1-Introduction/04-stats-and-probability/assignment.ipynb

260 lines
7.0 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"source": [
"## 概論:機率與統計 \n",
"## 作業 \n",
"\n",
"在這次作業中,我們將使用糖尿病患者的數據集,該數據集取自[這裡](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)。 \n"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y\n",
"0 59 2 32.1 101.0 157 93.2 38.0 4.0 4.8598 87 151\n",
"1 48 1 21.6 87.0 183 103.2 70.0 3.0 3.8918 69 75\n",
"2 72 2 30.5 93.0 156 93.6 41.0 4.0 4.6728 85 141\n",
"3 24 1 25.3 84.0 198 131.4 40.0 5.0 4.8903 89 206\n",
"4 50 1 23.0 101.0 192 125.4 52.0 4.0 4.2905 80 135"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>AGE</th>\n",
" <th>SEX</th>\n",
" <th>BMI</th>\n",
" <th>BP</th>\n",
" <th>S1</th>\n",
" <th>S2</th>\n",
" <th>S3</th>\n",
" <th>S4</th>\n",
" <th>S5</th>\n",
" <th>S6</th>\n",
" <th>Y</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>59</td>\n",
" <td>2</td>\n",
" <td>32.1</td>\n",
" <td>101.0</td>\n",
" <td>157</td>\n",
" <td>93.2</td>\n",
" <td>38.0</td>\n",
" <td>4.0</td>\n",
" <td>4.8598</td>\n",
" <td>87</td>\n",
" <td>151</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>48</td>\n",
" <td>1</td>\n",
" <td>21.6</td>\n",
" <td>87.0</td>\n",
" <td>183</td>\n",
" <td>103.2</td>\n",
" <td>70.0</td>\n",
" <td>3.0</td>\n",
" <td>3.8918</td>\n",
" <td>69</td>\n",
" <td>75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>72</td>\n",
" <td>2</td>\n",
" <td>30.5</td>\n",
" <td>93.0</td>\n",
" <td>156</td>\n",
" <td>93.6</td>\n",
" <td>41.0</td>\n",
" <td>4.0</td>\n",
" <td>4.6728</td>\n",
" <td>85</td>\n",
" <td>141</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>24</td>\n",
" <td>1</td>\n",
" <td>25.3</td>\n",
" <td>84.0</td>\n",
" <td>198</td>\n",
" <td>131.4</td>\n",
" <td>40.0</td>\n",
" <td>5.0</td>\n",
" <td>4.8903</td>\n",
" <td>89</td>\n",
" <td>206</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>50</td>\n",
" <td>1</td>\n",
" <td>23.0</td>\n",
" <td>101.0</td>\n",
" <td>192</td>\n",
" <td>125.4</td>\n",
" <td>52.0</td>\n",
" <td>4.0</td>\n",
" <td>4.2905</td>\n",
" <td>80</td>\n",
" <td>135</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
},
"metadata": {},
"execution_count": 13
}
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"在此數據集中,欄位如下:\n",
"* 年齡和性別不需額外解釋\n",
"* BMI 是身體質量指數\n",
"* BP 是平均血壓\n",
"* S1 到 S6 是不同的血液測量值\n",
"* Y 是疾病在一年內進展的定性指標\n",
"\n",
"讓我們使用概率和統計方法來研究這個數據集。\n",
"\n",
"### 任務 1計算所有值的平均值和方差\n"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"source": [],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### 任務 2根據性別繪製 BMI、BP 和 Y 的箱型圖\n"
],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"source": [],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"source": [],
"outputs": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### 任務 4測試不同變數與疾病進展 (Y) 之間的相關性\n",
"\n",
"> **提示** 相關矩陣可以提供最有用的資訊,幫助判斷哪些值是相互依賴的。\n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {}
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n本文件使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對於因使用此翻譯而產生的任何誤解或錯誤解讀概不負責。\n"
]
}
],
"metadata": {
"orig_nbformat": 4,
"language_info": {
"name": "python",
"version": "3.8.8",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3.8.8 64-bit (conda)"
},
"interpreter": {
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:11:24+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "mo"
}
},
"nbformat": 4,
"nbformat_minor": 2
}