Merge pull request #666 from microsoft/update-translations

🌐 Update translations via Co-op Translator
pull/668/head
Lee Stott 2 weeks ago committed by GitHub
commit 9c3aeb044c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -6,7 +6,7 @@
"## مقدمة في الاحتمالات والإحصاء \n",
"## الواجب \n",
"\n",
"في هذا الواجب، سنستخدم مجموعة بيانات مرضى السكري المأخوذة [من هنا](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html). \n"
"في هذا الواجب، سنستخدم مجموعة البيانات الخاصة بمرضى السكري المأخوذة [من هنا](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html). \n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### المهمة 3: ما هو توزيع العمر، الجنس، مؤشر كتلة الجسم ومتغيرات Y؟\n"
"### المهمة 3: ما هو توزيع العمر، الجنس، مؤشر كتلة الجسم والمتغيرات Y؟\n"
],
"metadata": {}
},
@ -214,7 +214,7 @@
{
"cell_type": "markdown",
"source": [
"### المهمة 5: اختبار الفرضية بأن درجة تقدم مرض السكري تختلف بين الرجال والنساء\n"
"### المهمة 5: اختبار الفرضية بأن درجة تطور مرض السكري تختلف بين الرجال والنساء\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**إخلاء المسؤولية**: \nتم ترجمة هذا المستند باستخدام خدمة الترجمة بالذكاء الاصطناعي [Co-op Translator](https://github.com/Azure/co-op-translator). بينما نسعى لتحقيق الدقة، يرجى العلم أن الترجمات الآلية قد تحتوي على أخطاء أو معلومات غير دقيقة. يجب اعتبار المستند الأصلي بلغته الأصلية المصدر الموثوق. للحصول على معلومات حاسمة، يُوصى بالاستعانة بترجمة بشرية احترافية. نحن غير مسؤولين عن أي سوء فهم أو تفسيرات خاطئة تنشأ عن استخدام هذه الترجمة.\n"
"\n---\n\n**إخلاء المسؤولية**: \nتمت ترجمة هذا المستند باستخدام خدمة الترجمة الآلية [Co-op Translator](https://github.com/Azure/co-op-translator). بينما نسعى لتحقيق الدقة، يرجى العلم أن الترجمات الآلية قد تحتوي على أخطاء أو معلومات غير دقيقة. يجب اعتبار المستند الأصلي بلغته الأصلية هو المصدر الموثوق. للحصول على معلومات حساسة أو هامة، يُوصى بالاستعانة بترجمة بشرية احترافية. نحن غير مسؤولين عن أي سوء فهم أو تفسيرات خاطئة تنشأ عن استخدام هذه الترجمة.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:16:23+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:05:41+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "ar"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## مقدمة في الاحتمالات والإحصاء\n",
"## الواجب\n",
"## مقدمة في الاحتمالات والإحصاء \n",
"## الواجب \n",
"\n",
"في هذا الواجب، سنستخدم مجموعة بيانات مرضى السكري المأخوذة [من هنا](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"في هذا الواجب، سنستخدم مجموعة بيانات مرضى السكري المأخوذة [من هنا](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html). \n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -155,7 +155,7 @@
"* BMI هو مؤشر كتلة الجسم \n",
"* BP هو متوسط ضغط الدم \n",
"* S1 إلى S6 هي قياسات دم مختلفة \n",
"* Y هو المقياس النوعي لتطور المرض خلال سنة واحدة \n",
"* Y هو مقياس نوعي لتطور المرض خلال سنة واحدة \n",
"\n",
"لنقم بدراسة هذا الملف باستخدام طرق الاحتمالات والإحصاء. \n",
"\n",
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### المهمة 3: ما هو توزيع العمر، الجنس، مؤشر كتلة الجسم ومتغيرات Y؟\n"
"### المهمة 3: ما هو توزيع العمر، الجنس، مؤشر كتلة الجسم والمتغيرات Y؟\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -846,8 +846,8 @@
{
"cell_type": "markdown",
"source": [
"الاستنتاج: \n",
"* أقوى ارتباط لـ Y هو مؤشر كتلة الجسم (BMI) و S5 (مستوى السكر في الدم). يبدو هذا منطقيًا.\n"
"الخلاصة: \n",
"* أقوى ارتباط مع Y هو مؤشر كتلة الجسم (BMI) و S5 (مستوى السكر في الدم). هذا يبدو منطقياً. \n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -879,7 +879,7 @@
{
"cell_type": "markdown",
"source": [
"### المهمة 5: اختبار الفرضية بأن درجة تقدم مرض السكري تختلف بين الرجال والنساء\n"
"### المهمة 5: اختبار الفرضية بأن درجة تطور مرض السكري تختلف بين الرجال والنساء\n"
],
"metadata": {}
},
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -920,7 +920,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**إخلاء المسؤولية**: \nتم ترجمة هذا المستند باستخدام خدمة الترجمة بالذكاء الاصطناعي [Co-op Translator](https://github.com/Azure/co-op-translator). بينما نسعى لتحقيق الدقة، يرجى العلم أن الترجمات الآلية قد تحتوي على أخطاء أو معلومات غير دقيقة. يجب اعتبار المستند الأصلي بلغته الأصلية المصدر الرسمي. للحصول على معلومات حاسمة، يُوصى بالاستعانة بترجمة بشرية احترافية. نحن غير مسؤولين عن أي سوء فهم أو تفسيرات خاطئة تنشأ عن استخدام هذه الترجمة.\n"
"\n---\n\n**إخلاء المسؤولية**: \nتم ترجمة هذا المستند باستخدام خدمة الترجمة الآلية [Co-op Translator](https://github.com/Azure/co-op-translator). بينما نسعى لتحقيق الدقة، يرجى العلم أن الترجمات الآلية قد تحتوي على أخطاء أو عدم دقة. يجب اعتبار المستند الأصلي بلغته الأصلية المصدر الموثوق. للحصول على معلومات حساسة أو هامة، يُوصى بالاستعانة بترجمة بشرية احترافية. نحن غير مسؤولين عن أي سوء فهم أو تفسيرات خاطئة ناتجة عن استخدام هذه الترجمة.\n"
]
}
],
@ -946,8 +946,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:21:20+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:06:00+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "ar"
}

@ -3,7 +3,7 @@
{
"cell_type": "markdown",
"source": [
"## Въведение в вероятностите и статистиката\n",
"## Въведение в теорията на вероятностите и статистиката\n",
"## Задача\n",
"\n",
"В тази задача ще използваме набора от данни за пациенти с диабет, взет [оттук](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"В този набор от данни колоните са следните:\n",
"* Възраст и пол са ясни сами по себе си\n",
"* BMI е индекс на телесната маса\n",
"* BP е средното кръвно налягане\n",
"* S1 до S6 са различни кръвни измервания\n",
"* Y е качествена мярка за прогресията на заболяването за една година\n",
"В този набор от данни колоните са следните: \n",
"* Възраст и пол са ясни сами по себе си \n",
"* BMI е индекс на телесната маса \n",
"* BP е средното кръвно налягане \n",
"* S1 до S6 са различни кръвни измервания \n",
"* Y е качествена мярка за прогресията на заболяването за една година \n",
"\n",
"Нека изследваме този набор от данни, използвайки методите на вероятността и статистиката.\n",
"\n",
"### Задача 1: Изчислете средните стойности и вариацията за всички стойности\n"
"### Задача 1: Изчислете средните стойности и вариацията за всички стойности \n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### Задача 3: Какво е разпределението на възраст, пол, ИТМ и Y променливи?\n"
"### Задача 3: Какво е разпределението на възраст, пол, ИТМ и променливата Y?\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Отказ от отговорност**: \nТози документ е преведен с помощта на AI услуга за превод [Co-op Translator](https://github.com/Azure/co-op-translator). Въпреки че се стремим към точност, моля, имайте предвид, че автоматизираните преводи може да съдържат грешки или неточности. Оригиналният документ на неговия роден език трябва да се счита за авторитетен източник. За критична информация се препоръчва професионален човешки превод. Не носим отговорност за недоразумения или погрешни интерпретации, произтичащи от използването на този превод.\n"
"\n---\n\n**Отказ от отговорност**: \nТози документ е преведен с помощта на AI услуга за превод [Co-op Translator](https://github.com/Azure/co-op-translator). Въпреки че се стремим към точност, моля, имайте предвид, че автоматичните преводи може да съдържат грешки или неточности. Оригиналният документ на неговия изходен език трябва да се счита за авторитетен източник. За критична информация се препоръчва професионален превод от човек. Не носим отговорност за каквито и да било недоразумения или погрешни интерпретации, произтичащи от използването на този превод.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:16:51+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:54:53+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "bg"
}

File diff suppressed because one or more lines are too long

@ -3,7 +3,7 @@
{
"cell_type": "markdown",
"source": [
"## Въведение в вероятностите и статистиката\n",
"## Въведение в теорията на вероятностите и статистиката\n",
"## Задача\n",
"\n",
"В тази задача ще използваме набора от данни за пациенти с диабет, взет [оттук](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"В този набор от данни колоните са следните: \n",
"* Възраст и пол са ясни сами по себе си \n",
"* BMI е индекс на телесната маса \n",
"* BP е средното кръвно налягане \n",
"* S1 до S6 са различни кръвни измервания \n",
"* Y е качествена мярка за прогресията на заболяването за една година \n",
"В този набор от данни колоните са следните:\n",
"* Възраст и пол са ясни сами по себе си\n",
"* BMI е индекс на телесната маса\n",
"* BP е средното кръвно налягане\n",
"* S1 до S6 са различни кръвни измервания\n",
"* Y е качествена мярка за прогресията на заболяването за една година\n",
"\n",
"Нека изследваме този набор от данни, използвайки методите на вероятността и статистиката.\n",
"\n",
"### Задача 1: Изчислете средните стойности и вариацията за всички стойности \n"
"### Задача 1: Изчислете средните стойности и вариацията за всички стойности\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Задача 2: Начертайте boxplots за BMI, BP и Y в зависимост от пола\n"
"### Задача 2: Начертайте кутии за BMI, BP и Y в зависимост от пола\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -595,7 +595,7 @@
"Заключения:\n",
"* Възраст - нормална\n",
"* Пол - еднороден\n",
"* BMI, Y - трудно за определяне\n"
"* ИТМ, Y - трудно е да се определи\n"
],
"metadata": {}
},
@ -604,7 +604,7 @@
"source": [
"### Задача 4: Тествайте корелацията между различни променливи и прогресията на заболяването (Y)\n",
"\n",
"> **Подсказка** Корелационната матрица ще ви предостави най-полезната информация за това кои стойности са взаимозависими.\n"
"> **Подсказка** Корелационната матрица ще ви даде най-полезната информация за това кои стойности са зависими.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -918,7 +918,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Отказ от отговорност**: \nТози документ е преведен с помощта на AI услуга за превод [Co-op Translator](https://github.com/Azure/co-op-translator). Въпреки че се стремим към точност, моля, имайте предвид, че автоматизираните преводи може да съдържат грешки или неточности. Оригиналният документ на неговия роден език трябва да се счита за авторитетен източник. За критична информация се препоръчва професионален човешки превод. Ние не носим отговорност за недоразумения или погрешни интерпретации, произтичащи от използването на този превод.\n"
"\n---\n\n**Отказ от отговорност**: \nТози документ е преведен с помощта на AI услуга за превод [Co-op Translator](https://github.com/Azure/co-op-translator). Въпреки че се стремим към точност, моля, имайте предвид, че автоматичните преводи може да съдържат грешки или неточности. Оригиналният документ на неговия изходен език трябва да се счита за авторитетен източник. За критична информация се препоръчва професионален превод от човек. Ние не носим отговорност за каквито и да е недоразумения или погрешни интерпретации, произтичащи от използването на този превод.\n"
]
}
],
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:21:57+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:55:11+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "bg"
}

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## সম্ভাবনা এবং পরিসংখ্যানের পরিচিতি\n",
"## অ্যাসাইনমেন্ট\n",
"## সম্ভাবনা এবং পরিসংখ্যানের পরিচিতি \n",
"## অ্যাসাইনমেন্ট \n",
"\n",
"এই অ্যাসাইনমেন্টে, আমরা ডাাবেটিস রোগীদের ডেটাসেট ব্যবহার করব যা [এখান থেকে নেওয়া হয়েছে](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)।\n"
"এই অ্যাসাইনমেন্টে, আমরা ডায়াবেটিস রোগীদের ডেটাসেট ব্যবহার করব যা [এখান থেকে নেওয়া হয়েছে](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"এই ডেটাসেটে কলামগুলো নিম্নরূপ: \n",
"* বয়স এবং লিঙ্গ স্বতঃব্যাখ্যামূলক \n",
"* BMI হলো শরীরের ভর সূচক \n",
"* BP হলো গড় রক্তচাপ \n",
"* S1 থেকে S6 হলো বিভিন্ন রক্তের পরিমাপ \n",
"* Y হলো এক বছরের মধ্যে রোগের অগ্রগতির গুণগত পরিমাপ \n",
"এই ডেটাসেটে নিম্নলিখিত কলামগুলো রয়েছে:\n",
"* বয়স এবং লিঙ্গ স্বতঃস্পষ্ট\n",
"* BMI হলো শরীরের ভর সূচক\n",
"* BP হলো গড় রক্তচাপ\n",
"* S1 থেকে S6 হলো বিভিন্ন রক্তের পরিমাপ\n",
"* Y হলো এক বছরের মধ্যে রোগের অগ্রগতির গুণগত পরিমাপ\n",
"\n",
"চলুন সম্ভাব্যতা এবং পরিসংখ্যানের পদ্ধতি ব্যবহার করে এই ডেটাসেটটি অধ্যয়ন করি। \n",
"চলুন এই ডেটাসেটটি সম্ভাবনা এবং পরিসংখ্যানের পদ্ধতি ব্যবহার করে অধ্যয়ন করি।\n",
"\n",
"### কাজ ১: সমস্ত মানের জন্য গড় মান এবং বৈচিত্র্য গণনা করুন \n"
"### কাজ ১: সমস্ত মানের জন্য গড় মান এবং বৈচিত্র্য গণনা করুন\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### কাজ ২: লিঙ্গের উপর নির্ভর করে BMI, BP এবং Y এর জন্য বক্সপ্লট আঁকুন\n"
"### টাস্ক ২: লিঙ্গের উপর নির্ভর করে BMI, BP এবং Y এর জন্য বক্সপ্লট আঁকুন\n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### টাস্ক ৩: বয়স, লিঙ্গ, BMI এবং Y ভেরিয়েবলের বিতরণ কী?\n"
"### টাস্ক ৩: বয়স, লিঙ্গ, বিএমআই এবং ওয়াই ভেরিয়েবলের বণ্টন কী?\n"
],
"metadata": {}
},
@ -214,7 +214,7 @@
{
"cell_type": "markdown",
"source": [
"### কাজ ৫: পুরুষ এবং নারীদের মধ্যে ডায়াবেটিসের অগ্রগতির মাত্রা ভিন্ন কিনা তা পরীক্ষা করুন\n"
"### টাস্ক ৫: পুরুষ এবং নারীদের মধ্যে ডায়াবেটিসের অগ্রগতির মাত্রা ভিন্ন কিনা তা পরীক্ষা করুন\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**অস্বীকৃতি**: \nএই নথিটি AI অনুবাদ পরিষেবা [Co-op Translator](https://github.com/Azure/co-op-translator) ব্যবহার করে অনুবাদ করা হয়েছে। আমরা যথাসম্ভব সঠিক অনুবাদ প্রদানের চেষ্টা করি, তবে অনুগ্রহ করে মনে রাখবেন যে স্বয়ংক্রিয় অনুবাদে ত্রুটি বা অসঙ্গতি থাকতে পারে। মূল ভাষায় থাকা নথিটিকে প্রামাণিক উৎস হিসেবে বিবেচনা করা উচিত। গুরুত্বপূর্ণ তথ্যের জন্য, পেশাদার মানব অনুবাদ সুপারিশ করা হয়। এই অনুবাদ ব্যবহারের ফলে কোনো ভুল বোঝাবুঝি বা ভুল ব্যাখ্যা হলে আমরা দায়বদ্ধ থাকব না।\n"
"\n---\n\n**অস্বীকৃতি**: \nএই নথিটি AI অনুবাদ পরিষেবা [Co-op Translator](https://github.com/Azure/co-op-translator) ব্যবহার করে অনুবাদ করা হয়েছে। আমরা যথাসাধ্য সঠিকতার জন্য চেষ্টা করি, তবে অনুগ্রহ করে মনে রাখবেন যে স্বয়ংক্রিয় অনুবাদে ত্রুটি বা অসঙ্গতি থাকতে পারে। মূল ভাষায় থাকা নথিটিকে প্রামাণিক উৎস হিসেবে বিবেচনা করা উচিত। গুরুত্বপূর্ণ তথ্যের জন্য, পেশাদার মানব অনুবাদ সুপারিশ করা হয়। এই অনুবাদ ব্যবহারের ফলে কোনো ভুল বোঝাবুঝি বা ভুল ব্যাখ্যা হলে আমরা দায়বদ্ধ থাকব না।\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:16:37+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:19:49+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "bn"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -151,15 +151,15 @@
"cell_type": "markdown",
"source": [
"এই ডেটাসেটে কলামগুলো নিম্নরূপ: \n",
"* বয়স এবং লিঙ্গ স্বতঃব্যাখ্যামূলক \n",
"* বয়স এবং লিঙ্গ স্বতঃস্পষ্ট \n",
"* BMI হলো শরীরের ভর সূচক \n",
"* BP হলো গড় রক্তচাপ \n",
"* S1 থেকে S6 হলো বিভিন্ন রক্তের পরিমাপ \n",
"* Y হলো এক বছরের মধ্যে রোগের অগ্রগতির গুণগত পরিমাপ \n",
"\n",
"চলুন সম্ভাব্যতা এবং পরিসংখ্যানের পদ্ধতি ব্যবহার করে এই ডেটাসেটটি অধ্যয়ন করি। \n",
"চলুন এই ডেটাসেটটি সম্ভাবনা এবং পরিসংখ্যানের পদ্ধতি ব্যবহার করে অধ্যয়ন করি।\n",
"\n",
"### কাজ ১: সমস্ত মানের জন্য গড় মান এবং বৈচিত্র্য গণনা করুন \n"
"### কাজ ১: সমস্ত মানের জন্য গড় মান এবং বৈচিত্র্য গণনা করুন\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### কাজ ২: লিঙ্গের উপর নির্ভর করে BMI, BP এবং Y এর জন্য বক্সপ্লট আঁকুন\n"
"### টাস্ক ২: লিঙ্গের উপর নির্ভর করে BMI, BP এবং Y এর জন্য বক্সপ্লট আঁকুন\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### টাস্ক ৩: বয়স, লিঙ্গ, BMI এবং Y ভেরিয়েবলগুলোর বণ্টন কী?\n"
"### টাস্ক ৩: বয়স, লিঙ্গ, বিএমআই এবং ওয়াই ভেরিয়েবলের বণ্টন কী?\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -592,10 +592,10 @@
{
"cell_type": "markdown",
"source": [
"উপসংহার:\n",
"* বয়স - স্বাভাবিক\n",
"* লিঙ্গ - একরূপ\n",
"* BMI, Y - বলা কঠিন\n"
"উপসংহারসমূহ: \n",
"* বয়স - স্বাভাবিক \n",
"* লিঙ্গ - অভিন্ন \n",
"* BMI, Y - বলা কঠিন \n"
],
"metadata": {}
},
@ -846,8 +846,8 @@
{
"cell_type": "markdown",
"source": [
"উপসংহার:\n",
"* Y এর সাথে সবচেয়ে শক্তিশালী সম্পর্ক হলো BMI এবং S5 (রক্তে শর্করা)। এটি যুক্তিসঙ্গত শোনাচ্ছে।\n"
"উপসংহার: \n",
"* Y এর সাথে সবচেয়ে শক্তিশালী সম্পর্ক হলো BMI এবং S5 (রক্তে চিনি)। এটি যুক্তিসঙ্গত শোনাচ্ছে। \n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -879,7 +879,7 @@
{
"cell_type": "markdown",
"source": [
"### কাজ ৫: পুরুষ এবং নারীদের মধ্যে ডায়াবেটিসের অগ্রগতির মাত্রা ভিন্ন কিনা তা পরীক্ষা করুন\n"
"### টাস্ক ৫: পুরুষ এবং নারীদের মধ্যে ডায়াবেটিসের অগ্রগতির মাত্রা ভিন্ন কিনা তা পরীক্ষা করুন\n"
],
"metadata": {}
},
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -918,7 +918,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**অস্বীকৃতি**: \nএই নথিটি AI অনুবাদ পরিষেবা [Co-op Translator](https://github.com/Azure/co-op-translator) ব্যবহার করে অনুবাদ করা হয়েছে। আমরা যথাসাধ্য সঠিকতার জন্য চেষ্টা করি, তবে অনুগ্রহ করে মনে রাখবেন যে স্বয়ংক্রিয় অনুবাদে ত্রুটি বা অসঙ্গতি থাকতে পারে। মূল ভাষায় থাকা নথিটিকে প্রামাণিক উৎস হিসেবে বিবেচনা করা উচিত। গুরুত্বপূর্ণ তথ্যের জন্য, পেশাদার মানব অনুবাদ সুপারিশ করা হয়। এই অনুবাদ ব্যবহারের ফলে কোনো ভুল বোঝাবুঝি বা ভুল ব্যাখ্যা হলে আমরা দায়বদ্ধ থাকব না।\n"
"\n---\n\n**অস্বীকৃতি**: \nএই নথিটি AI অনুবাদ পরিষেবা [Co-op Translator](https://github.com/Azure/co-op-translator) ব্যবহার করে অনুবাদ করা হয়েছে। আমরা যথাসম্ভব সঠিক অনুবাদের চেষ্টা করি, তবে অনুগ্রহ করে মনে রাখবেন যে স্বয়ংক্রিয় অনুবাদে ত্রুটি বা অসঙ্গতি থাকতে পারে। নথিটির মূল ভাষায় লেখা সংস্করণটিকেই প্রামাণিক উৎস হিসেবে বিবেচনা করা উচিত। গুরুত্বপূর্ণ তথ্যের জন্য, পেশাদার মানব অনুবাদ ব্যবহার করার পরামর্শ দেওয়া হচ্ছে। এই অনুবাদ ব্যবহারের ফলে সৃষ্ট কোনো ভুল বোঝাবুঝি বা ভুল ব্যাখ্যার জন্য আমরা দায়ী নই।\n"
]
}
],
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:21:38+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:20:08+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "bn"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"Neste conjunto de dados, as colunas são as seguintes: \n",
"* Idade e sexo são autoexplicativos \n",
"* IMC é o índice de massa corporal \n",
"* PA é a pressão arterial média \n",
"* S1 até S6 são diferentes medições sanguíneas \n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano \n",
"Neste conjunto de dados, as colunas são as seguintes:\n",
"* Idade e sexo são autoexplicativos\n",
"* IMC é o índice de massa corporal\n",
"* PA é a pressão arterial média\n",
"* S1 até S6 são diferentes medições de sangue\n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano\n",
"\n",
"Vamos estudar este conjunto de dados usando métodos de probabilidade e estatística.\n",
"Vamos estudar este conjunto de dados utilizando métodos de probabilidade e estatística.\n",
"\n",
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores\n"
"### Tarefa 1: Calcular valores médios e variância para todos os valores\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Tarefa 2: Traçar boxplots para IMC, PA e Y dependendo do gênero\n"
"### Tarefa 2: Plotar boxplots para IMC, PA e Y dependendo do gênero\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Aviso Legal**: \nEste documento foi traduzido utilizando o serviço de tradução por IA [Co-op Translator](https://github.com/Azure/co-op-translator). Embora nos esforcemos para garantir a precisão, esteja ciente de que traduções automatizadas podem conter erros ou imprecisões. O documento original em seu idioma nativo deve ser considerado a fonte autoritativa. Para informações críticas, recomenda-se a tradução profissional realizada por humanos. Não nos responsabilizamos por quaisquer mal-entendidos ou interpretações equivocadas decorrentes do uso desta tradução.\n"
"\n---\n\n**Aviso Legal**: \nEste documento foi traduzido utilizando o serviço de tradução por IA [Co-op Translator](https://github.com/Azure/co-op-translator). Embora nos esforcemos para garantir a precisão, esteja ciente de que traduções automáticas podem conter erros ou imprecisões. O documento original em seu idioma nativo deve ser considerado a fonte oficial. Para informações críticas, recomenda-se a tradução profissional realizada por humanos. Não nos responsabilizamos por quaisquer mal-entendidos ou interpretações incorretas decorrentes do uso desta tradução.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:17:02+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:26:33+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "br"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -157,9 +157,9 @@
"* S1 até S6 são diferentes medições sanguíneas \n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano \n",
"\n",
"Vamos estudar este conjunto de dados usando métodos de probabilidade e estatística.\n",
"Vamos estudar este conjunto de dados utilizando métodos de probabilidade e estatística.\n",
"\n",
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores\n"
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores \n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Tarefa 2: Plote boxplots para IMC, PA e Y dependendo do gênero\n"
"### Tarefa 2: Plotar boxplots para IMC, PA e Y dependendo do gênero\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -590,10 +590,10 @@
{
"cell_type": "markdown",
"source": [
"Conclusões:\n",
"Conclusões: \n",
"* Idade - normal \n",
"* Sexo - uniforme \n",
"* IMC, Y - difícil dizer \n"
"* IMC, Y - difícil de dizer \n"
],
"metadata": {}
},
@ -845,7 +845,7 @@
"cell_type": "markdown",
"source": [
"Conclusão: \n",
"* As correlações mais fortes de Y são o IMC e S5 (açúcar no sangue). Isso parece razoável.\n"
"* A correlação mais forte de Y é com o IMC e S5 (açúcar no sangue). Isso parece razoável.\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:22:12+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:26:48+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "br"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -154,11 +154,11 @@
"* BMI je index tělesné hmotnosti \n",
"* BP je průměrný krevní tlak \n",
"* S1 až S6 jsou různé krevní hodnoty \n",
"* Y je kvalitativní měřítko progrese onemocnění během jednoho roku \n",
"* Y je kvalitativní míra progrese onemocnění během jednoho roku \n",
"\n",
"Pojďme tuto datovou sadu prozkoumat pomocí metod pravděpodobnosti a statistiky.\n",
"\n",
"### Úkol 1: Vypočítejte průměrné hodnoty a rozptyl pro všechny hodnoty\n"
"### Úkol 1: Vypočítejte průměrné hodnoty a rozptyl pro všechny hodnoty \n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### Úkol 3: Jaké je rozložení věku, pohlaví, BMI a proměnných Y?\n"
"### Úkol 3: Jaké je rozložení proměnných Věk, Pohlaví, BMI a Y?\n"
],
"metadata": {}
},
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Prohlášení**: \nTento dokument byl přeložen pomocí služby pro automatický překlad [Co-op Translator](https://github.com/Azure/co-op-translator). Ačkoli se snažíme o přesnost, mějte na paměti, že automatické překlady mohou obsahovat chyby nebo nepřesnosti. Původní dokument v jeho původním jazyce by měl být považován za autoritativní zdroj. Pro důležité informace se doporučuje profesionální lidský překlad. Neodpovídáme za žádné nedorozumění nebo nesprávné interpretace vyplývající z použití tohoto překladu.\n"
"\n---\n\n**Prohlášení**: \nTento dokument byl přeložen pomocí služby pro automatický překlad [Co-op Translator](https://github.com/Azure/co-op-translator). I když se snažíme o co největší přesnost, mějte prosím na paměti, že automatické překlady mohou obsahovat chyby nebo nepřesnosti. Za autoritativní zdroj by měl být považován původní dokument v jeho původním jazyce. Pro důležité informace doporučujeme profesionální lidský překlad. Neodpovídáme za žádná nedorozumění nebo nesprávné výklady vyplývající z použití tohoto překladu.\n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:17:15+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:50:42+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "cs"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"V této datové sadě jsou sloupce následující:\n",
"* Věk a pohlaví jsou samovysvětlující\n",
"* BMI je index tělesné hmotnosti\n",
"* BP je průměrný krevní tlak\n",
"* S1 až S6 jsou různé krevní měření\n",
"* Y je kvalitativní míra progrese nemoci během jednoho roku\n",
"V této datové sadě jsou sloupce následující: \n",
"* Věk a pohlaví jsou samovysvětlující \n",
"* BMI je index tělesné hmotnosti \n",
"* BP je průměrný krevní tlak \n",
"* S1 až S6 jsou různé krevní hodnoty \n",
"* Y je kvalitativní míra progrese onemocnění během jednoho roku \n",
"\n",
"Pojďme tuto datovou sadu prozkoumat pomocí metod pravděpodobnosti a statistiky.\n",
"\n",
"### Úkol 1: Vypočítejte průměrné hodnoty a rozptyl pro všechny hodnoty\n"
"### Úkol 1: Vypočítejte průměrné hodnoty a rozptyl pro všechny hodnoty \n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### Úkol 3: Jaké je rozložení věku, pohlaví, BMI a proměnných Y?\n"
"### Úkol 3: Jaké je rozložení proměnných Věk, Pohlaví, BMI a Y?\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -592,17 +592,17 @@
{
"cell_type": "markdown",
"source": [
"Závěry:\n",
"* Věk - normální\n",
"* Pohlaví - jednotné\n",
"* BMI, Y - těžko říct\n"
"Závěry: \n",
"* Věk - normální \n",
"* Pohlaví - jednotné \n",
"* BMI, Y - těžko říct \n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Úkol 4: Otestujte korelaci mezi různými proměnnými a průběhem nemoci (Y)\n",
"### Úkol 4: Otestujte korelaci mezi různými proměnnými a progresí nemoci (Y)\n",
"\n",
"> **Tip** Korelační matice vám poskytne nejvíce užitečných informací o tom, které hodnoty jsou závislé.\n"
],
@ -846,8 +846,8 @@
{
"cell_type": "markdown",
"source": [
"Závěr:\n",
"* Nejsilnější korelace Y je s BMI a S5 (hladina cukru v krvi). To zní rozumně.\n"
"Závěr: \n",
"* Nejsilnější korelace s Y mají BMI a S5 (hladina cukru v krvi). To zní logicky.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +885,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +916,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Prohlášení**: \nTento dokument byl přeložen pomocí služby pro automatický překlad [Co-op Translator](https://github.com/Azure/co-op-translator). I když se snažíme o přesnost, mějte prosím na paměti, že automatické překlady mohou obsahovat chyby nebo nepřesnosti. Původní dokument v jeho původním jazyce by měl být považován za autoritativní zdroj. Pro důležité informace se doporučuje profesionální lidský překlad. Neodpovídáme za žádné nedorozumění nebo nesprávné interpretace vyplývající z použití tohoto překladu.\n"
"\n---\n\n**Upozornění**: \nTento dokument byl přeložen pomocí služby pro automatický překlad [Co-op Translator](https://github.com/Azure/co-op-translator). I když se snažíme o co největší přesnost, mějte prosím na paměti, že automatické překlady mohou obsahovat chyby nebo nepřesnosti. Původní dokument v jeho původním jazyce by měl být považován za závazný zdroj. Pro důležité informace doporučujeme profesionální lidský překlad. Neodpovídáme za žádná nedorozumění nebo nesprávné výklady vyplývající z použití tohoto překladu.\n"
]
}
],
@ -942,8 +942,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:22:28+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:50:58+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "cs"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Opgave 2: Plot boksdiagrammer for BMI, BP og Y afhængigt af køn\n"
"### Opgave 2: Plot boksplot for BMI, BP og Y afhængigt af køn\n"
],
"metadata": {}
},
@ -200,9 +200,9 @@
{
"cell_type": "markdown",
"source": [
"### Opgave 4: Test korrelationen mellem forskellige variable og sygdomsprogression (Y)\n",
"### Opgave 4: Test korrelationen mellem forskellige variabler og sygdomsprogression (Y)\n",
"\n",
"> **Tip** En korrelationsmatrix vil give dig den mest nyttige information om, hvilke værdier der er afhængige.\n"
"> **Tip** Korrelationsmatrixen vil give dig den mest nyttige information om, hvilke værdier der er afhængige.\n"
],
"metadata": {}
},
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Ansvarsfraskrivelse**: \nDette dokument er blevet oversat ved hjælp af AI-oversættelsestjenesten [Co-op Translator](https://github.com/Azure/co-op-translator). Selvom vi bestræber os på nøjagtighed, skal du være opmærksom på, at automatiserede oversættelser kan indeholde fejl eller unøjagtigheder. Det originale dokument på dets oprindelige sprog bør betragtes som den autoritative kilde. For kritisk information anbefales professionel menneskelig oversættelse. Vi er ikke ansvarlige for eventuelle misforståelser eller fejltolkninger, der måtte opstå som følge af brugen af denne oversættelse.\n"
"\n---\n\n**Ansvarsfraskrivelse**: \nDette dokument er blevet oversat ved hjælp af AI-oversættelsestjenesten [Co-op Translator](https://github.com/Azure/co-op-translator). Selvom vi bestræber os på nøjagtighed, skal du være opmærksom på, at automatiserede oversættelser kan indeholde fejl eller unøjagtigheder. Det originale dokument på dets oprindelige sprog bør betragtes som den autoritative kilde. For kritisk information anbefales professionel menneskelig oversættelse. Vi påtager os intet ansvar for misforståelser eller fejltolkninger, der måtte opstå som følge af brugen af denne oversættelse.\n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:17:28+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:36:01+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "da"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"I dette datasæt er kolonnerne som følger: \n",
"* Alder og køn er selvforklarende \n",
"* BMI er body mass index \n",
"* BP er gennemsnitligt blodtryk \n",
"* S1 til S6 er forskellige blodmålinger \n",
"* Y er det kvalitative mål for sygdomsprogression over et år \n",
"I dette datasæt er kolonnerne som følger:\n",
"* Alder og køn er selvforklarende\n",
"* BMI er kropsmasseindeks\n",
"* BP er gennemsnitligt blodtryk\n",
"* S1 til S6 er forskellige blodmålinger\n",
"* Y er det kvalitative mål for sygdomsprogression over et år\n",
"\n",
"Lad os undersøge dette datasæt ved hjælp af sandsynligheds- og statistikmetoder.\n",
"\n",
"### Opgave 1: Beregn gennemsnitsværdier og varians for alle værdier \n"
"### Opgave 1: Beregn gennemsnitsværdier og varians for alle værdier\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -592,10 +592,10 @@
{
"cell_type": "markdown",
"source": [
"Konklusioner:\n",
"* Alder - normal\n",
"* Køn - ensartet\n",
"* BMI, Y - svært at vurdere\n"
"Konklusioner: \n",
"* Alder - normal \n",
"* Køn - ensartet \n",
"* BMI, Y - svært at sige \n"
],
"metadata": {}
},
@ -604,7 +604,7 @@
"source": [
"### Opgave 4: Test korrelationen mellem forskellige variabler og sygdomsprogression (Y)\n",
"\n",
"> **Tip** Korrelationsmatrixen vil give dig den mest nyttige information om, hvilke værdier der er afhængige.\n"
"> **Tip** En korrelationsmatrix vil give dig den mest nyttige information om, hvilke værdier der er afhængige.\n"
],
"metadata": {}
},
@ -846,7 +846,7 @@
{
"cell_type": "markdown",
"source": [
"Konklusion:\n",
"Konklusion: \n",
"* Den stærkeste korrelation med Y er BMI og S5 (blodsukker). Dette virker rimeligt.\n"
],
"metadata": {}
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +885,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +916,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Ansvarsfraskrivelse**: \nDette dokument er blevet oversat ved hjælp af AI-oversættelsestjenesten [Co-op Translator](https://github.com/Azure/co-op-translator). Selvom vi bestræber os på nøjagtighed, skal du være opmærksom på, at automatiserede oversættelser kan indeholde fejl eller unøjagtigheder. Det originale dokument på dets oprindelige sprog bør betragtes som den autoritative kilde. For kritisk information anbefales professionel menneskelig oversættelse. Vi påtager os ikke ansvar for eventuelle misforståelser eller fejltolkninger, der opstår som følge af brugen af denne oversættelse.\n"
"\n---\n\n**Ansvarsfraskrivelse**: \nDette dokument er blevet oversat ved hjælp af AI-oversættelsestjenesten [Co-op Translator](https://github.com/Azure/co-op-translator). Selvom vi bestræber os på nøjagtighed, skal du være opmærksom på, at automatiserede oversættelser kan indeholde fejl eller unøjagtigheder. Det originale dokument på dets oprindelige sprog bør betragtes som den autoritative kilde. For kritisk information anbefales professionel menneskelig oversættelse. Vi påtager os intet ansvar for misforståelser eller fejltolkninger, der måtte opstå som følge af brugen af denne oversættelse.\n"
]
}
],
@ -942,8 +942,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:22:44+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:36:18+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "da"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -173,7 +173,7 @@
{
"cell_type": "markdown",
"source": [
"### Aufgabe 2: Boxplots für BMI, BP und Y abhängig vom Geschlecht erstellen\n"
"### Aufgabe 2: Erstelle Boxplots für BMI, BP und Y in Abhängigkeit vom Geschlecht\n"
],
"metadata": {}
},
@ -201,9 +201,9 @@
{
"cell_type": "markdown",
"source": [
"### Aufgabe 4: Teste die Korrelation zwischen verschiedenen Variablen und dem Krankheitsverlauf (Y)\n",
"### Aufgabe 4: Testen Sie die Korrelation zwischen verschiedenen Variablen und dem Krankheitsverlauf (Y)\n",
"\n",
"> **Hinweis** Eine Korrelationsmatrix liefert die nützlichsten Informationen darüber, welche Werte voneinander abhängig sind.\n"
"> **Tipp** Eine Korrelationsmatrix liefert Ihnen die nützlichsten Informationen darüber, welche Werte voneinander abhängig sind.\n"
],
"metadata": {}
},
@ -226,7 +226,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Haftungsausschluss**: \nDieses Dokument wurde mit dem KI-Übersetzungsdienst [Co-op Translator](https://github.com/Azure/co-op-translator) übersetzt. Obwohl wir uns um Genauigkeit bemühen, beachten Sie bitte, dass automatisierte Übersetzungen Fehler oder Ungenauigkeiten enthalten können. Das Originaldokument in seiner ursprünglichen Sprache sollte als maßgebliche Quelle betrachtet werden. Für kritische Informationen wird eine professionelle menschliche Übersetzung empfohlen. Wir übernehmen keine Haftung für Missverständnisse oder Fehlinterpretationen, die sich aus der Nutzung dieser Übersetzung ergeben.\n"
"\n---\n\n**Haftungsausschluss**: \nDieses Dokument wurde mit dem KI-Übersetzungsdienst [Co-op Translator](https://github.com/Azure/co-op-translator) übersetzt. Obwohl wir uns um Genauigkeit bemühen, weisen wir darauf hin, dass automatisierte Übersetzungen Fehler oder Ungenauigkeiten enthalten können. Das Originaldokument in seiner ursprünglichen Sprache sollte als maßgebliche Quelle betrachtet werden. Für kritische Informationen wird eine professionelle menschliche Übersetzung empfohlen. Wir übernehmen keine Haftung für Missverständnisse oder Fehlinterpretationen, die sich aus der Nutzung dieser Übersetzung ergeben.\n"
]
}
],
@ -252,8 +252,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:17:40+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:03:05+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "de"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -152,11 +152,11 @@
"source": [
"In diesem Datensatz sind die Spalten wie folgt:\n",
"\n",
"* Alter und Geschlecht sind selbsterklärend\n",
"* BMI ist der Body-Mass-Index\n",
"* BP ist der durchschnittliche Blutdruck\n",
"* S1 bis S6 sind verschiedene Blutmesswerte\n",
"* Y ist das qualitative Maß für den Krankheitsverlauf über ein Jahr\n",
"* Alter und Geschlecht sind selbsterklärend \n",
"* BMI ist der Body-Mass-Index \n",
"* BP ist der durchschnittliche Blutdruck \n",
"* S1 bis S6 sind verschiedene Blutmessungen \n",
"* Y ist das qualitative Maß für den Krankheitsverlauf über ein Jahr \n",
"\n",
"Lassen Sie uns diesen Datensatz mit Methoden der Wahrscheinlichkeit und Statistik untersuchen.\n",
"\n",
@ -355,7 +355,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -447,7 +447,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -478,7 +478,7 @@
{
"cell_type": "markdown",
"source": [
"### Aufgabe 2: Boxplots für BMI, BP und Y abhängig vom Geschlecht erstellen\n"
"### Aufgabe 2: Erstelle Boxplots für BMI, BP und Y in Abhängigkeit vom Geschlecht\n"
],
"metadata": {}
},
@ -486,8 +486,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -538,8 +538,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -593,19 +593,19 @@
{
"cell_type": "markdown",
"source": [
"Schlussfolgerungen:\n",
"* Alter - normal\n",
"* Geschlecht - einheitlich\n",
"* BMI, Y - schwer zu beurteilen\n"
"Schlussfolgerungen: \n",
"* Alter - normal \n",
"* Geschlecht - einheitlich \n",
"* BMI, Y - schwer zu sagen \n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Aufgabe 4: Teste die Korrelation zwischen verschiedenen Variablen und dem Krankheitsverlauf (Y)\n",
"### Aufgabe 4: Testen Sie die Korrelation zwischen verschiedenen Variablen und dem Krankheitsverlauf (Y)\n",
"\n",
"> **Hinweis** Eine Korrelationsmatrix liefert die nützlichsten Informationen darüber, welche Werte voneinander abhängig sind.\n"
"> **Tipp** Eine Korrelationsmatrix liefert Ihnen die nützlichsten Informationen darüber, welche Werte voneinander abhängig sind.\n"
],
"metadata": {}
},
@ -847,8 +847,8 @@
{
"cell_type": "markdown",
"source": [
"Fazit:\n",
"* Die stärkste Korrelation von Y ist BMI und S5 (Blutzucker). Das klingt vernünftig.\n"
"Fazit: \n",
"* Die stärkste Korrelation von Y besteht mit BMI und S5 (Blutzucker). Das klingt plausibel.\n"
],
"metadata": {}
},
@ -856,10 +856,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -886,9 +886,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -917,7 +917,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Haftungsausschluss**: \nDieses Dokument wurde mit dem KI-Übersetzungsdienst [Co-op Translator](https://github.com/Azure/co-op-translator) übersetzt. Obwohl wir uns um Genauigkeit bemühen, beachten Sie bitte, dass automatisierte Übersetzungen Fehler oder Ungenauigkeiten enthalten können. Das Originaldokument in seiner ursprünglichen Sprache sollte als maßgebliche Quelle betrachtet werden. Für kritische Informationen wird eine professionelle menschliche Übersetzung empfohlen. Wir übernehmen keine Haftung für Missverständnisse oder Fehlinterpretationen, die sich aus der Nutzung dieser Übersetzung ergeben.\n"
"\n---\n\n**Haftungsausschluss**: \nDieses Dokument wurde mithilfe des KI-Übersetzungsdienstes [Co-op Translator](https://github.com/Azure/co-op-translator) übersetzt. Obwohl wir uns um Genauigkeit bemühen, weisen wir darauf hin, dass automatisierte Übersetzungen Fehler oder Ungenauigkeiten enthalten können. Das Originaldokument in seiner ursprünglichen Sprache sollte als maßgebliche Quelle betrachtet werden. Für kritische Informationen wird eine professionelle menschliche Übersetzung empfohlen. Wir übernehmen keine Haftung für Missverständnisse oder Fehlinterpretationen, die sich aus der Nutzung dieser Übersetzung ergeben.\n"
]
}
],
@ -943,8 +943,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:23:00+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:03:21+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "de"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"Σε αυτό το σύνολο δεδομένων, οι στήλες είναι οι εξής:\n",
"* Η ηλικία και το φύλο είναι αυτονόητα\n",
"* Το BMI είναι ο δείκτης μάζας σώματος\n",
"* Το BP είναι η μέση αρτηριακή πίεση\n",
"* Το S1 έως S6 είναι διαφορετικές μετρήσεις αίματος\n",
"* Το Y είναι η ποιοτική μέτρηση της εξέλιξης της ασθένειας μέσα σε ένα χρόνο\n",
"Σε αυτό το σύνολο δεδομένων, οι στήλες είναι οι εξής: \n",
"* Η ηλικία και το φύλο είναι αυτονόητα \n",
"* Το BMI είναι ο δείκτης μάζας σώματος \n",
"* Το BP είναι η μέση αρτηριακή πίεση \n",
"* Οι S1 έως S6 είναι διαφορετικές μετρήσεις αίματος \n",
"* Το Y είναι το ποιοτικό μέτρο της εξέλιξης της νόσου μέσα σε ένα έτος \n",
"\n",
"Ας μελετήσουμε αυτό το σύνολο δεδομένων χρησιμοποιώντας μεθόδους πιθανότητας και στατιστικής.\n",
"Ας μελετήσουμε αυτό το σύνολο δεδομένων χρησιμοποιώντας μεθόδους πιθανοτήτων και στατιστικής.\n",
"\n",
"### Εργασία 1: Υπολογίστε τις μέσες τιμές και τη διακύμανση για όλες τις τιμές\n"
"### Εργασία 1: Υπολογίστε τις μέσες τιμές και τη διακύμανση για όλες τις τιμές \n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Εργασία 2: Σχεδιάστε boxplots για BMI, BP και Y ανάλογα με το φύλο\n"
"### Εργασία 2: Σχεδιάστε διαγράμματα κουτιού για BMI, BP και Y ανάλογα με το φύλο\n"
],
"metadata": {}
},
@ -200,7 +200,7 @@
{
"cell_type": "markdown",
"source": [
"### Εργασία 4: Δοκιμάστε τη συσχέτιση μεταξύ διαφορετικών μεταβλητών και της εξέλιξης της ασθένειας (Y)\n",
"### Εργασία 4: Δοκιμάστε τη συσχέτιση μεταξύ διαφορετικών μεταβλητών και της εξέλιξης της νόσου (Y)\n",
"\n",
"> **Υπόδειξη** Ο πίνακας συσχέτισης θα σας δώσει τις πιο χρήσιμες πληροφορίες σχετικά με το ποιες τιμές είναι εξαρτημένες.\n"
],
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Αποποίηση ευθύνης**: \nΑυτό το έγγραφο έχει μεταφραστεί χρησιμοποιώντας την υπηρεσία αυτόματης μετάφρασης [Co-op Translator](https://github.com/Azure/co-op-translator). Παρόλο που καταβάλλουμε προσπάθειες για ακρίβεια, παρακαλούμε να έχετε υπόψη ότι οι αυτοματοποιημένες μεταφράσεις ενδέχεται να περιέχουν λάθη ή ανακρίβειες. Το πρωτότυπο έγγραφο στη μητρική του γλώσσα θα πρέπει να θεωρείται η αυθεντική πηγή. Για κρίσιμες πληροφορίες, συνιστάται επαγγελματική ανθρώπινη μετάφραση. Δεν φέρουμε ευθύνη για τυχόν παρεξηγήσεις ή εσφαλμένες ερμηνείες που προκύπτουν από τη χρήση αυτής της μετάφρασης.\n"
"\n---\n\n**Αποποίηση ευθύνης**: \nΑυτό το έγγραφο έχει μεταφραστεί χρησιμοποιώντας την υπηρεσία αυτόματης μετάφρασης [Co-op Translator](https://github.com/Azure/co-op-translator). Παρόλο που καταβάλλουμε προσπάθειες για ακρίβεια, παρακαλούμε να έχετε υπόψη ότι οι αυτόματες μεταφράσεις ενδέχεται να περιέχουν σφάλματα ή ανακρίβειες. Το πρωτότυπο έγγραφο στη μητρική του γλώσσα θα πρέπει να θεωρείται η αυθεντική πηγή. Για κρίσιμες πληροφορίες, συνιστάται επαγγελματική ανθρώπινη μετάφραση. Δεν φέρουμε ευθύνη για τυχόν παρεξηγήσεις ή εσφαλμένες ερμηνείες που προκύπτουν από τη χρήση αυτής της μετάφρασης.\n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:17:53+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:31:57+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "el"
}

File diff suppressed because one or more lines are too long

@ -3,7 +3,7 @@
{
"cell_type": "markdown",
"source": [
"## Εισαγωγή στην Πιθανότητα και Στατιστική\n",
"## Εισαγωγή στην Πιθανότητα και τη Στατιστική\n",
"## Εργασία\n",
"\n",
"Σε αυτή την εργασία, θα χρησιμοποιήσουμε το σύνολο δεδομένων ασθενών με διαβήτη που έχει ληφθεί [από εδώ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"Σε αυτό το σύνολο δεδομένων, οι στήλες είναι οι εξής: \n",
"* Η ηλικία και το φύλο είναι αυτονόητα \n",
"* Το BMI είναι ο δείκτης μάζας σώματος \n",
"* Το BP είναι η μέση αρτηριακή πίεση \n",
"* Οι S1 έως S6 είναι διαφορετικές μετρήσεις αίματος \n",
"* Το Y είναι το ποιοτικό μέτρο της εξέλιξης της νόσου μέσα σε ένα έτος \n",
"Σε αυτό το σύνολο δεδομένων, οι στήλες είναι οι εξής:\n",
"* Η ηλικία και το φύλο είναι αυτονόητα\n",
"* Το BMI είναι ο δείκτης μάζας σώματος\n",
"* Το BP είναι η μέση αρτηριακή πίεση\n",
"* Το S1 έως S6 είναι διαφορετικές μετρήσεις αίματος\n",
"* Το Y είναι η ποιοτική μέτρηση της εξέλιξης της ασθένειας μέσα σε ένα χρόνο\n",
"\n",
"Ας μελετήσουμε αυτό το σύνολο δεδομένων χρησιμοποιώντας μεθόδους πιθανοτήτων και στατιστικής.\n",
"Ας μελετήσουμε αυτό το σύνολο δεδομένων χρησιμοποιώντας μεθόδους πιθανότητας και στατιστικής.\n",
"\n",
"### Εργασία 1: Υπολογίστε τις μέσες τιμές και τη διακύμανση για όλες τις τιμές \n"
"### Εργασία 1: Υπολογίστε τις μέσες τιμές και τη διακύμανση για όλες τις τιμές\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Εργασία 2: Σχεδιάστε boxplots για BMI, BP και Y ανάλογα με το φύλο\n"
"### Εργασία 2: Σχεδιάστε διαγράμματα κουτιού για BMI, BP και Y ανάλογα με το φύλο\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -592,17 +592,17 @@
{
"cell_type": "markdown",
"source": [
"Συμπεράσματα:\n",
"* Ηλικία - φυσιολογική\n",
"* Φύλο - ομοιόμορφο\n",
"* ΔΜΣ, Y - δύσκολο να προσδιοριστεί\n"
"Συμπεράσματα: \n",
"* Ηλικία - φυσιολογική \n",
"* Φύλο - ομοιόμορφο \n",
"* ΔΜΣ, Υ - δύσκολο να προσδιοριστεί \n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Εργασία 4: Δοκιμάστε τη συσχέτιση μεταξύ διαφορετικών μεταβλητών και της εξέλιξης της ασθένειας (Y)\n",
"### Εργασία 4: Δοκιμάστε τη συσχέτιση μεταξύ διαφορετικών μεταβλητών και της εξέλιξης της νόσου (Y)\n",
"\n",
"> **Υπόδειξη** Ο πίνακας συσχέτισης θα σας δώσει τις πιο χρήσιμες πληροφορίες σχετικά με το ποιες τιμές είναι εξαρτημένες.\n"
],
@ -847,7 +847,7 @@
"cell_type": "markdown",
"source": [
"Συμπέρασμα: \n",
"* Η ισχυρότερη συσχέτιση του Y είναι με το ΔΜΣ και το S5 (σάκχαρο αίματος). Αυτό ακούγεται λογικό.\n"
"* Η ισχυρότερη συσχέτιση του Y είναι με τον ΔΜΣ και το S5 (σάκχαρο αίματος). Αυτό ακούγεται λογικό.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +885,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +916,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Αποποίηση ευθύνης**: \nΑυτό το έγγραφο έχει μεταφραστεί χρησιμοποιώντας την υπηρεσία αυτόματης μετάφρασης [Co-op Translator](https://github.com/Azure/co-op-translator). Παρόλο που καταβάλλουμε προσπάθειες για ακρίβεια, παρακαλούμε να έχετε υπόψη ότι οι αυτοματοποιημένες μεταφράσεις ενδέχεται να περιέχουν λάθη ή ανακρίβειες. Το πρωτότυπο έγγραφο στη μητρική του γλώσσα θα πρέπει να θεωρείται η αυθεντική πηγή. Για κρίσιμες πληροφορίες, συνιστάται επαγγελματική ανθρώπινη μετάφραση. Δεν φέρουμε ευθύνη για τυχόν παρεξηγήσεις ή εσφαλμένες ερμηνείες που προκύπτουν από τη χρήση αυτής της μετάφρασης.\n"
"\n---\n\n**Αποποίηση Ευθύνης**: \nΑυτό το έγγραφο έχει μεταφραστεί χρησιμοποιώντας την υπηρεσία αυτόματης μετάφρασης [Co-op Translator](https://github.com/Azure/co-op-translator). Παρόλο που καταβάλλουμε προσπάθειες για ακρίβεια, παρακαλούμε να έχετε υπόψη ότι οι αυτόματες μεταφράσεις ενδέχεται να περιέχουν σφάλματα ή ανακρίβειες. Το πρωτότυπο έγγραφο στη μητρική του γλώσσα θα πρέπει να θεωρείται η αυθεντική πηγή. Για κρίσιμες πληροφορίες, συνιστάται επαγγελματική ανθρώπινη μετάφραση. Δεν φέρουμε ευθύνη για τυχόν παρεξηγήσεις ή εσφαλμένες ερμηνείες που προκύπτουν από τη χρήση αυτής της μετάφρασης.\n"
]
}
],
@ -942,8 +942,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:23:18+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:32:16+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "el"
}

@ -6,7 +6,7 @@
"## Introduction to Probability and Statistics\n",
"## Assignment\n",
"\n",
"In this assignment, we will use the dataset of diabetes patients obtained [from here](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"In this assignment, we will use the dataset of diabetes patients available [here](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,12 +149,12 @@
{
"cell_type": "markdown",
"source": [
"In this dataset, the columns are as follows:\n",
"* Age and sex are straightforward\n",
"* BMI refers to body mass index\n",
"* BP represents average blood pressure\n",
"* S1 to S6 are various blood measurements\n",
"* Y is a qualitative indicator of disease progression over the course of one year\n",
"In this dataset, the columns are as follows: \n",
"* Age and sex are self-explanatory \n",
"* BMI is body mass index \n",
"* BP is average blood pressure \n",
"* S1 through S6 are different blood measurements \n",
"* Y is the qualitative measure of disease progression over one year \n",
"\n",
"Let's analyze this dataset using probability and statistical methods.\n",
"\n",
@ -202,7 +202,7 @@
"source": [
"### Task 4: Test the correlation between different variables and disease progression (Y)\n",
"\n",
"> **Hint** The correlation matrix will provide the most valuable insights into which values are interdependent.\n"
"> **Hint** A correlation matrix will provide the most useful insights into which values are interdependent.\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Disclaimer**: \nThis document has been translated using the AI translation service [Co-op Translator](https://github.com/Azure/co-op-translator). While we aim for accuracy, please note that automated translations may include errors or inaccuracies. The original document in its native language should be regarded as the definitive source. For critical information, professional human translation is advised. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.\n"
"\n---\n\n**Disclaimer**: \nThis document has been translated using the AI translation service [Co-op Translator](https://github.com/Azure/co-op-translator). While we strive for accuracy, please note that automated translations may contain errors or inaccuracies. The original document in its native language should be regarded as the authoritative source. For critical information, professional human translation is recommended. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-03T20:43:46+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T16:59:17+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "en"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,12 +150,12 @@
{
"cell_type": "markdown",
"source": [
"In this dataset, the columns are as follows:\n",
"* Age and sex are straightforward.\n",
"* BMI refers to body mass index.\n",
"* BP represents average blood pressure.\n",
"* S1 through S6 are various blood measurements.\n",
"* Y is a qualitative indicator of disease progression over the course of one year.\n",
"In this dataset, the columns are as follows: \n",
"* Age and sex are self-explanatory \n",
"* BMI is body mass index \n",
"* BP is average blood pressure \n",
"* S1 through S6 are different blood measurements \n",
"* Y is the qualitative measure of disease progression over one year \n",
"\n",
"Let's analyze this dataset using probability and statistical methods.\n",
"\n",
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -592,19 +592,19 @@
{
"cell_type": "markdown",
"source": [
"Conclusions:\n",
"* Age - normal\n",
"* Sex - uniform\n",
"* BMI, Y - hard to tell\n"
"Conclusions: \n",
"* Age - normal \n",
"* Sex - uniform \n",
"* BMI, Y - hard to tell \n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Task 4: Examine the relationship between various variables and disease progression (Y)\n",
"### Task 4: Test the correlation between different variables and disease progression (Y)\n",
"\n",
"> **Hint** A correlation matrix will provide the most valuable insights into which values are interdependent.\n"
"> **Hint** A correlation matrix will provide the most useful insights into which values are interdependent.\n"
],
"metadata": {}
},
@ -847,7 +847,7 @@
"cell_type": "markdown",
"source": [
"Conclusion:\n",
"* The strongest correlation of Y is BMI and S5 (blood sugar). This seems logical.\n"
"* The strongest correlation of Y is BMI and S5 (blood sugar). This sounds reasonable.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -918,7 +918,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Disclaimer**: \nThis document has been translated using the AI translation service [Co-op Translator](https://github.com/Azure/co-op-translator). While we aim for accuracy, please note that automated translations may include errors or inaccuracies. The original document in its native language should be regarded as the authoritative source. For critical information, professional human translation is advised. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.\n"
"\n---\n\n**Disclaimer**: \nThis document has been translated using the AI translation service [Co-op Translator](https://github.com/Azure/co-op-translator). While we strive for accuracy, please note that automated translations may contain errors or inaccuracies. The original document in its native language should be regarded as the authoritative source. For critical information, professional human translation is recommended. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.\n"
]
}
],
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-03T20:44:02+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T16:59:33+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "en"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"En este conjunto de datos, las columnas son las siguientes: \n",
"* La edad y el sexo son autoexplicativos \n",
"* BMI es el índice de masa corporal \n",
"* BP es la presión arterial promedio \n",
"* S1 a S6 son diferentes mediciones sanguíneas \n",
"* Y es la medida cualitativa de la progresión de la enfermedad a lo largo de un año \n",
"En este conjunto de datos, las columnas son las siguientes:\n",
"* Edad y sexo son autoexplicativos\n",
"* BMI es el índice de masa corporal\n",
"* BP es la presión arterial promedio\n",
"* S1 a S6 son diferentes mediciones de sangre\n",
"* Y es la medida cualitativa de la progresión de la enfermedad durante un año\n",
"\n",
"Estudiemos este conjunto de datos utilizando métodos de probabilidad y estadística.\n",
"\n",
"### Tarea 1: Calcular los valores medios y la varianza para todos los valores \n"
"### Tarea 1: Calcular valores medios y varianza para todos los valores\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Tarea 2: Graficar diagramas de caja para IMC, PA y Y dependiendo del género\n"
"### Tarea 2: Graficar diagramas de caja para BMI, BP y Y dependiendo del género\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Descargo de responsabilidad**: \nEste documento ha sido traducido utilizando el servicio de traducción automática [Co-op Translator](https://github.com/Azure/co-op-translator). Aunque nos esforzamos por garantizar la precisión, tenga en cuenta que las traducciones automatizadas pueden contener errores o imprecisiones. El documento original en su idioma nativo debe considerarse como la fuente autorizada. Para información crítica, se recomienda una traducción profesional realizada por humanos. No nos hacemos responsables de malentendidos o interpretaciones erróneas que puedan surgir del uso de esta traducción.\n"
"\n---\n\n**Descargo de responsabilidad**: \nEste documento ha sido traducido utilizando el servicio de traducción automática [Co-op Translator](https://github.com/Azure/co-op-translator). Si bien nos esforzamos por lograr precisión, tenga en cuenta que las traducciones automáticas pueden contener errores o imprecisiones. El documento original en su idioma nativo debe considerarse como la fuente autorizada. Para información crítica, se recomienda una traducción profesional realizada por humanos. No nos hacemos responsables de malentendidos o interpretaciones erróneas que puedan surgir del uso de esta traducción.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:18:04+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:01:47+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "es"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"En este conjunto de datos, las columnas son las siguientes: \n",
"* Age y sex son autoexplicativas \n",
"* BMI es el índice de masa corporal \n",
"* BP es la presión arterial promedio \n",
"* S1 a S6 son diferentes mediciones sanguíneas \n",
"* Y es la medida cualitativa de la progresión de la enfermedad a lo largo de un año \n",
"En este conjunto de datos, las columnas son las siguientes:\n",
"* Edad y sexo son autoexplicativos\n",
"* BMI es el índice de masa corporal\n",
"* BP es la presión arterial promedio\n",
"* S1 a S6 son diferentes mediciones de sangre\n",
"* Y es la medida cualitativa de la progresión de la enfermedad durante un año\n",
"\n",
"Estudiemos este conjunto de datos utilizando métodos de probabilidad y estadística. \n",
"Estudiemos este conjunto de datos utilizando métodos de probabilidad y estadística.\n",
"\n",
"### Tarea 1: Calcular los valores medios y la varianza para todos los valores \n"
"### Tarea 1: Calcular valores medios y varianza para todos los valores\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -591,9 +591,9 @@
"cell_type": "markdown",
"source": [
"Conclusiones:\n",
"* Edad - normal\n",
"* Sexo - uniforme\n",
"* IMC, Y - difícil de determinar\n"
"* Edad - normal \n",
"* Sexo - uniforme \n",
"* IMC, Y - difícil de determinar \n"
],
"metadata": {}
},
@ -602,7 +602,7 @@
"source": [
"### Tarea 4: Prueba la correlación entre diferentes variables y la progresión de la enfermedad (Y)\n",
"\n",
"> **Sugerencia** La matriz de correlación te proporcionará la información más útil sobre qué valores son dependientes.\n"
"> **Sugerencia** Una matriz de correlación te proporcionará la información más útil sobre qué valores son dependientes.\n"
],
"metadata": {}
},
@ -845,7 +845,7 @@
"cell_type": "markdown",
"source": [
"Conclusión: \n",
"* La correlación más fuerte de Y es con el IMC y S5 (nivel de azúcar en sangre). Esto parece razonable.\n"
"* La correlación más fuerte de Y es con el IMC y S5 (azúcar en sangre). Esto suena razonable.\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Descargo de responsabilidad**: \nEste documento ha sido traducido utilizando el servicio de traducción automática [Co-op Translator](https://github.com/Azure/co-op-translator). Aunque nos esforzamos por garantizar la precisión, tenga en cuenta que las traducciones automatizadas pueden contener errores o imprecisiones. El documento original en su idioma nativo debe considerarse como la fuente autorizada. Para información crítica, se recomienda una traducción profesional realizada por humanos. No nos hacemos responsables de malentendidos o interpretaciones erróneas que puedan surgir del uso de esta traducción.\n"
"\n---\n\n**Descargo de responsabilidad**: \nEste documento ha sido traducido utilizando el servicio de traducción automática [Co-op Translator](https://github.com/Azure/co-op-translator). Si bien nos esforzamos por lograr precisión, tenga en cuenta que las traducciones automáticas pueden contener errores o imprecisiones. El documento original en su idioma nativo debe considerarse como la fuente autorizada. Para información crítica, se recomienda una traducción profesional realizada por humanos. No nos hacemos responsables de malentendidos o interpretaciones erróneas que puedan surgir del uso de esta traducción.\n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:23:33+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:02:02+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "es"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"در این مجموعه داده، ستون‌ها به شرح زیر هستند: \n",
"* سن و جنسیت نیازی به توضیح ندارند \n",
"* BMI شاخص توده بدنی است \n",
"* BP میانگین فشار خون است \n",
"* S1 تا S6 اندازه‌گیری‌های مختلف خون هستند \n",
"* Y معیار کیفی پیشرفت بیماری در طول یک سال است \n",
"در این مجموعه داده، ستون‌ها به صورت زیر هستند:\n",
"* سن و جنسیت نیازی به توضیح ندارند\n",
"* BMI شاخص توده بدنی است\n",
"* BP میانگین فشار خون است\n",
"* S1 تا S6 اندازه‌گیری‌های مختلف خون هستند\n",
"* Y معیار کیفی پیشرفت بیماری در طول یک سال است\n",
"\n",
"بیایید این مجموعه داده را با استفاده از روش‌های احتمال و آمار بررسی کنیم.\n",
"\n",
"### وظیفه ۱: محاسبه مقادیر میانگین و واریانس برای تمام مقادیر \n"
"### وظیفه ۱: محاسبه میانگین و واریانس برای تمام مقادیر\n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### وظیفه ۳: توزیع متغیرهای سن، جنسیت، شاخص توده بدنی و Y چگونه است؟\n"
"### وظیفه ۳: توزیع سن، جنسیت، شاخص توده بدنی و متغیر Y چگونه است؟\n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### وظیفه ۴: آزمایش همبستگی بین متغیرهای مختلف و پیشرفت بیماری (Y)\n",
"\n",
"> **نکته** ماتریس همبستگی اطلاعات مفیدی در مورد اینکه کدام مقادیر وابسته هستند به شما ارائه می‌دهد.\n"
"> **نکته** ماتریس همبستگی اطلاعات مفیدی در مورد اینکه کدام مقادیر وابسته هستند به شما می‌دهد.\n"
],
"metadata": {}
},
@ -214,7 +214,7 @@
{
"cell_type": "markdown",
"source": [
"### وظیفه ۵: فرضیه را آزمایش کنید که درجه پیشرفت دیابت بین مردان و زنان متفاوت است\n"
"### وظیفه ۵: فرضیه را آزمایش کنید که میزان پیشرفت دیابت بین مردان و زنان متفاوت است\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**سلب مسئولیت**: \nاین سند با استفاده از سرویس ترجمه هوش مصنوعی [Co-op Translator](https://github.com/Azure/co-op-translator) ترجمه شده است. در حالی که ما تلاش می‌کنیم دقت را حفظ کنیم، لطفاً توجه داشته باشید که ترجمه‌های خودکار ممکن است شامل خطاها یا نادرستی‌ها باشند. سند اصلی به زبان اصلی آن باید به عنوان منبع معتبر در نظر گرفته شود. برای اطلاعات حساس، توصیه می‌شود از ترجمه حرفه‌ای انسانی استفاده کنید. ما مسئولیتی در قبال سوء تفاهم‌ها یا تفسیرهای نادرست ناشی از استفاده از این ترجمه نداریم.\n"
"\n---\n\n**سلب مسئولیت**: \nاین سند با استفاده از سرویس ترجمه هوش مصنوعی [Co-op Translator](https://github.com/Azure/co-op-translator) ترجمه شده است. در حالی که ما برای دقت تلاش می‌کنیم، لطفاً توجه داشته باشید که ترجمه‌های خودکار ممکن است شامل خطاها یا نادرستی‌هایی باشند. سند اصلی به زبان اصلی آن باید به عنوان منبع معتبر در نظر گرفته شود. برای اطلاعات حساس، ترجمه حرفه‌ای انسانی توصیه می‌شود. ما هیچ مسئولیتی در قبال سوءتفاهم‌ها یا تفسیرهای نادرست ناشی از استفاده از این ترجمه نداریم.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:18:18+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:07:15+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "fa"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### وظیفه ۳: توزیع متغیرهای سن، جنسیت، شاخص توده بدنی و Y چگونه است؟\n"
"### وظیفه ۳: توزیع سن، جنسیت، شاخص توده بدنی و متغیر Y چگونه است؟\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -593,9 +593,9 @@
"cell_type": "markdown",
"source": [
"نتیجه‌گیری‌ها:\n",
"* سن - عادی\n",
"* جنسیت - یکنواخت\n",
"* شاخص توده بدنی، Y - سخت است که مشخص شود\n"
"* سن - عادی \n",
"* جنسیت - یکنواخت \n",
"* شاخص توده بدنی (BMI)، Y - سخت می‌توان قضاوت کرد \n"
],
"metadata": {}
},
@ -604,7 +604,7 @@
"source": [
"### وظیفه ۴: آزمایش همبستگی بین متغیرهای مختلف و پیشرفت بیماری (Y)\n",
"\n",
"> **نکته** ماتریس همبستگی اطلاعات مفیدی در مورد اینکه کدام مقادیر وابسته هستند به شما می‌دهد.\n"
"> **نکته** ماتریس همبستگی اطلاعات بسیار مفیدی در مورد اینکه کدام مقادیر وابسته هستند به شما می‌دهد.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -879,7 +879,7 @@
{
"cell_type": "markdown",
"source": [
"### وظیفه ۵: فرضیه را آزمایش کنید که درجه پیشرفت دیابت بین مردان و زنان متفاوت است\n"
"### وظیفه ۵: فرضیه را آزمایش کنید که میزان پیشرفت دیابت بین مردان و زنان متفاوت است\n"
],
"metadata": {}
},
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -907,7 +907,7 @@
{
"cell_type": "markdown",
"source": [
"نتیجه‌گیری: مقدار p نزدیک به ۰ (معمولاً کمتر از ۰.۰۵) نشان‌دهنده اعتماد بالا به فرضیه ما است. در مورد ما، شواهد قوی وجود ندارد که جنسیت بر پیشرفت دیابت تأثیر بگذارد.\n"
"نتیجه‌گیری: مقدار p نزدیک به ۰ (معمولاً کمتر از ۰.۰۵) نشان‌دهنده اعتماد بالا به فرضیه ما است. در مورد ما، شواهد قوی وجود ندارد که نشان دهد جنسیت بر پیشرفت دیابت تأثیر می‌گذارد.\n"
],
"metadata": {}
},
@ -920,7 +920,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**سلب مسئولیت**: \nاین سند با استفاده از سرویس ترجمه هوش مصنوعی [Co-op Translator](https://github.com/Azure/co-op-translator) ترجمه شده است. در حالی که ما تلاش می‌کنیم دقت را حفظ کنیم، لطفاً توجه داشته باشید که ترجمه‌های خودکار ممکن است شامل خطاها یا نادرستی‌ها باشند. سند اصلی به زبان اصلی آن باید به عنوان منبع معتبر در نظر گرفته شود. برای اطلاعات حساس، توصیه می‌شود از ترجمه حرفه‌ای انسانی استفاده کنید. ما هیچ مسئولیتی در قبال سوء تفاهم‌ها یا تفسیرهای نادرست ناشی از استفاده از این ترجمه نداریم.\n"
"\n---\n\n**سلب مسئولیت**: \nاین سند با استفاده از سرویس ترجمه هوش مصنوعی [Co-op Translator](https://github.com/Azure/co-op-translator) ترجمه شده است. در حالی که ما برای دقت تلاش می‌کنیم، لطفاً توجه داشته باشید که ترجمه‌های خودکار ممکن است شامل خطاها یا نادقتی‌ها باشند. سند اصلی به زبان بومی آن باید به عنوان منبع معتبر در نظر گرفته شود. برای اطلاعات حساس، ترجمه حرفه‌ای انسانی توصیه می‌شود. ما هیچ مسئولیتی در قبال سوءتفاهم‌ها یا تفسیرهای نادرست ناشی از استفاده از این ترجمه نداریم.\n"
]
}
],
@ -946,8 +946,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:23:53+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:07:35+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "fa"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"Tässä aineistossa sarakkeet ovat seuraavat: \n",
"* Ikä ja sukupuoli ovat itsestään selviä \n",
"* BMI on kehon painoindeksi \n",
"* BP on keskimääräinen verenpaine \n",
"* S1S6 ovat erilaisia veriarvoja \n",
"* Y on laadullinen mitta taudin etenemisestä yhden vuoden aikana \n",
"Tässä aineistossa sarakkeet ovat seuraavat:\n",
"* Ikä ja sukupuoli ovat itsestään selviä\n",
"* BMI on kehon massan indeksi\n",
"* BP on keskimääräinen verenpaine\n",
"* S1S6 ovat erilaisia verimittauksia\n",
"* Y on taudin etenemisen laadullinen mittari yhden vuoden aikana\n",
"\n",
"Tutkitaan tätä aineistoa todennäköisyyden ja tilastotieteen menetelmillä.\n",
"\n",
"### Tehtävä 1: Laske kaikkien arvojen keskiarvot ja varianssit \n"
"### Tehtävä 1: Laske kaikkien arvojen keskiarvot ja varianssit\n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### Tehtävä 4: Testaa eri muuttujien ja sairauden etenemisen (Y) välistä korrelaatiota\n",
"\n",
"> **Vinkki** Korrelaatiomatriisi antaa hyödyllisintä tietoa siitä, mitkä arvot ovat riippuvaisia toisistaan.\n"
"> **Vinkki** Korrelaatiomatriisi antaa sinulle hyödyllisintä tietoa siitä, mitkä arvot ovat riippuvaisia toisistaan.\n"
],
"metadata": {}
},
@ -214,7 +214,7 @@
{
"cell_type": "markdown",
"source": [
"### Tehtävä 5: Testaa hypoteesi, että diabeteksen etenemisaste eroaa miesten ja naisten välillä\n"
"### Tehtävä 5: Testaa hypoteesi, että diabeteksen etenemisaste on erilainen miesten ja naisten välillä\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Vastuuvapauslauseke**: \nTämä asiakirja on käännetty käyttämällä tekoälypohjaista käännöspalvelua [Co-op Translator](https://github.com/Azure/co-op-translator). Vaikka pyrimme tarkkuuteen, huomioithan, että automaattiset käännökset voivat sisältää virheitä tai epätarkkuuksia. Alkuperäistä asiakirjaa sen alkuperäisellä kielellä tulisi pitää ensisijaisena lähteenä. Kriittisen tiedon osalta suositellaan ammattimaista ihmiskäännöstä. Emme ole vastuussa väärinkäsityksistä tai virhetulkinnoista, jotka johtuvat tämän käännöksen käytöstä.\n"
"\n---\n\n**Vastuuvapauslauseke**: \nTämä asiakirja on käännetty käyttämällä tekoälypohjaista käännöspalvelua [Co-op Translator](https://github.com/Azure/co-op-translator). Vaikka pyrimme tarkkuuteen, huomioithan, että automaattiset käännökset voivat sisältää virheitä tai epätarkkuuksia. Alkuperäistä asiakirjaa sen alkuperäisellä kielellä tulee pitää ensisijaisena lähteenä. Kriittisen tiedon osalta suositellaan ammattimaista ihmiskääntämistä. Emme ole vastuussa tämän käännöksen käytöstä aiheutuvista väärinkäsityksistä tai virhetulkinnoista.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:18:32+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:38:38+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "fi"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -151,13 +151,13 @@
"cell_type": "markdown",
"source": [
"Tässä aineistossa sarakkeet ovat seuraavat: \n",
"* Ikä ja sukupuoli ovat itsestään selviä \n",
"* BMI on kehon painoindeksi \n",
"* Ikä ja sukupuoli ovat itsestäänselviä \n",
"* BMI tarkoittaa painoindeksiä \n",
"* BP on keskimääräinen verenpaine \n",
"* S1S6 ovat erilaisia veriarvoja \n",
"* Y on sairauden etenemisen laadullinen mittari yhden vuoden aikana \n",
"* Y on laadullinen mitta taudin etenemisestä yhden vuoden aikana \n",
"\n",
"Tutkitaan tätä aineistoa todennäköisyyden ja tilastotieteen menetelmillä.\n",
"Tutkitaan tätä aineistoa todennäköisyyden ja tilastotieteen menetelmien avulla. \n",
"\n",
"### Tehtävä 1: Laske kaikkien arvojen keskiarvot ja varianssit \n"
],
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -604,7 +604,7 @@
"source": [
"### Tehtävä 4: Testaa eri muuttujien ja sairauden etenemisen (Y) välistä korrelaatiota\n",
"\n",
"> **Vinkki** Korrelaatiomatriisi antaa hyödyllisintä tietoa siitä, mitkä arvot ovat riippuvaisia toisistaan.\n"
"> **Vinkki** Korrelaatiomatriisi antaa sinulle hyödyllisintä tietoa siitä, mitkä arvot ovat riippuvaisia toisistaan.\n"
],
"metadata": {}
},
@ -847,7 +847,7 @@
"cell_type": "markdown",
"source": [
"Johtopäätös: \n",
"* Vahvin korrelaatio Y:n kanssa on BMI ja S5 (verensokeri). Tämä vaikuttaa järkevältä.\n"
"* Vahvin korrelaatio Y:n kanssa on BMI ja S5 (verensokeri). Tämä kuulostaa järkevältä.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -918,7 +918,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Vastuuvapauslauseke**: \nTämä asiakirja on käännetty käyttämällä tekoälypohjaista käännöspalvelua [Co-op Translator](https://github.com/Azure/co-op-translator). Vaikka pyrimme tarkkuuteen, huomioithan, että automaattiset käännökset voivat sisältää virheitä tai epätarkkuuksia. Alkuperäistä asiakirjaa sen alkuperäisellä kielellä tulisi pitää ensisijaisena lähteenä. Kriittisen tiedon osalta suositellaan ammattimaista ihmiskäännöstä. Emme ole vastuussa väärinkäsityksistä tai virhetulkinnoista, jotka johtuvat tämän käännöksen käytöstä.\n"
"\n---\n\n**Vastuuvapauslauseke**: \nTämä asiakirja on käännetty käyttämällä tekoälypohjaista käännöspalvelua [Co-op Translator](https://github.com/Azure/co-op-translator). Pyrimme tarkkuuteen, mutta huomioithan, että automaattiset käännökset voivat sisältää virheitä tai epätarkkuuksia. Alkuperäistä asiakirjaa sen alkuperäisellä kielellä tulee pitää ensisijaisena lähteenä. Kriittisen tiedon osalta suositellaan ammattimaista ihmiskääntämistä. Emme ole vastuussa tämän käännöksen käytöstä aiheutuvista väärinkäsityksistä tai virhetulkinnoista.\n"
]
}
],
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:24:11+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:38:56+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "fi"
}

@ -6,7 +6,7 @@
"## Introduction à la Probabilité et aux Statistiques\n",
"## Devoir\n",
"\n",
"Dans ce devoir, nous utiliserons le jeu de données des patients atteints de diabète provenant [d'ici](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Dans ce devoir, nous utiliserons le jeu de données des patients diabétiques disponible [ici](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"Dans cet ensemble de données, les colonnes sont les suivantes : \n",
"* L'âge et le sexe sont explicites \n",
"* L'IMC est l'indice de masse corporelle \n",
"* La PA est la pression artérielle moyenne \n",
"* S1 à S6 sont différentes mesures sanguines \n",
"* Y est la mesure qualitative de la progression de la maladie sur une année \n",
"Dans ce jeu de données, les colonnes sont les suivantes :\n",
"* L'âge et le sexe sont explicites\n",
"* L'IMC est l'indice de masse corporelle\n",
"* BP est la pression artérielle moyenne\n",
"* S1 à S6 sont différentes mesures sanguines\n",
"* Y est la mesure qualitative de la progression de la maladie sur une année\n",
"\n",
"Étudions cet ensemble de données en utilisant des méthodes de probabilité et de statistiques.\n",
"Étudions ce jeu de données en utilisant des méthodes de probabilité et de statistiques.\n",
"\n",
"### Tâche 1 : Calculer les valeurs moyennes et la variance pour toutes les valeurs\n"
"### Tâche 1 : Calculer les valeurs moyennes et la variance pour tous les éléments\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Tâche 2 : Tracer des boxplots pour l'IMC, la TA et Y en fonction du sexe\n"
"### Tâche 2 : Tracer des boîtes à moustaches pour l'IMC, la TA et Y en fonction du sexe\n"
],
"metadata": {}
},
@ -200,7 +200,7 @@
"source": [
"### Tâche 4 : Tester la corrélation entre différentes variables et la progression de la maladie (Y)\n",
"\n",
"> **Indice** Une matrice de corrélation vous fournira les informations les plus utiles sur les valeurs qui sont dépendantes.\n"
"> **Conseil** Une matrice de corrélation vous fournira les informations les plus utiles sur les valeurs qui sont dépendantes.\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Avertissement** : \nCe document a été traduit à l'aide du service de traduction automatique [Co-op Translator](https://github.com/Azure/co-op-translator). Bien que nous nous efforcions d'assurer l'exactitude, veuillez noter que les traductions automatisées peuvent contenir des erreurs ou des inexactitudes. Le document original dans sa langue d'origine doit être considéré comme la source faisant autorité. Pour des informations critiques, il est recommandé de recourir à une traduction professionnelle réalisée par un humain. Nous déclinons toute responsabilité en cas de malentendus ou d'interprétations erronées résultant de l'utilisation de cette traduction.\n"
"\n---\n\n**Avertissement** : \nCe document a été traduit à l'aide du service de traduction automatique [Co-op Translator](https://github.com/Azure/co-op-translator). Bien que nous nous efforcions d'assurer l'exactitude, veuillez noter que les traductions automatisées peuvent contenir des erreurs ou des inexactitudes. Le document original dans sa langue d'origine doit être considéré comme la source faisant autorité. Pour des informations critiques, il est recommandé de faire appel à une traduction humaine professionnelle. Nous déclinons toute responsabilité en cas de malentendus ou d'interprétations erronées résultant de l'utilisation de cette traduction.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:18:44+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:00:34+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "fr"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## Introduction à la Probabilité et aux Statistiques\n",
"## Devoir\n",
"\n",
"Dans ce devoir, nous utiliserons le jeu de données des patients atteints de diabète provenant [d'ici](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Dans ce devoir, nous utiliserons le jeu de données des patients diabétiques disponible [ici](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"Dans ce jeu de données, les colonnes sont les suivantes :\n",
"* L'âge et le sexe sont explicites\n",
"* L'IMC est l'indice de masse corporelle\n",
"* BP est la pression artérielle moyenne\n",
"* S1 à S6 sont différentes mesures sanguines\n",
"* Y est la mesure qualitative de la progression de la maladie sur une année\n",
"Dans cet ensemble de données, les colonnes sont les suivantes : \n",
"* L'âge et le sexe sont explicites \n",
"* L'IMC correspond à l'indice de masse corporelle \n",
"* La PA est la pression artérielle moyenne \n",
"* S1 à S6 sont différentes mesures sanguines \n",
"* Y est la mesure qualitative de la progression de la maladie sur une année \n",
"\n",
"Étudions ce jeu de données en utilisant des méthodes de probabilité et de statistiques.\n",
"Étudions cet ensemble de données à l'aide des méthodes de probabilité et de statistiques.\n",
"\n",
"### Tâche 1 : Calculer les valeurs moyennes et la variance pour tous les éléments\n"
"### Tâche 1 : Calculer les valeurs moyennes et la variance pour toutes les valeurs \n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Tâche 2 : Tracer des boxplots pour l'IMC, la TA et Y en fonction du sexe\n"
"### Tâche 2 : Tracer des boîtes à moustaches pour l'IMC, la TA et Y en fonction du genre\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -602,7 +602,7 @@
"source": [
"### Tâche 4 : Tester la corrélation entre différentes variables et la progression de la maladie (Y)\n",
"\n",
"> **Indice** Une matrice de corrélation vous fournira les informations les plus utiles sur les valeurs qui sont dépendantes.\n"
"> **Conseil** Une matrice de corrélation vous fournira les informations les plus utiles pour identifier quelles valeurs sont dépendantes.\n"
],
"metadata": {}
},
@ -845,7 +845,7 @@
"cell_type": "markdown",
"source": [
"Conclusion : \n",
"* La corrélation la plus forte avec Y est l'IMC et S5 (sucre dans le sang). Cela semble raisonnable.\n"
"* La corrélation la plus forte avec Y est l'IMC et S5 (sucre dans le sang). Cela semble logique.\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:24:26+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:00:48+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "fr"
}

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## מבוא להסתברות וסטטיסטיקה \n",
"## משימה \n",
"## מבוא להסתברות וסטטיסטיקה\n",
"## משימה\n",
"\n",
"במשימה זו, נשתמש במאגר הנתונים של חולי סוכרת שנלקח [מכאן](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html). \n"
"במשימה זו, נשתמש במאגר הנתונים של חולי סוכרת שנלקח [מכאן](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"במערך הנתונים הזה, העמודות הן כדלקמן: \n",
"* גיל ומין מובנים מאליהם \n",
"* BMI הוא מדד מסת הגוף \n",
"* BP הוא לחץ דם ממוצע \n",
"* S1 עד S6 הם מדידות דם שונות \n",
"* Y הוא מדד איכותי להתקדמות המחלה לאורך שנה אחת \n",
עזרת מערך הנתונים הזה, העמודות הן כדלקמן:\n",
"* גיל ומין מובנים מאליהם\n",
"* BMI הוא מדד מסת הגוף\n",
"* BP הוא לחץ דם ממוצע\n",
"* S1 עד S6 הם מדידות דם שונות\n",
"* Y הוא מדד איכותי להתקדמות המחלה לאורך שנה אחת\n",
"\n",
"בואו נלמד את מערך הנתונים הזה באמצעות שיטות של הסתברות וסטטיסטיקה.\n",
"בואו נלמד את מערך הנתונים הזה באמצעות שיטות הסתברות וסטטיסטיקה.\n",
"\n",
"### משימה 1: חישוב ערכי ממוצע ושונות לכל הערכים \n"
"### משימה 1: חישוב ערכי ממוצע ושונות לכל הערכים\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### משימה 2: שרטט תרשימי קופסה עבור BMI, BP ו-Y בהתאם למגדר\n"
"### משימה 2: שרטטו תרשימי קופסה עבור BMI, BP ו-Y בהתאם למגדר\n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### משימה 3: מהי ההתפלגות של משתני גיל, מין, BMI ו-Y?\n"
"### משימה 3: מהי התפלגות הגיל, המין, ה-BMI והמשתנה Y?\n"
],
"metadata": {}
},
@ -200,9 +200,9 @@
{
"cell_type": "markdown",
"source": [
"### משימה 4: בדיקת הקשר בין משתנים שונים להתקדמות המחלה (Y)\n",
"### משימה 4: בדיקת הקורלציה בין משתנים שונים להתקדמות המחלה (Y)\n",
"\n",
"> **רמז** מטריצת מתאם תספק לך את המידע השימושי ביותר על אילו ערכים תלויים זה בזה.\n"
"> **רמז** מטריצת קורלציה תספק לך את המידע השימושי ביותר על אילו ערכים תלויים זה בזה.\n"
],
"metadata": {}
},
@ -214,7 +214,7 @@
{
"cell_type": "markdown",
"source": [
"### משימה 5: בדיקת ההשערה שהדרגת התקדמות הסוכרת שונה בין גברים לנשים\n"
"### משימה 5: בדיקת ההשערה שהדרגה של התקדמות הסוכרת שונה בין גברים לנשים\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**כתב ויתור**: \nמסמך זה תורגם באמצעות שירות תרגום מבוסס בינה מלאכותית [Co-op Translator](https://github.com/Azure/co-op-translator). בעוד שאנו שואפים לדיוק, יש להיות מודעים לכך שתרגומים אוטומטיים עשויים להכיל שגיאות או אי דיוקים. המסמך המקורי בשפתו המקורית צריך להיחשב כמקור סמכותי. עבור מידע קריטי, מומלץ להשתמש בתרגום מקצועי על ידי אדם. איננו נושאים באחריות לאי הבנות או לפרשנויות שגויות הנובעות משימוש בתרגום זה.\n"
"\n---\n\n**כתב ויתור**: \nמסמך זה תורגם באמצעות שירות תרגום מבוסס בינה מלאכותית [Co-op Translator](https://github.com/Azure/co-op-translator). למרות שאנו שואפים לדיוק, יש לקחת בחשבון שתרגומים אוטומטיים עשויים להכיל שגיאות או אי דיוקים. המסמך המקורי בשפתו המקורית צריך להיחשב כמקור סמכותי. עבור מידע קריטי, מומלץ להשתמש בתרגום מקצועי על ידי אדם. איננו נושאים באחריות לאי הבנות או לפרשנויות שגויות הנובעות משימוש בתרגום זה.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:18:58+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:41:18+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "he"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## מבוא להסתברות וסטטיסטיקה \n",
"## משימה \n",
"## מבוא להסתברות וסטטיסטיקה\n",
"## משימה\n",
"\n",
"במשימה זו, נשתמש במאגר הנתונים של חולי סוכרת שנלקח [מכאן](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html). \n"
"במשימה זו, נשתמש במאגר הנתונים של חולי סוכרת שנלקח [מכאן](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,8 +150,8 @@
{
"cell_type": "markdown",
"source": [
"במערך נתונים זה, העמודות הן כדלקמן: \n",
"* גיל ומין מובנים מאליהם \n",
עבור מערך הנתונים הזה, העמודות הן כדלקמן: \n",
"* גיל ומין הם מובנים מאליהם \n",
"* BMI הוא מדד מסת הגוף \n",
"* BP הוא לחץ דם ממוצע \n",
"* S1 עד S6 הם מדידות דם שונות \n",
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### משימה 3: מהי ההתפלגות של משתני גיל, מין, BMI ו-Y?\n"
"### משימה 3: מהי התפלגות הגיל, המין, ה-BMI והמשתנה Y?\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -602,9 +602,9 @@
{
"cell_type": "markdown",
"source": [
"### משימה 4: בדיקת הקשר בין משתנים שונים להתקדמות המחלה (Y)\n",
"### משימה 4: בדיקת הקורלציה בין משתנים שונים להתקדמות המחלה (Y)\n",
"\n",
"> **רמז** מטריצת מתאם תספק את המידע השימושי ביותר על אילו ערכים תלויים.\n"
"> **רמז** מטריצת קורלציה תספק לך את המידע השימושי ביותר על אילו ערכים תלויים זה בזה.\n"
],
"metadata": {}
},
@ -847,7 +847,7 @@
"cell_type": "markdown",
"source": [
"סיכום: \n",
"* הקשר החזק ביותר של Y הוא עם BMI ו-S5 (רמת סוכר בדם). זה נשמע הגיוני.\n"
"* הקשר החזק ביותר של Y הוא עם BMI ו-S5 (רמת סוכר בדם). זה נשמע הגיוני. \n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -879,7 +879,7 @@
{
"cell_type": "markdown",
"source": [
"### משימה 5: בדוק את ההשערה שהדרגת התקדמות הסוכרת שונה בין גברים לנשים\n"
"### משימה 5: בדיקת ההשערה שהדרגה של התקדמות הסוכרת שונה בין גברים לנשים\n"
],
"metadata": {}
},
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -920,7 +920,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**כתב ויתור**: \nמסמך זה תורגם באמצעות שירות תרגום מבוסס בינה מלאכותית [Co-op Translator](https://github.com/Azure/co-op-translator). בעוד שאנו שואפים לדיוק, יש להיות מודעים לכך שתרגומים אוטומטיים עשויים להכיל שגיאות או אי דיוקים. המסמך המקורי בשפתו המקורית צריך להיחשב כמקור הסמכותי. עבור מידע קריטי, מומלץ להשתמש בתרגום מקצועי על ידי אדם. איננו נושאים באחריות לאי הבנות או לפרשנויות שגויות הנובעות משימוש בתרגום זה.\n"
"\n---\n\n**כתב ויתור**: \nמסמך זה תורגם באמצעות שירות תרגום מבוסס בינה מלאכותית [Co-op Translator](https://github.com/Azure/co-op-translator). למרות שאנו שואפים לדיוק, יש לקחת בחשבון שתרגומים אוטומטיים עשויים להכיל שגיאות או אי-דיוקים. המסמך המקורי בשפתו המקורית נחשב למקור הסמכותי. למידע קריטי, מומלץ להשתמש בתרגום מקצועי על ידי בני אדם. איננו נושאים באחריות לכל אי-הבנה או פרשנות שגויה הנובעת משימוש בתרגום זה. \n"
]
}
],
@ -946,8 +946,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:24:44+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:41:36+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "he"
}

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## संभावना और सांख्यिकी का परिचय\n",
"## असाइनमेंट\n",
"## संभावना और सांख्यिकी का परिचय \n",
"## असाइनमेंट \n",
"\n",
"इस असाइनमेंट में, हम मधुमेह रोगियों के डेटा सेट का उपयोग करेंगे, जो [यहां से लिया गया है](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)।\n"
"इस असाइनमेंट में, हम मधुमेह रोगियों के डेटा सेट का उपयोग करेंगे, जो [यहां से लिया गया है](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### कार्य 2: लिंग के आधार पर BMI, BP और Y के लिए बॉक्सप्लॉट बनाएं\n"
"### कार्य 2: लिंग के अनुसार BMI, BP और Y के लिए बॉक्सप्लॉट बनाएं\n"
],
"metadata": {}
},
@ -198,9 +198,9 @@
{
"cell_type": "markdown",
"source": [
"### कार्य 4: विभिन्न चर और बीमारी की प्रगति (Y) के बीच सहसंबंध का परीक्षण करें\n",
"### कार्य 4: विभिन्न चर और रोग की प्रगति (Y) के बीच सहसंबंध का परीक्षण करें\n",
"\n",
"> **संकेत** सहसंबंध मैट्रिक्स आपको यह समझने में सबसे अधिक मदद करेगा कि कौन से मान एक-दूसरे पर निर्भर हैं।\n"
"> **संकेत** सहसंबंध मैट्रिक्स आपको यह समझने में सबसे अधिक सहायक होगा कि कौन से मान एक-दूसरे पर निर्भर हैं।\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**अस्वीकरण**: \nयह दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) का उपयोग करके अनुवादित किया गया है। जबकि हम सटीकता सुनिश्चित करने का प्रयास करते हैं, कृपया ध्यान दें कि स्वचालित अनुवाद में त्रुटियां या अशुद्धियां हो सकती हैं। मूल भाषा में उपलब्ध मूल दस्तावेज़ को प्रामाणिक स्रोत माना जाना चाहिए। महत्वपूर्ण जानकारी के लिए, पेशेवर मानव अनुवाद की सिफारिश की जाती है। इस अनुवाद के उपयोग से उत्पन्न किसी भी गलतफहमी या गलत व्याख्या के लिए हम उत्तरदायी नहीं हैं।\n"
"\n---\n\n**अस्वीकरण**: \nयह दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) का उपयोग करके अनुवादित किया गया है। जबकि हम सटीकता के लिए प्रयासरत हैं, कृपया ध्यान दें कि स्वचालित अनुवाद में त्रुटियां या अशुद्धियां हो सकती हैं। मूल भाषा में उपलब्ध मूल दस्तावेज़ को आधिकारिक स्रोत माना जाना चाहिए। महत्वपूर्ण जानकारी के लिए, पेशेवर मानव अनुवाद की सिफारिश की जाती है। इस अनुवाद के उपयोग से उत्पन्न किसी भी गलतफहमी या गलत व्याख्या के लिए हम उत्तरदायी नहीं हैं।\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:19:09+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:18:25+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "hi"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## संभावना और सांख्यिकी का परिचय\n",
"## असाइनमेंट\n",
"## संभावना और सांख्यिकी का परिचय \n",
"## असाइनमेंट \n",
"\n",
"इस असाइनमेंट में, हम मधुमेह रोगियों के डेटा सेट का उपयोग करेंगे जो [यहां से लिया गया है](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)।\n"
"इस असाइनमेंट में, हम मधुमेह रोगियों के डेटा सेट का उपयोग करेंगे, जिसे [यहां से लिया गया है](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### कार्य 2: लिंग के आधार पर BMI, BP और Y के लिए बॉक्सप्लॉट बनाएं\n"
"### कार्य 2: लिंग के अनुसार BMI, BP और Y के लिए बॉक्सप्लॉट बनाएं\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -590,7 +590,7 @@
{
"cell_type": "markdown",
"source": [
"निष्कर्ष:\n",
"निष्कर्ष: \n",
"* आयु - सामान्य \n",
"* लिंग - समान \n",
"* बीएमआई, वाई - कहना मुश्किल \n"
@ -600,9 +600,9 @@
{
"cell_type": "markdown",
"source": [
"### कार्य 4: विभिन्न चर और रोग की प्रगति (Y) के बीच संबंध का परीक्षण करें\n",
"### कार्य 4: विभिन्न चर और रोग की प्रगति (Y) के बीच सहसंबंध का परीक्षण करें\n",
"\n",
"> **संकेत** संबंध मैट्रिक्स आपको यह समझने में सबसे अधिक सहायक होगा कि कौन से मान एक-दूसरे पर निर्भर हैं।\n"
"> **संकेत** सहसंबंध मैट्रिक्स आपको यह समझने में सबसे अधिक सहायक होगा कि कौन-कौन से मान एक-दूसरे पर निर्भर हैं।\n"
],
"metadata": {}
},
@ -845,7 +845,7 @@
"cell_type": "markdown",
"source": [
"निष्कर्ष:\n",
"* Y का सबसे मजबूत संबंध BMI और S5 (ब्लड शुगर) से है। यह उचित लगता है।\n"
"* Y का सबसे मजबूत सहसंबंध BMI और S5 (ब्लड शुगर) के साथ है। यह तर्कसंगत लगता है।\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**अस्वीकरण**: \nयह दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) का उपयोग करके अनुवादित किया गया है। जबकि हम सटीकता सुनिश्चित करने का प्रयास करते हैं, कृपया ध्यान दें कि स्वचालित अनुवाद में त्रुटियां या अशुद्धियां हो सकती हैं। मूल भाषा में उपलब्ध मूल दस्तावेज़ को प्रामाणिक स्रोत माना जाना चाहिए। महत्वपूर्ण जानकारी के लिए, पेशेवर मानव अनुवाद की सिफारिश की जाती है। इस अनुवाद के उपयोग से उत्पन्न किसी भी गलतफहमी या गलत व्याख्या के लिए हम उत्तरदायी नहीं हैं।\n"
"\n---\n\n**अस्वीकरण**: \nयह दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) का उपयोग करके अनुवादित किया गया है। जबकि हम सटीकता के लिए प्रयासरत हैं, कृपया ध्यान दें कि स्वचालित अनुवाद में त्रुटियां या अशुद्धियां हो सकती हैं। मूल भाषा में उपलब्ध मूल दस्तावेज़ को आधिकारिक स्रोत माना जाना चाहिए। महत्वपूर्ण जानकारी के लिए, पेशेवर मानव अनुवाद की सिफारिश की जाती है। इस अनुवाद के उपयोग से उत्पन्न किसी भी गलतफहमी या गलत व्याख्या के लिए हम उत्तरदायी नहीं हैं। \n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:24:59+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:18:40+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "hi"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,14 +149,14 @@
{
"cell_type": "markdown",
"source": [
"在此數據集中,列的含義如下\n",
"* 年齡和性別不言自明\n",
"在此數據集中,列包含以下內容\n",
"* 年齡和性別不需額外解釋\n",
"* BMI 是身體質量指數\n",
"* BP 是平均血壓\n",
"* S1 S6 是不同的血液測量值\n",
"* S1 S6 是不同的血液測量值\n",
"* Y 是疾病在一年內進展的定性指標\n",
"\n",
"讓我們使用概率和統計方法來研究這個數據集。\n",
"讓我們使用概率和統計方法來研究這個數據集。\n",
"\n",
"### 任務 1計算所有值的平均值和方差\n"
],
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### 任務 3: 年齡、性別、BMI 和 Y 變量的分佈是什麼?\n"
"### 任務 3年齡、性別、BMI 和 Y 變數的分佈是什麼?\n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### 任務 4測試不同變數與疾病進展Y之間的相關性\n",
"\n",
"> **提示** 相關性矩陣可以為你提供最有用的資訊,幫助判斷哪些值是相關的。\n"
"> **提示** 相關性矩陣可以為你提供最有用的資訊,幫助判斷哪些值是相關的。\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業的人類翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n"
"\n---\n\n**免責聲明** \n文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議使用專業的人類翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:49:25+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:12:43+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "hk"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"在此數據集中,欄位如下: \n",
"* Age 和 sex 不需多作解釋 \n",
"* BMI 是身體質量指數 \n",
"* BP 是平均血壓 \n",
"* S1 到 S6 是不同的血液測量值 \n",
"* Y 是一年內疾病進展的定性指標 \n",
"在此數據集中,列包含以下內容:\n",
"* 年齡和性別不需額外解釋\n",
"* BMI 是身體質量指數\n",
"* BP 是平均血壓\n",
"* S1 至 S6 是不同的血液測量值\n",
"* Y 是疾病在一年內進展的定性指標\n",
"\n",
"讓我們使用概率和統計方法來研究這個數據集。\n",
"讓我們使用概率和統計方法來研究這個數據集。\n",
"\n",
"### 任務 1計算所有值的平均值和方差 \n"
"### 任務 1計算所有值的平均值和方差\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -604,7 +604,7 @@
"source": [
"### 任務 4測試不同變數與疾病進展Y之間的相關性\n",
"\n",
"> **提示** 相關矩陣可以為你提供最有用的資訊,幫助判斷哪些值是相互依賴的。\n"
"> **提示** 相關矩陣可以為你提供最有用的資訊,幫助判斷哪些值是相的。\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -879,7 +879,7 @@
{
"cell_type": "markdown",
"source": [
"### 任務 5: 測試糖尿病進展程度在男性和女性之間是否存在差異的假設\n"
"### 任務 5:檢驗糖尿病進展程度在男性和女性之間是否存在差異的假設\n"
],
"metadata": {}
},
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -918,7 +918,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n此文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n"
"\n---\n\n**免責聲明** \n此文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n"
]
}
],
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:58:14+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:13:00+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "hk"
}

@ -6,7 +6,7 @@
"## Uvod u vjerojatnost i statistiku\n",
"## Zadatak\n",
"\n",
"U ovom zadatku koristit ćemo skup podataka o pacijentima s dijabetesom preuzet [odavde](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"U ovom zadatku koristit ćemo skup podataka o pacijentima s dijabetesom preuzet [s ove stranice](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -158,7 +158,7 @@
"\n",
"Proučimo ovaj skup podataka koristeći metode vjerojatnosti i statistike.\n",
"\n",
"### Zadatak 1: Izračunajte srednje vrijednosti i varijancu za sve vrijednosti\n"
"### Zadatak 1: Izračunajte srednje vrijednosti i varijancu za sve vrijednosti \n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### Zadatak 4: Testirajte korelaciju između različitih varijabli i napredovanja bolesti (Y)\n",
"\n",
"> **Savjet** Korelacijska matrica pružit će vam najkorisnije informacije o tome koje vrijednosti su međusobno ovisne.\n"
"> **Savjet** Korelacijska matrica pružit će vam najkorisnije informacije o tome koje su vrijednosti međusobno ovisne.\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Odricanje od odgovornosti**: \nOvaj dokument je preveden pomoću AI usluge za prevođenje [Co-op Translator](https://github.com/Azure/co-op-translator). Iako nastojimo osigurati točnost, imajte na umu da automatski prijevodi mogu sadržavati pogreške ili netočnosti. Izvorni dokument na izvornom jeziku treba smatrati autoritativnim izvorom. Za ključne informacije preporučuje se profesionalni prijevod od strane ljudskog prevoditelja. Ne preuzimamo odgovornost za bilo kakve nesporazume ili pogrešne interpretacije koje proizlaze iz korištenja ovog prijevoda.\n"
"\n---\n\n**Odricanje od odgovornosti**: \nOvaj dokument je preveden korištenjem AI usluge za prevođenje [Co-op Translator](https://github.com/Azure/co-op-translator). Iako nastojimo osigurati točnost, imajte na umu da automatski prijevodi mogu sadržavati pogreške ili netočnosti. Izvorni dokument na izvornom jeziku treba smatrati mjerodavnim izvorom. Za ključne informacije preporučuje se profesionalni prijevod od strane stručnjaka. Ne preuzimamo odgovornost za bilo kakva nesporazuma ili pogrešna tumačenja koja mogu proizaći iz korištenja ovog prijevoda.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:19:23+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:57:44+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "hr"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## Uvod u vjerojatnost i statistiku\n",
"## Zadatak\n",
"\n",
"U ovom zadatku koristit ćemo skup podataka o pacijentima s dijabetesom preuzet [odavde](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"U ovom zadatku koristit ćemo skup podataka o pacijentima s dijabetesom preuzet [s ove stranice](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,12 +150,12 @@
{
"cell_type": "markdown",
"source": [
"U ovom skupu podataka, stupci su sljedeći: \n",
"* Dob i spol su sami po sebi razumljivi \n",
"* BMI je indeks tjelesne mase \n",
"* BP je prosječni krvni tlak \n",
"* S1 do S6 su različita mjerenja krvi \n",
"* Y je kvalitativna mjera napredovanja bolesti tijekom jedne godine \n",
"U ovom skupu podataka, stupci su sljedeći:\n",
"* Dob i spol su sami po sebi razumljivi\n",
"* BMI je indeks tjelesne mase\n",
"* BP je prosječni krvni tlak\n",
"* S1 do S6 su različita mjerenja krvi\n",
"* Y je kvalitativna mjera napredovanja bolesti tijekom jedne godine\n",
"\n",
"Proučimo ovaj skup podataka koristeći metode vjerojatnosti i statistike.\n",
"\n",
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -604,7 +604,7 @@
"source": [
"### Zadatak 4: Testirajte korelaciju između različitih varijabli i napredovanja bolesti (Y)\n",
"\n",
"> **Savjet** Korelacijska matrica pružit će vam najkorisnije informacije o tome koje vrijednosti su međusobno ovisne.\n"
"> **Savjet** Korelacijska matrica pružit će vam najkorisnije informacije o tome koje su vrijednosti međusobno ovisne.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -918,7 +918,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Odricanje od odgovornosti**: \nOvaj dokument je preveden pomoću AI usluge za prevođenje [Co-op Translator](https://github.com/Azure/co-op-translator). Iako nastojimo osigurati točnost, imajte na umu da automatski prijevodi mogu sadržavati pogreške ili netočnosti. Izvorni dokument na izvornom jeziku treba smatrati autoritativnim izvorom. Za ključne informacije preporučuje se profesionalni prijevod od strane čovjeka. Ne preuzimamo odgovornost za bilo kakva nesporazuma ili pogrešna tumačenja koja proizlaze iz korištenja ovog prijevoda.\n"
"\n---\n\n**Odricanje od odgovornosti**: \nOvaj dokument je preveden korištenjem AI usluge za prevođenje [Co-op Translator](https://github.com/Azure/co-op-translator). Iako nastojimo osigurati točnost, imajte na umu da automatski prijevodi mogu sadržavati pogreške ili netočnosti. Izvorni dokument na izvornom jeziku treba smatrati mjerodavnim izvorom. Za ključne informacije preporučuje se profesionalni prijevod od strane stručnjaka. Ne preuzimamo odgovornost za bilo kakva nesporazuma ili pogrešna tumačenja koja mogu proizaći iz korištenja ovog prijevoda.\n"
]
}
],
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:25:18+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:58:02+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "hr"
}

@ -6,7 +6,7 @@
"## Bevezetés a valószínűségszámításba és statisztikába\n",
"## Feladat\n",
"\n",
"Ebben a feladatban a cukorbeteg páciensek adatállományát fogjuk használni, amely [innen származik](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Ebben a feladatban a cukorbeteg páciensek adathalmazát fogjuk használni, amely [innen származik](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,14 +149,14 @@
{
"cell_type": "markdown",
"source": [
"Ebben az adatállományban az oszlopok a következők:\n",
"Ebben az adathalmazban az oszlopok a következők:\n",
"* Az életkor és nem magától értetődőek\n",
"* A BMI a testtömeg-indexet jelenti\n",
"* A BP az átlagos vérnyomást jelöli\n",
"* A BMI a testtömeg-index\n",
"* A BP az átlagos vérnyomás\n",
"* Az S1-től S6-ig különböző vérvizsgálati eredmények\n",
"* Az Y a betegség egyéves előrehaladásának kvalitatív mértéke\n",
"\n",
"Vizsgáljuk meg ezt az adatállományt a valószínűség és statisztika módszereivel.\n",
"Vizsgáljuk meg ezt az adathalmazt a valószínűség és statisztika módszereivel.\n",
"\n",
"### Feladat 1: Számítsuk ki az átlagértékeket és a szórást minden értékre\n"
],
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Feladat 2: Készítsen boxplotokat a BMI, BP és Y értékekről nemek szerint\n"
"### 2. feladat: Készítsen boxplotokat a BMI, BP és Y értékekről nemek szerint\n"
],
"metadata": {}
},
@ -198,9 +198,9 @@
{
"cell_type": "markdown",
"source": [
"### 4. feladat: Vizsgálja meg a különböző változók és a betegség előrehaladása (Y) közötti korrelációt\n",
"### Feladat 4: Vizsgáld meg a különböző változók és a betegség előrehaladása (Y) közötti korrelációt\n",
"\n",
"> **Tipp** A korrelációs mátrix nyújtja a leghasznosabb információt arról, hogy mely értékek függenek egymástól.\n"
"> **Tipp** A korrelációs mátrix nyújtja a leghasznosabb információt arról, hogy mely értékek függnek egymástól.\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Felelősség kizárása**: \nEz a dokumentum az AI fordítási szolgáltatás, a [Co-op Translator](https://github.com/Azure/co-op-translator) segítségével lett lefordítva. Bár törekszünk a pontosságra, kérjük, vegye figyelembe, hogy az automatikus fordítások hibákat vagy pontatlanságokat tartalmazhatnak. Az eredeti dokumentum az eredeti nyelvén tekintendő hiteles forrásnak. Kritikus információk esetén javasolt professzionális emberi fordítást igénybe venni. Nem vállalunk felelősséget semmilyen félreértésért vagy téves értelmezésért, amely a fordítás használatából eredhet.\n"
"\n---\n\n**Felelősségkizárás**: \nEz a dokumentum az [Co-op Translator](https://github.com/Azure/co-op-translator) AI fordítási szolgáltatás segítségével készült. Bár törekszünk a pontosságra, kérjük, vegye figyelembe, hogy az automatikus fordítások hibákat vagy pontatlanságokat tartalmazhatnak. Az eredeti dokumentum az eredeti nyelvén tekintendő hiteles forrásnak. Kritikus információk esetén javasolt professzionális, emberi fordítást igénybe venni. Nem vállalunk felelősséget a fordítás használatából eredő félreértésekért vagy téves értelmezésekért.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:19:36+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:49:19+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "hu"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## Bevezetés a valószínűségszámításba és statisztikába\n",
"## Feladat\n",
"\n",
"Ebben a feladatban a cukorbeteg páciensek adatállományát fogjuk használni, amely [innen származik](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Ebben a feladatban a cukorbeteg páciensek adathalmazát fogjuk használni, amely [innen származik](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"Ebben az adathalmazban az oszlopok a következők:\n",
"* Az életkor és nem magától értetődőek\n",
"* A BMI a testtömeg-indexet jelenti\n",
"* A BP az átlagos vérnyomást jelöli\n",
"* Az S1-től S6-ig különböző vérvizsgálati eredmények\n",
"* Az Y a betegség egyéves előrehaladásának kvalitatív mértéke\n",
"Ebben az adathalmazban az oszlopok a következők: \n",
"* Az életkor és a nem magától értetődő \n",
"* A BMI a testtömegindex \n",
"* A BP az átlagos vérnyomás \n",
"* Az S1-től S6-ig különböző vérvizsgálati eredmények \n",
"* Az Y a betegség egyéves előrehaladásának kvalitatív mértéke \n",
"\n",
"Vizsgáljuk meg ezt az adathalmazt a valószínűség és statisztika módszereivel.\n",
"Vizsgáljuk meg ezt az adathalmazt a valószínűség és a statisztika módszereivel.\n",
"\n",
"### Feladat 1: Számítsuk ki az átlagértékeket és a szórást minden értékre\n"
"### 1. feladat: Számítsuk ki az összes érték átlagát és szórását\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Feladat 2: Készítsen boxplotokat a BMI, BP és Y értékekről nemek szerint\n"
"### 2. feladat: Készítsen boxplotokat a BMI, BP és Y értékekről nemek szerint\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -591,7 +591,7 @@
"cell_type": "markdown",
"source": [
"Következtetések:\n",
"* Kor - normális\n",
"* Életkor - normális\n",
"* Nem - egységes\n",
"* BMI, Y - nehéz megállapítani\n"
],
@ -602,7 +602,7 @@
"source": [
"### Feladat 4: Vizsgáld meg a különböző változók és a betegség előrehaladása (Y) közötti korrelációt\n",
"\n",
"> **Tipp** A korrelációs mátrix nyújtja a leghasznosabb információt arról, hogy mely értékek függnek egymástól.\n"
"> **Tipp** A korrelációs mátrix nyújtja a leghasznosabb információt arról, hogy mely értékek függenek egymástól.\n"
],
"metadata": {}
},
@ -844,8 +844,8 @@
{
"cell_type": "markdown",
"source": [
"Következtetés:\n",
"* Y legerősebb korrelációja a BMI és az S5 (vércukor). Ez logikusnak tűnik.\n"
"Következtetés: \n",
"* Az Y legerősebb korrelációja a BMI-vel és az S5-tel (vércukor). Ez ésszerűnek tűnik.\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Felelősség kizárása**: \nEz a dokumentum az AI fordítási szolgáltatás, a [Co-op Translator](https://github.com/Azure/co-op-translator) segítségével lett lefordítva. Bár törekszünk a pontosságra, kérjük, vegye figyelembe, hogy az automatikus fordítások hibákat vagy pontatlanságokat tartalmazhatnak. Az eredeti dokumentum az eredeti nyelvén tekintendő hiteles forrásnak. Kritikus információk esetén javasolt professzionális emberi fordítást igénybe venni. Nem vállalunk felelősséget semmilyen félreértésért vagy téves értelmezésért, amely a fordítás használatából eredhet.\n"
"\n---\n\n**Felelősségkizárás**: \nEz a dokumentum az [Co-op Translator](https://github.com/Azure/co-op-translator) AI fordítási szolgáltatás segítségével készült. Bár törekszünk a pontosságra, kérjük, vegye figyelembe, hogy az automatikus fordítások hibákat vagy pontatlanságokat tartalmazhatnak. Az eredeti dokumentum az eredeti nyelvén tekintendő hiteles forrásnak. Kritikus információk esetén javasolt professzionális, emberi fordítást igénybe venni. Nem vállalunk felelősséget a fordítás használatából eredő félreértésekért vagy téves értelmezésekért.\n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:25:34+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:49:36+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "hu"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,12 +149,12 @@
{
"cell_type": "markdown",
"source": [
"Dalam dataset ini, kolom-kolomnya adalah sebagai berikut: \n",
"* Usia dan jenis kelamin sudah jelas. \n",
"* BMI adalah indeks massa tubuh. \n",
"* BP adalah tekanan darah rata-rata. \n",
"* S1 hingga S6 adalah berbagai pengukuran darah. \n",
"* Y adalah ukuran kualitatif dari perkembangan penyakit selama satu tahun. \n",
"Dalam dataset ini, kolom-kolomnya adalah sebagai berikut:\n",
"* Usia dan jenis kelamin sudah jelas\n",
"* BMI adalah indeks massa tubuh\n",
"* BP adalah tekanan darah rata-rata\n",
"* S1 hingga S6 adalah berbagai pengukuran darah\n",
"* Y adalah ukuran kualitatif dari perkembangan penyakit selama satu tahun\n",
"\n",
"Mari kita pelajari dataset ini menggunakan metode probabilitas dan statistik.\n",
"\n",
@ -200,7 +200,7 @@
"source": [
"### Tugas 4: Uji korelasi antara berbagai variabel dan perkembangan penyakit (Y)\n",
"\n",
"> **Petunjuk** Matriks korelasi akan memberikan informasi paling berguna tentang nilai-nilai yang saling bergantung.\n"
"> **Petunjuk** Matriks korelasi akan memberikan informasi paling berguna tentang nilai-nilai mana yang saling bergantung.\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Penafian**: \nDokumen ini telah diterjemahkan menggunakan layanan penerjemahan AI [Co-op Translator](https://github.com/Azure/co-op-translator). Meskipun kami berusaha untuk memberikan hasil yang akurat, harap diingat bahwa terjemahan otomatis mungkin mengandung kesalahan atau ketidakakuratan. Dokumen asli dalam bahasa aslinya harus dianggap sebagai sumber yang otoritatif. Untuk informasi yang bersifat kritis, disarankan menggunakan jasa penerjemahan profesional oleh manusia. Kami tidak bertanggung jawab atas kesalahpahaman atau penafsiran yang keliru yang timbul dari penggunaan terjemahan ini.\n"
"\n---\n\n**Penafian**: \nDokumen ini telah diterjemahkan menggunakan layanan penerjemahan AI [Co-op Translator](https://github.com/Azure/co-op-translator). Meskipun kami berupaya untuk memberikan hasil yang akurat, harap diperhatikan bahwa terjemahan otomatis mungkin mengandung kesalahan atau ketidakakuratan. Dokumen asli dalam bahasa aslinya harus dianggap sebagai sumber yang berwenang. Untuk informasi yang bersifat kritis, disarankan menggunakan jasa penerjemahan manusia profesional. Kami tidak bertanggung jawab atas kesalahpahaman atau penafsiran yang keliru yang timbul dari penggunaan terjemahan ini.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:19:47+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:43:52+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "id"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,12 +150,12 @@
{
"cell_type": "markdown",
"source": [
"Dalam dataset ini, kolom-kolomnya adalah sebagai berikut: \n",
"* Usia dan jenis kelamin sudah jelas. \n",
"* BMI adalah indeks massa tubuh. \n",
"* BP adalah tekanan darah rata-rata. \n",
"* S1 hingga S6 adalah berbagai pengukuran darah. \n",
"* Y adalah ukuran kualitatif dari perkembangan penyakit selama satu tahun. \n",
"Dalam dataset ini, kolom-kolomnya adalah sebagai berikut:\n",
"* Usia dan jenis kelamin sudah jelas\n",
"* BMI adalah indeks massa tubuh\n",
"* BP adalah tekanan darah rata-rata\n",
"* S1 hingga S6 adalah berbagai pengukuran darah\n",
"* Y adalah ukuran kualitatif dari perkembangan penyakit selama satu tahun\n",
"\n",
"Mari kita pelajari dataset ini menggunakan metode probabilitas dan statistik.\n",
"\n",
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Tugas 2: Plot boxplot untuk BMI, BP, dan Y berdasarkan jenis kelamin\n"
"### Tugas 2: Buat boxplot untuk BMI, BP, dan Y berdasarkan jenis kelamin\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -593,7 +593,7 @@
"Kesimpulan:\n",
"* Usia - normal \n",
"* Jenis Kelamin - seragam \n",
"* BMI, Y - sulit untuk menentukan \n"
"* BMI, Y - sulit untuk ditentukan \n"
],
"metadata": {}
},
@ -602,7 +602,7 @@
"source": [
"### Tugas 4: Uji korelasi antara berbagai variabel dan perkembangan penyakit (Y)\n",
"\n",
"> **Hint** Matriks korelasi akan memberikan informasi paling berguna tentang nilai-nilai yang saling bergantung.\n"
"> **Petunjuk** Matriks korelasi akan memberikan informasi paling berguna tentang nilai-nilai mana yang saling bergantung.\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Penafian**: \nDokumen ini telah diterjemahkan menggunakan layanan penerjemahan AI [Co-op Translator](https://github.com/Azure/co-op-translator). Meskipun kami berupaya untuk memberikan hasil yang akurat, harap diperhatikan bahwa terjemahan otomatis mungkin mengandung kesalahan atau ketidakakuratan. Dokumen asli dalam bahasa aslinya harus dianggap sebagai sumber yang berwenang. Untuk informasi yang bersifat kritis, disarankan menggunakan jasa penerjemahan manusia profesional. Kami tidak bertanggung jawab atas kesalahpahaman atau interpretasi yang keliru yang timbul dari penggunaan terjemahan ini.\n"
"\n---\n\n**Penafian**: \nDokumen ini telah diterjemahkan menggunakan layanan penerjemahan AI [Co-op Translator](https://github.com/Azure/co-op-translator). Meskipun kami berupaya untuk memberikan hasil yang akurat, harap diperhatikan bahwa terjemahan otomatis mungkin mengandung kesalahan atau ketidakakuratan. Dokumen asli dalam bahasa aslinya harus dianggap sebagai sumber yang berwenang. Untuk informasi yang bersifat kritis, disarankan menggunakan jasa penerjemahan manusia profesional. Kami tidak bertanggung jawab atas kesalahpahaman atau penafsiran yang keliru yang timbul dari penggunaan terjemahan ini.\n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:25:50+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:44:07+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "id"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"In questo dataset, le colonne sono le seguenti: \n",
"* Età e sesso sono autoesplicativi \n",
"* BMI è l'indice di massa corporea \n",
"* BP è la pressione sanguigna media \n",
"* S1 fino a S6 sono diverse misurazioni del sangue \n",
"* Y è la misura qualitativa della progressione della malattia nell'arco di un anno \n",
"In questo dataset, le colonne sono le seguenti:\n",
"* Età e sesso sono autoesplicativi\n",
"* BMI è l'indice di massa corporea\n",
"* BP è la pressione sanguigna media\n",
"* S1 fino a S6 sono diverse misurazioni del sangue\n",
"* Y è la misura qualitativa della progressione della malattia nell'arco di un anno\n",
"\n",
"Studiamo questo dataset utilizzando metodi di probabilità e statistica.\n",
"\n",
"### Attività 1: Calcolare i valori medi e la varianza per tutti i valori \n"
"### Compito 1: Calcolare i valori medi e la varianza per tutti i valori\n"
],
"metadata": {}
},
@ -196,7 +196,7 @@
{
"cell_type": "markdown",
"source": [
"### Compito 4: Testare la correlazione tra diverse variabili e la progressione della malattia (Y)\n",
"### Attività 4: Testare la correlazione tra diverse variabili e la progressione della malattia (Y)\n",
"\n",
"> **Suggerimento** La matrice di correlazione ti fornirà le informazioni più utili su quali valori sono dipendenti.\n"
],
@ -221,7 +221,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Disclaimer**: \nQuesto documento è stato tradotto utilizzando il servizio di traduzione automatica [Co-op Translator](https://github.com/Azure/co-op-translator). Sebbene ci impegniamo per garantire l'accuratezza, si prega di notare che le traduzioni automatiche possono contenere errori o imprecisioni. Il documento originale nella sua lingua nativa dovrebbe essere considerato la fonte autorevole. Per informazioni critiche, si raccomanda una traduzione professionale effettuata da un traduttore umano. Non siamo responsabili per eventuali fraintendimenti o interpretazioni errate derivanti dall'uso di questa traduzione.\n"
"\n---\n\n**Disclaimer**: \nQuesto documento è stato tradotto utilizzando il servizio di traduzione automatica [Co-op Translator](https://github.com/Azure/co-op-translator). Sebbene ci impegniamo per garantire l'accuratezza, si prega di notare che le traduzioni automatiche possono contenere errori o imprecisioni. Il documento originale nella sua lingua nativa dovrebbe essere considerato la fonte autorevole. Per informazioni critiche, si consiglia una traduzione professionale eseguita da un traduttore umano. Non siamo responsabili per eventuali fraintendimenti o interpretazioni errate derivanti dall'uso di questa traduzione.\n"
]
}
],
@ -247,8 +247,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:19:57+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:27:46+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "it"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -483,8 +483,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -533,8 +533,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -591,7 +591,7 @@
"Conclusioni:\n",
"* Età - normale\n",
"* Sesso - uniforme\n",
"* BMI, Y - difficile da dire\n"
"* BMI, Y - difficile da determinare\n"
],
"metadata": {}
},
@ -851,10 +851,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -881,9 +881,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -912,7 +912,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Disclaimer**: \nQuesto documento è stato tradotto utilizzando il servizio di traduzione automatica [Co-op Translator](https://github.com/Azure/co-op-translator). Sebbene ci impegniamo per garantire l'accuratezza, si prega di notare che le traduzioni automatiche possono contenere errori o imprecisioni. Il documento originale nella sua lingua nativa dovrebbe essere considerato la fonte autorevole. Per informazioni critiche, si raccomanda una traduzione professionale effettuata da un traduttore umano. Non siamo responsabili per eventuali incomprensioni o interpretazioni errate derivanti dall'uso di questa traduzione.\n"
"\n---\n\n**Disclaimer**: \nQuesto documento è stato tradotto utilizzando il servizio di traduzione automatica [Co-op Translator](https://github.com/Azure/co-op-translator). Sebbene ci impegniamo per garantire l'accuratezza, si prega di notare che le traduzioni automatiche possono contenere errori o imprecisioni. Il documento originale nella sua lingua nativa dovrebbe essere considerato la fonte autorevole. Per informazioni critiche, si consiglia una traduzione professionale eseguita da un traduttore umano. Non siamo responsabili per eventuali fraintendimenti o interpretazioni errate derivanti dall'uso di questa traduzione.\n"
]
}
],
@ -938,8 +938,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:26:04+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:28:00+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "it"
}

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## 確率と統計の入門 \n",
"## 課題 \n",
"## 確率と統計の入門\n",
"## 課題\n",
"\n",
"この課題では、[こちら](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)から取得した糖尿病患者のデータセットを使用します。 \n"
"この課題では、[こちら](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)から取得した糖尿病患者のデータセットを使用します。\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### タスク3: 年齢、性別、BMI、Y変数の分布は何ですか?\n"
"### タスク3: 年齢、性別、BMI、およびY変数の分布はどうなっていますか?\n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### タスク 4: 異なる変数と病気の進行 (Y) の相関をテストする\n",
"\n",
"> **ヒント** 相関行列は、どの値が依存しているかについて最も有な情報を提供します。\n"
"> **ヒント** 相関行列は、どの値が依存しているかについて最も有な情報を提供します。\n"
],
"metadata": {}
},
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責事項**: \nこの文書はAI翻訳サービス[Co-op Translator](https://github.com/Azure/co-op-translator)を使用して翻訳されています。正確性を追求しておりますが、自動翻訳には誤りや不正確な部分が含まれる可能性があります。元の言語で記載された文書を正式な情報源としてご参照ください。重要な情報については、専門の人間による翻訳を推奨します。この翻訳の使用に起因する誤解や誤解釈について、当社は責任を負いません。\n"
"\n---\n\n**免責事項**: \nこの文書はAI翻訳サービス [Co-op Translator](https://github.com/Azure/co-op-translator) を使用して翻訳されています。正確性を期すよう努めておりますが、自動翻訳には誤りや不正確な表現が含まれる可能性があります。元の言語で記載された原文を公式な情報源としてご参照ください。重要な情報については、専門の人間による翻訳を推奨します。この翻訳の利用に起因する誤解や誤認について、当方は一切の責任を負いません。\n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:20:10+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:15:31+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "ja"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## 確率と統計の入門 \n",
"## 課題 \n",
"## 確率と統計の入門\n",
"## 課題\n",
"\n",
"この課題では、[こちら](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)から取得した糖尿病患者のデータセットを使用します。 \n"
"この課題では、[こちら](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)から取得した糖尿病患者のデータセットを使用します。\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,14 +150,14 @@
{
"cell_type": "markdown",
"source": [
"このデータセットには、以下の列が含まれています:\n",
"* 年齢と性別はそのままの意味です\n",
"このデータセットには以下の列があります:\n",
"* 年齢と性別はそのまま説明不要です\n",
"* BMIは体格指数を表します\n",
"* BPは平均血圧をします\n",
"* S1からS6は異なる血液測定値を表します\n",
"* Yは1年間の病気の進行度を定性的に測定したものです\n",
"* BPは平均血圧をします\n",
"* S1からS6は異なる血液測定値す\n",
"* Yは1年間の疾患進行の定性的な指標です\n",
"\n",
"このデータセットを確率と統計の手法を用いて分析してみましょう。\n",
"このデータセットを確率と統計の手法を使って分析してみましょう。\n",
"\n",
"### タスク 1: 全ての値の平均値と分散を計算する\n"
],
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### タスク3: 年齢、性別、BMI、Y変数の分布は何ですか?\n"
"### タスク3: 年齢、性別、BMI、およびY変数の分布はどうなっていますか?\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -602,9 +602,9 @@
{
"cell_type": "markdown",
"source": [
"### タスク 4: 異なる変数と病気の進行 (Y) の相関をテストする\n",
"### タスク 4: 異なる変数と病気の進行 (Y) の相関をテストする\n",
"\n",
"> **ヒント** 相関行列は、どの値が依存しているかについて最も有な情報を提供します。\n"
"> **ヒント** 相関行列は、どの値が依存しているかについて最も有な情報を提供します。\n"
],
"metadata": {}
},
@ -846,8 +846,8 @@
{
"cell_type": "markdown",
"source": [
"結論:\n",
"* Yと最も強い相関はBMIとS5血糖値です。これは妥当な結果のように思われます。\n"
"結論: \n",
"* Yと最も強い相関があるのはBMIとS5血糖値です。これは妥当な結果のように思われます。\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +885,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +916,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責事項**: \nこの文書はAI翻訳サービス[Co-op Translator](https://github.com/Azure/co-op-translator)を使用して翻訳されています。正確性を追求しておりますが、自動翻訳には誤りや不正確な部分が含まれる可能性があります。元の言語で記載された文書を正式な情報源としてお考えください。重要な情報については、専門の人間による翻訳を推奨します。この翻訳の使用に起因する誤解や誤った解釈について、当方は一切の責任を負いません。\n"
"\n---\n\n**免責事項**: \nこの文書はAI翻訳サービス [Co-op Translator](https://github.com/Azure/co-op-translator) を使用して翻訳されています。正確性を期すよう努めておりますが、自動翻訳には誤りや不正確な表現が含まれる可能性があります。元の言語で記載された原文を公式な情報源としてご参照ください。重要な情報については、専門の人間による翻訳を推奨します。本翻訳の利用に起因する誤解や誤認について、当社は一切の責任を負いません。\n"
]
}
],
@ -942,8 +942,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:26:26+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:15:47+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "ja"
}

@ -6,7 +6,7 @@
"## 확률과 통계 소개\n",
"## 과제\n",
"\n",
"이 과제에서는 [여기에서](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) 가져온 당뇨병 환자 데이터셋을 사용할 것입니다.\n"
"이 과제에서는 [여기에서 가져온](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) 당뇨병 환자 데이터셋을 사용할 것입니다.\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"이 데이터셋의 열은 다음과 같습니다: \n",
"* Age와 sex는 별도의 설명이 필요 없습니다. \n",
"* BMI는 체질량지수입니다. \n",
"* BP는 평균 혈압입니다. \n",
"* S1부터 S6까지는 서로 다른 혈액 측정값입니다. \n",
"* Y는 1년 동안의 질병 진행 정도를 나타내는 정성적 지표입니다. \n",
"이 데이터셋의 열은 다음과 같습니다:\n",
"* Age와 sex는 별도의 설명이 필요 없습니다\n",
"* BMI는 체질량지수입니다\n",
"* BP는 평균 혈압입니다\n",
"* S1부터 S6까지는 서로 다른 혈액 측정값입니다\n",
"* Y는 1년 동안 질병 진행의 정성적 측정값입니다\n",
"\n",
"이 데이터셋을 확률과 통계 방법을 사용하여 분석해 봅시다.\n",
"\n",
"### 작업 1: 모든 값에 대한 평균값과 분산 계산\n"
"### 작업 1: 모든 값의 평균값과 분산을 계산하세요\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Task 2: 성별에 따라 BMI, BP 및 Y의 박스플롯 그리기\n"
"### 작업 2: 성별에 따라 BMI, BP 및 Y에 대한 박스플롯 그리기\n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### 작업 3: 연령, 성별, BMI 및 Y 변수의 분포는 무엇입니까?\n"
"### 작업 3: 나이, 성별, BMI 및 Y 변수의 분포는 무엇인가요?\n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### 작업 4: 다양한 변수와 질병 진행(Y) 간의 상관관계 테스트\n",
"\n",
"> **힌트** 상관관계 행렬은 어떤 값들이 서로 의존적인지에 대한 가장 유용한 정보를 제공합니다.\n"
"> **힌트** 상관 행렬은 어떤 값들이 서로 의존적인지에 대한 가장 유용한 정보를 제공합니다.\n"
],
"metadata": {}
},
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**면책 조항**: \n이 문서는 AI 번역 서비스 [Co-op Translator](https://github.com/Azure/co-op-translator)를 사용하여 번역되었습니다. 정확성을 위해 최선을 다하고 있지만, 자동 번역에는 오류나 부정확성이 포함될 수 있습니다. 원본 문서의 원어 버전을 권위 있는 출처로 간주해야 합니다. 중요한 정보의 경우, 전문적인 인간 번역을 권장합니다. 이 번역 사용으로 인해 발생하는 오해나 잘못된 해석에 대해 책임을 지지 않습니다.\n"
"\n---\n\n**면책 조항**: \n이 문서는 AI 번역 서비스 [Co-op Translator](https://github.com/Azure/co-op-translator)를 사용하여 번역되었습니다. 정확성을 위해 최선을 다하고 있으나, 자동 번역에는 오류나 부정확성이 포함될 수 있습니다. 원본 문서(원어로 작성된 문서)를 권위 있는 자료로 간주해야 합니다. 중요한 정보의 경우, 전문적인 인간 번역을 권장합니다. 이 번역 사용으로 인해 발생하는 오해나 잘못된 해석에 대해 당사는 책임을 지지 않습니다. \n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:20:23+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:16:58+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "ko"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## 확률과 통계 소개\n",
"## 과제\n",
"\n",
"이 과제에서는 [여기에서](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) 가져온 당뇨병 환자 데이터셋을 사용할 것입니다.\n"
"이 과제에서는 [여기에서 가져온](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) 당뇨병 환자 데이터셋을 사용할 것입니다.\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +529,7 @@
{
"cell_type": "markdown",
"source": [
"### 작업 3: Age, Sex, BMI 및 Y 변수의 분포는 무엇입니까?\n"
"### 작업 3: 나이, 성별, BMI 및 Y 변수의 분포는 무엇입니까?\n"
],
"metadata": {}
},
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -602,9 +602,9 @@
{
"cell_type": "markdown",
"source": [
"### 작업 4: 다양한 변수와 질병 진행(Y) 간의 상관관계 테스트하세요\n",
"### 작업 4: 다양한 변수와 질병 진행(Y) 간의 상관관계 테스트\n",
"\n",
"> **힌트** 상관관계 행렬은 어떤 값들이 서로 의존적인지에 대한 가장 유용한 정보를 제공합니다.\n"
"> **힌트** 상관 행렬은 어떤 값들이 서로 의존적인지에 대한 가장 유용한 정보를 제공합니다.\n"
],
"metadata": {}
},
@ -847,7 +847,7 @@
"cell_type": "markdown",
"source": [
"결론: \n",
"* Y와 가장 강한 상관관계를 가진 변수는 BMI와 S5(혈당)입니다. 이는 합리적으로 들립니다.\n"
"* Y와 가장 강한 상관관계를 보이는 것은 BMI와 S5(혈당)입니다. 이는 합리적으로 들립니다.\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +885,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +916,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**면책 조항**: \n이 문서는 AI 번역 서비스 [Co-op Translator](https://github.com/Azure/co-op-translator)를 사용하여 번역되었습니다. 정확성을 위해 최선을 다하고 있지만, 자동 번역에는 오류나 부정확성이 포함될 수 있습니다. 원본 문서를 해당 언어로 작성된 상태에서 권위 있는 자료로 간주해야 합니다. 중요한 정보의 경우, 전문적인 인간 번역을 권장합니다. 이 번역 사용으로 인해 발생하는 오해나 잘못된 해석에 대해 책임을 지지 않습니다.\n"
"\n---\n\n**면책 조항**: \n이 문서는 AI 번역 서비스 [Co-op Translator](https://github.com/Azure/co-op-translator)를 사용하여 번역되었습니다. 정확성을 위해 최선을 다하고 있으나, 자동 번역에는 오류나 부정확성이 포함될 수 있습니다. 원본 문서를 해당 언어로 작성된 상태에서 권위 있는 자료로 간주해야 합니다. 중요한 정보의 경우, 전문적인 인간 번역을 권장합니다. 이 번역 사용으로 인해 발생할 수 있는 오해나 잘못된 해석에 대해 당사는 책임을 지지 않습니다. \n"
]
}
],
@ -942,8 +942,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:26:42+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:17:21+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "ko"
}

@ -6,7 +6,7 @@
"## Įvadas į tikimybes ir statistiką\n",
"## Užduotis\n",
"\n",
"Šioje užduotyje naudosime diabetu sergančių pacientų duomenų rinkinį, paimtą [iš čia](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Šioje užduotyje naudosime diabeto pacientų duomenų rinkinį, paimtą [iš čia](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -152,8 +152,8 @@
"Šiame duomenų rinkinyje stulpeliai yra tokie: \n",
"* Amžius ir lytis yra savaime suprantami \n",
"* KMI yra kūno masės indeksas \n",
"* AKS yra vidutinis kraujo spaudimas \n",
"* S1 iki S6 yra skirtingi kraujo matavimai \n",
"* AKS yra vidutinis kraujospūdis \n",
"* S1 iki S6 yra skirtingi kraujo tyrimų rodikliai \n",
"* Y yra kokybinis ligos progresavimo matas per vienerius metus \n",
"\n",
"Išnagrinėkime šį duomenų rinkinį naudodami tikimybių ir statistikos metodus.\n",
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Užduotis 2: Nubraižykite dėžės diagramas BMI, BP ir Y priklausomai nuo lyties\n"
"### Užduotis 2: Nubraižykite BMI, BP ir Y dėžutinius grafikus pagal lytį\n"
],
"metadata": {}
},
@ -186,7 +186,7 @@
{
"cell_type": "markdown",
"source": [
"### Užduotis 3: Koks yra amžiaus, lyties, KMI ir Y kintamųjų pasiskirstymas?\n"
"### Užduotis 3: Kokia yra Amžiaus, Lyties, KMI ir Y kintamųjų pasiskirstymas?\n"
],
"metadata": {}
},
@ -200,9 +200,9 @@
{
"cell_type": "markdown",
"source": [
"### Užduotis 4: Patikrinti koreliaciją tarp skirtingų kintamųjų ir ligos progresavimo (Y)\n",
"### Užduotis 4: Ištirkite skirtingų kintamųjų ir ligos progresavimo (Y) koreliaciją\n",
"\n",
"> **Patarimas** Koreliacijos matrica suteiks naudingiausią informaciją apie tai, kurie rodikliai yra priklausomi.\n"
"> **Patarimas** Koreliacijos matrica suteiks jums naudingiausią informaciją apie tai, kurie dydžiai yra tarpusavyje priklausomi.\n"
],
"metadata": {}
},
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Atsakomybės apribojimas**: \nŠis dokumentas buvo išverstas naudojant AI vertimo paslaugą [Co-op Translator](https://github.com/Azure/co-op-translator). Nors siekiame tikslumo, prašome atkreipti dėmesį, kad automatiniai vertimai gali turėti klaidų ar netikslumų. Originalus dokumentas jo gimtąja kalba turėtų būti laikomas autoritetingu šaltiniu. Kritinei informacijai rekomenduojama profesionali žmogaus vertimo paslauga. Mes neprisiimame atsakomybės už nesusipratimus ar klaidingus interpretavimus, atsiradusius dėl šio vertimo naudojimo.\n"
"\n---\n\n**Atsakomybės apribojimas**: \nŠis dokumentas buvo išverstas naudojant dirbtinio intelekto vertimo paslaugą [Co-op Translator](https://github.com/Azure/co-op-translator). Nors siekiame tikslumo, atkreipiame dėmesį, kad automatiniai vertimai gali turėti klaidų ar netikslumų. Originalus dokumentas jo gimtąja kalba turėtų būti laikomas autoritetingu šaltiniu. Dėl svarbios informacijos rekomenduojame kreiptis į profesionalius vertėjus. Mes neprisiimame atsakomybės už nesusipratimus ar klaidingus aiškinimus, kylančius dėl šio vertimo naudojimo.\n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-01T23:20:36+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T18:03:38+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "lt"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## Įvadas į tikimybes ir statistiką\n",
"## Užduotis\n",
"\n",
"Šioje užduotyje naudosime diabetu sergančių pacientų duomenų rinkinį, paimtą [iš čia](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Šioje užduotyje naudosime diabeto pacientų duomenų rinkinį, paimtą [iš čia](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,12 +150,13 @@
{
"cell_type": "markdown",
"source": [
"Šiame duomenų rinkinyje stulpeliai yra tokie: \n",
"* Amžius ir lytis yra savaime suprantami \n",
"* KMI yra kūno masės indeksas \n",
"* AKS yra vidutinis kraujo spaudimas \n",
"* S1 iki S6 yra skirtingi kraujo matavimai \n",
"* Y yra kokybinis ligos progresavimo matas per vienerius metus \n",
"Šiame duomenų rinkinyje stulpeliai yra tokie:\n",
"\n",
"* Amžius ir lytis yra savaime suprantami\n",
"* KMI yra kūno masės indeksas\n",
"* AKS yra vidutinis kraujo spaudimas\n",
"* S1 iki S6 yra skirtingi kraujo matavimai\n",
"* Y yra kokybinis ligos progresavimo matas per vienerius metus\n",
"\n",
"Išnagrinėkime šį duomenų rinkinį naudodami tikimybių ir statistikos metodus.\n",
"\n",
@ -354,7 +355,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +447,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +478,7 @@
{
"cell_type": "markdown",
"source": [
"### Užduotis 2: Nubraižykite dėžės diagramas BMI, BP ir Y priklausomai nuo lyties\n"
"### Užduotis 2: Nubraižykite BMI, BP ir Y dėžutės diagramas pagal lytį\n"
],
"metadata": {}
},
@ -485,8 +486,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -529,7 +530,7 @@
{
"cell_type": "markdown",
"source": [
"### Užduotis 3: Koks yra amžiaus, lyties, KMI ir Y kintamųjų pasiskirstymas?\n"
"### Užduotis 3: Kokia yra Amžiaus, Lyties, KMI ir Y kintamųjų pasiskirstymo analizė?\n"
],
"metadata": {}
},
@ -537,8 +538,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -593,16 +594,16 @@
"cell_type": "markdown",
"source": [
"Išvados: \n",
"* Amžius - normalus \n",
"* Lytis - vienoda \n",
"* KMI, Y - sunku pasakyti \n"
"* Amžius normalus \n",
"* Lytis vienoda \n",
"* KMI, Y sunku pasakyti \n"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"### Užduotis 4: Patikrinti koreliaciją tarp skirtingų kintamųjų ir ligos progresavimo (Y)\n",
"### Užduotis 4: Ištirkite koreliaciją tarp skirtingų kintamųjų ir ligos progresavimo (Y)\n",
"\n",
"> **Patarimas** Koreliacijos matrica suteiks naudingiausią informaciją apie tai, kurie kintamieji yra priklausomi.\n"
],
@ -847,7 +848,7 @@
"cell_type": "markdown",
"source": [
"Išvada: \n",
"* Stipriausia Y koreliacija yra su KMI ir S5 (cukraus kiekis kraujyje). Tai skamba logiškai.\n"
"* Stipriausia Y koreliacija yra su KMI ir S5 (cukraus kiekis kraujyje). Tai atrodo logiška.\n"
],
"metadata": {}
},
@ -855,10 +856,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +886,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +917,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Atsakomybės apribojimas**: \nŠis dokumentas buvo išverstas naudojant AI vertimo paslaugą [Co-op Translator](https://github.com/Azure/co-op-translator). Nors siekiame tikslumo, prašome atkreipti dėmesį, kad automatiniai vertimai gali turėti klaidų ar netikslumų. Originalus dokumentas jo gimtąja kalba turėtų būti laikomas autoritetingu šaltiniu. Kritinei informacijai rekomenduojama profesionali žmogaus vertimo paslauga. Mes neprisiimame atsakomybės už nesusipratimus ar klaidingus interpretavimus, atsiradusius naudojant šį vertimą.\n"
"\n---\n\n**Atsakomybės apribojimas**: \nŠis dokumentas buvo išverstas naudojant dirbtinio intelekto vertimo paslaugą [Co-op Translator](https://github.com/Azure/co-op-translator). Nors siekiame tikslumo, atkreipiame dėmesį, kad automatiniai vertimai gali turėti klaidų ar netikslumų. Originalus dokumentas jo gimtąja kalba turėtų būti laikomas autoritetingu šaltiniu. Dėl svarbios informacijos rekomenduojame kreiptis į profesionalius vertėjus. Mes neprisiimame atsakomybės už nesusipratimus ar klaidingus aiškinimus, kylančius dėl šio vertimo naudojimo.\n"
]
}
],
@ -942,8 +943,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-01T23:26:59+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T18:03:59+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "lt"
}

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## 概率與統計簡介\n",
"## 作業\n",
"## 概論:機率與統計 \n",
"## 作業 \n",
"\n",
"在這次作業中,我們將使用糖尿病患者的數據集,該數據集取自[此處](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)。\n"
"在這次作業中,我們將使用糖尿病患者的數據集,該數據集取自[這裡](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)。 \n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"在這個數據集中,欄位如下: \n",
"* 年齡和性別不需多作解釋 \n",
"* BMI 是身體質量指數 \n",
"* BP 是平均血壓 \n",
"* S1 到 S6 是不同的血液測量值 \n",
"* Y 是一年內疾病進展的定性指標 \n",
"在此數據集中,欄位如下:\n",
"* 年齡和性別不需額外解釋\n",
"* BMI 是身體質量指數\n",
"* BP 是平均血壓\n",
"* S1 到 S6 是不同的血液測量值\n",
"* Y 是疾病在一年內進展的定性指標\n",
"\n",
"讓我們使用機率與統計的方法來研究這個數據集。\n",
"讓我們使用概率和統計方法來研究這個數據集。\n",
"\n",
"### 任務 1計算所有值的平均值和變異數\n"
"### 任務 1計算所有值的平均值和方差\n"
],
"metadata": {}
},
@ -198,9 +198,9 @@
{
"cell_type": "markdown",
"source": [
"### 任務 4測試不同變數與疾病進展Y之間的相關性\n",
"### 任務 4測試不同變數與疾病進展 (Y) 之間的相關性\n",
"\n",
"> **提示** 相關矩陣可以提供最有用的資訊,幫助判斷哪些值是相互依賴的。\n"
"> **提示** 相關矩陣可以提供最有用的資訊,幫助判斷哪些值是相互依賴的。\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n本文件使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而引起的任何誤解或錯誤解讀概不負責。\n"
"\n---\n\n**免責聲明** \n本文件使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對於因使用此翻譯而產生的任何誤解或錯誤解讀概不負責。\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:43:11+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:11:24+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "mo"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## 概率與統計簡介\n",
"## 概率與統計簡介\n",
"## 作業\n",
"\n",
"在這次作業中,我們將使用糖尿病患者的數據集,該數據集取自[此處](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)。\n"
"在這次作業中,我們將使用從[這裡](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)取得的糖尿病患者數據集。\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,8 +150,8 @@
{
"cell_type": "markdown",
"source": [
"在此數據集中,欄位如下\n",
"* 年齡和性別不需額外解釋\n",
"在此數據集中,列包含以下內容\n",
"* 年齡和性別不需多作解釋\n",
"* BMI 是身體質量指數\n",
"* BP 是平均血壓\n",
"* S1 到 S6 是不同的血液測量值\n",
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -600,9 +600,9 @@
{
"cell_type": "markdown",
"source": [
"### 任務 4測試不同變數與疾病進展Y之間的相關性\n",
"### 任務 4測試不同變數與疾病進展 (Y) 之間的相關性\n",
"\n",
"> **提示** 相關矩陣可以提供最有用的資訊,幫助判斷哪些值是相互依賴的。\n"
"> **提示** 相關矩陣可以提供最有用的資訊,幫助判斷哪些值是相互依賴的。\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**免責聲明** \n本文件使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而引起的任何誤解或錯誤解讀概不負責。 \n"
"\n---\n\n**免責聲明** \n本文件使用 AI 翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊,建議尋求專業人工翻譯。我們對因使用此翻譯而產生的任何誤解或錯誤解讀概不負責。\n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:49:43+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:11:39+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "mo"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -196,7 +196,7 @@
{
"cell_type": "markdown",
"source": [
"### कार्य ४: विविध चल आणि आजार प्रगती (Y) यांच्यातील परस्परसंबंध तपासा\n",
"### कार्य ४: विविध चल आणि रोग प्रगती (Y) यांच्यातील परस्परसंबंध तपासा\n",
"\n",
"> **सूचना** परस्परसंबंध मॅट्रिक्स तुम्हाला कोणते मूल्ये परस्पर अवलंबून आहेत याबद्दल सर्वात उपयुक्त माहिती देईल.\n"
],
@ -221,7 +221,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**अस्वीकरण**: \nहा दस्तऐवज AI भाषांतर सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) चा वापर करून भाषांतरित करण्यात आला आहे. आम्ही अचूकतेसाठी प्रयत्नशील असलो तरी कृपया लक्षात ठेवा की स्वयंचलित भाषांतरे त्रुटी किंवा अचूकतेच्या अभावाने युक्त असू शकतात. मूळ भाषेतील दस्तऐवज हा अधिकृत स्रोत मानला जावा. महत्त्वाच्या माहितीसाठी व्यावसायिक मानवी भाषांतराची शिफारस केली जाते. या भाषांतराचा वापर करून उद्भवलेल्या कोणत्याही गैरसमज किंवा चुकीच्या अर्थासाठी आम्ही जबाबदार राहणार नाही.\n"
"\n---\n\n**अस्वीकरण**: \nहा दस्तऐवज AI भाषांतर सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) चा वापर करून भाषांतरित करण्यात आला आहे. आम्ही अचूकतेसाठी प्रयत्नशील असलो तरी, कृपया लक्षात घ्या की स्वयंचलित भाषांतरांमध्ये त्रुटी किंवा अचूकतेचा अभाव असू शकतो. मूळ भाषेतील मूळ दस्तऐवज हा अधिकृत स्रोत मानला जावा. महत्त्वाच्या माहितीसाठी व्यावसायिक मानवी भाषांतराची शिफारस केली जाते. या भाषांतराचा वापर केल्यामुळे उद्भवणाऱ्या कोणत्याही गैरसमज किंवा चुकीच्या अर्थासाठी आम्ही जबाबदार राहणार नाही.\n"
]
}
],
@ -247,8 +247,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:43:44+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:21:12+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "mr"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## संभाव्यता आणि सांख्यिकीची ओळख\n",
"## असाइनमेंट\n",
"## संभाव्यता आणि सांख्यिकीची ओळख \n",
"## असाइनमेंट \n",
"\n",
"या असाइनमेंटमध्ये, आपण मधुमेह रुग्णांचा डेटासेट वापरणार आहोत जो [येथून घेतलेला आहे](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"या असाइनमेंटमध्ये, आपण मधुमेह रुग्णांचा डेटासेट वापरणार आहोत जो [येथून घेतलेला आहे](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html). \n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"या डेटासेटमध्ये, स्तंभ खालीलप्रमाणे आहेत:\n",
"* वय आणि लिंग हे स्वतः स्पष्ट आहेत\n",
"या डेटासेटमध्ये खालील प्रकारचे स्तंभ आहेत:\n",
"* वय आणि लिंग स्वतः स्पष्ट आहेत\n",
"* BMI म्हणजे शरीराचा वस्तुमान निर्देशांक\n",
"* BP म्हणजे सरासरी रक्तदाब\n",
"* S1 ते S6 हे वेगवेगळ्या रक्ताच्या मोजमापांचे प्रतिनिधित्व करतात\n",
"* Y म्हणजे एका वर्षातील आजाराच्या प्रगतीचे गुणात्मक मोजमाप\n",
"* S1 ते S6 हे वेगवेगळ्या रक्ताचे मोजमाप आहेत\n",
"* Y म्हणजे एका वर्षाच्या कालावधीत रोगाच्या प्रगतीचे गुणात्मक मोजमाप\n",
"\n",
"चला या डेटासेटचा अभ्यास संभाव्यता आणि सांख्यिकीच्या पद्धतींचा वापर करून करूया.\n",
"चला संभाव्यता आणि सांख्यिकीच्या पद्धती वापरून या डेटासेटचा अभ्यास करूया.\n",
"\n",
"### कार्य 1: सर्व मूल्यांसाठी सरासरी आणि वैविध्य (variance) मोजा\n"
"### कार्य 1: सर्व मूल्यांसाठी सरासरी आणि विचलन गणना करा\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -483,8 +483,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -533,8 +533,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -851,10 +851,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -881,9 +881,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -912,7 +912,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**अस्वीकरण**: \nहा दस्तऐवज [Co-op Translator](https://github.com/Azure/co-op-translator) या एआय भाषांतर सेवेचा वापर करून भाषांतरित करण्यात आला आहे. आम्ही अचूकतेसाठी प्रयत्नशील असलो तरी, कृपया लक्षात घ्या की स्वयंचलित भाषांतरांमध्ये त्रुटी किंवा अचूकतेचा अभाव असू शकतो. मूळ भाषेतील दस्तऐवज हा अधिकृत स्रोत मानला जावा. महत्त्वाच्या माहितीसाठी व्यावसायिक मानवी भाषांतराची शिफारस केली जाते. या भाषांतराचा वापर केल्यामुळे उद्भवणाऱ्या कोणत्याही गैरसमज किंवा चुकीच्या अर्थासाठी आम्ही जबाबदार राहणार नाही.\n"
"\n---\n\n**अस्वीकरण**: \nहा दस्तऐवज AI भाषांतर सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) वापरून भाषांतरित करण्यात आला आहे. आम्ही अचूकतेसाठी प्रयत्नशील असलो तरी कृपया लक्षात ठेवा की स्वयंचलित भाषांतरांमध्ये त्रुटी किंवा अचूकतेचा अभाव असू शकतो. मूळ भाषेतील दस्तऐवज हा अधिकृत स्रोत मानला जावा. महत्त्वाच्या माहितीसाठी व्यावसायिक मानवी भाषांतराची शिफारस केली जाते. या भाषांतराचा वापर करून निर्माण होणाऱ्या कोणत्याही गैरसमज किंवा चुकीच्या अर्थासाठी आम्ही जबाबदार राहणार नाही.\n"
]
}
],
@ -938,8 +938,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:50:28+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:21:27+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "mr"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"Dalam set data ini, lajur-lajur adalah seperti berikut: \n",
"* Umur dan jantina adalah jelas dengan sendirinya \n",
"* BMI ialah indeks jisim badan \n",
"* BP ialah tekanan darah purata \n",
"* S1 hingga S6 adalah pelbagai ukuran darah \n",
"* Y ialah ukuran kualitatif perkembangan penyakit sepanjang satu tahun \n",
"Dalam dataset ini, lajur-lajur adalah seperti berikut:\n",
"* Umur dan jantina adalah jelas dengan sendirinya\n",
"* BMI adalah indeks jisim badan\n",
"* BP adalah tekanan darah purata\n",
"* S1 hingga S6 adalah pelbagai ukuran darah\n",
"* Y adalah ukuran kualitatif bagi perkembangan penyakit sepanjang satu tahun\n",
"\n",
"Mari kita kaji set data ini menggunakan kaedah kebarangkalian dan statistik. \n",
"Mari kita kaji dataset ini menggunakan kaedah kebarangkalian dan statistik.\n",
"\n",
"### Tugasan 1: Kira nilai purata dan varians untuk semua nilai \n"
"### Tugasan 1: Kira nilai purata dan varians untuk semua nilai\n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Tugasan 2: Plot kotak plot untuk BMI, BP dan Y bergantung kepada jantina\n"
"### Tugasan 2: Plot kotak plot untuk BMI, BP dan Y bergantung pada jantina\n"
],
"metadata": {}
},
@ -227,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Penafian**: \nDokumen ini telah diterjemahkan menggunakan perkhidmatan terjemahan AI [Co-op Translator](https://github.com/Azure/co-op-translator). Walaupun kami berusaha untuk memastikan ketepatan, sila ambil maklum bahawa terjemahan automatik mungkin mengandungi kesilapan atau ketidaktepatan. Dokumen asal dalam bahasa asalnya harus dianggap sebagai sumber yang berwibawa. Untuk maklumat yang kritikal, terjemahan manusia profesional adalah disyorkan. Kami tidak bertanggungjawab atas sebarang salah faham atau salah tafsir yang timbul daripada penggunaan terjemahan ini.\n"
"\n---\n\n**Penafian**: \nDokumen ini telah diterjemahkan menggunakan perkhidmatan terjemahan AI [Co-op Translator](https://github.com/Azure/co-op-translator). Walaupun kami berusaha untuk memastikan ketepatan, sila ambil maklum bahawa terjemahan automatik mungkin mengandungi kesilapan atau ketidaktepatan. Dokumen asal dalam bahasa asalnya harus dianggap sebagai sumber yang berwibawa. Untuk maklumat penting, terjemahan manusia profesional adalah disyorkan. Kami tidak bertanggungjawab atas sebarang salah faham atau salah tafsir yang timbul daripada penggunaan terjemahan ini.\n"
]
}
],
@ -253,8 +253,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:43:27+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:45:12+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "ms"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## Pengenalan kepada Kebarangkalian dan Statistik\n",
"## Tugasan\n",
"\n",
"Dalam tugasan ini, kita akan menggunakan dataset pesakit diabetes yang diambil [dari sini](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
"Dalam tugasan ini, kita akan menggunakan set data pesakit diabetes yang diambil [dari sini](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"Dalam set data ini, lajur-lajur adalah seperti berikut: \n",
"* Umur dan jantina adalah jelas dengan sendirinya \n",
"* BMI adalah indeks jisim badan \n",
"* BP adalah tekanan darah purata \n",
"* S1 hingga S6 adalah pelbagai ukuran darah \n",
"* Y adalah ukuran kualitatif perkembangan penyakit sepanjang satu tahun \n",
"Dalam dataset ini, lajur-lajur adalah seperti berikut:\n",
"* Umur dan jantina adalah jelas dengan sendirinya\n",
"* BMI ialah indeks jisim badan\n",
"* BP ialah tekanan darah purata\n",
"* S1 hingga S6 adalah pelbagai ukuran darah\n",
"* Y ialah ukuran kualitatif bagi perkembangan penyakit sepanjang satu tahun\n",
"\n",
"Mari kita kaji set data ini menggunakan kaedah kebarangkalian dan statistik. \n",
"Mari kita kaji dataset ini menggunakan kaedah kebarangkalian dan statistik.\n",
"\n",
"### Tugasan 1: Kira nilai purata dan varians untuk semua nilai \n"
"### Tugasan 1: Kira nilai purata dan varians untuk semua nilai\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Tugasan 2: Plot kotak plot untuk BMI, BP dan Y bergantung kepada jantina\n"
"### Tugasan 2: Plot kotak plot untuk BMI, BP dan Y bergantung pada jantina\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -887,9 +887,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -944,8 +944,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:50:06+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:45:29+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "ms"
}

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## စွမ်းဆောင်ရည်နှင့် သင်္ချာဆိုင်ရာ သင်္ချာအခြေခံ\n",
"## စွမ်းဆောင်ရည်နှင့် စာရင်းအင်းဆိုင်ရာ သဘောတရားအကြောင်း\n",
"## လုပ်ငန်းတာဝန်\n",
"\n",
"ဒီလုပ်ငန်းတာဝန်မှာတော့ [ဒီနေရာမှ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) ယူထားတဲ့ ဆီးချိုရောဂါရှိသူများ၏ ဒေတာစနစ်ကို အသုံးပြုသွားမှာ ဖြစ်ပါတယ်။\n"
"ဒီလုပ်ငန်းတာဝန်မှာ ကျွန်တော်တို့ [ဒီနေရာ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) မှ ယူထားတဲ့ ဆီးချိုရောဂါရှိသူများ၏ ဒေတာဆက်တင်ကို အသုံးပြုသွားမှာ ဖြစ်ပါတယ်။\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"ဒီဒေတာဆက်တင်တွင် ကော်လံများမှာ အောက်ပါအတိုင်းဖြစ်သည် - \n",
"* အသက်နှင့် လိင်သည် အလွယ်တကူနားလည်နိုင်သည် \n",
"* BMI သည် ကိုယ်အလေးချိန်နှင့် အရပ်အမတ်အချိုး \n",
"* BP သည် ပျမ်းမျှ သွေးပေါင်ချိန် \n",
"* S1 မှ S6 အထိသည် သွေးစစ်ဆေးမှုအမျိုးမျိုး \n",
"* Y သည် တစ်နှစ်အတွင်း ရောဂါတိုးတက်မှုအရည်အသွေးတိုင်းတာချက် \n",
"ဒီဒေတာဆက်တင်ထဲမှာ ကော်လံတွေက အောက်ပါအတိုင်းဖြစ်ပါတယ်- \n",
"* အသက်နဲ့ လိင်က အလွယ်တကူနားလည်နိုင်ပါတယ် \n",
"* BMI က ကိုယ်အလေးချိန်နှင့် အရွယ်အစားညှိထားသော အညွှန်းကိန်းဖြစ်ပါတယ် \n",
"* BP က ပျမ်းမျှ သွေးဖိအား \n",
"* S1 ကနေ S6 အထိက သွေးစစ်ဆေးမှုအမျိုးမျိုး \n",
"* Y က တစ်နှစ်အတွင်း ရောဂါတိုးတက်မှုအရည်အချင်းအတိုင်းအတာ \n",
"\n",
"Probability နှင့် Statistics နည်းလမ်းများကို အသုံးပြု၍ ဒီဒေတာဆက်တင်ကို လေ့လာကြမည်။\n",
"ဒီဒေတာဆက်တင်ကို သက်မှတ်နှုန်းနှင့် သင်္ချာဆိုင်ရာ နည်းလမ်းများကို အသုံးပြုပြီး လေ့လာကြမယ်။\n",
"\n",
"### Task 1: တန်ဖိုးအားလုံးအတွက် ပျမ်းမျှနှင့် အပြောင်းအလဲကိုတွက်ချက်ပါ \n"
"### တာဝန် ၁: တန်ဖိုးအားလုံးအတွက် ပျမ်းမျှနှုန်းနဲ့ မျိုးကွဲမှုကို တွက်ချက်ပါ \n"
],
"metadata": {}
},
@ -196,9 +196,9 @@
{
"cell_type": "markdown",
"source": [
"### အလုပ် ၄: အမျိုးမျိုးသော အပြောင်းအလဲများနှင့် ရောဂါတိုးတက်မှု (Y) အကြား ဆက်စပ်မှုကို စမ်းသပ်ပါ\n",
"### တာဝန် ၄: အမျိုးမျိုးသော အပြောင်းအလဲများနှင့် ရောဂါတိုးတက်မှု (Y) အကြား ဆက်စပ်မှုကို စမ်းသပ်ပါ\n",
"\n",
"> **အကြံပြုချက်** ဆက်စပ်မှုအမီတာဇယားက ဘယ်တန်ဖိုးတွေက အချင်းချင်းမူတည်နေတယ်ဆိုတာ အထောက်အကူဖြစ်စေမယ့် အချက်အလက်တွေကို ပိုမိုပေးနိုင်ပါတယ်။\n"
"> **အကြံပြုချက်** ဆက်စပ်မှုအချိုးဇယားသည် ဘယ်တန်ဖိုးများသည် အချင်းချင်းမူတည်နေသည်ကို အထောက်အကူဖြစ်စေမည့် အရေးကြီးသော အချက်အလက်များကို ပေးနိုင်ပါသည်။\n"
],
"metadata": {}
},
@ -221,7 +221,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**ဝက်ဘ်ဆိုက်မှတ်ချက်**: \nဤစာရွက်စာတမ်းကို AI ဘာသာပြန်ဝန်ဆောင်မှု [Co-op Translator](https://github.com/Azure/co-op-translator) ကို အသုံးပြု၍ ဘာသာပြန်ထားပါသည်။ ကျွန်ုပ်တို့သည် တိကျမှန်ကန်မှုအတွက် ကြိုးစားနေပါသော်လည်း၊ အလိုအလျောက်ဘာသာပြန်မှုများတွင် အမှားများ သို့မဟုတ် မမှန်ကန်မှုများ ပါဝင်နိုင်သည်ကို ကျေးဇူးပြု၍ သတိပြုပါ။ မူရင်းစာရွက်စာတမ်းကို ၎င်း၏ မူလဘာသာစကားဖြင့် အာဏာတည်သောရင်းမြစ်အဖြစ် သတ်မှတ်ရန် လိုအပ်ပါသည်။ အရေးကြီးသော အချက်အလက်များအတွက် လူ့ဘာသာပြန်ပညာရှင်များမှ အတည်ပြုထားသော ဘာသာပြန်မှုကို အသုံးပြုရန် အကြံပြုပါသည်။ ဤဘာသာပြန်မှုကို အသုံးပြုခြင်းမှ ဖြစ်ပေါ်လာသော နားလည်မှုမှားမှုများ သို့မဟုတ် အဓိပ္ပာယ်မှားမှုများအတွက် ကျွန်ုပ်တို့သည် တာဝန်မယူပါ။\n"
"\n---\n\n**ဝက်ဘ်ဆိုက်မှတ်ချက်**: \nဤစာရွက်စာတမ်းကို AI ဘာသာပြန်ဝန်ဆောင်မှု [Co-op Translator](https://github.com/Azure/co-op-translator) ကို အသုံးပြု၍ ဘာသာပြန်ထားပါသည်။ ကျွန်ုပ်တို့သည် တိကျမှန်ကန်မှုအတွက် ကြိုးစားနေပါသော်လည်း၊ အလိုအလျောက်ဘာသာပြန်မှုများတွင် အမှားများ သို့မဟုတ် မမှန်ကန်မှုများ ပါဝင်နိုင်သည်ကို ကျေးဇူးပြု၍ သတိပြုပါ။ မူရင်းစာရွက်စာတမ်းကို ၎င်း၏ မူလဘာသာစကားဖြင့် အာဏာတည်သောရင်းမြစ်အဖြစ် သတ်မှတ်သင့်ပါသည်။ အရေးကြီးသော အချက်အလက်များအတွက် လူကူးဘာသာပြန်မှုကို အကြံပြုပါသည်။ ဤဘာသာပြန်မှုကို အသုံးပြုခြင်းမှ ဖြစ်ပေါ်လာသော နားလည်မှုမှားများ သို့မဟုတ် အဓိပ္ပါယ်မှားများအတွက် ကျွန်ုပ်တို့သည် တာဝန်မယူပါ။\n"
]
}
],
@ -247,8 +247,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:44:02+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T18:00:47+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "my"
}

File diff suppressed because one or more lines are too long

@ -3,10 +3,10 @@
{
"cell_type": "markdown",
"source": [
"## မူလိကျ Probability နှင့် Statistics \n",
"## လုပ်ငန်းတာဝန် \n",
"## စွမ်းဆောင်ရည်နှင့် စာရင်းအင်းဆိုင်ရာ သဘောတရားအကြောင်း\n",
"## လုပ်ငန်းတာဝန်\n",
"\n",
"ဒီလုပ်ငန်းတာဝန်မှာ ကျွန်တော်တို့ [ဒီနေရာ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) မှာရရှိတဲ့ ဆီးချိုရောဂါရှိသူများရဲ့ ဒေတာစနစ်ကို အသုံးပြုသွားမှာ ဖြစ်ပါတယ်။ \n"
"ဒီလုပ်ငန်းတာဝန်မှာ ကျွန်တော်တို့ [ဒီနေရာ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) မှ ယူထားတဲ့ ဆီးချိုရောဂါရှိသူများ၏ ဒေတာဆက်တင်ကို အသုံးပြုသွားမှာ ဖြစ်ပါတယ်။\n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"ဒီဒေတာစနစ်တွင် ကော်လံများမှာ အောက်ပါအတိုင်းဖြစ်သည်။\n",
"* အသက်နှင့် လိင်သည် အလွယ်တကူနားလည်နိုင်သည်။\n",
"* BMI သည် ကိုယ်အလေးချိန်နှင့် အရပ်အမောင်းအချိုးကို ဖော်ပြသည်။\n",
"* BP သည် ပျမ်းမျှ သွေးပေါင်ချိန်ကို ဆိုလိုသည်။\n",
"* S1 မှ S6 သည် သွေးစစ်ဆေးမှုအမျိုးမျိုးကို ဖော်ပြသည်။\n",
"* Y သည် တစ်နှစ်အတွင်း ရောဂါတိုးတက်မှုအရည်အသွေးကို ဖော်ပြသည်။\n",
"ဒီဒေတာဆက်တင်ထဲမှာ ကော်လံတွေက အောက်ပါအတိုင်းဖြစ်ပါတယ်- \n",
"* အသက်နဲ့ လိင်က အလွယ်တကူနားလည်နိုင်ပါတယ် \n",
"* BMI က ကိုယ်အလေးချိန်ညွှန်းကိန်းဖြစ်ပါတယ် \n",
"* BP က ပျမ်းမျှ သွေးဖိအားဖြစ်ပါတယ် \n",
"* S1 ကနေ S6 အထိက သွေးစစ်ဆေးမှုအမျိုးမျိုးဖြစ်ပါတယ် \n",
"* Y က တစ်နှစ်အတွင်း ရောဂါတိုးတက်မှုအရည်အချင်းကို တိုင်းတာထားတဲ့ တန်ဖိုးဖြစ်ပါတယ် \n",
"\n",
"Probability နှင့် Statistics နည်းလမ်းများကို အသုံးပြု၍ ဒီဒေတာစနစ်ကို လေ့လာကြမည်။\n",
"ဒီဒေတာဆက်တင်ကို သက်မှတ်နှုန်းနဲ့ သင်္ချာဆိုင်ရာ နည်းလမ်းတွေကို အသုံးပြုပြီး လေ့လာကြရအောင်။\n",
"\n",
"### Task 1: တန်ဖိုးအားလုံးအတွက် ပျမ်းမျှနှင့် အပြောင်းအလဲကို တွက်ချက်ပါ\n"
"### တာဝန် ၁: တန်ဖိုးအားလုံးအတွက် ပျမ်းမျှတန်ဖိုးနဲ့ မျိုးကွဲမှုကို တွက်ချက်ပါ \n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -483,8 +483,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -533,8 +533,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -598,9 +598,9 @@
{
"cell_type": "markdown",
"source": [
"### အလုပ် ၄: အမျိုးမျိုးသော အပြောင်းအလဲများနှင့် ရောဂါတိုးတက်မှု (Y) အကြား ဆက်စပ်မှုကို စမ်းသပ်ပါ\n",
"### အလုပ် ၄: အမျိုးမျိုးသောအပြောင်းအလဲများနှင့် ရောဂါတိုးတက်မှု (Y) အကြား ဆက်စပ်မှုကို စမ်းသပ်ပါ\n",
"\n",
"> **အကြံပြုချက်** ဆက်စပ်မှုအမီတာစ် (correlation matrix) သည် ဘယ်တန်ဖိုးများသည် အချင်းချင်းမူတည်နေသည်ကို အထောက်အကူဖြစ်စေသော အချက်အလက်များကို ပေးနိုင်ပါသည်။\n"
"> **အကြံပြုချက်** ဆက်စပ်မှု အကျဉ်းချုပ်ဇယားသည် ဘယ်တန်ဖိုးများသည် အချင်းချင်းပေါ်မူတည်နေသည်ကို သိရှိရန် အထောက်အကူဖြစ်စေပါမည်။\n"
],
"metadata": {}
},
@ -843,7 +843,7 @@
"cell_type": "markdown",
"source": [
"အနှစ်ချုပ်: \n",
"* Y နှင့် အပြင်းထန်ဆုံး ဆက်စပ်မှုမှာ BMI နဲ့ S5 (သွေးချို) ဖြစ်ပါတယ်။ ဒါဟာ အကျိုးရှိယ်လို့ ထင်ရပါတယ်။ \n"
"* Y နှင့် အပြင်းထန်ဆုံး ဆက်စပ်မှုမှာ BMI နှင့် S5 (သွေးချို) ဖြစ်ပါတယ်။ ဒါဟာ အကျိုးရှိယ်လို့ ထင်ရပါတယ်။ \n"
],
"metadata": {}
},
@ -851,10 +851,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -881,9 +881,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -912,7 +912,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**ဝက်ဘ်ဆိုက်မှတ်ချက်**: \nဤစာရွက်စာတမ်းကို AI ဘာသာပြန်ဝန်ဆောင်မှု [Co-op Translator](https://github.com/Azure/co-op-translator) ကို အသုံးပြု၍ ဘာသာပြန်ထားပါသည်။ ကျွန်ုပ်တို့သည် တိကျမှန်ကန်မှုအတွက် ကြိုးစားနေပါသော်လည်း၊ အလိုအလျောက်ဘာသာပြန်မှုများတွင် အမှားများ သို့မဟုတ် မတိကျမှုများ ပါဝင်နိုင်သည်ကို ကျေးဇူးပြု၍ သတိပြုပါ။ မူရင်းဘာသာစကားဖြင့် ရေးသားထားသော စာရွက်စာတမ်းကို အာဏာတည်သော ရင်းမြစ်အဖြစ် သတ်မှတ်သင့်ပါသည်။ အရေးကြီးသော အချက်အလက်များအတွက် လူသားဘာသာပြန်ပညာရှင်များမှ ပြန်ဆိုမှုကို အကြံပြုပါသည်။ ဤဘာသာပြန်မှုကို အသုံးပြုခြင်းမှ ဖြစ်ပေါ်လာသော နားလည်မှုမှားများ သို့မဟုတ် အဓိပ္ပယ်မှားများအတွက် ကျွန်ုပ်တို့သည် တာဝန်မယူပါ။\n"
"\n---\n\n**ဝက်ဘ်ဆိုက်မှတ်ချက်**: \nဤစာရွက်စာတမ်းကို AI ဘာသာပြန်ဝန်ဆောင်မှု [Co-op Translator](https://github.com/Azure/co-op-translator) ကို အသုံးပြု၍ ဘာသာပြန်ထားပါသည်။ ကျွန်ုပ်တို့သည် တိကျမှန်ကန်မှုအတွက် ကြိုးစားနေပါသော်လည်း၊ အလိုအလျောက်ဘာသာပြန်မှုများတွင် အမှားများ သို့မဟုတ် မမှန်ကန်မှုများ ပါဝင်နိုင်သည်ကို ကျေးဇူးပြု၍ သတိပြုပါ။ မူရင်းစာရွက်စာတမ်းကို ၎င်း၏ မူလဘာသာစကားဖြင့် အာဏာတည်သောရင်းမြစ်အဖြစ် သတ်မှတ်ရန် လိုအပ်ပါသည်။ အရေးကြီးသော အချက်အလက်များအတွက် လူဘာသာပြန်ပညာရှင်များမှ ပြန်ဆိုမှုကို အကြံပြုပါသည်။ ဤဘာသာပြန်မှုကို အသုံးပြုခြင်းမှ ဖြစ်ပေါ်လာသော နားလည်မှုမှားများ သို့မဟုတ် အဓိပ္ပယ်မှားများအတွက် ကျွန်ုပ်တို့သည် တာဝန်မယူပါ။\n"
]
}
],
@ -938,8 +938,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:50:58+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T18:01:06+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "my"
}

@ -6,7 +6,7 @@
"## सम्भाव्यता र तथ्यांकको परिचय \n",
"## असाइनमेन्ट \n",
"\n",
"यस असाइनमेन्टमा, हामी मधुमेहका बिरामीहरूको डटासेट प्रयोग गर्नेछौं जुन [यहाँबाट लिइएको हो](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
"यस असाइनमेन्टमा, हामी मधुमेहका बिरामीहरूको डटासेट प्रयोग गर्नेछौं जुन [यहाँबाट लिइएको हो](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"यस डेटासेटमा, स्तम्भहरू निम्नानुसार छन्: \n",
"* उमेर र लिङ्ग स्वाभाविक रूपमा स्पष्ट छन् \n",
"* BMI भनेको शरीरको मास सूचकांक हो \n",
"* BP भनेको औसत रक्तचाप हो \n",
"* S1 देखि S6 सम्म विभिन्न रक्त परीक्षणका मापनहरू हुन् \n",
"* Y भनेको एक वर्षको अवधिमा रोगको प्रगतिको गुणात्मक मापन हो \n",
"यस डेटासेटमा स्तम्भहरू निम्न प्रकारका छन्:\n",
"* उमेर र लिङ्ग स्पष्ट छन्\n",
"* BMI भनेको शरीरको मास सूचकांक हो\n",
"* BP भनेको औसत रक्तचाप हो\n",
"* S1 देखि S6 विभिन्न रक्त मापनहरू हुन्\n",
"* Y भनेको एक वर्षको अवधिमा रोगको प्रगतिको गुणात्मक मापन हो\n",
"\n",
"आउनुहोस्, सम्भाव्यता र तथ्याङ्कका विधिहरू प्रयोग गरेर यस डेटासेटको अध्ययन गरौँ। \n",
"आउनुहोस्, सम्भाव्यता र तथ्याङ्कका विधिहरू प्रयोग गरेर यस डेटासेटको अध्ययन गरौं।\n",
"\n",
"### कार्य १: सबै मानहरूको औसत र विचलन गणना गर्नुहोस् \n"
"### कार्य १: सबै मानहरूको औसत मान र विचलन गणना गर्नुहोस्\n"
],
"metadata": {}
},
@ -198,9 +198,9 @@
{
"cell_type": "markdown",
"source": [
"### कार्य ४: विभिन्न चरहरू र रोगको प्रगति (Y) बीचको सम्बन्ध परीक्षण गर्नुहोस्\n",
"### कार्य ४: विभिन्न भेरिएबलहरू र रोगको प्रगति (Y) बीचको सम्बन्ध परीक्षण गर्नुहोस्\n",
"\n",
"> **सूचना** सम्बन्ध म्याट्रिक्सले कुन मानहरू परस्पर निर्भर छन् भन्ने सबैभन्दा उपयोगी जानकारी प्रदान गर्नेछ।\n"
"> **संकेत** सम्बन्ध म्याट्रिक्सले कुन मानहरू परनिर्भर छन् भन्ने बारेमा सबैभन्दा उपयोगी जानकारी दिन्छ।\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**अस्वीकरण**: \nयो दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) प्रयोग गरी अनुवाद गरिएको हो। हामी यथासम्भव सटीकता सुनिश्चित गर्न प्रयास गर्छौं, तर कृपया ध्यान दिनुहोस् कि स्वचालित अनुवादहरूमा त्रुटि वा अशुद्धता हुन सक्छ। यसको मूल भाषामा रहेको मूल दस्तावेज़लाई आधिकारिक स्रोत मानिनुपर्छ। महत्त्वपूर्ण जानकारीका लागि, व्यावसायिक मानव अनुवाद सिफारिस गरिन्छ। यस अनुवादको प्रयोगबाट उत्पन्न हुने कुनै पनि गलतफहमी वा गलत व्याख्याक लागि हामी जिम्मेवार हुने छैनौं। \n"
"\n---\n\n**अस्वीकरण**: \nयो दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) प्रयोग गरी अनुवाद गरिएको हो। हामी यथासम्भव सटीकता सुनिश्चित गर्न प्रयास गर्छौं, तर कृपया ध्यान दिनुहोस् कि स्वचालित अनुवादहरूमा त्रुटिहरू वा अशुद्धताहरू हुन सक्छन्। यसको मूल भाषामा रहेको मूल दस्तावेज़लाई आधिकारिक स्रोत मानिनुपर्छ। महत्त्वपूर्ण जानकारीका लागि, व्यावसायिक मानव अनुवाद सिफारिस गरिन्छ। यस अनुवादको प्रयोगबाट उत्पन्न हुने कुनै पनि गलतफहमी वा गलत व्याख्याक लागि हामी जिम्मेवार हुने छैनौं।\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:44:25+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:22:31+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "ne"
}

File diff suppressed because one or more lines are too long

@ -6,7 +6,7 @@
"## सम्भाव्यता र तथ्यांकको परिचय \n",
"## असाइनमेन्ट \n",
"\n",
"यस असाइनमेन्टमा, हामी मधुमेहका बिरामीहरूको डेटासेट प्रयोग गर्नेछौं जुन [यहाँबाट लिइएको ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
"यस असाइनमेन्टमा, हामी मधुमेहका बिरामीहरूको डेटासेट प्रयोग गर्नेछौं जुन [यहाँबाट लिइएको हो](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)। \n"
],
"metadata": {}
},
@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"यस डेटासेटमा स्तम्भहरू निम्न प्रकारका छन्: \n",
"* उमेर र लिङ्ग स्वाभाविक रूपमा स्पष्ट छन् \n",
"* BMI भनेको शरीरको मास सूचकांक हो \n",
"* BP भनेको औसत रक्तचाप हो \n",
"* S1 देखि S6 सम्म विभिन्न रक्त परीक्षणका मापनहरू हुन् \n",
"* Y भनेको एक वर्षको अवधिमा रोगको प्रगतिको गुणात्मक मापन हो \n",
"यस डेटासेटमा स्तम्भहरू निम्न प्रकारका छन्:\n",
"* उमेर र लिङ्ग स्पष्ट छन्\n",
"* BMI भनेको शरीरको मास सूचकांक हो\n",
"* BP भनेको औसत रक्तचाप हो\n",
"* S1 देखि S6 विभिन्न रक्त मापनहरू हुन्\n",
"* Y भनेको एक वर्षको अवधिमा रोगको प्रगतिको गुणात्मक मापन हो\n",
"\n",
"आउनुहोस्, सम्भाव्यता र तथ्याङ्कका विधिहरू प्रयोग गरेर यस डेटासेटको अध्ययन गरौँ। \n",
"आउनुहोस्, सम्भाव्यता र तथ्याकका विधिहरू प्रयोग गरेर यस डेटासेटको अध्ययन गरौं।\n",
"\n",
"### कार्य १: सबै मानहरूको औसत र विचलन गणना गर्नुहोस् \n"
"### कार्य १: सबै मानहरूको औसत मान र विचलन गणना गर्नुहोस्\n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -590,10 +590,10 @@
{
"cell_type": "markdown",
"source": [
"निष्कर्षहरू: \n",
"* उमेर - सामान्य \n",
"* लिङ्ग - समान \n",
"* BMI, Y - भन्न गाह्रो \n"
"निष्कर्षहरू:\n",
"* उमेर - सामान्य\n",
"* लिङ्ग - समान\n",
"* BMI, Y - भन्न गाह्रो\n"
],
"metadata": {}
},
@ -602,7 +602,7 @@
"source": [
"### कार्य ४: विभिन्न भेरिएबलहरू र रोगको प्रगति (Y) बीचको सम्बन्ध परीक्षण गर्नुहोस्\n",
"\n",
"> **संकेत** सम्बन्ध म्याट्रिक्सले कुन मानहरू परस्पर निर्भर छन् भन्ने सबैभन्दा उपयोगी जानकारी दिन्छ।\n"
"> **संकेत** सम्बन्ध म्याट्रिक्सले कुन मानहरू परनिर्भर छन् भन्ने बारेमा सबैभन्दा उपयोगी जानकारी दिन्छ।\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**अस्वीकरण**: \nयो दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) प्रयोग गरी अनुवाद गरिएको हो। हामी यथासम्भव शुद्धताको प्रयास गर्छौं, तर कृपया ध्यान दिनुहोस् कि स्वचालित अनुवादहरूमा त्रुटि वा अशुद्धता हुन सक्छ। यसको मूल भाषामा रहेको मूल दस्तावेज़लाई आधिकारिक स्रोत मानिनुपर्छ। महत्त्वपूर्ण जानकारीका लागि, व्यावसायिक मानव अनुवाद सिफारिस गरिन्छ। यस अनुवादको प्रयोगबाट उत्पन्न हुने कुनै पनि गलतफहमी वा गलत व्याख्याक लागि हामी जिम्मेवार हुने छैनौं। \n"
"\n---\n\n**अस्वीकरण**: \nयो दस्तावेज़ AI अनुवाद सेवा [Co-op Translator](https://github.com/Azure/co-op-translator) प्रयोग गरी अनुवाद गरिएको हो। हामी यथासम्भव सटीकता सुनिश्चित गर्न प्रयास गर्छौं, तर कृपया ध्यान दिनुहोस् कि स्वचालित अनुवादहरूमा त्रुटि वा अशुद्धता हुन सक्छ। यसको मूल भाषामा रहेको मूल दस्तावेज़लाई आधिकारिक स्रोत मानिनुपर्छ। महत्त्वपूर्ण जानकारीका लागि, व्यावसायिक मानव अनुवाद सिफारिस गरिन्छ। यस अनुवादको प्रयोगबाट उत्पन्न हुने कुनै पनि गलतफहमी वा गलत व्याख्याक लागि हामी जिम्मेवार हुने छैनौं।\n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:51:17+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:22:47+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "ne"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"In deze dataset zijn de kolommen als volgt: \n",
"* Leeftijd en geslacht spreken voor zich \n",
"* BMI is de body mass index \n",
"* BP is de gemiddelde bloeddruk \n",
"* S1 tot en met S6 zijn verschillende bloedmetingen \n",
"* Y is de kwalitatieve maat voor ziekteprogressie over één jaar \n",
"In deze dataset zijn de kolommen als volgt:\n",
"* Leeftijd en geslacht spreken voor zich\n",
"* BMI is de body mass index\n",
"* BP is de gemiddelde bloeddruk\n",
"* S1 tot en met S6 zijn verschillende bloedmetingen\n",
"* Y is de kwalitatieve maat voor ziekteprogressie over één jaar\n",
"\n",
"Laten we deze dataset bestuderen met behulp van methoden uit de waarschijnlijkheidsleer en statistiek. \n",
"Laten we deze dataset bestuderen met behulp van methoden uit de kansrekening en statistiek.\n",
"\n",
"### Taak 1: Bereken gemiddelde waarden en variantie voor alle waarden \n"
"### Taak 1: Bereken gemiddelde waarden en variantie voor alle waarden\n"
],
"metadata": {}
},
@ -200,7 +200,7 @@
"source": [
"### Taak 4: Test de correlatie tussen verschillende variabelen en ziekteprogressie (Y)\n",
"\n",
"> **Tip** Een correlatiematrix geeft je de meest bruikbare informatie over welke waarden afhankelijk zijn.\n"
"> **Hint** Een correlatiematrix geeft je de meest bruikbare informatie over welke waarden afhankelijk zijn.\n"
],
"metadata": {}
},
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:44:44+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:39:55+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "nl"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -157,7 +157,7 @@
"* S1 tot en met S6 zijn verschillende bloedmetingen\n",
"* Y is de kwalitatieve maat voor ziekteprogressie over één jaar\n",
"\n",
"Laten we deze dataset bestuderen met behulp van methoden uit de waarschijnlijkheidsleer en statistiek.\n",
"Laten we deze dataset bestuderen met behulp van methoden uit de kansrekening en statistiek.\n",
"\n",
"### Taak 1: Bereken gemiddelde waarden en variantie voor alle waarden\n"
],
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -590,7 +590,7 @@
{
"cell_type": "markdown",
"source": [
"Conclusies: \n",
"Conclusies:\n",
"* Leeftijd - normaal \n",
"* Geslacht - uniform \n",
"* BMI, Y - moeilijk te zeggen \n"
@ -602,7 +602,7 @@
"source": [
"### Taak 4: Test de correlatie tussen verschillende variabelen en ziekteprogressie (Y)\n",
"\n",
"> **Hint** Een correlatiematrix geeft je de meest bruikbare informatie over welke waarden afhankelijk zijn.\n"
"> **Tip** Een correlatiematrix geeft je de meest bruikbare informatie over welke waarden afhankelijk zijn.\n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -914,7 +914,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Disclaimer**: \nDit document is vertaald met behulp van de AI-vertalingsservice [Co-op Translator](https://github.com/Azure/co-op-translator). Hoewel we streven naar nauwkeurigheid, dient u zich ervan bewust te zijn dat geautomatiseerde vertalingen fouten of onnauwkeurigheden kunnen bevatten. Het originele document in de oorspronkelijke taal moet worden beschouwd als de gezaghebbende bron. Voor kritieke informatie wordt professionele menselijke vertaling aanbevolen. Wij zijn niet aansprakelijk voor misverstanden of verkeerde interpretaties die voortvloeien uit het gebruik van deze vertaling.\n"
"\n---\n\n**Disclaimer**: \nDit document is vertaald met behulp van de AI-vertalingsservice [Co-op Translator](https://github.com/Azure/co-op-translator). Hoewel we streven naar nauwkeurigheid, willen we u erop wijzen dat geautomatiseerde vertalingen fouten of onnauwkeurigheden kunnen bevatten. Het originele document in de oorspronkelijke taal moet worden beschouwd als de gezaghebbende bron. Voor kritieke informatie wordt professionele menselijke vertaling aanbevolen. Wij zijn niet aansprakelijk voor misverstanden of verkeerde interpretaties die voortvloeien uit het gebruik van deze vertaling.\n"
]
}
],
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:51:40+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:40:10+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "nl"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,7 +149,7 @@
{
"cell_type": "markdown",
"source": [
"I dette datasettet har kolonnene følgende betydning:\n",
"I dette datasettet er kolonnene som følger:\n",
"* Alder og kjønn er selvforklarende\n",
"* BMI er kroppsmasseindeks\n",
"* BP er gjennomsnittlig blodtrykk\n",
@ -247,8 +247,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:44:58+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:37:17+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "no"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -483,8 +483,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -533,8 +533,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -598,9 +598,9 @@
{
"cell_type": "markdown",
"source": [
"### Oppgave 4: Test korrelasjonen mellom ulike variabler og sykdomsutvikling (Y)\n",
"### Oppgave 4: Test korrelasjonen mellom ulike variabler og sykdomsprogresjon (Y)\n",
"\n",
"> **Tips** En korrelasjonsmatrise vil gi deg den mest nyttige informasjonen om hvilke verdier som er avhengige.\n"
"> **Hint** Korrelasjonsmatrisen vil gi deg den mest nyttige informasjonen om hvilke verdier som er avhengige.\n"
],
"metadata": {}
},
@ -851,10 +851,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -881,9 +881,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -912,7 +912,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Ansvarsfraskrivelse**: \nDette dokumentet er oversatt ved hjelp av AI-oversettelsestjenesten [Co-op Translator](https://github.com/Azure/co-op-translator). Selv om vi tilstreber nøyaktighet, vennligst vær oppmerksom på at automatiske oversettelser kan inneholde feil eller unøyaktigheter. Det originale dokumentet på sitt opprinnelige språk bør anses som den autoritative kilden. For kritisk informasjon anbefales profesjonell menneskelig oversettelse. Vi er ikke ansvarlige for eventuelle misforståelser eller feiltolkninger som oppstår ved bruk av denne oversettelsen.\n"
"\n---\n\n**Ansvarsfraskrivelse**: \nDette dokumentet er oversatt ved hjelp av AI-oversettelsestjenesten [Co-op Translator](https://github.com/Azure/co-op-translator). Selv om vi streber etter nøyaktighet, vær oppmerksom på at automatiserte oversettelser kan inneholde feil eller unøyaktigheter. Det originale dokumentet på sitt opprinnelige språk bør anses som den autoritative kilden. For kritisk informasjon anbefales profesjonell menneskelig oversettelse. Vi er ikke ansvarlige for misforståelser eller feiltolkninger som oppstår ved bruk av denne oversettelsen.\n"
]
}
],
@ -938,8 +938,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:52:02+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:37:30+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "no"
}

@ -6,7 +6,7 @@
"## ਪਰਿਚਯ: ਸੰਭਾਵਨਾ ਅਤੇ ਅੰਕੜੇ\n",
"## ਅਸਾਈਨਮੈਂਟ\n",
"\n",
"ਇਸ ਅਸਾਈਨਮੈਂਟ ਵਿੱਚ, ਅਸੀਂ ਸ਼ੂਗਰ ਦੇ ਮਰੀਜ਼ਾਂ ਦੇ ਡਾਟਾਸੈੱਟ ਦੀ ਵਰਤੋਂ ਕਰਾਂਗੇ ਜੋ [ਇਥੋਂ ਲਿਆ ਗਿਆ ਹੈ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)।\n"
"ਇਸ ਅਸਾਈਨਮੈਂਟ ਵਿੱਚ, ਅਸੀਂ ਸ਼ੂਗਰ ਮਰੀਜ਼ਾਂ ਦੇ ਡਾਟਾਸੈੱਟ ਦੀ ਵਰਤੋਂ ਕਰਾਂਗੇ ਜੋ [ਇਥੋਂ ਲਿਆ ਗਿਆ ਹੈ](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)।\n"
],
"metadata": {}
},
@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,14 +149,14 @@
{
"cell_type": "markdown",
"source": [
"ਇਸ ਡਟਾਸੈੱਟ ਵਿੱਚ ਕਾਲਮ ਹੇਠ ਲਿਖੇ ਹਨ: \n",
"ਇਸ ਡਟਾਸੈੱਟ ਵਿੱਚ ਕਾਲਮ ਹੇਠ ਲਿਖੇ ਹਨ: \n",
"* ਉਮਰ ਅਤੇ ਲਿੰਗ ਸਵੈ-ਸਪਸ਼ਟ ਹਨ \n",
"* BMI ਮਤਲਬ ਬਾਡੀ ਮਾਸ ਇੰਡੈਕਸ ਹੈ \n",
"* BP ਮਤਲਬ ਔਸਤ ਰਕਤ ਦਬਾਅ ਹੈ \n",
"* S1 ਤੋਂ S6 ਵੱਖ-ਵੱਖ ਰਕਤ ਦੇ ਮਾਪ ਹਨ \n",
"* Y ਇੱਕ ਸਾਲ ਵਿੱਚ ਬਿਮਾਰੀ ਦੇ ਵਿਕਾਸ ਦਾ ਗੁਣਾਤਮਕ ਮਾਪ ਹੈ \n",
"* Y ਇੱਕ ਗੁਣਾਤਮਕ ਮਾਪ ਹੈ ਜੋ ਇੱਕ ਸਾਲ ਵਿੱਚ ਬਿਮਾਰੀ ਦੇ ਵਿਕਾਸ ਨੂੰ ਦਰਸਾਉਂਦਾ ਹੈ \n",
"\n",
"ਆਓ ਇਸ ਡਟਾਸੈੱਟ ਦਾ ਅਧਿਐਨ ਸੰਭਾਵਨਾ ਅਤੇ ਅੰਕੜਾ ਵਿਗਿਆਨ ਦੇ ਤਰੀਕਿਆਂ ਨਾਲ ਕਰੀਏ। \n",
"ਆਓ ਇਸ ਡਟਾਸੈੱਟ ਦਾ ਅਧਿਐਨ ਸੰਭਾਵਨਾ ਅਤੇ ਅੰਕੜਾ ਵਿਗਿਆਨ ਦੇ ਤਰੀਕਿਆਂ ਨਾਲ ਕਰੀਏ। \n",
"\n",
"### ਕੰਮ 1: ਸਾਰੇ ਮੁੱਲਾਂ ਲਈ ਔਸਤ ਅਤੇ ਵੈਰੀਅੰਸ ਦੀ ਗਣਨਾ ਕਰੋ \n"
],
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### ਟਾਸਕ 2: ਲਿੰਗ ਦੇ ਧਾਰ 'ਤੇ BMI, BP ਅਤੇ Y ਲਈ ਬਾਕਸਪਲਾਟ ਬਣਾਓ\n"
"### ਟਾਸਕ 2: ਲਿੰਗ ਦੇ ਧਾਰ 'ਤੇ BMI, BP ਅਤੇ Y ਲਈ ਬਾਕਸਪਲਾਟ ਬਣਾਓ\n"
],
"metadata": {}
},
@ -202,7 +202,7 @@
"source": [
"### ਟਾਸਕ 4: ਵੱਖ-ਵੱਖ ਵੈਰੀਏਬਲਾਂ ਅਤੇ ਬਿਮਾਰੀ ਦੇ ਵਿਕਾਸ (Y) ਦੇ ਵਿਚਕਾਰ ਸਬੰਧ ਦੀ ਜਾਂਚ ਕਰੋ\n",
"\n",
"> **ਸੁਝਾਅ** ਸਬੰਧ ਮੈਟ੍ਰਿਕਸ ਤੁਹਾਨੂੰ ਇਹ ਜਾਣਨ ਲਈ ਸਭ ਤੋਂ ਜ਼ਿਆਦਾ ਲਾਭਦਾਇਕ ਜਾਣਕਾਰੀ ਦੇਵੇਗਾ ਕਿ ਕਿਹੜੀਆਂ ਮੁੱਲਾਂ ਦਾ ਇੱਕ ਦੂਜੇ ਨਾਲ ਸਬੰਧ ਹੈ।\n"
"> **ਸੁਝਾਅ** ਸਬੰਧ ਮੈਟ੍ਰਿਕਸ ਤੁਹਾਨੂੰ ਇਹ ਸਮਝਣ ਲਈ ਸਭ ਤੋਂ ਜ਼ਿਆਦਾ ਮਦਦਗਾਰ ਜਾਣਕਾਰੀ ਦੇਵੇਗਾ ਕਿ ਕਿਹੜੀਆਂ ਮੁੱਲਾਂ ਦਾ ਇੱਕ ਦੂਜੇ ਨਾਲ ਸਬੰਧ ਹੈ।\n"
],
"metadata": {}
},
@ -225,7 +225,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**ਅਸਵੀਕਤੀ**: \nਇਹ ਦਸਤਾਵੇਜ਼ AI ਅਨੁਵਾਦ ਸੇਵਾ [Co-op Translator](https://github.com/Azure/co-op-translator) ਦੀ ਵਰਤੋਂ ਕਰਕੇ ਅਨੁਵਾਦ ਕੀਤਾ ਗਿਆ ਹੈ। ਜਦੋਂ ਕਿ ਅਸੀਂ ਸਹੀ ਹੋਣ ਦਾ ਯਤਨ ਕਰਦੇ ਹਾਂ, ਕਿਰਪਾ ਕਰਕੇ ਧਿਆਨ ਦਿਓ ਕਿ ਸਵੈਚਾਲਿਤ ਅਨੁਵਾਦਾਂ ਵਿੱਚ ਗਲਤੀਆਂ ਜਾਂ ਅਸੁੱਤੀਆਂ ਹੋ ਸਕਦੀਆਂ ਹਨ। ਇਸ ਦੀ ਮੂਲ ਭਾਸ਼ਾ ਵਿੱਚ ਮੌਜੂਦ ਮੂਲ ਦਸਤਾਵੇਜ਼ ਨੂੰ ਪ੍ਰਮਾਣਿਕ ਸਰੋਤ ਮੰਨਿਆ ਜਾਣਾ ਚਾਹੀਦਾ ਹੈ। ਮਹੱਤਵਪੂਰਨ ਜਾਣਕਾਰੀ ਲਈ, ਪੇਸ਼ੇਵਰ ਮਨੁੱਖੀ ਅਨੁਵਾਦ ਦੀ ਸਿਫਾਰਸ਼ ਕੀਤੀ ਜਾਂਦੀ ਹੈ। ਇਸ ਅਨੁਵਾਦ ਦੇ ਪ੍ਰਯੋਗ ਤੋਂ ਪੈਦਾ ਹੋਣ ਵਾਲੀਆਂ ਕਿਸੇ ਵੀ ਗਲਤਫਹਮੀਆਂ ਜਾਂ ਗਲਤ ਵਿਆਖਿਆਵਾਂ ਲਈ ਅਸੀਂ ਜ਼ਿੰਮੇਵਾਰ ਨਹੀਂ ਹਾਂ। \n"
"\n---\n\n**ਅਸਵੀਕਤੀ**: \nਇਹ ਦਸਤਾਵੇਜ਼ AI ਅਨੁਵਾਦ ਸੇਵਾ [Co-op Translator](https://github.com/Azure/co-op-translator) ਦੀ ਵਰਤੋਂ ਕਰਕੇ ਅਨੁਵਾਦ ਕੀਤਾ ਗਿਆ ਹੈ। ਜਦੋਂ ਕਿ ਅਸੀਂ ਸਹੀਅਤ ਲਈ ਯਤਨਸ਼ੀਲ ਹਾਂ, ਕਿਰਪਾ ਕਰਕੇ ਧਿਆਨ ਦਿਓ ਕਿ ਸਵੈਚਾਲਿਤ ਅਨੁਵਾਦਾਂ ਵਿੱਚ ਗਲਤੀਆਂ ਜਾਂ ਅਸੁਚਤਤਾਵਾਂ ਹੋ ਸਕਦੀਆਂ ਹਨ। ਮੂਲ ਦਸਤਾਵੇਜ਼ ਨੂੰ ਇਸਦੀ ਮੂਲ ਭਾਸ਼ਾ ਵਿੱਚ ਅਧਿਕਾਰਤ ਸਰੋਤ ਮੰਨਿਆ ਜਾਣਾ ਚਾਹੀਦਾ ਹੈ। ਮਹੱਤਵਪੂਰਨ ਜਾਣਕਾਰੀ ਲਈ, ਪੇਸ਼ੇਵਰ ਮਨੁੱਖੀ ਅਨੁਵਾਦ ਦੀ ਸਿਫਾਰਸ਼ ਕੀਤੀ ਜਾਂਦੀ ਹੈ। ਇਸ ਅਨੁਵਾਦ ਦੀ ਵਰਤੋਂ ਤੋਂ ਪੈਦਾ ਹੋਣ ਵਾਲੇ ਕਿਸੇ ਵੀ ਗਲਤ ਫਹਿਮੀ ਜਾਂ ਗਲਤ ਵਿਆਖਿਆ ਲਈ ਅਸੀਂ ਜ਼ਿੰਮੇਵਾਰ ਨਹੀਂ ਹਾਂ।\n"
]
}
],
@ -251,8 +251,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:45:21+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:24:05+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "pa"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"ਇਸ ਡਾਟਾਸੈਟ ਵਿੱਚ ਕਾਲਮ ਹੇਠਾਂ ਦਿੱਤੇ ਗਏ ਹਨ: \n",
"ਇਸ ਡਾਟਾਸੈਟ ਵਿੱਚ, ਕਾਲਮ ਹੇਠਾਂ ਦਿੱਤੇ ਗਏ ਹਨ: \n",
"* ਉਮਰ ਅਤੇ ਲਿੰਗ ਸਵੈ-ਸਪਸ਼ਟ ਹਨ \n",
"* BMI ਸ਼ਰੀਰ ਦਾ ਭਾਰ ਸੂਚਕਾਂਕ ਹੈ \n",
"* BP ਔਸਤ ਰਕਤ ਦਬਾਅ ਹੈ \n",
"* S1 ਤੋਂ S6 ਵੱਖ-ਵੱਖ ਰਕਤ ਮਾਪ ਹਨ \n",
"* Y ਇੱਕ ਸਾਲ ਵਿੱਚ ਬਿਮਾਰੀ ਦੇ ਵਿਕਾਸ ਦਾ ਗੁਣਾਤਮਕ ਮਾਪ ਹੈ \n",
"* Y ਇੱਕ ਸਾਲ ਦੇ ਦੌਰਾਨ ਬਿਮਾਰੀ ਦੀ ਪ੍ਰਗਤੀ ਦਾ ਗੁਣਾਤਮਕ ਮਾਪ ਹੈ \n",
"\n",
"ਆਓ ਸੰਭਾਵਨਾ ਅਤੇ ਅੰਕੜਾ ਵਿਗਿਆਨ ਦੇ ਤਰੀਕਿਆਂ ਦੀ ਵਰਤੋਂ ਕਰਕੇ ਇਸ ਡਾਟਾਸੈਟ ਦਾ ਅਧਿਐਨ ਕਰੀਏ। \n",
"ਆਓ ਸੰਭਾਵਨਾ ਅਤੇ ਅੰਕੜਾ ਵਿਗਿਆਨ ਦੇ ਤਰੀਕਿਆਂ ਦੀ ਵਰਤੋਂ ਕਰਕੇ ਇਸ ਡਾਟਾਸੈਟ ਦਾ ਅਧਿਐਨ ਕਰੀਏ। \n",
"\n",
"### ਕੰਮ 1: ਸਾਰੇ ਮੁੱਲਾਂ ਲਈ ਔਸਤ ਅਤੇ ਵਿਆਪਨ ਦੀ ਗਣਨਾ ਕਰੋ \n"
"### ਕੰਮ 1: ਸਾਰੇ ਮੁੱਲਾਂ ਲਈ ਔਸਤ ਅਤੇ ਵੈਰੀਅੰਸ ਦੀ ਗਣਨਾ ਕਰੋ \n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -537,8 +537,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -593,9 +593,9 @@
"cell_type": "markdown",
"source": [
"ਨਤੀਜੇ: \n",
"* ਉਮਰ - ਸਧਾਰ \n",
"* ਉਮਰ - ਸਧਾਰ \n",
"* ਲਿੰਗ - ਇਕਸਾਰ \n",
"* ਬੀਐਮਆਈ, ਵਾਈ - ਦੱਸਣਾ ਔਖਾ\n"
"* BMI, Y - ਕਹਿਣਾ ਔਖਾ \n"
],
"metadata": {}
},
@ -604,7 +604,7 @@
"source": [
"### ਟਾਸਕ 4: ਵੱਖ-ਵੱਖ ਵੈਰੀਏਬਲਾਂ ਅਤੇ ਬਿਮਾਰੀ ਦੇ ਵਿਕਾਸ (Y) ਦੇ ਵਿਚਕਾਰ ਸਬੰਧ ਦੀ ਜਾਂਚ ਕਰੋ\n",
"\n",
"> **ਸੁਝਾਅ** ਸਬੰਧ ਮੈਟ੍ਰਿਕਸ ਤੁਹਾਨੂੰ ਇਹ ਜਾਣਨ ਲਈ ਸਭ ਤੋਂ ਜ਼ਿਆਦਾ ਲਾਭਦਾਇਕ ਜਾਣਕਾਰੀ ਦੇਵੇਗਾ ਕਿ ਕਿਹੜੀਆਂ ਮੁੱਲਾਂ ਦਾ ਇੱਕ ਦੂਜੇ ਨਾਲ ਸਬੰਧ ਹੈ।\n"
"> **ਸੁਝਾਅ** ਸਬੰਧ ਮੈਟ੍ਰਿਕਸ ਤੁਹਾਨੂੰ ਇਹ ਪਤਾ ਲਗਾਉਣ ਲਈ ਸਭ ਤੋਂ ਜ਼ਿਆਦਾ ਮਦਦਗਾਰ ਜਾਣਕਾਰੀ ਦੇਵੇਗਾ ਕਿ ਕਿਹੜੀਆਂ ਮੁੱਲਾਂ ਇੱਕ ਦੂਜੇ 'ਤੇ ਨਿਰਭਰ ਹਨ।\n"
],
"metadata": {}
},
@ -847,7 +847,7 @@
"cell_type": "markdown",
"source": [
"ਨਤੀਜਾ: \n",
"* Y ਦੇ ਨ ਸਭ ਤੋਂ ਮਜ਼ਬੂਤ ਸੰਬੰਧ BMI ਅਤੇ S5 (ਖੂਨ ਵਿੱਚ ਸ਼ੂਗਰ) ਹੈ। ਇਹ ਵਾਜਬ ਲੱਗਦਾ ਹੈ।\n"
"* Y ਦਾ ਸਭ ਤੋਂ ਮਜ਼ਬੂਤ ਸੰਬੰਧ BMI ਅਤੇ S5 (ਖੂਨ ਵਿੱਚ ਸ਼ੱਕਰ) ਨਾਲ ਹੈ। ਇਹ ਵਾਜਬ ਲੱਗਦਾ ਹੈ।\n"
],
"metadata": {}
},
@ -855,10 +855,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -885,9 +885,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -916,7 +916,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**ਅਸਵੀਕਰਤੀ**: \nਇਹ ਦਸਤਾਵੇਜ਼ AI ਅਨੁਵਾਦ ਸੇਵਾ [Co-op Translator](https://github.com/Azure/co-op-translator) ਦੀ ਵਰਤੋਂ ਕਰਕੇ ਅਨੁਵਾਦ ਕੀਤਾ ਗਿਆ ਹੈ। ਜਦੋਂ ਕਿ ਅਸੀਂ ਸਹੀ ਹੋਣ ਦਾ ਯਤਨ ਕਰਦੇ ਹਾਂ, ਕਿਰਪਾ ਕਰਕੇ ਧਿਆਨ ਦਿਓ ਕਿ ਸਵੈਚਾਲਿਤ ਅਨੁਵਾਦਾਂ ਵਿੱਚ ਗਲਤੀਆਂ ਜਾਂ ਅਸੁੱਤੀਆਂ ਹੋ ਸਕਦੀਆਂ ਹਨ। ਇਸ ਦੀ ਮੂਲ ਭਾਸ਼ਾ ਵਿੱਚ ਮੌਜੂਦ ਮੂਲ ਦਸਤਾਵੇਜ਼ ਨੂੰ ਪ੍ਰਮਾਣਿਕ ਸਰੋਤ ਮੰਨਿਆ ਜਾਣਾ ਚਾਹੀਦਾ ਹੈ। ਮਹੱਤਵਪੂਰਨ ਜਾਣਕਾਰੀ ਲਈ, ਪੇਸ਼ੇਵਰ ਮਨੁੱਖੀ ਅਨੁਵਾਦ ਦੀ ਸਿਫਾਰਸ਼ ਕੀਤੀ ਜਾਂਦੀ ਹੈ। ਇਸ ਅਨੁਵਾਦ ਦੇ ਪ੍ਰਯੋਗ ਤੋਂ ਪੈਦਾ ਹੋਣ ਵਾਲੇ ਕਿਸੇ ਵੀ ਗਲਤਫਹਿਮੀ ਜਾਂ ਗਲਤ ਵਿਆਖਿਆ ਲਈ ਅਸੀਂ ਜ਼ਿੰਮੇਵਾਰ ਨਹੀਂ ਹਾਂ। \n"
"\n---\n\n**ਅਸਵੀਕਰਤੀ**: \nਇਹ ਦਸਤਾਵੇਜ਼ AI ਅਨੁਵਾਦ ਸੇਵਾ [Co-op Translator](https://github.com/Azure/co-op-translator) ਦੀ ਵਰਤੋਂ ਕਰਕੇ ਅਨੁਵਾਦ ਕੀਤਾ ਗਿਆ ਹੈ। ਜਦੋਂ ਕਿ ਅਸੀਂ ਸਹੀ ਹੋਣ ਦਾ ਯਤਨ ਕਰਦੇ ਹਾਂ, ਕਿਰਪਾ ਕਰਕੇ ਧਿਆਨ ਦਿਓ ਕਿ ਸਵੈਚਾਲਿਤ ਅਨੁਵਾਦਾਂ ਵਿੱਚ ਗਲਤੀਆਂ ਜਾਂ ਅਸੁੱਚੀਤਤਾਵਾਂ ਹੋ ਸਕਦੀਆਂ ਹਨ। ਇਸ ਦੀ ਮੂਲ ਭਾਸ਼ਾ ਵਿੱਚ ਮੌਜੂਦ ਮੂਲ ਦਸਤਾਵੇਜ਼ ਨੂੰ ਪ੍ਰਮਾਣਿਕ ਸਰੋਤ ਮੰਨਿਆ ਜਾਣਾ ਚਾਹੀਦਾ ਹੈ। ਮਹੱਤਵਪੂਰਨ ਜਾਣਕਾਰੀ ਲਈ, ਪੇਸ਼ੇਵਰ ਮਨੁੱਖੀ ਅਨੁਵਾਦ ਦੀ ਸਿਫਾਰਸ਼ ਕੀਤੀ ਜਾਂਦੀ ਹੈ। ਇਸ ਅਨੁਵਾਦ ਦੀ ਵਰਤੋਂ ਤੋਂ ਪੈਦਾ ਹੋਣ ਵਾਲੇ ਕਿਸੇ ਵੀ ਗਲਤਫਹਿਮੀ ਜਾਂ ਗਲਤ ਵਿਆਖਿਆ ਲਈ ਅਸੀਂ ਜ਼ਿੰਮੇਵਾਰ ਨਹੀਂ ਹਾਂ। \n"
]
}
],
@ -942,8 +942,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:52:28+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:24:25+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "pa"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,14 +149,14 @@
{
"cell_type": "markdown",
"source": [
"W tym zestawie danych kolumny są następujące:\n",
"* Wiek i płeć są oczywiste\n",
"* BMI to wskaźnik masy ciała\n",
"* BP to średnie ciśnienie krwi\n",
"* S1 do S6 to różne pomiary krwi\n",
"* Y to jakościowy wskaźnik postępu choroby w ciągu jednego roku\n",
"W tym zbiorze danych kolumny przedstawiają następujące informacje: \n",
"* Wiek i płeć są oczywiste \n",
"* BMI to wskaźnik masy ciała \n",
"* BP to średnie ciśnienie krwi \n",
"* S1 do S6 to różne pomiary krwi \n",
"* Y to jakościowa miara postępu choroby w ciągu jednego roku \n",
"\n",
"Przeanalizujmy ten zestaw danych za pomocą metod prawdopodobieństwa i statystyki.\n",
"Przeanalizujmy ten zbiór danych za pomocą metod prawdopodobieństwa i statystyki.\n",
"\n",
"### Zadanie 1: Oblicz średnie wartości i wariancję dla wszystkich danych\n"
],
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Zastrzeżenie**: \nTen dokument został przetłumaczony za pomocą usługi tłumaczeniowej AI [Co-op Translator](https://github.com/Azure/co-op-translator). Chociaż dokładamy wszelkich starań, aby zapewnić dokładność, prosimy pamiętać, że automatyczne tłumaczenia mogą zawierać błędy lub nieścisłości. Oryginalny dokument w jego rodzimym języku powinien być uznawany za wiarygodne źródło. W przypadku informacji krytycznych zaleca się skorzystanie z profesjonalnego tłumaczenia wykonanego przez człowieka. Nie ponosimy odpowiedzialności za jakiekolwiek nieporozumienia lub błędne interpretacje wynikające z korzystania z tego tłumaczenia.\n"
"\n---\n\n**Zastrzeżenie**: \nTen dokument został przetłumaczony za pomocą usługi tłumaczeniowej AI [Co-op Translator](https://github.com/Azure/co-op-translator). Chociaż dokładamy wszelkich starań, aby tłumaczenie było precyzyjne, prosimy pamiętać, że automatyczne tłumaczenia mogą zawierać błędy lub nieścisłości. Oryginalny dokument w jego rodzimym języku powinien być uznawany za wiarygodne źródło. W przypadku informacji krytycznych zaleca się skorzystanie z profesjonalnego tłumaczenia wykonanego przez człowieka. Nie ponosimy odpowiedzialności za jakiekolwiek nieporozumienia lub błędne interpretacje wynikające z korzystania z tego tłumaczenia.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:45:38+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:29:04+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "pl"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,14 +150,14 @@
{
"cell_type": "markdown",
"source": [
"W tym zestawie danych kolumny są następujące:\n",
"* Wiek i płeć są oczywiste\n",
"* BMI to wskaźnik masy ciała\n",
"* BP to średnie ciśnienie krwi\n",
"* S1 do S6 to różne pomiary krwi\n",
"* Y to jakościowy wskaźnik postępu choroby w ciągu jednego roku\n",
"W tym zbiorze danych kolumny przedstawiają następujące informacje: \n",
"* Wiek i płeć są oczywiste \n",
"* BMI to wskaźnik masy ciała \n",
"* BP to średnie ciśnienie krwi \n",
"* S1 do S6 to różne pomiary krwi \n",
"* Y to jakościowa miara postępu choroby w ciągu jednego roku \n",
"\n",
"Przeanalizujmy ten zestaw danych za pomocą metod prawdopodobieństwa i statystyki.\n",
"Przeanalizujmy ten zbiór danych za pomocą metod prawdopodobieństwa i statystyki.\n",
"\n",
"### Zadanie 1: Oblicz średnie wartości i wariancję dla wszystkich danych\n"
],
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -477,7 +477,7 @@
{
"cell_type": "markdown",
"source": [
"### Zadanie 2: Wykreśl wykresy pudełkowe dla BMI, BP i Y w zależności od płci\n"
"### Zadanie 2: Narysuj wykresy pudełkowe dla BMI, BP i Y w zależności od płci\n"
],
"metadata": {}
},
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:52:50+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:29:20+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "pl"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"Neste conjunto de dados, as colunas são as seguintes:\n",
"* Idade e sexo são autoexplicativos\n",
"* IMC é o índice de massa corporal\n",
"* PA é a pressão arterial média\n",
"* S1 até S6 são diferentes medições sanguíneas\n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano\n",
"Neste conjunto de dados, as colunas são as seguintes: \n",
"* Idade e sexo são autoexplicativos \n",
"* IMC é o índice de massa corporal \n",
"* PA é a pressão arterial média \n",
"* S1 até S6 são diferentes medições sanguíneas \n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano \n",
"\n",
"Vamos estudar este conjunto de dados utilizando métodos de probabilidade e estatística.\n",
"\n",
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores\n"
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores \n"
],
"metadata": {}
},
@ -172,7 +172,7 @@
{
"cell_type": "markdown",
"source": [
"### Tarefa 2: Traçar boxplots para IMC, TA e Y dependendo do género\n"
"### Tarefa 2: Traçar boxplots para IMC, PA e Y dependendo do género\n"
],
"metadata": {}
},
@ -223,7 +223,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Aviso Legal**: \nEste documento foi traduzido utilizando o serviço de tradução automática [Co-op Translator](https://github.com/Azure/co-op-translator). Embora nos esforcemos para garantir a precisão, esteja ciente de que traduções automáticas podem conter erros ou imprecisões. O documento original na sua língua nativa deve ser considerado a fonte oficial. Para informações críticas, recomenda-se a tradução profissional realizada por humanos. Não nos responsabilizamos por quaisquer mal-entendidos ou interpretações incorretas resultantes do uso desta tradução.\n"
"\n---\n\n**Aviso Legal**: \nEste documento foi traduzido utilizando o serviço de tradução por IA [Co-op Translator](https://github.com/Azure/co-op-translator). Embora nos esforcemos para garantir a precisão, é importante notar que traduções automáticas podem conter erros ou imprecisões. O documento original na sua língua nativa deve ser considerado a fonte autoritária. Para informações críticas, recomenda-se a tradução profissional realizada por humanos. Não nos responsabilizamos por quaisquer mal-entendidos ou interpretações incorretas decorrentes da utilização desta tradução.\n"
]
}
],
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:45:51+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:25:21+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "pt"
}

File diff suppressed because one or more lines are too long

@ -14,11 +14,11 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"import matplotlib.pyplot as plt\r\n",
"\r\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"df = pd.read_csv(\"../../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -150,16 +150,16 @@
{
"cell_type": "markdown",
"source": [
"Neste conjunto de dados, as colunas são as seguintes:\n",
"* Idade e sexo são autoexplicativos\n",
"* IMC é o índice de massa corporal\n",
"* PA é a pressão arterial média\n",
"* S1 até S6 são diferentes medições de sangue\n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano\n",
"Neste conjunto de dados, as colunas são as seguintes: \n",
"* Idade e sexo são autoexplicativos \n",
"* IMC é o índice de massa corporal \n",
"* PA é a pressão arterial média \n",
"* S1 até S6 são diferentes medições sanguíneas \n",
"* Y é a medida qualitativa da progressão da doença ao longo de um ano \n",
"\n",
"Vamos estudar este conjunto de dados utilizando métodos de probabilidade e estatística.\n",
"\n",
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores\n"
"### Tarefa 1: Calcular os valores médios e a variância para todos os valores \n"
],
"metadata": {}
},
@ -354,7 +354,7 @@
"cell_type": "code",
"execution_count": 8,
"source": [
"# Another way\r\n",
"# Another way\n",
"pd.DataFrame([df.mean(),df.var()],index=['Mean','Variance']).head()"
],
"outputs": [
@ -446,7 +446,7 @@
"cell_type": "code",
"execution_count": 9,
"source": [
"# Or, more simply, for the mean (variance can be done similarly)\r\n",
"# Or, more simply, for the mean (variance can be done similarly)\n",
"df.mean()"
],
"outputs": [
@ -485,8 +485,8 @@
"cell_type": "code",
"execution_count": 17,
"source": [
"for col in ['BMI','BP','Y']:\r\n",
" df.boxplot(column=col,by='SEX')\r\n",
"for col in ['BMI','BP','Y']:\n",
" df.boxplot(column=col,by='SEX')\n",
"plt.show()"
],
"outputs": [
@ -535,8 +535,8 @@
"cell_type": "code",
"execution_count": 19,
"source": [
"for col in ['AGE','SEX','BMI','Y']:\r\n",
" df[col].hist()\r\n",
"for col in ['AGE','SEX','BMI','Y']:\n",
" df[col].hist()\n",
" plt.show()"
],
"outputs": [
@ -590,10 +590,10 @@
{
"cell_type": "markdown",
"source": [
"Conclusões:\n",
"Conclusões: \n",
"* Idade - normal \n",
"* Sexo - uniforme \n",
"* IMC, Y - difícil de dizer \n"
"* IMC, Y - difícil de determinar \n"
],
"metadata": {}
},
@ -853,10 +853,10 @@
"cell_type": "code",
"execution_count": 26,
"source": [
"fig, ax = plt.subplots(1,3,figsize=(10,5))\r\n",
"for i,n in enumerate(['BMI','S5','BP']):\r\n",
" ax[i].scatter(df['Y'],df[n])\r\n",
" ax[i].set_title(n)\r\n",
"fig, ax = plt.subplots(1,3,figsize=(10,5))\n",
"for i,n in enumerate(['BMI','S5','BP']):\n",
" ax[i].scatter(df['Y'],df[n])\n",
" ax[i].set_title(n)\n",
"plt.show()"
],
"outputs": [
@ -883,9 +883,9 @@
"cell_type": "code",
"execution_count": 27,
"source": [
"from scipy.stats import ttest_ind\r\n",
"\r\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\r\n",
"from scipy.stats import ttest_ind\n",
"\n",
"tval, pval = ttest_ind(df.loc[df['SEX']==1,['Y']], df.loc[df['SEX']==2,['Y']],equal_var=False)\n",
"print(f\"T-value = {tval[0]:.2f}\\nP-value: {pval[0]}\")"
],
"outputs": [
@ -940,8 +940,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "1bdbefe3f2486d8e178ee242ac532d43",
"translation_date": "2025-09-02T09:53:10+00:00",
"original_hash": "ebf5783d7ab3f7ab30a437492a30b229",
"translation_date": "2025-09-06T17:25:36+00:00",
"source_file": "1-Introduction/04-stats-and-probability/solution/assignment.ipynb",
"language_code": "pt"
}

@ -14,10 +14,10 @@
"cell_type": "code",
"execution_count": 13,
"source": [
"import pandas as pd\r\n",
"import numpy as np\r\n",
"\r\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
"df.head()"
],
"outputs": [
@ -149,16 +149,16 @@
{
"cell_type": "markdown",
"source": [
"În acest set de date, coloanele sunt următoarele: \n",
"* Vârsta și sexul sunt auto-explicative \n",
"* BMI este indicele de masă corporală \n",
"* BP este tensiunea arterială medie \n",
"* S1 până la S6 sunt diferite măsurători ale sângelui \n",
"* Y este măsura calitativă a progresiei bolii pe parcursul unui an \n",
"În acest set de date, coloanele sunt următoarele:\n",
"* Vârsta și sexul sunt auto-explicative\n",
"* BMI este indicele de masă corporală\n",
"* BP este tensiunea arterială medie\n",
"* S1 până la S6 sunt diferite măsurători ale sângelui\n",
"* Y este măsura calitativă a progresiei bolii pe parcursul unui an\n",
"\n",
"Să studiem acest set de date folosind metode de probabilitate și statistică.\n",
"\n",
"### Sarcina 1: Calculați valorile medii și varianța pentru toate valorile \n"
"### Sarcina 1: Calculați valorile medii și varianța pentru toate valorile\n"
],
"metadata": {}
},
@ -249,8 +249,8 @@
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
},
"coopTranslator": {
"original_hash": "defe9f96b3d327a6f37d795c43ad0219",
"translation_date": "2025-09-02T09:46:06+00:00",
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
"translation_date": "2025-09-06T17:53:28+00:00",
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
"language_code": "ro"
}

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save