You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
263 lines
7.5 KiB
263 lines
7.5 KiB
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Einführung in Wahrscheinlichkeit und Statistik\n",
|
|
"## Aufgabe\n",
|
|
"\n",
|
|
"In dieser Aufgabe verwenden wir den Datensatz von Diabetes-Patienten, der [von hier](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html) stammt.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"source": [
|
|
"import pandas as pd\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
|
|
"df.head()"
|
|
],
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
" AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y\n",
|
|
"0 59 2 32.1 101.0 157 93.2 38.0 4.0 4.8598 87 151\n",
|
|
"1 48 1 21.6 87.0 183 103.2 70.0 3.0 3.8918 69 75\n",
|
|
"2 72 2 30.5 93.0 156 93.6 41.0 4.0 4.6728 85 141\n",
|
|
"3 24 1 25.3 84.0 198 131.4 40.0 5.0 4.8903 89 206\n",
|
|
"4 50 1 23.0 101.0 192 125.4 52.0 4.0 4.2905 80 135"
|
|
],
|
|
"text/html": [
|
|
"<div>\n",
|
|
"<style scoped>\n",
|
|
" .dataframe tbody tr th:only-of-type {\n",
|
|
" vertical-align: middle;\n",
|
|
" }\n",
|
|
"\n",
|
|
" .dataframe tbody tr th {\n",
|
|
" vertical-align: top;\n",
|
|
" }\n",
|
|
"\n",
|
|
" .dataframe thead th {\n",
|
|
" text-align: right;\n",
|
|
" }\n",
|
|
"</style>\n",
|
|
"<table border=\"1\" class=\"dataframe\">\n",
|
|
" <thead>\n",
|
|
" <tr style=\"text-align: right;\">\n",
|
|
" <th></th>\n",
|
|
" <th>AGE</th>\n",
|
|
" <th>SEX</th>\n",
|
|
" <th>BMI</th>\n",
|
|
" <th>BP</th>\n",
|
|
" <th>S1</th>\n",
|
|
" <th>S2</th>\n",
|
|
" <th>S3</th>\n",
|
|
" <th>S4</th>\n",
|
|
" <th>S5</th>\n",
|
|
" <th>S6</th>\n",
|
|
" <th>Y</th>\n",
|
|
" </tr>\n",
|
|
" </thead>\n",
|
|
" <tbody>\n",
|
|
" <tr>\n",
|
|
" <th>0</th>\n",
|
|
" <td>59</td>\n",
|
|
" <td>2</td>\n",
|
|
" <td>32.1</td>\n",
|
|
" <td>101.0</td>\n",
|
|
" <td>157</td>\n",
|
|
" <td>93.2</td>\n",
|
|
" <td>38.0</td>\n",
|
|
" <td>4.0</td>\n",
|
|
" <td>4.8598</td>\n",
|
|
" <td>87</td>\n",
|
|
" <td>151</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>1</th>\n",
|
|
" <td>48</td>\n",
|
|
" <td>1</td>\n",
|
|
" <td>21.6</td>\n",
|
|
" <td>87.0</td>\n",
|
|
" <td>183</td>\n",
|
|
" <td>103.2</td>\n",
|
|
" <td>70.0</td>\n",
|
|
" <td>3.0</td>\n",
|
|
" <td>3.8918</td>\n",
|
|
" <td>69</td>\n",
|
|
" <td>75</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>2</th>\n",
|
|
" <td>72</td>\n",
|
|
" <td>2</td>\n",
|
|
" <td>30.5</td>\n",
|
|
" <td>93.0</td>\n",
|
|
" <td>156</td>\n",
|
|
" <td>93.6</td>\n",
|
|
" <td>41.0</td>\n",
|
|
" <td>4.0</td>\n",
|
|
" <td>4.6728</td>\n",
|
|
" <td>85</td>\n",
|
|
" <td>141</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>3</th>\n",
|
|
" <td>24</td>\n",
|
|
" <td>1</td>\n",
|
|
" <td>25.3</td>\n",
|
|
" <td>84.0</td>\n",
|
|
" <td>198</td>\n",
|
|
" <td>131.4</td>\n",
|
|
" <td>40.0</td>\n",
|
|
" <td>5.0</td>\n",
|
|
" <td>4.8903</td>\n",
|
|
" <td>89</td>\n",
|
|
" <td>206</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>4</th>\n",
|
|
" <td>50</td>\n",
|
|
" <td>1</td>\n",
|
|
" <td>23.0</td>\n",
|
|
" <td>101.0</td>\n",
|
|
" <td>192</td>\n",
|
|
" <td>125.4</td>\n",
|
|
" <td>52.0</td>\n",
|
|
" <td>4.0</td>\n",
|
|
" <td>4.2905</td>\n",
|
|
" <td>80</td>\n",
|
|
" <td>135</td>\n",
|
|
" </tr>\n",
|
|
" </tbody>\n",
|
|
"</table>\n",
|
|
"</div>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 13
|
|
}
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"In diesem Datensatz sind die Spalten wie folgt:\n",
|
|
"\n",
|
|
"* Alter und Geschlecht sind selbsterklärend\n",
|
|
"* BMI ist der Body-Mass-Index\n",
|
|
"* BP ist der durchschnittliche Blutdruck\n",
|
|
"* S1 bis S6 sind verschiedene Blutmesswerte\n",
|
|
"* Y ist das qualitative Maß für den Krankheitsverlauf über ein Jahr\n",
|
|
"\n",
|
|
"Lassen Sie uns diesen Datensatz mit Methoden der Wahrscheinlichkeit und Statistik untersuchen.\n",
|
|
"\n",
|
|
"### Aufgabe 1: Mittelwerte und Varianz für alle Werte berechnen\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"### Aufgabe 2: Erstelle Boxplots für BMI, BP und Y in Abhängigkeit vom Geschlecht\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"### Aufgabe 3: Wie ist die Verteilung von Alter, Geschlecht, BMI und Y-Variablen?\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"### Aufgabe 4: Testen Sie die Korrelation zwischen verschiedenen Variablen und dem Krankheitsverlauf (Y)\n",
|
|
"\n",
|
|
"> **Tipp** Eine Korrelationsmatrix liefert Ihnen die nützlichsten Informationen darüber, welche Werte voneinander abhängig sind.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n---\n\n**Haftungsausschluss**: \nDieses Dokument wurde mit dem KI-Übersetzungsdienst [Co-op Translator](https://github.com/Azure/co-op-translator) übersetzt. Obwohl wir uns um Genauigkeit bemühen, weisen wir darauf hin, dass automatisierte Übersetzungen Fehler oder Ungenauigkeiten enthalten können. Das Originaldokument in seiner ursprünglichen Sprache sollte als maßgebliche Quelle betrachtet werden. Für kritische Informationen wird eine professionelle menschliche Übersetzung empfohlen. Wir übernehmen keine Haftung für Missverständnisse oder Fehlinterpretationen, die sich aus der Nutzung dieser Übersetzung ergeben.\n"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"orig_nbformat": 4,
|
|
"language_info": {
|
|
"name": "python",
|
|
"version": "3.8.8",
|
|
"mimetype": "text/x-python",
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"pygments_lexer": "ipython3",
|
|
"nbconvert_exporter": "python",
|
|
"file_extension": ".py"
|
|
},
|
|
"kernelspec": {
|
|
"name": "python3",
|
|
"display_name": "Python 3.8.8 64-bit (conda)"
|
|
},
|
|
"interpreter": {
|
|
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
|
|
},
|
|
"coopTranslator": {
|
|
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
|
|
"translation_date": "2025-09-06T17:03:05+00:00",
|
|
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
|
|
"language_code": "de"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
} |