You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
260 lines
7.5 KiB
260 lines
7.5 KiB
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Introduction à la Probabilité et aux Statistiques\n",
|
|
"## Devoir\n",
|
|
"\n",
|
|
"Dans ce devoir, nous utiliserons le jeu de données des patients diabétiques disponible [ici](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html).\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"source": [
|
|
"import pandas as pd\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\n",
|
|
"df.head()"
|
|
],
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
" AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y\n",
|
|
"0 59 2 32.1 101.0 157 93.2 38.0 4.0 4.8598 87 151\n",
|
|
"1 48 1 21.6 87.0 183 103.2 70.0 3.0 3.8918 69 75\n",
|
|
"2 72 2 30.5 93.0 156 93.6 41.0 4.0 4.6728 85 141\n",
|
|
"3 24 1 25.3 84.0 198 131.4 40.0 5.0 4.8903 89 206\n",
|
|
"4 50 1 23.0 101.0 192 125.4 52.0 4.0 4.2905 80 135"
|
|
],
|
|
"text/html": [
|
|
"<div>\n",
|
|
"<style scoped>\n",
|
|
" .dataframe tbody tr th:only-of-type {\n",
|
|
" vertical-align: middle;\n",
|
|
" }\n",
|
|
"\n",
|
|
" .dataframe tbody tr th {\n",
|
|
" vertical-align: top;\n",
|
|
" }\n",
|
|
"\n",
|
|
" .dataframe thead th {\n",
|
|
" text-align: right;\n",
|
|
" }\n",
|
|
"</style>\n",
|
|
"<table border=\"1\" class=\"dataframe\">\n",
|
|
" <thead>\n",
|
|
" <tr style=\"text-align: right;\">\n",
|
|
" <th></th>\n",
|
|
" <th>AGE</th>\n",
|
|
" <th>SEX</th>\n",
|
|
" <th>BMI</th>\n",
|
|
" <th>BP</th>\n",
|
|
" <th>S1</th>\n",
|
|
" <th>S2</th>\n",
|
|
" <th>S3</th>\n",
|
|
" <th>S4</th>\n",
|
|
" <th>S5</th>\n",
|
|
" <th>S6</th>\n",
|
|
" <th>Y</th>\n",
|
|
" </tr>\n",
|
|
" </thead>\n",
|
|
" <tbody>\n",
|
|
" <tr>\n",
|
|
" <th>0</th>\n",
|
|
" <td>59</td>\n",
|
|
" <td>2</td>\n",
|
|
" <td>32.1</td>\n",
|
|
" <td>101.0</td>\n",
|
|
" <td>157</td>\n",
|
|
" <td>93.2</td>\n",
|
|
" <td>38.0</td>\n",
|
|
" <td>4.0</td>\n",
|
|
" <td>4.8598</td>\n",
|
|
" <td>87</td>\n",
|
|
" <td>151</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>1</th>\n",
|
|
" <td>48</td>\n",
|
|
" <td>1</td>\n",
|
|
" <td>21.6</td>\n",
|
|
" <td>87.0</td>\n",
|
|
" <td>183</td>\n",
|
|
" <td>103.2</td>\n",
|
|
" <td>70.0</td>\n",
|
|
" <td>3.0</td>\n",
|
|
" <td>3.8918</td>\n",
|
|
" <td>69</td>\n",
|
|
" <td>75</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>2</th>\n",
|
|
" <td>72</td>\n",
|
|
" <td>2</td>\n",
|
|
" <td>30.5</td>\n",
|
|
" <td>93.0</td>\n",
|
|
" <td>156</td>\n",
|
|
" <td>93.6</td>\n",
|
|
" <td>41.0</td>\n",
|
|
" <td>4.0</td>\n",
|
|
" <td>4.6728</td>\n",
|
|
" <td>85</td>\n",
|
|
" <td>141</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>3</th>\n",
|
|
" <td>24</td>\n",
|
|
" <td>1</td>\n",
|
|
" <td>25.3</td>\n",
|
|
" <td>84.0</td>\n",
|
|
" <td>198</td>\n",
|
|
" <td>131.4</td>\n",
|
|
" <td>40.0</td>\n",
|
|
" <td>5.0</td>\n",
|
|
" <td>4.8903</td>\n",
|
|
" <td>89</td>\n",
|
|
" <td>206</td>\n",
|
|
" </tr>\n",
|
|
" <tr>\n",
|
|
" <th>4</th>\n",
|
|
" <td>50</td>\n",
|
|
" <td>1</td>\n",
|
|
" <td>23.0</td>\n",
|
|
" <td>101.0</td>\n",
|
|
" <td>192</td>\n",
|
|
" <td>125.4</td>\n",
|
|
" <td>52.0</td>\n",
|
|
" <td>4.0</td>\n",
|
|
" <td>4.2905</td>\n",
|
|
" <td>80</td>\n",
|
|
" <td>135</td>\n",
|
|
" </tr>\n",
|
|
" </tbody>\n",
|
|
"</table>\n",
|
|
"</div>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 13
|
|
}
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"Dans ce jeu de données, les colonnes sont les suivantes :\n",
|
|
"* L'âge et le sexe sont explicites\n",
|
|
"* L'IMC est l'indice de masse corporelle\n",
|
|
"* BP est la pression artérielle moyenne\n",
|
|
"* S1 à S6 sont différentes mesures sanguines\n",
|
|
"* Y est la mesure qualitative de la progression de la maladie sur une année\n",
|
|
"\n",
|
|
"Étudions ce jeu de données en utilisant des méthodes de probabilité et de statistiques.\n",
|
|
"\n",
|
|
"### Tâche 1 : Calculer les valeurs moyennes et la variance pour tous les éléments\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"### Tâche 2 : Tracer des boîtes à moustaches pour l'IMC, la TA et Y en fonction du sexe\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"### Tâche 4 : Tester la corrélation entre différentes variables et la progression de la maladie (Y)\n",
|
|
"\n",
|
|
"> **Conseil** Une matrice de corrélation vous fournira les informations les plus utiles sur les valeurs qui sont dépendantes.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n---\n\n**Avertissement** : \nCe document a été traduit à l'aide du service de traduction automatique [Co-op Translator](https://github.com/Azure/co-op-translator). Bien que nous nous efforcions d'assurer l'exactitude, veuillez noter que les traductions automatisées peuvent contenir des erreurs ou des inexactitudes. Le document original dans sa langue d'origine doit être considéré comme la source faisant autorité. Pour des informations critiques, il est recommandé de faire appel à une traduction humaine professionnelle. Nous déclinons toute responsabilité en cas de malentendus ou d'interprétations erronées résultant de l'utilisation de cette traduction.\n"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"orig_nbformat": 4,
|
|
"language_info": {
|
|
"name": "python",
|
|
"version": "3.8.8",
|
|
"mimetype": "text/x-python",
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"pygments_lexer": "ipython3",
|
|
"nbconvert_exporter": "python",
|
|
"file_extension": ".py"
|
|
},
|
|
"kernelspec": {
|
|
"name": "python3",
|
|
"display_name": "Python 3.8.8 64-bit (conda)"
|
|
},
|
|
"interpreter": {
|
|
"hash": "86193a1ab0ba47eac1c69c1756090baa3b420b3eea7d4aafab8b85f8b312f0c5"
|
|
},
|
|
"coopTranslator": {
|
|
"original_hash": "6d945fd15163f60cb473dbfe04b2d100",
|
|
"translation_date": "2025-09-06T17:00:34+00:00",
|
|
"source_file": "1-Introduction/04-stats-and-probability/assignment.ipynb",
|
|
"language_code": "fr"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
} |