{ "cells": [ { "cell_type": "markdown", "source": [ "## Introduction to Probability and Statistics\r\n", "## Assignment\r\n", "\r\n", "In this assignment, we will use the dataset of diabetes patients taken [from here](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)." ], "metadata": {} }, { "cell_type": "code", "execution_count": 13, "source": [ "import pandas as pd\r\n", "import numpy as np\r\n", "\r\n", "df = pd.read_csv(\"../../data/diabetes.tsv\",sep='\\t')\r\n", "df.head()" ], "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " AGE SEX BMI BP S1 S2 S3 S4 S5 S6 Y\n", "0 59 2 32.1 101.0 157 93.2 38.0 4.0 4.8598 87 151\n", "1 48 1 21.6 87.0 183 103.2 70.0 3.0 3.8918 69 75\n", "2 72 2 30.5 93.0 156 93.6 41.0 4.0 4.6728 85 141\n", "3 24 1 25.3 84.0 198 131.4 40.0 5.0 4.8903 89 206\n", "4 50 1 23.0 101.0 192 125.4 52.0 4.0 4.2905 80 135" ], "text/html": [ "
\n", " | AGE | \n", "SEX | \n", "BMI | \n", "BP | \n", "S1 | \n", "S2 | \n", "S3 | \n", "S4 | \n", "S5 | \n", "S6 | \n", "Y | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "59 | \n", "2 | \n", "32.1 | \n", "101.0 | \n", "157 | \n", "93.2 | \n", "38.0 | \n", "4.0 | \n", "4.8598 | \n", "87 | \n", "151 | \n", "
1 | \n", "48 | \n", "1 | \n", "21.6 | \n", "87.0 | \n", "183 | \n", "103.2 | \n", "70.0 | \n", "3.0 | \n", "3.8918 | \n", "69 | \n", "75 | \n", "
2 | \n", "72 | \n", "2 | \n", "30.5 | \n", "93.0 | \n", "156 | \n", "93.6 | \n", "41.0 | \n", "4.0 | \n", "4.6728 | \n", "85 | \n", "141 | \n", "
3 | \n", "24 | \n", "1 | \n", "25.3 | \n", "84.0 | \n", "198 | \n", "131.4 | \n", "40.0 | \n", "5.0 | \n", "4.8903 | \n", "89 | \n", "206 | \n", "
4 | \n", "50 | \n", "1 | \n", "23.0 | \n", "101.0 | \n", "192 | \n", "125.4 | \n", "52.0 | \n", "4.0 | \n", "4.2905 | \n", "80 | \n", "135 | \n", "