diff --git a/1-Introduction/04-stats-and-probability/README.md b/1-Introduction/04-stats-and-probability/README.md
index 4f762e2b..4ea65986 100644
--- a/1-Introduction/04-stats-and-probability/README.md
+++ b/1-Introduction/04-stats-and-probability/README.md
@@ -38,7 +38,7 @@ Suppose we draw a sequence of n samples of a random variable X: x1, x
> It can be demonstrated that for any discrete distribution with values {x1, x2, ..., xN} and corresponding probabilities p1, p2, ..., pN, the expectation would equal to E(X)=x1p1+x2p2+...+xNpN.
-To identify how far the values are spread, we can compute the variance σ2 = ∑(xi - \mu;)2/n), where μ is the mean of the sequence. The value σ is called **standard deviation**, and σ2 is called a **variance**.
+To identify how far the values are spread, we can compute the variance σ2 = ∑(xi - μ)2/n, where μ is the mean of the sequence. The value σ is called **standard deviation**, and σ2 is called a **variance**.
## Real-world Data
@@ -48,12 +48,40 @@ When we analyze data from real life, they often are not random variables as such
[180.0, 215.0, 210.0, 210.0, 188.0, 176.0, 209.0, 200.0, 231.0, 180.0, 188.0, 180.0, 185.0, 160.0, 180.0, 185.0, 197.0, 189.0, 185.0, 219.0]
```
-> When working with real data, we assume that data points are samples drawn from some probability distribution. This assumption allows us to apply machine learning techniques and build working predictive models.
+> When working with real-world data, we assume that all data points are samples drawn from some probability distribution. This assumption allows us to apply machine learning techniques and build working predictive models.
+To see what is the distribution of our data, we can plot a graph called a **histogram**. X-axis would contain a number of different weight intervals (so-called **bins**), and vertical axis would show the number of times our random variable sample was inside a given interval.
+
+From this histogram you can see that all values are centered around certain mean weight, and the further we go from that weight - the fewer weights of that value are encountered. I.e., it is very improbable that a weight of a baseball player would be very different from the mean weight. Variance of weights show the extent to which weights are likely to differ from the mean.
+
+> If we take weights of other people, not from the baseball league, the distribution is likely to be different. However, the shape of the distribution will be the same, but mean and variance would change. So, if we train our model on baseball players, it i likely to give wrong results when applied to students of a university, because the underlying distribution is different.
## Normal Distribution
+The distribution of weights that we have seen above is very typical, and many measurements from real world follow the same type of distribution, but with different mean and variance. This distribution is called **normal distribution**, and it plays very important role in statistics.
+
+Using normal distribution is a correct way to generate random weights of potential baseball players. Once we know mean weight `mean` and standard deviation `std`, we can generate 1000 weight samples in the following way:
+```python
+samples = np.random.normal(mean,std,1000)
+```
+
+If we plot the histogram of the generated samples we will see the picture very similar to the one shown above. And if we increase the number of samples and the number of bins, we can generate a picture of a normal distribution that is more close to ideal:
+
+
+*Normal Distribution with mean=0 and std.dev=1*
+
+
+## Law of Large Numbers and Central Limit Theorem
+
+One of the reasons why normal distribution is so important is so-called **central limit theorem**. Suppose we have a large sample of independent N values X1, ..., XN, sampled from any distribution with mean μ and variance σ2. Then, for sufficiently large N (in other words, when N→∞), the mean ΣiXi would be normally distributed, with mean μ and variance σ2/N.
+
+> Another way to interpret central limit theorem is to say that regardless of distribution, when you compute the mean of any random variable values you end up with normal distribution.
+
+From central limit theorem it also follows that, when N→∞, the probability of the sample mean to be equal to μ becomes 1. This is known as **the law of large numbers**.
+
+
+
## 🚀 Challenge
diff --git a/1-Introduction/04-stats-and-probability/images/normal-histogram.png b/1-Introduction/04-stats-and-probability/images/normal-histogram.png
new file mode 100644
index 00000000..af40eca9
Binary files /dev/null and b/1-Introduction/04-stats-and-probability/images/normal-histogram.png differ
diff --git a/1-Introduction/04-stats-and-probability/notebook.ipynb b/1-Introduction/04-stats-and-probability/notebook.ipynb
index 41dda760..ad4cd849 100644
--- a/1-Introduction/04-stats-and-probability/notebook.ipynb
+++ b/1-Introduction/04-stats-and-probability/notebook.ipynb
@@ -338,6 +338,172 @@
],
"metadata": {}
},
+ {
+ "cell_type": "code",
+ "execution_count": 49,
+ "source": [
+ "mean = df['Weight'].mean()\r\n",
+ "var = df['Weight'].var()\r\n",
+ "std = df['Weight'].std()\r\n",
+ "print(f\"Mean = {mean}\\nVariance = {var}\\nStandard Deviation = {std}\")"
+ ],
+ "outputs": [
+ {
+ "output_type": "stream",
+ "name": "stdout",
+ "text": [
+ "Mean = 201.6892545982575\n",
+ "Variance = 440.6426848120547\n",
+ "Standard Deviation = 20.991490771549664\n"
+ ]
+ }
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Normal Distribution\r\n",
+ "\r\n",
+ "Let's create an artificial sample of weights that follows normal distribution with the same mean and variance as real data:"
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 60,
+ "source": [
+ "generated = np.random.normal(mean,std,1000)\r\n",
+ "generated[:20]"
+ ],
+ "outputs": [
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "array([187.05660174, 181.77292853, 183.09148457, 198.30703945,\n",
+ " 201.51640234, 213.21564624, 221.00562653, 218.30263433,\n",
+ " 234.16968198, 187.40138853, 199.34286071, 205.52705493,\n",
+ " 251.03651986, 189.64156046, 222.23536452, 211.37502445,\n",
+ " 205.07287496, 207.90248813, 180.66579133, 226.86092236])"
+ ]
+ },
+ "metadata": {},
+ "execution_count": 60
+ }
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 54,
+ "source": [
+ "plt.hist(generated,bins=15)\r\n",
+ "plt.show()"
+ ],
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "image/svg+xml": "\r\n\r\n\r\n",
+ "image/png": ""
+ },
+ "metadata": {}
+ }
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 62,
+ "source": [
+ "plt.hist(np.random.normal(0,1,50000),bins=300)\r\n",
+ "plt.show()"
+ ],
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "image/svg+xml": "\r\n\r\n\r\n",
+ "image/png": ""
+ },
+ "metadata": {}
+ }
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "Since most values in real life are normally distributed, it means we should not use uniform random number generator to generate sample data. Here is what happens if we try to generate weights with uniform distribution (generated by `np.random.rand`):"
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 63,
+ "source": [
+ "wrong_sample = np.random.rand(1000)*2*std+mean-std\r\n",
+ "plt.hist(wrong_sample)\r\n",
+ "plt.show()"
+ ],
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "image/svg+xml": "\r\n\r\n\r\n",
+ "image/png": ""
+ },
+ "metadata": {}
+ }
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "## Simulating Normal Distribution with Central Limit Theorem\r\n",
+ "\r\n",
+ "Pseudo-random generator in Python is designed to give us uniform distribution. If we want to create a generator for normal distribution, we can use central limit theorem. To get a normally distributed value we will just compute a mean of a uniform-generated sample."
+ ],
+ "metadata": {}
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 64,
+ "source": [
+ "def normal_random(sample_size=100):\r\n",
+ " sample = [random.uniform(0,1) for _ in range(sample_size) ]\r\n",
+ " return sum(sample)/sample_size\r\n",
+ "\r\n",
+ "sample = [normal_random() for _ in range(100)]\r\n",
+ "plt.hist(sample)\r\n",
+ "plt.show()"
+ ],
+ "outputs": [
+ {
+ "output_type": "display_data",
+ "data": {
+ "text/plain": [
+ ""
+ ],
+ "image/svg+xml": "\r\n\r\n\r\n",
+ "image/png": ""
+ },
+ "metadata": {}
+ }
+ ],
+ "metadata": {}
+ },
{
"cell_type": "code",
"execution_count": null,