"You could use `isnull` to do this in place, but that can be laborious, particularly if you have a lot of values to fill. Because this is such a common task in data science, pandas provides `fillna`, which returns a copy of the `Series` or `DataFrame` with the missing values replaced with one of your choosing. Let's create another example `Series` to see how this works in practice."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CE8S7louLezV"
},
"source": [
"First let us consider non-numeric data. In datasets, we have columns with categorical data. Eg. Gender, True or False etc.\n",
"\n",
"In most of these cases, we replace missing values with the `mode` of the column. Say, we have 100 data points and 90 have said True, 8 have said False and 2 have not filled. Then, we can will the 2 with True, considering the full column. \n",
"\n",
"Again, here we can use domain knowledge here. Let us consider an example of filling with the mode."
"As we can see, the null value has been replaced. Needless to say, we could have written anything in place or `'True'` and it would have got substituted."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "heYe1I0dOmQ_"
},
"source": [
"Now, coming to numeric data. Here, we have a two common ways of replacing missing values:\n",
"\n",
"1. Replace with Median of the row\n",
"2. Replace with Mean of the row \n",
"\n",
"We replace with Median, in case of skewed data with outliers. This is beacuse median is robust to outliers.\n",
"\n",
"When the data is normalized, we can use mean, as in that case, mean and median would be pretty close."