| ](../../sketchnotes/10-Visualizing-Distributions.png)|
| ](https://github.com/microsoft/Data-Science-For-Beginners/blob/main/sketchnotes/10-Visualizing-Distributions.png)|
|:---:|
|:---:|
| Visualizing Distributions - _Sketchnote by [@nitya](https://twitter.com/nitya)_ |
| Visualizing Distributions - _Sketchnote by [@nitya](https://twitter.com/nitya)_ |

This gives an overview of the general distribution of body length per bird Order, but it is not the optimal way to display true distributions. That task is usually handled by creating a Histogram.
This gives an overview of the general distribution of body length per bird Order, but it is not the optimal way to display true distributions. That task is usually handled by creating a Histogram.
## Working with histograms
## Working with histograms
@ -47,7 +47,7 @@ This gives an overview of the general distribution of body length per bird Order

As you can see, most of the 400+ birds in this dataset fall in the range of under 2000 for their Max Body Mass. Gain more insight into the data by changing the `bins` parameter to a higher number, something like 30:
As you can see, most of the 400+ birds in this dataset fall in the range of under 2000 for their Max Body Mass. Gain more insight into the data by changing the `bins` parameter to a higher number, something like 30:
@ -55,7 +55,7 @@ As you can see, most of the 400+ birds in this dataset fall in the range of unde
This chart shows the distribution in a bit more granular fashion. A chart less skewed to the left could be created by ensuring that you only select data within a given range:
This chart shows the distribution in a bit more granular fashion. A chart less skewed to the left could be created by ensuring that you only select data within a given range:

There doesn't seem to be a good correlation between minimum wingspan and conservation status. Test other elements of the dataset using this method. You can try different filters as well. Do you find any correlation?
There doesn't seem to be a good correlation between minimum wingspan and conservation status. Test other elements of the dataset using this method. You can try different filters as well. Do you find any correlation?
@ -126,7 +126,7 @@ Let's work with density plot's now!
You can see how the plot echoes the previous one for Minimum Wingspan data; it's just a bit smoother. If you wanted to revisit that jagged MaxBodyMass line in the second chart you built, you could smooth it out very well by recreating it using this method:
You can see how the plot echoes the previous one for Minimum Wingspan data; it's just a bit smoother. If you wanted to revisit that jagged MaxBodyMass line in the second chart you built, you could smooth it out very well by recreating it using this method:
@ -134,7 +134,7 @@ You can see how the plot echoes the previous one for Minimum Wingspan data; it's
✅ Read about the parameters available for this type of plot and experiment!
✅ Read about the parameters available for this type of plot and experiment!
@ -152,8 +152,29 @@ This type of chart offers beautifully explanatory visualizations. With a few lin
ggplot(data=birds_filtered_1,aes(x = MaxBodyMass, fill = Order)) +
ggplot(data=birds_filtered_1,aes(x = MaxBodyMass, fill = Order)) +
geom_density(alpha=0.5)
geom_density(alpha=0.5)
```
```
![bodymass per order]()

You can also map the density of several variables in one chart. Text the MaxLength and MinLength of a bird compared to their conservation status:
You can also map the density of several variables in one chart. Text the MaxLength and MinLength of a bird compared to their conservation status:
```r
to be inserted
```
![2d density plot]()
Perhaps it's worth researching whether the cluster of 'Vulnerable' birds according to their lengths is meaningful or not.
## 🚀 Challenge
Histograms are a more sophisticated type of chart than basic scatterplots, bar charts, or line charts. Go on a search on the internet to find good examples of the use of histograms. How are they used, what do they demonstrate, and in what fields or areas of inquiry do they tend to be used?
In this lesson, you used `ggplot2` and started working to show more sophisticated charts. Do some research on `geom_density_2d()` a "continuous probability density curve in one or more dimensions". Read through [the documentation](https://ggplot2.tidyverse.org/reference/geom_density_2d.html) to understand how it works.