> ### [This lesson is available in R!](./solution/R/lesson_3-R.ipynb)
> ### [This lesson is available in R!](./solution/R/lesson_3.html)
### Introduction
### Introduction
So far you have explored what regression is with sample data gathered from the pumpkin pricing dataset that we will use throughout this lesson. You have also visualized it using Matplotlib.
So far you have explored what regression is with sample data gathered from the pumpkin pricing dataset that we will use throughout this lesson. You have also visualized it using Matplotlib.
@ -84,8 +84,8 @@ We do so since we want to model a line that has the least cumulative distance fr
>
>
> In other words, and referring to our pumpkin data's original question: "predict the price of a pumpkin per bushel by month", `X` would refer to the price and `Y` would refer to the month of sale.
> In other words, and referring to our pumpkin data's original question: "predict the price of a pumpkin per bushel by month", `X` would refer to the price and `Y` would refer to the month of sale.
>
>
> ![Infographic by Jen Looper](../../images/calculation.png)
![Infographic by Jen Looper](../../images/calculation.png)
>
> Calculate the value of Y. If you're paying around \$4, it must be April!
> Calculate the value of Y. If you're paying around \$4, it must be April!
>
>
> The math that calculates the line must demonstrate the slope of the line, which is also dependent on the intercept, or where `Y` is situated when `X = 0`.
> The math that calculates the line must demonstrate the slope of the line, which is also dependent on the intercept, or where `Y` is situated when `X = 0`.
@ -114,7 +114,7 @@ Load up required libraries and dataset. Convert the data to a data frame contain
- Convert the price to reflect the pricing by bushel quantity
- Convert the price to reflect the pricing by bushel quantity
> We covered these steps in the [previous lesson](https://github.com/microsoft/ML-For-Beginners/blob/main/2-Regression/2-Data/solution/lesson_2-R.ipynb).
> We covered these steps in the [previous lesson](https://github.com/microsoft/ML-For-Beginners/blob/main/2-Regression/2-Data/solution/lesson_2.html).
@ -285,7 +285,7 @@ That's an awesome thought! You see, once your recipe is defined, you can estimat
For that, you'll need two more verbs: `prep()` and `bake()` and as always, our little R friends by [`Allison Horst`](https://github.com/allisonhorst/stats-illustrations) help you in understanding this better!
For that, you'll need two more verbs: `prep()` and `bake()` and as always, our little R friends by [`Allison Horst`](https://github.com/allisonhorst/stats-illustrations) help you in understanding this better!
![Artwork by \@allison_horst](../images/recipes.png){width="550"}
![Artwork by \@allison_horst](../../images/recipes.png){width="550"}
[`prep()`](https://recipes.tidymodels.org/reference/prep.html): estimates the required parameters from a training set that can be later applied to other data sets. For instance, for a given predictor column, what observation will be assigned integer 0 or 1 or 2 etc
[`prep()`](https://recipes.tidymodels.org/reference/prep.html): estimates the required parameters from a training set that can be later applied to other data sets. For instance, for a given predictor column, what observation will be assigned integer 0 or 1 or 2 etc
For our purposes, we will express this as a binary: 'Orange' or 'Not Orange'. There is also a 'striped' category in our dataset but there are few instances of it, so we will not use it. It disappears once we remove null values from the dataset, anyway.
For our purposes, we will express this as a binary: 'Orange' or 'Not Orange'. There is also a 'striped' category in our dataset but there are few instances of it, so we will not use it. It disappears once we remove null values from the dataset, anyway.
@ -148,7 +148,7 @@ pumpkins_select %>%
The goal of data exploration is to try to understand the `relationships` between its attributes; in particular, any apparent correlation between the *features* and the *label* your model will try to predict. One way of doing this is by using data visualization.
The goal of data exploration is to try to understand the `relationships` between its attributes; in particular, any apparent correlation between the *features* and the *label* your model will try to predict. One way of doing this is by using data visualization.
Given our the data types of our columns, we can `encode` them and be on our way to making some visualizations. This simply involves `translating` a column with `categorical values` for example our columns of type *char*, into one or more `numeric columns` that take the place of the original. - Something we did in our [last lesson](https://github.com/microsoft/ML-For-Beginners/blob/main/2-Regression/3-Linear/solution/lesson_3-R.ipynb).
Given our the data types of our columns, we can `encode` them and be on our way to making some visualizations. This simply involves `translating` a column with `categorical values` for example our columns of type *char*, into one or more `numeric columns` that take the place of the original. - Something we did in our [last lesson](https://github.com/microsoft/ML-For-Beginners/blob/main/2-Regression/3-Linear/solution/lesson_3.html).
Tidymodels provides yet another neat package: [recipes](https://recipes.tidymodels.org/)- a package for preprocessing data. We'll define a `recipe` that specifies that all predictor columns should be encoded into a set of integers , `prep` it to estimates the required quantities and statistics needed by any operations and finally `bake` to apply the computations to new data.
Tidymodels provides yet another neat package: [recipes](https://recipes.tidymodels.org/)- a package for preprocessing data. We'll define a `recipe` that specifies that all predictor columns should be encoded into a set of integers , `prep` it to estimates the required quantities and statistics needed by any operations and finally `bake` to apply the computations to new data.
@ -94,12 +94,12 @@ By ensuring that the content aligns with projects, the process is made more enga
| 02 | The History of machine learning | [Introduction](1-Introduction/README.md) | Learn the history underlying this field | [Lesson](1-Introduction/2-history-of-ML/README.md) | Jen and Amy |
| 02 | The History of machine learning | [Introduction](1-Introduction/README.md) | Learn the history underlying this field | [Lesson](1-Introduction/2-history-of-ML/README.md) | Jen and Amy |
| 03 | Fairness and machine learning | [Introduction](1-Introduction/README.md) | What are the important philosophical issues around fairness that students should consider when building and applying ML models? | [Lesson](1-Introduction/3-fairness/README.md) | Tomomi |
| 03 | Fairness and machine learning | [Introduction](1-Introduction/README.md) | What are the important philosophical issues around fairness that students should consider when building and applying ML models? | [Lesson](1-Introduction/3-fairness/README.md) | Tomomi |
| 04 | Techniques for machine learning | [Introduction](1-Introduction/README.md) | What techniques do ML researchers use to build ML models? | [Lesson](1-Introduction/4-techniques-of-ML/README.md) | Chris and Jen |
| 04 | Techniques for machine learning | [Introduction](1-Introduction/README.md) | What techniques do ML researchers use to build ML models? | [Lesson](1-Introduction/4-techniques-of-ML/README.md) | Chris and Jen |
| 05 | Introduction to regression | [Regression](2-Regression/README.md) | Get started with Python and Scikit-learn for regression models | <ul><li>[Python](2-Regression/1-Tools/README.md)</li><li>[R](2-Regression/1-Tools/solution/R/lesson_1-R.ipynb)</li></ul> | <ul><li>Jen</li><li>Eric Wanjau</li></ul> |
| 05 | Introduction to regression | [Regression](2-Regression/README.md) | Get started with Python and Scikit-learn for regression models | <ul><li>[Python](2-Regression/1-Tools/README.md)</li><li>[R](2-Regression/1-Tools/solution/R/lesson_1.html)</li></ul> | <ul><li>Jen</li><li>Eric Wanjau</li></ul> |
| 06 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Visualize and clean data in preparation for ML | <ul><li>[Python](2-Regression/2-Data/README.md)</li><li>[R](2-Regression/2-Data/solution/R/lesson_2-R.ipynb)</li></ul> | <ul><li>Jen</li><li>Eric Wanjau</li></ul> |
| 06 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Visualize and clean data in preparation for ML | <ul><li>[Python](2-Regression/2-Data/README.md)</li><li>[R](2-Regression/2-Data/solution/R/lesson_2.html)</li></ul> | <ul><li>Jen</li><li>Eric Wanjau</li></ul> |
| 07 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Build linear and polynomial regression models | <ul><li>[Python](2-Regression/3-Linear/README.md)</li><li>[R](2-Regression/3-Linear/solution/R/lesson_3-R.ipynb)</li></ul> | <ul><li>Jen and Dmitry</li><li>Eric Wanjau</li></ul> |
| 07 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Build linear and polynomial regression models | <ul><li>[Python](2-Regression/3-Linear/README.md)</li><li>[R](2-Regression/3-Linear/solution/R/lesson_3.html)</li></ul> | <ul><li>Jen and Dmitry</li><li>Eric Wanjau</li></ul> |
| 08 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Build a logistic regression model | <ul><li>[Python](2-Regression/4-Logistic/README.md) </li><li>[R](2-Regression/4-Logistic/solution/R/lesson_4-R.ipynb)</li></ul> | <ul><li>Jen</li><li>Eric Wanjau</li></ul> |
| 08 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Build a logistic regression model | <ul><li>[Python](2-Regression/4-Logistic/README.md) </li><li>[R](2-Regression/4-Logistic/solution/R/lesson_4.html)</li></ul> | <ul><li>Jen</li><li>Eric Wanjau</li></ul> |
| 09 | A Web App 🔌 | [Web App](3-Web-App/README.md) | Build a web app to use your trained model | [Python](3-Web-App/1-Web-App/README.md) | Jen |
| 09 | A Web App 🔌 | [Web App](3-Web-App/README.md) | Build a web app to use your trained model | [Python](3-Web-App/1-Web-App/README.md) | Jen |
| 10 | Introduction to classification | [Classification](4-Classification/README.md) | Clean, prep, and visualize your data; introduction to classification | <ul><li> [Python](4-Classification/1-Introduction/README.md) </li><li>[R](4-Classification/1-Introduction/solution/R/lesson_10-R.ipynb) | <ul><li>Jen and Cassie</li><li>Eric Wanjau</li></ul> |
| 10 | Introduction to classification | [Classification](4-Classification/README.md) | Clean, prep, and visualize your data; introduction to classification | <ul><li> [Python](4-Classification/1-Introduction/README.md) </li><li>[R](4-Classification/1-Introduction/solution/R/lesson_10.html) | <ul><li>Jen and Cassie</li><li>Eric Wanjau</li></ul> |
| 11 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | Introduction to classifiers | <ul><li> [Python](4-Classification/2-Classifiers-1/README.md)</li><li>[R](4-Classification/2-Classifiers-1/solution/R/lesson_11.html) | <ul><li>Jen and Cassie</li><li>Eric Wanjau</li></ul> |
| 11 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | Introduction to classifiers | <ul><li> [Python](4-Classification/2-Classifiers-1/README.md)</li><li>[R](4-Classification/2-Classifiers-1/solution/R/lesson_11.html) | <ul><li>Jen and Cassie</li><li>Eric Wanjau</li></ul> |
| 12 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | More classifiers | <ul><li> [Python](4-Classification/3-Classifiers-2/README.md)</li><li>[R](4-Classification/3-Classifiers-2/solution/R/lesson_12.html) | <ul><li>Jen and Cassie</li><li>Eric Wanjau</li></ul> |
| 12 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | More classifiers | <ul><li> [Python](4-Classification/3-Classifiers-2/README.md)</li><li>[R](4-Classification/3-Classifiers-2/solution/R/lesson_12.html) | <ul><li>Jen and Cassie</li><li>Eric Wanjau</li></ul> |
| 13 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | Build a recommender web app using your model | [Python](4-Classification/4-Applied/README.md) | Jen |
| 13 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | Build a recommender web app using your model | [Python](4-Classification/4-Applied/README.md) | Jen |