@ -89,7 +89,7 @@ In the `notebook.ipynb` file associated to this lesson, clear out all the cells
In this section, you will work with a small dataset about diabetes that is built into Scikit-Learn for learning purposes. Imagine that you wanted to test a treatment for diabetic patients. Machine Learning models might help you determine which patients would respond better to the treatment, based on combinations of variables. Even a very basic Regression model, when visualized, might show information about variables that would help you organize your theoretical clinical trials.
> ✅ There are many types of Regression methods, and which one you pick depends on the answer you're looking for. If you want to predict the probable height for a person of a given age, you'd use Linear Regression, as you're seeking a **numeric value**. If you're interested in discovering whether a type of recipe should be considered vegan or not, you're looking for a **category assignment** so you would use Logistic Regression. You'll learn more about Logistic Regression later. Think a bit about some questions you can ask of data, and which of these methods would be more appropriate.
> ✅ There are many types of Regression methods, and which one you pick depends on the answer you're looking for. If you want to predict the probable height for a person of a given age, you'd use Linear Regression, as you're seeking a **numeric value**. If you're interested in discovering whether a type of cuisine should be considered vegan or not, you're looking for a **category assignment** so you would use Logistic Regression. You'll learn more about Logistic Regression later. Think a bit about some questions you can ask of data, and which of these methods would be more appropriate.
@ -10,7 +10,7 @@ Classification is a form of [supervised learning](https://wikipedia.org/wiki/Sup
Remember, Linear Regression helped you predict relationships between variables and make accurate predictions on where a new datapoint would fall in relationship to that line. So, you could predict what price a pumpkin would be in September vs. December, for example. Logistic Regression helped you discover binary categories: at this price point, is this pumpkin orange or not-orange?
Classification uses various algorithms to determine other ways of determining a data point's label or class. Let's work with this recipe data to see whether, by observing a group of ingredients, we can determine its cuisine of origin.
Classification uses various algorithms to determine other ways of determining a data point's label or class. Let's work with this cuisine data to see whether, by observing a group of ingredients, we can determine its cuisine of origin.
@ -21,11 +21,11 @@ Before starting the process of cleaning our data, visualizing it, and prepping i
Derived from [statistics](https://wikipedia.org/wiki/Statistical_classification), classification using classic machine learning uses features, such as 'smoker','weight', and 'age' to determine 'likelihood of developing X disease'. As a supervised learning technique similar to the Regression exercises you performed earlier, your data is labeled and the ML algorithms use those labels to classify and predict classes (or 'features') of a dataset and assign them to a group or outcome.
✅ Take a moment to imagine a dataset about recipes. What would a multiclass model be able to answer? What would a binary model be able to answer? What if you wanted to determine whether a given cuisine was likely to contain Fenugreek? What if you wanted to see if, given a present of a grocery bag full of star anise, artichokes, cauliflower, and horseradish, you could create a typical Indian dish?
✅ Take a moment to imagine a dataset about cuisines. What would a multiclass model be able to answer? What would a binary model be able to answer? What if you wanted to determine whether a given cuisine was likely to use fenugreek? What if you wanted to see if, given a present of a grocery bag full of star anise, artichokes, cauliflower, and horseradish, you could create a typical Indian dish?
## Hello 'classifier'
The question we want to ask of this recipe dataset is actually a **multiclass question**, as we have several potential national cuisines to work with. Given a batch of ingredients, which of these many classes will the data fit?
The question we want to ask of this cuisine dataset is actually a **multiclass question**, as we have several potential national cuisines to work with. Given a batch of ingredients, which of these many classes will the data fit?
Scikit-Learn offers several different algorithms to use to classify data, depending on the kind of problem you want to solve. In the next two lessons, you'll learn about several of these algorithms.
@ -51,7 +51,7 @@ from imblearn.over_sampling import SMOTE
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about recipes. You will use this dataset with a variety of classifiers to predict a given national cuisine based on a group of ingredients. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about cuisines. You will use this dataset with a variety of classifiers to predict a given national cuisine based on a group of ingredients. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
There are many ways to use the LogisticRegression library in Scikit-Learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
@ -181,7 +181,7 @@ The result is printed - Indian cuisine is its best guess, with good probability:
In this lesson, you will build a classification model using some of the techniques you have learned in previous lessons and with the delicious cuisine dataset used throughout this series. In addition, you will build a small web app to use a saved model, leveraging Onnx's web runtime.

One of the most useful practical uses of machine learning is building recommendation systems, and you can take the first step in that direction today!
[](https://youtu.be/giIXNoiqO_U "Recommendation Systems Introduction")
Describe what we will learn
> 🎥 Click the image above for a video: Andrew Ng introduces recommendation system design
## Regional topic: Delicious Asian and Indian Recipes 🍜
## Regional topic: Delicious Asian and Indian Cuisines 🍜
In Asia and India, food traditions are extremely diverse, and very delicious! Let's look at data about regional recipes to try to guess where they originated.
In Asia and India, food traditions are extremely diverse, and very delicious! Let's look at data about regional cuisines to try to guess where they originated.

> Photo by <ahref="https://unsplash.com/@changlisheng?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Lisheng Chang</a> on <ahref="https://unsplash.com/s/photos/asian-food?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
@ -22,4 +22,4 @@ In this section, you will build on the skills you learned in Lesson 1 (Regressio
"Getting Started with Classification" was written with ♥️ by [Cassie Breviu](https://www.twitter.com/cassieview) and [Jen Looper](https://www.twitter.com/jenlooper)
The delicious recipes dataset was sourced from [Kaggle](https://www.kaggle.com/hoandan/asian-and-indian-cuisines)
The delicious cuisines dataset was sourced from [Kaggle](https://www.kaggle.com/hoandan/asian-and-indian-cuisines)
@ -80,9 +80,9 @@ By ensuring that the content aligns with projects, the process is made more enga
| 07 | North American Pumpkin Prices 🎃 | [Regression](2-Regression/README.md) | Build a Logistic Regression model | [lesson](2-Regression/4-Logistic/README.md) | Jen |
| 08 | A Web App 🔌 | [Web App](3-Web-App/README.md) | Build a Web app to use your trained model | [lesson](3-Web-App/README.md) | Jen |
| 09 | Introduction to Classification | [Classification](4-Classification/README.md) | Clean, Prep, and Visualize your Data; Introduction to Classification | [lesson](4-Classification/1-Introduction/README.md) | Cassie |
| 10 | Delicious Asian and Indian Recipes 🍜 | [Classification](4-Classification/README.md) | Build a Discriminative Model | [lesson](4-Classification/2-Descriminative/README.md) | Cassie |
| 11 | Delicious Asian and Indian Recipes 🍜 | [Classification](4-Classification/README.md) | Build a Generative Model | [lesson](4-Classification/3-Generative/README.md) | Cassie |
| 12 | Delicious Asian and Indian Recipes 🍜 | [Classification](4-Classification/README.md) | Build a Web App using your Model | [lesson](4-Classification/4-Applied/README.md) | Jen |
| 10 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Discriminative Model | [lesson](4-Classification/2-Descriminative/README.md) | Cassie |
| 11 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Generative Model | [lesson](4-Classification/3-Generative/README.md) | Cassie |
| 12 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Web App using your Model | [lesson](4-Classification/4-Applied/README.md) | Jen |
| 13 | Introduction to Clustering | [Clustering](5-Clustering/README.md) | Clean, Prep, and Visualize your Data; Introduction to Clustering | [lesson](5-Clustering/1-Visualize/README.md) | Jen |
| 15 | Introduction to Natural Language Processing ☕️ | [Natural Language Processing](6-NLP/README.md) | Learn the basics about NLP by building a simple bot | [lesson](6-NLP/1-Introduction-to-NLP/README.md) | Stephen |