renaming classification content as 'cuisines', not recipes

pull/36/head
Jen Looper 3 years ago
parent 788047ae1d
commit f979b19fc1

@ -89,7 +89,7 @@ In the `notebook.ipynb` file associated to this lesson, clear out all the cells
In this section, you will work with a small dataset about diabetes that is built into Scikit-Learn for learning purposes. Imagine that you wanted to test a treatment for diabetic patients. Machine Learning models might help you determine which patients would respond better to the treatment, based on combinations of variables. Even a very basic Regression model, when visualized, might show information about variables that would help you organize your theoretical clinical trials.
> ✅ There are many types of Regression methods, and which one you pick depends on the answer you're looking for. If you want to predict the probable height for a person of a given age, you'd use Linear Regression, as you're seeking a **numeric value**. If you're interested in discovering whether a type of recipe should be considered vegan or not, you're looking for a **category assignment** so you would use Logistic Regression. You'll learn more about Logistic Regression later. Think a bit about some questions you can ask of data, and which of these methods would be more appropriate.
> ✅ There are many types of Regression methods, and which one you pick depends on the answer you're looking for. If you want to predict the probable height for a person of a given age, you'd use Linear Regression, as you're seeking a **numeric value**. If you're interested in discovering whether a type of cuisine should be considered vegan or not, you're looking for a **category assignment** so you would use Logistic Regression. You'll learn more about Logistic Regression later. Think a bit about some questions you can ask of data, and which of these methods would be more appropriate.
Let's get started on this task.

@ -10,7 +10,7 @@ Classification is a form of [supervised learning](https://wikipedia.org/wiki/Sup
Remember, Linear Regression helped you predict relationships between variables and make accurate predictions on where a new datapoint would fall in relationship to that line. So, you could predict what price a pumpkin would be in September vs. December, for example. Logistic Regression helped you discover binary categories: at this price point, is this pumpkin orange or not-orange?
Classification uses various algorithms to determine other ways of determining a data point's label or class. Let's work with this recipe data to see whether, by observing a group of ingredients, we can determine its cuisine of origin.
Classification uses various algorithms to determine other ways of determining a data point's label or class. Let's work with this cuisine data to see whether, by observing a group of ingredients, we can determine its cuisine of origin.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/17/)
### Introduction
@ -21,11 +21,11 @@ Before starting the process of cleaning our data, visualizing it, and prepping i
Derived from [statistics](https://wikipedia.org/wiki/Statistical_classification), classification using classic machine learning uses features, such as 'smoker','weight', and 'age' to determine 'likelihood of developing X disease'. As a supervised learning technique similar to the Regression exercises you performed earlier, your data is labeled and the ML algorithms use those labels to classify and predict classes (or 'features') of a dataset and assign them to a group or outcome.
✅ Take a moment to imagine a dataset about recipes. What would a multiclass model be able to answer? What would a binary model be able to answer? What if you wanted to determine whether a given cuisine was likely to contain Fenugreek? What if you wanted to see if, given a present of a grocery bag full of star anise, artichokes, cauliflower, and horseradish, you could create a typical Indian dish?
✅ Take a moment to imagine a dataset about cuisines. What would a multiclass model be able to answer? What would a binary model be able to answer? What if you wanted to determine whether a given cuisine was likely to use fenugreek? What if you wanted to see if, given a present of a grocery bag full of star anise, artichokes, cauliflower, and horseradish, you could create a typical Indian dish?
## Hello 'classifier'
The question we want to ask of this recipe dataset is actually a **multiclass question**, as we have several potential national cuisines to work with. Given a batch of ingredients, which of these many classes will the data fit?
The question we want to ask of this cuisine dataset is actually a **multiclass question**, as we have several potential national cuisines to work with. Given a batch of ingredients, which of these many classes will the data fit?
Scikit-Learn offers several different algorithms to use to classify data, depending on the kind of problem you want to solve. In the next two lessons, you'll learn about several of these algorithms.
@ -51,7 +51,7 @@ from imblearn.over_sampling import SMOTE
The next task will be to import the data:
```python
df = pd.read_csv('../data/recipes.csv')
df = pd.read_csv('../data/cuisines.csv')
```
Check the data's shape:

@ -19,7 +19,7 @@
"cells": [
{
"source": [
"# Delicious Asian and Indian Recipes "
"# Delicious Asian and Indian Cuisines "
],
"cell_type": "markdown",
"metadata": {}

File diff suppressed because one or more lines are too long

@ -1,6 +1,6 @@
# Recipe Classifiers 1
# Cuisine Classifiers 1
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about recipes. You will use this dataset with a variety of classifiers to predict a given national cuisine based on a group of ingredients. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about cuisines. You will use this dataset with a variety of classifiers to predict a given national cuisine based on a group of ingredients. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/19/)
# Preparation
@ -11,8 +11,8 @@ Working in this lesson's `notebook.ipynb` folder, import that file along with th
```python
import pandas as pd
recipes_df = pd.read_csv("../../data/cleaned_cuisine.csv")
recipes_df.head()
cuisines_df = pd.read_csv("../../data/cleaned_cuisine.csv")
cuisines_df.head()
```
The data looks like this:
@ -37,8 +37,8 @@ import numpy as np
Divide the X and y coordinates into two dataframes for training. `cuisine` can be the labels dataframe:
```python
recipes_label_df = recipes_df['cuisine']
recipes_label_df.head()
cuisines_label_df = cuisines_df['cuisine']
cuisines_label_df.head()
```
It will look like this:
@ -55,8 +55,8 @@ Name: cuisine, dtype: object
Drop that `Unnamed: 0` column and the `cuisine` column and save the rest of the data as trainable features:
```python
recipes_feature_df = recipes_df.drop(['Unnamed: 0', 'cuisine'], axis=1)
recipes_feature_df.head()
cuisines_feature_df = cuisines_df.drop(['Unnamed: 0', 'cuisine'], axis=1)
cuisines_feature_df.head()
```
Your features look like this:
@ -113,7 +113,7 @@ Let's focus on Logistic Regression for our first training trial since you recent
Let's train that model. Split your data into training and testing groups:
```python
X_train, X_test, y_train, y_test = train_test_split(recipes_feature_df, recipes_label_df, test_size=0.3)
X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
```
There are many ways to use the LogisticRegression library in Scikit-Learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
@ -181,7 +181,7 @@ The result is printed - Indian cuisine is its best guess, with good probability:
| korean | 0.017277 | | | | | | | | | | | | | | | | | | | | |
| thai | 0.007634 | | | | | | | | | | | | | | | | | | | | |
✅ Can you explain why the model is pretty sure this is an Indian recipe?
✅ Can you explain why the model is pretty sure this is an Indian cuisine?
Get more detail by printing a classification report, as you did in the Regression lessons:

@ -9,7 +9,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@ -42,18 +42,18 @@
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>cuisine</th>\n <th>almond</th>\n <th>angelica</th>\n <th>anise</th>\n <th>anise_seed</th>\n <th>apple</th>\n <th>apple_brandy</th>\n <th>apricot</th>\n <th>armagnac</th>\n <th>...</th>\n <th>whiskey</th>\n <th>white_bread</th>\n <th>white_wine</th>\n <th>whole_grain_wheat_flour</th>\n <th>wine</th>\n <th>wood</th>\n <th>yam</th>\n <th>yeast</th>\n <th>yogurt</th>\n <th>zucchini</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1</td>\n <td>indian</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>2</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>3</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>4</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows × 382 columns</p>\n</div>"
},
"metadata": {},
"execution_count": 12
"execution_count": 1
}
],
"source": [
"import pandas as pd\n",
"recipes_df = pd.read_csv(\"../../data/cleaned_cuisine.csv\")\n",
"recipes_df.head()"
"cuisines_df = pd.read_csv(\"../../data/cleaned_cuisine.csv\")\n",
"cuisines_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@ -66,7 +66,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 3,
"metadata": {},
"outputs": [
{
@ -82,17 +82,17 @@
]
},
"metadata": {},
"execution_count": 14
"execution_count": 3
}
],
"source": [
"recipes_label_df = recipes_df['cuisine']\n",
"recipes_label_df.head()"
"cuisines_label_df = cuisines_df['cuisine']\n",
"cuisines_label_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 4,
"metadata": {},
"outputs": [
{
@ -125,33 +125,33 @@
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>almond</th>\n <th>angelica</th>\n <th>anise</th>\n <th>anise_seed</th>\n <th>apple</th>\n <th>apple_brandy</th>\n <th>apricot</th>\n <th>armagnac</th>\n <th>artemisia</th>\n <th>artichoke</th>\n <th>...</th>\n <th>whiskey</th>\n <th>white_bread</th>\n <th>white_wine</th>\n <th>whole_grain_wheat_flour</th>\n <th>wine</th>\n <th>wood</th>\n <th>yam</th>\n <th>yeast</th>\n <th>yogurt</th>\n <th>zucchini</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows × 380 columns</p>\n</div>"
},
"metadata": {},
"execution_count": 15
"execution_count": 4
}
],
"source": [
"recipes_feature_df = recipes_df.drop(['Unnamed: 0', 'cuisine'], axis=1)\n",
"recipes_feature_df.head()"
"cuisines_feature_df = cuisines_df.drop(['Unnamed: 0', 'cuisine'], axis=1)\n",
"cuisines_feature_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"X_train, X_test, y_train, y_test = train_test_split(recipes_feature_df, recipes_label_df, test_size=0.3)"
"X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 6,
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Accuracy is 0.7906588824020017\n"
"Accuracy is 0.8181818181818182\n"
]
}
],
@ -165,14 +165,14 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 7,
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"ingredients: Index(['basil', 'coconut', 'coriander', 'cumin', 'fenugreek', 'pepper',\n 'turmeric'],\n dtype='object')\ncuisine: thai\n"
"ingredients: Index(['artemisia', 'black_pepper', 'mushroom', 'shiitake', 'soy_sauce',\n 'vegetable_oil'],\n dtype='object')\ncuisine: korean\n"
]
}
],
@ -184,7 +184,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 8,
"metadata": {},
"outputs": [
{
@ -192,16 +192,16 @@
"data": {
"text/plain": [
" 0\n",
"thai 0.857884\n",
"indian 0.105667\n",
"japanese 0.033860\n",
"chinese 0.002365\n",
"korean 0.000224"
"korean 0.392231\n",
"chinese 0.372872\n",
"japanese 0.218825\n",
"thai 0.013427\n",
"indian 0.002645"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>0</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>thai</th>\n <td>0.857884</td>\n </tr>\n <tr>\n <th>indian</th>\n <td>0.105667</td>\n </tr>\n <tr>\n <th>japanese</th>\n <td>0.033860</td>\n </tr>\n <tr>\n <th>chinese</th>\n <td>0.002365</td>\n </tr>\n <tr>\n <th>korean</th>\n <td>0.000224</td>\n </tr>\n </tbody>\n</table>\n</div>"
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>0</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>korean</th>\n <td>0.392231</td>\n </tr>\n <tr>\n <th>chinese</th>\n <td>0.372872</td>\n </tr>\n <tr>\n <th>japanese</th>\n <td>0.218825</td>\n </tr>\n <tr>\n <th>thai</th>\n <td>0.013427</td>\n </tr>\n <tr>\n <th>indian</th>\n <td>0.002645</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 19
"execution_count": 8
}
],
"source": [
@ -220,14 +220,14 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 9,
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" precision recall f1-score support\n\n chinese 0.72 0.69 0.71 239\n indian 0.90 0.89 0.89 244\n japanese 0.74 0.75 0.75 232\n korean 0.82 0.77 0.79 229\n thai 0.78 0.85 0.81 255\n\n accuracy 0.79 1199\n macro avg 0.79 0.79 0.79 1199\nweighted avg 0.79 0.79 0.79 1199\n\n"
" precision recall f1-score support\n\n chinese 0.75 0.73 0.74 223\n indian 0.93 0.88 0.90 255\n japanese 0.78 0.78 0.78 253\n korean 0.87 0.86 0.86 236\n thai 0.76 0.84 0.80 232\n\n accuracy 0.82 1199\n macro avg 0.82 0.82 0.82 1199\nweighted avg 0.82 0.82 0.82 1199\n\n"
]
}
],
@ -239,12 +239,11 @@
],
"metadata": {
"interpreter": {
"hash": "dd61f40108e2a19f4ef0d3ebbc6b6eea57ab3c4bc13b15fe6f390d3d86442534"
"hash": "70b38d7a306a849643e446cd70466270a13445e5987dfa1344ef2b127438fa4d"
},
"kernelspec": {
"name": "python37364bit8d3b438fb5fc4430a93ac2cb74d693a7",
"display_name": "Python 3.7.3 64-bit",
"language": "python"
"name": "python3",
"display_name": "Python 3.7.0 64-bit ('3.7')"
},
"language_info": {
"codemirror_mode": {

@ -1,4 +1,4 @@
# Recipe Classifiers 2
# Cuisine Classifiers 2
In this second Classification lesson, you will explore more ways to classify numeric data, and the ramifications for choosing one over the other.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/21/)
@ -42,7 +42,7 @@ import numpy as np
Split your training and test data:
```python
X_train, X_test, y_train, y_test = train_test_split(recipes_feature_df, recipes_label_df, test_size=0.3)
X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
```
## Linear SVC Classifier

@ -47,8 +47,8 @@
],
"source": [
"import pandas as pd\n",
"recipes_df = pd.read_csv(\"../data/cleaned_cuisine.csv\")\n",
"recipes_df.head()"
"cuisines_df = pd.read_csv(\"../data/cleaned_cuisine.csv\")\n",
"cuisines_df.head()"
]
},
{
@ -73,8 +73,8 @@
}
],
"source": [
"recipes_label_df = recipes_df['cuisine']\n",
"recipes_label_df.head()"
"cuisines_label_df = cuisines_df['cuisine']\n",
"cuisines_label_df.head()"
]
},
{
@ -116,8 +116,8 @@
}
],
"source": [
"recipes_feature_df = recipes_df.drop(['Unnamed: 0', 'cuisine'], axis=1)\n",
"recipes_feature_df.head()"
"cuisines_feature_df = cuisines_df.drop(['Unnamed: 0', 'cuisine'], axis=1)\n",
"cuisines_feature_df.head()"
]
}
],

@ -9,7 +9,7 @@
},
{
"cell_type": "code",
"execution_count": 57,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@ -42,18 +42,18 @@
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Unnamed: 0</th>\n <th>cuisine</th>\n <th>almond</th>\n <th>angelica</th>\n <th>anise</th>\n <th>anise_seed</th>\n <th>apple</th>\n <th>apple_brandy</th>\n <th>apricot</th>\n <th>armagnac</th>\n <th>...</th>\n <th>whiskey</th>\n <th>white_bread</th>\n <th>white_wine</th>\n <th>whole_grain_wheat_flour</th>\n <th>wine</th>\n <th>wood</th>\n <th>yam</th>\n <th>yeast</th>\n <th>yogurt</th>\n <th>zucchini</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1</td>\n <td>indian</td>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>2</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>3</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>4</td>\n <td>indian</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows × 382 columns</p>\n</div>"
},
"metadata": {},
"execution_count": 57
"execution_count": 1
}
],
"source": [
"import pandas as pd\n",
"recipes_df = pd.read_csv(\"../../data/cleaned_cuisine.csv\")\n",
"recipes_df.head()"
"cuisines_df = pd.read_csv(\"../../data/cleaned_cuisine.csv\")\n",
"cuisines_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 58,
"execution_count": 2,
"metadata": {},
"outputs": [
{
@ -69,17 +69,17 @@
]
},
"metadata": {},
"execution_count": 58
"execution_count": 2
}
],
"source": [
"recipes_label_df = recipes_df['cuisine']\n",
"recipes_label_df.head()"
"cuisines_label_df = cuisines_df['cuisine']\n",
"cuisines_label_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 59,
"execution_count": 3,
"metadata": {},
"outputs": [
{
@ -112,12 +112,12 @@
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>almond</th>\n <th>angelica</th>\n <th>anise</th>\n <th>anise_seed</th>\n <th>apple</th>\n <th>apple_brandy</th>\n <th>apricot</th>\n <th>armagnac</th>\n <th>artemisia</th>\n <th>artichoke</th>\n <th>...</th>\n <th>whiskey</th>\n <th>white_bread</th>\n <th>white_wine</th>\n <th>whole_grain_wheat_flour</th>\n <th>wine</th>\n <th>wood</th>\n <th>yam</th>\n <th>yeast</th>\n <th>yogurt</th>\n <th>zucchini</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>...</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n <td>1</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows × 380 columns</p>\n</div>"
},
"metadata": {},
"execution_count": 59
"execution_count": 3
}
],
"source": [
"recipes_feature_df = recipes_df.drop(['Unnamed: 0', 'cuisine'], axis=1)\n",
"recipes_feature_df.head()"
"cuisines_feature_df = cuisines_df.drop(['Unnamed: 0', 'cuisine'], axis=1)\n",
"cuisines_feature_df.head()"
]
},
{
@ -129,7 +129,7 @@
},
{
"cell_type": "code",
"execution_count": 60,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
@ -144,16 +144,16 @@
},
{
"cell_type": "code",
"execution_count": 61,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"X_train, X_test, y_train, y_test = train_test_split(recipes_feature_df, recipes_label_df, test_size=0.3)"
"X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)"
]
},
{
"cell_type": "code",
"execution_count": 62,
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
@ -172,77 +172,77 @@
},
{
"cell_type": "code",
"execution_count": 63,
"execution_count": 7,
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Accuracy (train) for Linear SVC: 79.7% \n",
"Accuracy (train) for Linear SVC: 76.4% \n",
" precision recall f1-score support\n",
"\n",
" chinese 0.71 0.73 0.72 232\n",
" indian 0.91 0.88 0.89 251\n",
" japanese 0.76 0.78 0.77 239\n",
" korean 0.86 0.75 0.80 244\n",
" thai 0.76 0.84 0.80 233\n",
" chinese 0.64 0.66 0.65 242\n",
" indian 0.91 0.86 0.89 236\n",
" japanese 0.72 0.73 0.73 245\n",
" korean 0.83 0.75 0.79 234\n",
" thai 0.75 0.82 0.78 242\n",
"\n",
" accuracy 0.80 1199\n",
" macro avg 0.80 0.80 0.80 1199\n",
"weighted avg 0.80 0.80 0.80 1199\n",
" accuracy 0.76 1199\n",
" macro avg 0.77 0.76 0.77 1199\n",
"weighted avg 0.77 0.76 0.77 1199\n",
"\n",
"Accuracy (train) for KNN classifier: 72.9% \n",
"Accuracy (train) for KNN classifier: 70.7% \n",
" precision recall f1-score support\n",
"\n",
" chinese 0.62 0.68 0.65 232\n",
" indian 0.87 0.82 0.85 251\n",
" japanese 0.62 0.83 0.71 239\n",
" korean 0.92 0.55 0.68 244\n",
" thai 0.73 0.76 0.75 233\n",
" chinese 0.65 0.63 0.64 242\n",
" indian 0.84 0.81 0.82 236\n",
" japanese 0.60 0.81 0.69 245\n",
" korean 0.89 0.53 0.67 234\n",
" thai 0.69 0.75 0.72 242\n",
"\n",
" accuracy 0.73 1199\n",
" macro avg 0.75 0.73 0.73 1199\n",
"weighted avg 0.76 0.73 0.73 1199\n",
" accuracy 0.71 1199\n",
" macro avg 0.73 0.71 0.71 1199\n",
"weighted avg 0.73 0.71 0.71 1199\n",
"\n",
"Accuracy (train) for SVC: 81.8% \n",
"Accuracy (train) for SVC: 80.1% \n",
" precision recall f1-score support\n",
"\n",
" chinese 0.78 0.71 0.74 232\n",
" indian 0.92 0.90 0.91 251\n",
" japanese 0.79 0.80 0.80 239\n",
" korean 0.85 0.78 0.82 244\n",
" thai 0.75 0.89 0.81 233\n",
" chinese 0.71 0.69 0.70 242\n",
" indian 0.92 0.92 0.92 236\n",
" japanese 0.77 0.78 0.77 245\n",
" korean 0.87 0.77 0.82 234\n",
" thai 0.75 0.86 0.80 242\n",
"\n",
" accuracy 0.82 1199\n",
" macro avg 0.82 0.82 0.82 1199\n",
"weighted avg 0.82 0.82 0.82 1199\n",
" accuracy 0.80 1199\n",
" macro avg 0.80 0.80 0.80 1199\n",
"weighted avg 0.80 0.80 0.80 1199\n",
"\n",
"Accuracy (train) for RFST: 83.3% \n",
"Accuracy (train) for RFST: 82.8% \n",
" precision recall f1-score support\n",
"\n",
" chinese 0.80 0.75 0.77 232\n",
" indian 0.91 0.92 0.91 251\n",
" japanese 0.81 0.82 0.82 239\n",
" korean 0.85 0.81 0.83 244\n",
" thai 0.79 0.86 0.82 233\n",
" chinese 0.80 0.75 0.77 242\n",
" indian 0.90 0.91 0.90 236\n",
" japanese 0.82 0.78 0.80 245\n",
" korean 0.85 0.82 0.83 234\n",
" thai 0.78 0.89 0.83 242\n",
"\n",
" accuracy 0.83 1199\n",
" macro avg 0.83 0.83 0.83 1199\n",
"weighted avg 0.83 0.83 0.83 1199\n",
"\n",
"Accuracy (train) for ADA: 70.8% \n",
"Accuracy (train) for ADA: 71.1% \n",
" precision recall f1-score support\n",
"\n",
" chinese 0.60 0.45 0.51 232\n",
" indian 0.90 0.81 0.85 251\n",
" japanese 0.65 0.72 0.68 239\n",
" korean 0.72 0.76 0.74 244\n",
" thai 0.67 0.79 0.72 233\n",
" chinese 0.60 0.57 0.58 242\n",
" indian 0.87 0.84 0.86 236\n",
" japanese 0.71 0.60 0.65 245\n",
" korean 0.68 0.78 0.72 234\n",
" thai 0.70 0.78 0.74 242\n",
"\n",
" accuracy 0.71 1199\n",
" macro avg 0.71 0.71 0.70 1199\n",
"weighted avg 0.71 0.71 0.70 1199\n",
" macro avg 0.71 0.71 0.71 1199\n",
"weighted avg 0.71 0.71 0.71 1199\n",
"\n"
]
}
@ -272,9 +272,8 @@
"hash": "70b38d7a306a849643e446cd70466270a13445e5987dfa1344ef2b127438fa4d"
},
"kernelspec": {
"name": "python37364bit8d3b438fb5fc4430a93ac2cb74d693a7",
"display_name": "Python 3.7.3 64-bit",
"language": "python"
"name": "python3",
"display_name": "Python 3.7.0 64-bit ('3.7')"
},
"language_info": {
"codemirror_mode": {

@ -1,13 +1,16 @@
# Build a Classifying Web App
# Build a Cuisine Recommender Web App
Add a sketchnote if possible/appropriate
In this lesson, you will build a classification model using some of the techniques you have learned in previous lessons and with the delicious cuisine dataset used throughout this series. In addition, you will build a small web app to use a saved model, leveraging Onnx's web runtime.
![Embed a video here if available](video-url)
One of the most useful practical uses of machine learning is building recommendation systems, and you can take the first step in that direction today!
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/23/)
[![Recommendation Systems Introduction](https://img.youtube.com/vi/giIXNoiqO_U/0.jpg)](https://youtu.be/giIXNoiqO_U "Recommendation Systems Introduction")
Describe what we will learn
> 🎥 Click the image above for a video: Andrew Ng introduces recommendation system design
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/23/)
In this lesson you will learn:
-
### Introduction
Describe what will be covered

@ -0,0 +1,28 @@
{
"metadata": {
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": 3
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2,
"cells": [
{
"source": [
"# Build a cuisine recommender"
],
"cell_type": "markdown",
"metadata": {}
}
]
}

@ -1,7 +1,7 @@
<!DOCTYPE html>
<html>
<header>
<title>Recipe Matcher</title>
<title>Cuisine Matcher</title>
</header>
<body>
<h1>Check your refrigerator. What can you create?</h1>
@ -41,9 +41,9 @@
<label>cumin</label>
</div>
</div>
<div style="padding-top:10px">
<button onClick="startInference()">What kind of cuisine can you make?</button>
</div>
<!-- import ONNXRuntime Web from CDN -->
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.8.0-dev.20210608.0/dist/ort.min.js"></script>
<script>
@ -76,13 +76,25 @@
})
}
function testCheckboxes() {
for (var i = 0; i < checks.length; i++)
if (checks[i].type == "checkbox")
if (checks[i].checked)
return true;
return false;
}
async function startInference() {
let checked = testCheckboxes()
if (checked) {
try {
// create a new session and load the specific model.
//
const session = await ort.InferenceSession.create('./model2.onnx');
const session = await ort.InferenceSession.create('./model.onnx');
const input = new ort.Tensor(new Float32Array(ingredients), [1, 380]);
const feeds = { float_input: input };
@ -96,6 +108,8 @@
} catch (e) {
console.log(`failed to inference ONNX model: ${e}.`);
}
}
else alert("Please check an ingredient")
}
init();

@ -29,6 +29,13 @@
"nbformat": 4,
"nbformat_minor": 2,
"cells": [
{
"source": [
"# Build a cuisine recommender"
],
"cell_type": "markdown",
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 1,

@ -1,7 +1,7 @@
# Getting Started with Classification
## Regional topic: Delicious Asian and Indian Recipes 🍜
## Regional topic: Delicious Asian and Indian Cuisines 🍜
In Asia and India, food traditions are extremely diverse, and very delicious! Let's look at data about regional recipes to try to guess where they originated.
In Asia and India, food traditions are extremely diverse, and very delicious! Let's look at data about regional cuisines to try to guess where they originated.
![Thai food seller](./images/thai-food.jpg)
> Photo by <a href="https://unsplash.com/@changlisheng?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Lisheng Chang</a> on <a href="https://unsplash.com/s/photos/asian-food?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
@ -22,4 +22,4 @@ In this section, you will build on the skills you learned in Lesson 1 (Regressio
"Getting Started with Classification" was written with ♥️ by [Cassie Breviu](https://www.twitter.com/cassieview) and [Jen Looper](https://www.twitter.com/jenlooper)
The delicious recipes dataset was sourced from [Kaggle](https://www.kaggle.com/hoandan/asian-and-indian-cuisines)
The delicious cuisines dataset was sourced from [Kaggle](https://www.kaggle.com/hoandan/asian-and-indian-cuisines)

File diff suppressed because it is too large Load Diff

@ -80,9 +80,9 @@ By ensuring that the content aligns with projects, the process is made more enga
| 07 | North American Pumpkin Prices 🎃 | [Regression](2-Regression/README.md) | Build a Logistic Regression model | [lesson](2-Regression/4-Logistic/README.md) | Jen |
| 08 | A Web App 🔌 | [Web App](3-Web-App/README.md) | Build a Web app to use your trained model | [lesson](3-Web-App/README.md) | Jen |
| 09 | Introduction to Classification | [Classification](4-Classification/README.md) | Clean, Prep, and Visualize your Data; Introduction to Classification | [lesson](4-Classification/1-Introduction/README.md) | Cassie |
| 10 | Delicious Asian and Indian Recipes 🍜 | [Classification](4-Classification/README.md) | Build a Discriminative Model | [lesson](4-Classification/2-Descriminative/README.md) | Cassie |
| 11 | Delicious Asian and Indian Recipes 🍜 | [Classification](4-Classification/README.md) | Build a Generative Model | [lesson](4-Classification/3-Generative/README.md) | Cassie |
| 12 | Delicious Asian and Indian Recipes 🍜 | [Classification](4-Classification/README.md) | Build a Web App using your Model | [lesson](4-Classification/4-Applied/README.md) | Jen |
| 10 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Discriminative Model | [lesson](4-Classification/2-Descriminative/README.md) | Cassie |
| 11 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Generative Model | [lesson](4-Classification/3-Generative/README.md) | Cassie |
| 12 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Web App using your Model | [lesson](4-Classification/4-Applied/README.md) | Jen |
| 13 | Introduction to Clustering | [Clustering](5-Clustering/README.md) | Clean, Prep, and Visualize your Data; Introduction to Clustering | [lesson](5-Clustering/1-Visualize/README.md) | Jen |
| 14 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](5-Clustering/README.md) | Explore the K-Means Clustering Method | [lesson](5-Clustering/2-K-Means/README.md) | Jen |
| 15 | Introduction to Natural Language Processing ☕️ | [Natural Language Processing](6-NLP/README.md) | Learn the basics about NLP by building a simple bot | [lesson](6-NLP/1-Introduction-to-NLP/README.md) | Stephen |

Loading…
Cancel
Save