Ornella Altunyan 4 years ago
commit 80e0986409

@ -1,21 +1,27 @@
# Cuisine classifiers 1
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about cuisines. You will use this dataset with a variety of classifiers to predict a given national cuisine based on a group of ingredients. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about cuisines.
You will use this dataset with a variety of classifiers to _predict a given national cuisine based on a group of ingredients_. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/21/)
# Preparation
Assuming you completed [Lesson 1](../1-Introduction/README.md), make sure that a _cleaned_cuisines.csv_ file exists in the root `/data` folder for these four lessons.
Working in this lesson's _notebook.ipynb_ folder, import that file along with the Pandas library:
## Exercise - predict a national cuisine
1. Working in this lesson's _notebook.ipynb_ folder, import that file along with the Pandas library:
```python
import pandas as pd
cuisines_df = pd.read_csv("../../data/cleaned_cuisine.csv")
cuisines_df.head()
```
The data looks like this:
```output
| | Unnamed: 0 | cuisine | almond | angelica | anise | anise_seed | apple | apple_brandy | apricot | armagnac | ... | whiskey | white_bread | white_wine | whole_grain_wheat_flour | wine | wood | yam | yeast | yogurt | zucchini |
| --- | ---------- | ------- | ------ | -------- | ----- | ---------- | ----- | ------------ | ------- | -------- | --- | ------- | ----------- | ---------- | ----------------------- | ---- | ---- | --- | ----- | ------ | -------- |
| 0 | 0 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
@ -23,8 +29,9 @@ The data looks like this:
| 2 | 2 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 3 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 4 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
```
Now, import several more libraries:
1. Now, import several more libraries:
```python
from sklearn.linear_model import LogisticRegression
@ -34,7 +41,7 @@ from sklearn.svm import SVC
import numpy as np
```
Divide the X and y coordinates into two dataframes for training. `cuisine` can be the labels dataframe:
1. Divide the X and y coordinates into two dataframes for training. `cuisine` can be the labels dataframe:
```python
cuisines_label_df = cuisines_df['cuisine']
@ -43,7 +50,7 @@ cuisines_label_df.head()
It will look like this:
```
```output
0 indian
1 indian
2 indian
@ -52,7 +59,7 @@ It will look like this:
Name: cuisine, dtype: object
```
Drop that `Unnamed: 0` column and the `cuisine` column and save the rest of the data as trainable features:
1. Drop that `Unnamed: 0` column and the `cuisine` column, calling `drop()`. Save the rest of the data as trainable features:
```python
cuisines_feature_df = cuisines_df.drop(['Unnamed: 0', 'cuisine'], axis=1)
@ -70,6 +77,7 @@ Your features look like this:
| 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Now you are ready to train your model!
## Choosing your classifier
Now that your data is clean and ready for training, you have to decide which algorithm to use for the job.
@ -87,6 +95,8 @@ Scikit-learn groups classification under Supervised Learning, and in that catego
> You can also use [neural networks to classify data](https://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification), but that is outside the scope of this lesson.
### What classifier to go with?
So, which classifier should you choose? Often, running through several and looking for a good result is a way to test. Scikit-learn offers a [side-by-side comparison](https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html) on a created dataset, comparing KNeighbors, SVC two ways, GaussianProcessClassifier, DecisionTreeClassifier, RandomForestClassifier, MLPClassifier, AdaBoostClassifier, GaussianNB and QuadraticDiscrinationAnalysis, showing the results visualized:
![comparison of classifiers](images/comparison.png)
@ -94,6 +104,8 @@ So, which classifier should you choose? Often, running through several and looki
> AutoML solves this problem neatly by running these comparisons in the cloud, allowing you to choose the best algorithm for your data. Try it [here](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-15963-cxa)
### A better approach
A better way than wildly guessing, however, is to follow the ideas on this downloadable [ML Cheat sheet](https://docs.microsoft.com/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-15963-cxa). Here, we discover that, for our multiclass problem, we have some choices:
![cheatsheet for multiclass problems](images/cheatsheet.png)
@ -101,24 +113,25 @@ A better way than wildly guessing, however, is to follow the ideas on this downl
✅ Download this cheat sheet, print it out, and hang it on your wall!
Given our clean, but minimal dataset, and the fact that we are running training locally via notebooks, neural networks are too heavyweight for this task. We do not use a two-class classifier, so that rules out one-vs-all. A decision tree might work, or logistic regression for multiclass data. The multiclass boosted decision tree is most suitable for nonparametric tasks, e.g. tasks designed to build rankings, so it is not useful for us.
### Reasoning
We can focus on logistic regression for our first training trial since you recently learned about the latter in a previous lesson.
## Train your model
Let's see if we can reason our way through different approaches given the constraints we have:
Let's train a model. Split your data into training and testing groups:
- **Neural networks are too heavy**. Given our clean, but minimal dataset, and the fact that we are running training locally via notebooks, neural networks are too heavyweight for this task.
- **No two-class classifier**. We do not use a two-class classifier, so that rules out one-vs-all.
- **Decision tree or logistic regression could work**. A decision tree might work, or logistic regression for multiclass data.
- **Multiclass Boosted Decision Trees solve a different problem**. The multiclass boosted decision tree is most suitable for nonparametric tasks, e.g. tasks designed to build rankings, so it is not useful for us.
```python
X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
```
### Using Scikit-learn
There are many ways to use the LogisticRegression library in Scikit-learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
We will be using Scikit-learn to analyze our data. However, there are many ways to use logistic regression in Scikit-learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
According to the docs, "In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the multi_class option is set to ovr, and uses the cross-entropy loss if the multi_class option is set to multinomial. (Currently the multinomial option is supported only by the lbfgs, sag, saga and newton-cg solvers.)"
Essentially there are two important parameters `multi_class` and `solver`, that we need to specify, when we ask Scikit-learn to perform a logistic regression. The `multi_class` value applies a certain behavior. The value of the solver is what algorithm to use. Not all solvers can be paired with all `multi_class` values.
Since you are using the multiclass case, you need to choose what scheme to use and what 'solver' to set.
According to the docs, in the multiclass case, the training algorithm:
Use LogisticRegression with a multiclass setting and the liblinear solver to train.
- **Uses the one-vs-rest (OvR) scheme**, if the `multi_class` option is set to `ovr`
- **Uses the cross-entropy loss**, if the `multi_class` option is set to `multinomial`. (Currently the `multinomial` option is supported only by the lbfgs, sag, saga and newton-cg solvers.)"
> 🎓 The 'scheme' here can either be 'ovr' (one-vs-rest) or 'multinomial'. Since logistic regression is really designed to support binary classification, these schemes allow it to better handle multiclass classification tasks. [source](https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/)
@ -128,6 +141,21 @@ Scikit-learn offers this table to explain how solvers handle different challenge
![solvers](images/solvers.png)
## Exercise - split the data
We can focus on logistic regression for our first training trial since you recently learned about the latter in a previous lesson.
Split your data into training and testing groups by calling `train_test_split()`:
```python
X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
```
## Exercise - apply logistic regression
Since you are using the multiclass case, you need to choose what _scheme_ to use and what _solver_ to set. Use LogisticRegression with a multiclass setting and the **liblinear** solver to train.
1. Create a logistic regression with multi_class set to `ovr` and the solver set to `liblinear`:
```python
lr = LogisticRegression(multi_class='ovr',solver='liblinear')
model = lr.fit(X_train, np.ravel(y_train))
@ -140,23 +168,25 @@ print ("Accuracy is {}".format(accuracy))
> Note, use Pandas [`ravel`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.ravel.html) function to flatten your data when needed.
The accuracy is good at over 80%!
The accuracy is good at over **80%**!
You can see this model in action by testing one row of data (#50):
1. You can see this model in action by testing one row of data (#50):
```python
print(f'ingredients: {X_test.iloc[50][X_test.iloc[50]!=0].keys()}')
print(f'cuisine: {y_test.iloc[50]}')
```
The result is printed:
```
```output
ingredients: Index(['cilantro', 'onion', 'pea', 'potato', 'tomato', 'vegetable_oil'], dtype='object')
cuisine: indian
```
✅ Try a different row number and check the results
Digging deeper, you can check for the accuracy of this prediction:
1. Digging deeper, you can check for the accuracy of this prediction:
```python
test= X_test.iloc[50].values.reshape(-1, 1).T
@ -167,6 +197,7 @@ resultdf = pd.DataFrame(data=proba, columns=classes)
topPrediction = resultdf.T.sort_values(by=[0], ascending = [False])
topPrediction.head()
```
The result is printed - Indian cuisine is its best guess, with good probability:
| | 0 | | | | | | | | | | | | | | | | | | | | |
@ -179,7 +210,7 @@ The result is printed - Indian cuisine is its best guess, with good probability:
✅ Can you explain why the model is pretty sure this is an Indian cuisine?
Get more detail by printing a classification report, as you did in the regression lessons:
1. Get more detail by printing a classification report, as you did in the regression lessons:
```python
y_pred = model.predict(X_test)

@ -1,6 +1,6 @@
# Cuisine classifiers 2
In this second classification lesson, you will explore more ways to classify numeric data. You will also learn about the ramifications for choosing one over the other.
In this second classification lesson, you will explore more ways to classify numeric data. You will also learn about the ramifications for choosing one classifier over the other.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/23/)

@ -7,40 +7,51 @@ One of the most useful practical uses of machine learning is building recommenda
[![Recommendation Systems Introduction](https://img.youtube.com/vi/giIXNoiqO_U/0.jpg)](https://youtu.be/giIXNoiqO_U "Recommendation Systems Introduction")
> 🎥 Click the image above for a video: Andrew Ng introduces recommendation system design
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/25/)
In this lesson you will learn:
- How to build a model and save it as an Onnx model
- How to use Netron to inspect the model
- How to use your model in a web app for inference
## Build your model
Building applied ML systems is an important part of leveraging these technologies for your business systems. You can use models within your web applications (and thus use them in an offline context if needed) by using Onnx. In a [previous lesson](../../3-Web-App/1-Web-App/README.md), you built a Regression model about UFO sightings, "pickled" it, and used it in a Flask app. While this architecture is very useful to know it is a full-stack Python app, and your requirements may include the use of a JavaScript application. In this lesson, you can build a basic JavaScript-based system for inference. First, however, you need to train a model and convert it for use with Onnx.
Building applied ML systems is an important part of leveraging these technologies for your business systems. You can use models within your web applications (and thus use them in an offline context if needed) by using Onnx.
In a [previous lesson](../../3-Web-App/1-Web-App/README.md), you built a Regression model about UFO sightings, "pickled" it, and used it in a Flask app. While this architecture is very useful to know, it is a full-stack Python app, and your requirements may include the use of a JavaScript application.
In this lesson, you can build a basic JavaScript-based system for inference. First, however, you need to train a model and convert it for use with Onnx.
First, train a classification model using the cleaned cuisines dataset we used. Start by importing useful libraries:
## Exercise - train classification model
First, train a classification model using the cleaned cuisines dataset we used.
1. Start by importing useful libraries:
```python
pip install skl2onnx
import pandas as pd
```
You need '[skl2onnx](https://onnx.ai/sklearn-onnx/)' to help convert your Scikit-learn model to Onnx format.
Then, work with your data in the same way you did in previous lessons:
1. Then, work with your data in the same way you did in previous lessons, by reading a CSV file using `read_csv()`:
```python
data = pd.read_csv('../data/cleaned_cuisine.csv')
data.head()
```
Remove the first two unnecessary columns and save the remaining data as 'X':
1. Remove the first two unnecessary columns and save the remaining data as 'X':
```python
X = data.iloc[:,2:]
X.head()
```
Save the labels as 'y':
1. Save the labels as 'y':
```python
y = data[['cuisine']]
@ -48,7 +59,11 @@ y.head()
```
Commence the training routine. We will use the 'SVC' library which has good accuracy. Import the appropriate libraries from Scikit-learn:
### Commence the training routine
We will use the 'SVC' library which has good accuracy.
1. Import the appropriate libraries from Scikit-learn:
```python
from sklearn.model_selection import train_test_split
@ -56,30 +71,35 @@ from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score,precision_score,confusion_matrix,classification_report
```
Separate training and test sets:
1. Separate training and test sets:
```python
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3)
```
Build an SVC Classification model as you did in the previous lesson:
1. Build an SVC Classification model as you did in the previous lesson:
```python
model = SVC(kernel='linear', C=10, probability=True,random_state=0)
model.fit(X_train,y_train.values.ravel())
```
Now, test your model:
1. Now, test your model, calling `predict()`:
```python
y_pred = model.predict(X_test)
```
Print out a classification report to check the model's quality:
1. Print out a classification report to check the model's quality:
```python
print(classification_report(y_test,y_pred))
```
As we saw before, the accuracy is good:
```
```output
precision recall f1-score support
chinese 0.72 0.69 0.70 257
@ -92,7 +112,12 @@ As we saw before, the accuracy is good:
macro avg 0.79 0.79 0.79 1199
weighted avg 0.79 0.79 0.79 1199
```
Now, convert your model to Onnx. Make sure to do the conversion with the proper Tensor number. This dataset has 380 ingredients listed, so you need to notate that number in `FloatTensorType`:
### Convert your model to Onnx
Make sure to do the conversion with the proper Tensor number. This dataset has 380 ingredients listed, so you need to notate that number in `FloatTensorType`:
1. Convert using a tensor number of 380.
```python
from skl2onnx import convert_sklearn
@ -100,6 +125,11 @@ from skl2onnx.common.data_types import FloatTensorType
initial_type = [('float_input', FloatTensorType([None, 380]))]
options = {id(model): {'nocl': True, 'zipmap': False}}
```
1. Create the onx and store as a file **model.onnx**:
```python
onx = convert_sklearn(model, initial_types=initial_type, options=options)
with open("./model.onnx", "wb") as f:
f.write(onx.SerializeToString())
@ -108,6 +138,7 @@ with open("./model.onnx", "wb") as f:
> Note, you can pass in [options](https://onnx.ai/sklearn-onnx/parameterized.html) in your conversion script. In this case, we passed in 'nocl' to be True and 'zipmap' to be False. Since this is a classification model, you have the option to remove ZipMap which produces a list of dictionaries (not necessary). `nocl` refers to class information being included in the model. Reduce your model's size by setting `nocl` to 'True'.
Running the entire notebook will now build an Onnx model and save it to this folder.
## View your model
Onnx models are not very visible in Visual Studio code, but there's a very good free software that many researchers use to visualize the model to ensure that it is properly built. Download [Netron](https://github.com/lutzroeder/Netron) and open your model.onnx file. You can see your simple model visualized, with its 380 inputs and classifier listed:
@ -117,11 +148,12 @@ Onnx models are not very visible in Visual Studio code, but there's a very good
Netron is a helpful tool to view your models.
Now you are ready to use this neat model in a web app. Let's build an app that will come in handy when you look in your refrigerator and try to figure out which combination of your leftover ingredients you can use to cook a given cuisine, as determined by your model.
## Build a recommender web application
You can use your model directly in a web app. This architecture also allows you to run it locally and even offline if needed. Start by creating an `index.html` file in the same folder where you stored your `model.onnx` file.
In this file, add the following markup:
1. In this file _index.html_, add the following markup:
```html
<!DOCTYPE html>
@ -135,7 +167,7 @@ In this file, add the following markup:
</html>
```
Now, working within the `body` tags, add a little markup to show a list of checkboxes reflecting some ingredients:
1. Now, working within the `body` tags, add a little markup to show a list of checkboxes reflecting some ingredients:
```html
<h1>Check your refrigerator. What can you create?</h1>
@ -179,16 +211,20 @@ Now, working within the `body` tags, add a little markup to show a list of check
<button onClick="startInference()">What kind of cuisine can you make?</button>
</div>
```
Notice that each checkbox is given a value. This reflects the index where the ingredient is found according to the dataset. Apple, for example, in this alphabetic list, occupies the fifth column, so its value is '4' since we start counting at 0. You can consult the [ingredients spreadsheet](../data/ingredient_indexes.csv) to discover a given ingredient's index.
Continuing your work in the index.html file, add a script block where the model is called after the final closing `</div>`. First, import the [Onnx Runtime](https://www.onnxruntime.ai/):
Continuing your work in the index.html file, add a script block where the model is called after the final closing `</div>`.
1. First, import the [Onnx Runtime](https://www.onnxruntime.ai/):
```html
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.8.0-dev.20210608.0/dist/ort.min.js"></script>
```
> Onnx Runtime is used to enable running your Onnx models across a wide range of hardware platforms, including optimizations and an API to use.
Once the Runtime is in place, you can call it:
1. Once the Runtime is in place, you can call it:
```javascript
<script>
@ -269,7 +305,7 @@ In this code, there are several things happening:
3. You created a `testCheckboxes` function that checks whether any checkbox was checked.
4. You use that function when the button is pressed and, if any checkbox is checked, you start inference.
5. The inference routine includes:
1. Setting up an asyncronous load of the model
1. Setting up an asynchronous load of the model
2. Creating a Tensor structure to send to the model
3. Creating 'feeds' that reflects the `float_input` input that you created when training your model (you can use Netron to verify that name)
4. Sending these 'feeds' to the model and waiting for a response
@ -280,12 +316,13 @@ Open a terminal session in Visual Studio Code in the folder where your index.htm
![ingredient web app](images/web-app.png)
Congratulations, you have created a simple web app recommendation with a few fields. Take some time to build out this system!
Congratulations, you have created a 'recommendation' web app with a few fields. Take some time to build out this system!
## 🚀Challenge
Your web app is very minimal, so continue to build it out using ingredients and their indexes from the [ingredient_indexes](../data/ingredient_indexes.csv) data. What flavor combinations work to create a given national dish?
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/26/)
## Review & Self Study
While this lesson just touched on the utility of creating a recommendation system for food ingredients, this area of ML applications is very rich in examples. Read some more about how these systems are built:
@ -293,6 +330,7 @@ While this lesson just touched on the utility of creating a recommendation syste
- https://www.sciencedirect.com/topics/computer-science/recommendation-engine
- https://www.technologyreview.com/2014/08/25/171547/the-ultimate-challenge-for-recommendation-engines/
- https://www.technologyreview.com/2015/03/23/168831/everything-is-a-recommendation/
## Assignment
[Build a new recommender](assignment.md)

@ -1,4 +1,4 @@
# Sentiment Analysis: Hotel Reviews
# Sentiment analysis with hotel reviews - processing the data
In this section you will use the techniques in the previous lessons to do some exploratory data analysis of a large dataset. Once you have a good understanding of the usefulness of the various columns, you will learn how to remove the unneeded columns, calculate some new data based on the existing columns, and save the resulting dataset for use in the final challenge.
@ -6,24 +6,20 @@ In this section you will use the techniques in the previous lessons to do some e
### Introduction
So far you've learned about how text data is quite unlike numerical types of data. If it's text that was written or spoken by a human, if can be analysed to find patterns and frequencies, sentiment and meaning. This final lesson takes you into a real data set with a real challenge. This lesson is a lot of code and analysis of a data set, it is quite dense but very amenable to experimentation in your favourite IDE or Notebook.
> This lesson uses the data set **515K Hotel Reviews Data in Europe**, CC0: Public Domain license, scraped from Booking.com from public sources. The creator of the dataset was Jiashen Liu.
So far you've learned about how text data is quite unlike numerical types of data. If it's text that was written or spoken by a human, if can be analysed to find patterns and frequencies, sentiment and meaning. This lesson takes you into a real data set with a real challenge: **[515K Hotel Reviews Data in Europe](https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe)** and includes a [CC0: Public Domain license](https://creativecommons.org/publicdomain/zero/1.0/). It was scraped from Booking.com from public sources. The creator of the dataset was Jiashen Liu.
### Preparation
You will need:
* Python 3
* pandas
* **TODO install NTLK details**
* NLTK, [which you should install locally](https://www.nltk.org/install.html)
* The data set which is available on Kaggle [515K Hotel Reviews Data in Europe](https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe). It is around 230 MB unzipped. Download it to the `/data` folder associated with these NLP lessons.
* The data set is available on Kaggle [515K Hotel Reviews Data in Europe](https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe), it is around 230 MB unzipped.
## Exploratory data analysis
## Exploratory Data Analysis
This challenge assumes you are building a hotel recommendation bot using sentiment analysis and guest reviews scores. The dataset you will be starting from has over 515,000 rows reviewing 1493 different hotels in 6 cities.
This challenge assumes that you are building a hotel recommendation bot using sentiment analysis and guest reviews scores. The dataset you will be using includes reviews of 1493 different hotels in 6 cities.
Using Python, a dataset of hotel reviews, and NLTK's sentiment analysis you could find out:
@ -33,34 +29,26 @@ Using Python, a dataset of hotel reviews, and NLTK's sentiment analysis you coul
#### Dataset
Let's explore the dataset first. Remember to download and save the CSV file here: https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe.
The dataset was created by **Jiashen Liu** 4 years ago (as of writing) and is licensed [CC0: Public Domain](https://creativecommons.org/publicdomain/zero/1.0/).
> "This dataset contains 515,000 customer reviews and scoring of 1493 luxury hotels across Europe. Meanwhile, the geographical location of hotels are also provided for further analysis."
You could open the file in an editor like VS Code or even Excel, and as it's a text CSV file, any editor that can handle large text files should be able to open it.
Let's explore the dataset first. Remember to download it and save it locally. Open the file in an editor like VS Code or even Excel. As it's a text-based CSV file, any editor that can handle large text files should be able to open it.
The headers in the dataset are as follows:
*Hotel_Address, Additional_Number_of_Scoring, Review_Date, Average_Score, Hotel_Name, Reviewer_Nationality, Negative_Review, Review_Total_Negative_Word_Counts, Total_Number_of_Reviews, Positive_Review, Review_Total_Positive_Word_Counts, Total_Number_of_Reviews_Reviewer_Has_Given, Reviewer_Score, Tags, days_since_review, lat, lng*
and Jiashen provides the description of each item on Kaggle.
Here they are grouped in a way that might be easier to examine:
##### Hotel columns
* `Hotel_Name`, `Hotel_Address`, `lat` (latitude), `lng` (longitude)
* Using *lat* and *lng* you could plot a map with Python showing the hotel locations (perhaps colour coded for negative and positive reviews)
* Using *lat* and *lng* you could plot a map with Python showing the hotel locations (perhaps color coded for negative and positive reviews)
* Hotel_Address is not obviously useful to us, and we'll probably replace that with a country for easier sorting & searching
**Hotel Meta-review columns**
* `Average_Score`
* According to the dataset creator, this column is *Average Score of the hotel, calculated based on the latest comment in the last year*. This seems like an unusual way to calculate the score, but it is the data scraped so we may take it as face value for now. Based on the other columns in this data, can you think of another way to calculate the average score?
* According to the dataset creator, this column is the *Average Score of the hotel, calculated based on the latest comment in the last year*. This seems like an unusual way to calculate the score, but it is the data scraped so we may take it as face value for now.
✅ Based on the other columns in this data, can you think of another way to calculate the average score?
* `Total_Number_of_Reviews`
* The total number of reviews this hotel has received - it is not clear (without writing some code) if this refers to the reviews in the dataset. More on this discrepancy below in the **Average hotel score** section.
* The total number of reviews this hotel has received - it is not clear (without writing some code) if this refers to the reviews in the dataset.
* `Additional_Number_of_Scoring`
* This means a review score was given but no positive or negative review was written by the reviewer
@ -73,12 +61,12 @@ Here they are grouped in a way that might be easier to examine:
- If a reviewer wrote nothing, this field will have "**No Negative**"
- Note that a reviewer may write a positive review in the Negative review column (e.g. "there is nothing bad about this hotel")
- `Review_Total_Negative_Word_Counts`
- Are higher negative word counts indicative of a lower score (without checking the sentimentality)
- Higher negative word counts indicate a lower score (without checking the sentimentality)
- `Positive_Review`
- If a reviewer wrote nothing, this field will have "**No Positive**"
- Note that a reviewer may write a negative review in the Positive review column (e.g. "there is nothing good about this hotel at all")
- `Review_Total_Positive_Word_Counts`
- Are higher positive word counts indicative of a higher score (without checking the sentimentality)
- Higher positive word counts indicate a higher score (without checking the sentimentality)
- `Review_Date` and `days_since_review`
- A freshness or staleness measure might be applied to a review (older reviews might not be as accurate as newer ones because hotel management changed, or renovations have been done, or a pool was added etc.)
- `Tags`
@ -88,9 +76,9 @@ Here they are grouped in a way that might be easier to examine:
**Reviewer columns**
- `Total_Number_of_Reviews_Reviewer_Has_Given`
- This might be an factor in a recommendation model, for instance, if you could determine that more prolific reviewers with hundreds of reviews were more likely to be negative rather than positive. However, the reviewer of any particular review is not identified with a unique code, and therefore cannot be linked to a set of reviews. There are 30 reviewers with 100 or more reviews, but hard to see how this can aid the recommendation model.
- This might be an factor in a recommendation model, for instance, if you could determine that more prolific reviewers with hundreds of reviews were more likely to be negative rather than positive. However, the reviewer of any particular review is not identified with a unique code, and therefore cannot be linked to a set of reviews. There are 30 reviewers with 100 or more reviews, but it's hard to see how this can aid the recommendation model.
- `Reviewer_Nationality`
- Some people might think that certain nationalities are more likely to give a positive or negative review because of a national inclination. Be careful building such anecdotal views into your models. These are national (and sometimes racial) stereotypes, and each reviewer was an individual who wrote a review based on their experience. It may have been filtered through many lens, such as their previous hotel stays, the distance travelled, and their personal temperament - but thinking that their nationality was the reason for a review score is a hard to justify assumption.
- Some people might think that certain nationalities are more likely to give a positive or negative review because of a national inclination. Be careful building such anecdotal views into your models. These are national (and sometimes racial) stereotypes, and each reviewer was an individual who wrote a review based on their experience. It may have been filtered through many lenses such as their previous hotel stays, the distance travelled, and their personal temperament. Thinking that their nationality was the reason for a review score is hard to justify.
##### Examples
@ -102,20 +90,21 @@ As you can see from this guest, they did not have a happy stay at this hotel. Th
##### Tags
As mentioned above, at first glance, the idea to use `Tags` to categorise the data makes sense. Unfortunately these tags are not standardised, which means in one hotel, the options might be *Single room*, *Twin room*, and *Double room*, but in the next hotel, they are *Deluxe Single Room*, *Classic Queen Room*, and *Executive King Room*. These might be the same things, but there are so many variations, the choice becomes:
As mentioned above, at first glance, the idea to use `Tags` to categorize the data makes sense. Unfortunately these tags are not standardized, which means that in a given hotel, the options might be *Single room*, *Twin room*, and *Double room*, but in the next hotel, they are *Deluxe Single Room*, *Classic Queen Room*, and *Executive King Room*. These might be the same things, but there are so many variations that the choice becomes:
1. Attempt to change all terms to a single standard, which is very difficult, because it is not clear what the conversion path would be in each case (e.g. *Classic single room* maps to *Single room* but *Superior Queen Room with Courtyard Garden or City View* is much harder to map)
2. We can take an NLP approach and measure the frequency of certain terms like *Solo*, *Business Traveller*, or *Family with young kids* as they apply to each hotel, and factor that into the recommendation
1. We can take an NLP approach and measure the frequency of certain terms like *Solo*, *Business Traveller*, or *Family with young kids* as they apply to each hotel, and factor that into the recommendation
Tags are usually (but not always) a single field containing a list of 5 to 6 comma separated values aligning to *Type of trip*, *Type of guests*, *Type of room*, *Number of nights*, and *Type of device review was submitted on*. However, because some reviewers don't fill in each field (they might leave one blank), the values are not always in the same order.
As an example, take *Type of group*. There are 1025 unique possibilities in this field in the `Tags` column, and unfortunately only some of them refer to a group (some are the type of room etc.). If you filter only the ones that mention family, the results contain many *Family room* type results. If you include the term *with*, i.e. count the *Family with* values, the results are better, with over 80,000 of the 515,000 results containing the phrase "Family with young children" or "Family with older children".
This means the tags column is not completely useless to us, but will take some work to make it useful.
This means the tags column is not completely useless to us, but it will take some work to make it useful.
##### Average Hotel Score
##### Average hotel score
There are a number of oddities or discrepancies with the data set that I can't figure out, but are illustrated here so you are aware of them when building your models. If you figure it out, please let us know!
There are a number of oddities or discrepancies with the data set that I can't figure out, but are illustrated here so you are aware of them when building your models. If you figure it out, please let us know in the discussion section!
The dataset has the following columns relating to the average score and number of reviews:
@ -125,7 +114,7 @@ The dataset has the following columns relating to the average score and number o
4. Total_Number_of_Reviews
5. Reviewer_Score
If we take a single hotel and count the reviews, we see that the single hotel with the most reviews in this dataset is *Britannia International Hotel Canary Wharf* with 4789 reviews out of 515,000. But if we look at the `Total_Number_of_Reviews` value for this hotel, it is 9086. You might surmise that there are many more scores without reviews, so perhaps we should add in the `Additional_Number_of_Scoring` column value. That value is 2682, and adding it to 4789 gets us 7,471 which is still 1615 short of the `Total_Number_of_Reviews`.
The single hotel with the most reviews in this dataset is *Britannia International Hotel Canary Wharf* with 4789 reviews out of 515,000. But if we look at the `Total_Number_of_Reviews` value for this hotel, it is 9086. You might surmise that there are many more scores without reviews, so perhaps we should add in the `Additional_Number_of_Scoring` column value. That value is 2682, and adding it to 4789 gets us 7,471 which is still 1615 short of the `Total_Number_of_Reviews`.
If you take the `Average_Score` columns, you might surmise it is the average of the reviews in the dataset, but the description from Kaggle is "*Average Score of the hotel, calculated based on the latest comment in the last year*". That doesn't seem that useful, but we can calculate our own average based on the reviews scores in the data set. Using the same hotel as an example, the average hotel score is given as 7.1 but the calculated score (average reviewer score *in* the dataset) is 6.8. This is close, but not the same value, and we can only guess that the scores given in the `Additional_Number_of_Scoring` reviews increased the average to 7.1. Unfortunately with no way to test or prove that assertion, it is difficult to use or trust `Average_Score`, `Additional_Number_of_Scoring` and `Total_Number_of_Reviews` when they are based on, or refer to, data we do not have.
@ -133,19 +122,17 @@ To complicate things further, the hotel with the second highest number of review
On the possibility that these hotel might be an outlier, and that maybe most of the values tally up (but some do not for some reason) we will write a short programs next to explore the values in the dataset and determine the correct usage (or non-usage) of the values.
##### A note of caution when working with datasets with human written reviews
Most of the time working with this dataset, you will write code that calculates something from the text, without having to read or analyse the text yourself. This is the essence of NLP, interpreting meaning or sentiment without having to have a human do it. However, it is possible you will read some of the negative reviews. I would urge you not to, because you don't have to. However they were written by humans, hotel guests who decided to write a review. Some of them are silly, or irrelevant negative hotel reviews, such as "The weather wasn't great", something beyond the control of the hotel, or indeed, anyone. But there is a dark side to some reviews too. Sometimes the negative reviews are racist, sexist, or ageist. This is unfortunate but to be expected in a dataset scraped off a public website. Some reviewers leave reviews that you would find distasteful, uncomfortable, or upsetting. Better to let the code measure the sentiment, than read them yourself and be upset. That said, it is a minority that write such things, but they exist all the same.
> 🚨 A note of caution
>
> When working with this dataset you will write code that calculates something from the text without having to read or analyse the text yourself. This is the essence of NLP, interpreting meaning or sentiment without having to have a human do it. However, it is possible that you will read some of the negative reviews. I would urge you not to, because you don't have to. Some of them are silly, or irrelevant negative hotel reviews, such as "The weather wasn't great", something beyond the control of the hotel, or indeed, anyone. But there is a dark side to some reviews too. Sometimes the negative reviews are racist, sexist, or ageist. This is unfortunate but to be expected in a dataset scraped off a public website. Some reviewers leave reviews that you would find distasteful, uncomfortable, or upsetting. Better to let the code measure the sentiment than read them yourself and be upset. That said, it is a minority that write such things, but they exist all the same.
#### Loading the CSV data into a pandas DataFrame
## Exercise
That's enough examining the data visually, now you'll write some code and get some answers! This section is focused on the pandas library. Your very first task is to ensure you can load and read the CSV data. The pandas library has a fast CSV loader, and the result is placed in a *DataFrame*. If you've never used a DataFrame before, imagine it's a 2D structure with rows and columns. The CSV we are loading has over half a million rows, but only 17 columns. pandas gives you lots of powerful ways to interact with a DataFrame, including the ability to perform operations on every row.
### Load the data
Learning pandas is hard but very worth while, it is a great library to be a master of. For this lesson, you need to understand the following items like DataFrames, Series, value_count(), apply(), groupBy(), and transform().
That's enough examining the data visually, now you'll write some code and get some answers! This section uses the pandas library. Your very first task is to ensure you can load and read the CSV data. The pandas library has a fast CSV loader, and the result is placed in a dataframe, as in previous lessons. The CSV we are loading has over half a million rows, but only 17 columns. Pandas gives you lots of powerful ways to interact with a dataframe, including the ability to perform operations on every row.
There are some great guides and docs at the [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/) and it's worth following the *Getting started* and *User guide*.
From here on in this lesson, there will be code snippets and some explanations of the code and some discussion about what the results mean. Try to do each section in turn, and you may find the Juypter notebook useful as it contains all the sections. **TODO: clean and upload notebook too**
From here on in this lesson, there will be code snippets and some explanations of the code and some discussion about what the results mean. Use the included _notebook.ipynb_ for your code.
Let's start with loading the data file you be using:
@ -156,31 +143,29 @@ import time
# importing time so the start and end time can be used to calculate file loading time
print("Loading data file now, this could take a while depending on file size")
start = time.time()
# df is 'DataFrame'
df = pd.read_csv('Hotel_Reviews.csv')
# df is 'DataFrame' - make sure you downloaded the file to the data folder
df = pd.read_csv('../../data/Hotel_Reviews.csv')
end = time.time()
print("Loading took " + str(round(end - start, 2)) + " seconds")
```
Now that the data is loaded, we can perform some operations on it. Keep this code at the top of your program for the next part.
#### Exploring the data
In this case, the data is already *clean*, that means that it is ready to work with, and does not have characters in other languages that might trip up the algorithms expecting only English characters. You might have to work with data that required some initial processing to format it before applying NLP techniques, but not this time.
## Explore the data
However, you should take a moment to ensure you that once loaded, you can explore the data with code. It's very easy to want to focus on the `Negative_Review` and `Positive_Review` columns. They are filled with natural text for your NLP algorithms to process. But wait! Before you jump into the NLP and sentiment, you should follow the code below, to get used to working with DataFrames and also to ascertain if the values given in the dataset match the values you calculate with *pandas*.
In this case, the data is already *clean*, that means that it is ready to work with, and does not have characters in other languages that might trip up algorithms expecting only English characters.
#### DataFrame operations
✅ You might have to work with data that required some initial processing to format it before applying NLP techniques, but not this time. If you had to, how would you handle non-English characters?
The first task in this lesson is to check if the following assertions are correct by writing some code that examines the data frame (without changing it). The first is below as an example and the others are similar, but this is a great way to learn how to work with a DataFrame (if this is your first time encountering them, you should definitely try to complete them before the next section).
Take a moment to ensure you that once the data is loaded, you can explore it with code. It's very easy to want to focus on the `Negative_Review` and `Positive_Review` columns. They are filled with natural text for your NLP algorithms to process. But wait! Before you jump into the NLP and sentiment, you should follow the code below to ascertain if the values given in the dataset match the values you calculate with pandas.
> Like many programming tasks, there are several ways to complete this, but good advice is to do it in the simplest, easiest way you can, especially if it will be easier to understand when you come back this code in the future. With DataFrames, there is a comprehensive API that will often have a way to do what you want efficiently.
## Dataframe operations
If you prefer, you can treat these as coding tasks and attempt to answer them without looking at the solution. If you are new to DataFrames, try following and executing the code of each step, paying attention to methods you do not recognise.
The first task in this lesson is to check if the following assertions are correct by writing some code that examines the data frame (without changing it).
With each of these questions, you can build on the previous answer by adding each solution beneath the previous answer (you don't have to create a new Python file for each answer). Remember to include the code in the *Loading the CSV file* above, that code is *required* before your code.
> Like many programming tasks, there are several ways to complete this, but good advice is to do it in the simplest, easiest way you can, especially if it will be easier to understand when you come back to this code in the future. With dataframes, there is a comprehensive API that will often have a way to do what you want efficiently.
Here are the questions on their own, followed by the code and explanations:
Treat the following questions as coding tasks and attempt to answer them without looking at the solution.
1. Print out the *shape* of the data frame you have just loaded (the shape is the number of rows and columns)
2. Calculate the frequency count for reviewer nationalities:
@ -196,7 +181,7 @@ Here are the questions on their own, followed by the code and explanations:
8. Calculate and print out how many rows have column `Positive_Review` values of "No Positive"
9. Calculate and print out how many rows have column `Positive_Review` values of "No Positive" **and** `Negative_Review` values of "No Negative"
### Code
### Code answers
1. Print out the *shape* of the data frame you have just loaded (the shape is the number of rows and columns)
@ -293,7 +278,7 @@ Here are the questions on their own, followed by the code and explanations:
display(hotel_freq_df)
```
| Hotel_Name | Total_Number_of_Reviews | Total_Reviews_Found |
|:----------:|:-----------------------:|:-------------------:|
| :----------------------------------------: | :---------------------: | :-----------------: |
| Britannia International Hotel Canary Wharf | 9086 | 4789 |
| Park Plaza Westminster Bridge London | 12158 | 4169 |
| Copthorne Tara Hotel London Kensington | 7105 | 3578 |
@ -326,7 +311,7 @@ Here are the questions on their own, followed by the code and explanations:
display(review_scores_df[["Average_Score_Difference", "Average_Score", "Calc_Average_Score", "Hotel_Name"]])
```
You may also wonder about the supplied in dataset `Average_Score` value and why it is sometimes different from the calculated average score. As we can't know why some of the values match, but others have a difference, it's safest in this case to use the review scores that we have to calculate the average ourselves. That said, the differences are usually very small, here are the hotels with the greatest deviation from the dataset average and the calculated average:
You may also wonder about the `Average_Score` value and why it is sometimes different from the calculated average score. As we can't know why some of the values match, but others have a difference, it's safest in this case to use the review scores that we have to calculate the average ourselves. That said, the differences are usually very small, here are the hotels with the greatest deviation from the dataset average and the calculated average:
| Average_Score_Difference | Average_Score | Calc_Average_Score | Hotel_Name |
| :----------------------: | :-----------: | :----------------: | ------------------------------------------: |
@ -393,11 +378,11 @@ Here are the questions on their own, followed by the code and explanations:
Sum took 0.19 seconds
```
You may have noticed that there are 127 rows that have both "No Negative" and "No Positive" values for the columns `Negative_Review` and `Positive_Review` respectively. That means that the reviewer gave the hotel a numerical score, but declined to write either a positive or negative review. Luckily this is a small amount of rows (127 out of 515738, or 0.02%), so it probably won't skew our model or results in any particular direction, but you might not have expected a data set of reviews to have rows with no reviews, so worth exploring the data to discover rows like this.
You may have noticed that there are 127 rows that have both "No Negative" and "No Positive" values for the columns `Negative_Review` and `Positive_Review` respectively. That means that the reviewer gave the hotel a numerical score, but declined to write either a positive or negative review. Luckily this is a small amount of rows (127 out of 515738, or 0.02%), so it probably won't skew our model or results in any particular direction, but you might not have expected a data set of reviews to have rows with no reviews, so it's worth exploring the data to discover rows like this.
### Modifying the DataFrame
### Modifying the dataframe
Now that you've explored the dataset, you can see some issues with it. Some columns are are filled with useless information, others are just incorrect, or if they are correct, it's unclear how to they were calculated, and answers cannot be independently verified by your own calculations.
Now that you've explored the dataset, you can see some issues with it. Some columns are are filled with useless information, others are just incorrect. If they are correct, it's unclear how they were calculated, and answers cannot be independently verified by your own calculations.
Next, you will add columns that will be useful later, change the values in other columns, and drop certain columns completely.
@ -464,7 +449,7 @@ Follow these steps in order:
2. Hotel Meta-review columns: `Average_Score`, `Total_Number_of_Reviews`, `Additional_Number_of_Scoring`
* Drop `Additional_Number_of_Scoring`
* Replace `Total_Number_of_Reviews` with the total number of reviews for that hotel that are actually in the dataset and
* Replace `Total_Number_of_Reviews` with the total number of reviews for that hotel that are actually in the dataset
* Replace `Average_Score` with our own calculated score
@ -488,7 +473,7 @@ Follow these steps in order:
- Drop `Total_Number_of_Reviews_Reviewer_Has_Given`
- Keep `Reviewer_Nationality`
Finally, save the dataset as it is now with a new name, then proceed to the NLP section.
Finally, save the dataset as it is now with a new name.
```python
df.drop(["Review_Total_Negative_Word_Counts", "Review_Total_Positive_Word_Counts", "days_since_review", "Total_Number_of_Reviews_Reviewer_Has_Given"], axis = 1, inplace=True)
@ -501,13 +486,14 @@ df.to_csv(r'Hotel_Reviews_Filtered.csv', index = False)
---
## 🚀Challenge
This lesson demonstrates, as we saw in previous lessons, how critically important it is to understand your data and its foibles before performing operations on it. Text-based data, in particular, bears careful scrutiny. Dig through various text-heavy datasets and see if you can discover areas that could introduce bias or skewed sentiment into a model.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/38/)
## Review & Self Study
Take [this Learning Path on NLP](https://docs.microsoft.com/learn/paths/explore-natural-language-processing/?WT.mc_id=academic-15963-cxa) to discover tools to try when building speech and text-heavy models.
## Assignment
[Poetic license](assignment.md)
[NLTK](assignment.md)

@ -1,9 +1,6 @@
# [Assignment Name]
# NLTK
## Instructions
## Rubric
NLTK is a well-known library for use in computational linguistics and NLP. Take this opportunity to read through the '[NLTK book](https://www.nltk.org/book/)' and try out its exercises. In this ungraded assignment, you will get to know this library more deeply.
| Criteria | Exemplary | Adequate | Needs Improvement |
| -------- | --------- | -------- | ----------------- |
| | | | |

@ -0,0 +1 @@
Download the hotel review data to this folder.

@ -118,7 +118,7 @@
],
"cell_type": "code",
"metadata": {},
"execution_count": 6,
"execution_count": null,
"outputs": []
},
{
@ -237,7 +237,7 @@
},
{
"source": [
"We add small amount `eps` to the original vector in order to avoid division by 0 in the initial case, when all components of the vector are identical.\n",
"We add a small amount of `eps` to the original vector in order to avoid division by 0 in the initial case, when all components of the vector are identical.\n",
"\n",
"The actual learning algorithm we will run for 5000 experiments, also called **epochs**: "
],
@ -354,7 +354,7 @@
},
{
"source": [
"## Investigating Learning Process"
"## Investigating the Learning Process"
],
"cell_type": "markdown",
"metadata": {}

@ -28,7 +28,7 @@
"source": [
"# Peter and the Wolf: Reinforcement Learning Primer\n",
"\n",
"In this tutorial, we will learn how to apply Reinforcement learning to a problem of path finding. The setting is inspired by [Peter and the Wolf](https://en.wikipedia.org/wiki/Peter_and_the_Wolf) musical fairy tale by Russian composer [Segei Prokofiev](https://en.wikipedia.org/wiki/Sergei_Prokofiev). It is a story about young pioneer Peter, who bravely goes out of his house to the forest clearing to chase the wolf. We will train machine learning algorithms that will help Peter to explore the surroinding area and build an optimal navigation map.\n",
"In this tutorial, we will learn how to apply Reinforcement learning to a problem of path finding. The setting is inspired by [Peter and the Wolf](https://en.wikipedia.org/wiki/Peter_and_the_Wolf) musical fairy tale by Russian composer [Sergei Prokofiev](https://en.wikipedia.org/wiki/Sergei_Prokofiev). It is a story about young pioneer Peter, who bravely goes out of his house to the forest clearing to chase the wolf. We will train machine learning algorithms that will help Peter to explore the surroinding area and build an optimal navigation map.\n",
"\n",
"First, let's import a bunch of userful libraries:"
],
@ -128,13 +128,11 @@
},
{
"source": [
"The strategy of our agent (Peter) is defined by so-called **policy**. A policy is a function that returns the action at any given state. In our case, the state of the problem is represented by the board, including the current position of the player. \n",
"\n",
"The goal of reinforcement learning is to eventually learn a good policy that will allow us to solve the problem efficiently. However, as a baseline, let's consider the simplest policy called **random walk**.\n",
"The strategy of our agent (Peter) is defined by a so-called **policy**. Let's consider the simplest policy called **random walk**.\n",
"\n",
"## Random walk\n",
"\n",
"Let's first solve our problem by implementing a random walk strategy. With random walk, we will randomly chose the next action from allowed ones, until we reach the apple. "
"Let's first solve our problem by implementing a random walk strategy."
],
"cell_type": "markdown",
"metadata": {}
@ -223,7 +221,7 @@
"source": [
"## Reward Function\n",
"\n",
"To make out policy more intelligent, we need to understand which moves are \"better\" than others. To do this, we need to define our goal. The goal can be defined in terms of **reward function**, that will return some score value for each state. The higher the number - the better is the reward function\n",
"To make our policy more intelligent, we need to understand which moves are \"better\" than others.\n",
"\n"
],
"cell_type": "markdown",
@ -253,13 +251,9 @@
},
{
"source": [
"Interesting thing about reward function is that in most of the cases *we are only given substantial reward at the end of the game*. It means that out algorithm should somehow remember \"good\" steps that lead to positive reward at the end, and increase their importance. Similarly, all moves that lead to bad results should be discouraged.\n",
"\n",
"## Q-Learning\n",
"\n",
"An algorithm that we will discuss here is called **Q-Learning**. In this algorithm, the policy is defined by a function (or a data structure) called **Q-Table**. It records the \"goodness\" of each of the actions in a given state, i.e. $Q : {S\\times A}\\to\\mathbb{R}$, where $S$ is a set of states, $A$ is the set of actions.\n",
"\n",
"It is called Q-Table because it is often convenient to represent it as a table, or multi-dimensional array. Since our board has dimentions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`:"
"Build a Q-Table, or multi-dimensional array. Since our board has dimentions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`:"
],
"cell_type": "markdown",
"metadata": {}
@ -275,7 +269,7 @@
},
{
"source": [
"Notice that we initially initialize all values of Q-Table with equal value, in our case - 0.25. That corresponds to the \"random walk\" policy, because all moves in each state are equally good. We can pass the Q-Table to the `plot` function in order to visualize the table on the board:"
"Pass the Q-Table to the plot function in order to visualize the table on the board:"
],
"cell_type": "markdown",
"metadata": {}
@ -301,38 +295,11 @@
"m.plot(Q)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"source": [
"In the center of each cell there is an \"arrow\" that indicates the preferred direction of movement. Since all directions are equal, a dot is displayed.\n",
"\n",
"Now we need to run the simulation, explore our environment, and learn better distribution of Q-Table values, which will allow us to find the path to the apple much faster.\n",
"\n",
"## Essence of Q-Learning: Bellman Equation\n",
"\n",
"Once we start moving, each action will have a corresponding reward, i.e. we can theoretically select the next action based on the highest immediate reward. However, in most of the states the move will not achieve our goal or reaching the apple, and thus we cannot immediately decide which direction is better.\n",
"\n",
"> It is not the immediate result that matters, but rather the final result, which we will obtain at the end of the simulation.\n",
"\n",
"In order to account for this delayed reward, we need to use the principles of **[dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming)**, which allows us to think about out problem recursively.\n",
"## Essence of Q-Learning: Bellman Equation and Learning Algorithm\n",
"\n",
"Suppose we are now at the state $s$, and we want to move to the next state $s'$. By doing so, we will receive the immediate reward $r(s,a)$, defined by reward function, plus some future reward. If we suppose that our Q-Table correctly reflects the \"attractiveness\" of each action, then at state $s'$ we will chose an action $a'$ that corresponds to maximum value of $Q(s',a')$. Thus, the best possible future reward we could get at state $s'$ will be defined as $\\max_{a'}Q(s',a')$ (maximum here is computed over all possible actions $a'$ at state $s'$). \n",
"\n",
"This gives the **Bellman formula** for calculating the value of Q-Table at state $s$, given action $a$:\n",
"\n",
"$$Q(s,a) = r(s,a) + \\gamma \\max_{a'} Q(s',a')$$\n",
"\n",
"Here $\\gamma$ is so-called **discount factor** that determines to which extent you should prefer current reward over the future reward and vice versa.\n",
"\n",
"## Learning Algorithm\n",
"\n",
"Given the equation above, we can now write a pseudo-code for our leaning algorithm:\n",
"Write a pseudo-code for our leaning algorithm:\n",
"\n",
"* Initialize Q-Table Q with equal numbers for all states and actions\n",
"* Set learning rate $\\alpha\\leftarrow 1$\n",
@ -349,9 +316,7 @@
"\n",
"## Exploit vs. Explore\n",
"\n",
"In the algorithm above, we did not specify how exactly we should chose an action at step 2.1. If we are choosing the action randomly, we will randomly **explore** the environment, and we are quite likely to die often, and also explore such areas where we would not normally go. An alternative approach would be to **exploit** the Q-Table values that we already know, and thus to chose the best action (with highers Q-Table value) at state $s$. This, however, will prevent us from exploring other states, and quite likely we might not find the optimal solution.\n",
"\n",
"Thus, the best approach is to balance between exploration and exploitation. This can be easily done by choosing the action at state $s$ with probabilities proportional to values in Q-Table. In the beginning, when Q-Table values are all the same, it would correspond to random selection, but as we learn more about our environment, we would be more likely to follow the optimal route, however, choosing the unexplored path once in a while.\n",
"The best approach is to balance between exploration and exploitation. As we learn more about our environment, we would be more likely to follow the optimal route, however, choosing the unexplored path once in a while.\n",
"\n",
"## Python Implementation\n",
"\n",
@ -374,7 +339,7 @@
},
{
"source": [
"We add small amount `eps` to the original vector in order to avoid division by 0 in the initial case, when all components of the vector are identical.\n",
"We add a small amount of `eps` to the original vector in order to avoid division by 0 in the initial case, when all components of the vector are identical.\n",
"\n",
"The actual learning algorithm we will run for 5000 experiments, also called **epochs**: "
],
@ -431,7 +396,7 @@
},
{
"source": [
"After executing this algorithm, Q-Table should be updated with values that define the attractiveness of different actions at each step. We can try to visualize Q-Table by plotting a vector at each cell that will point in the desired direction of movement. For simplicity, we draw small circle instead of arrow head."
"After executing this algorithm, the Q-Table should be updated with values that define the attractiveness of different actions at each step. Visualize the table here:"
],
"cell_type": "markdown",
"metadata": {}
@ -494,13 +459,11 @@
},
{
"source": [
"If you try the code above several times, you may notice that sometimes it just \"hangs\", and you need to press STOP button in the notebook to interrupt it. This happens because there could be situations when two states \"point\" to each other in terms of optimal Q-Value, in which case the agents ends up moving between those states indefinitely.\n",
"If you try the code above several times, you may notice that sometimes it just \"hangs\", and you need to press the STOP button in the notebook to interrupt it. \n",
"\n",
"> **Task 1:** Modify the `walk` function to limit the maximum length of path by a certain number of steps (say, 100), and watch the code above return this value from time to time.\n",
"\n",
"> **Task 2:** Modify the `walk` function so that it does not go back to the places where is has already been previously. This will prevent `walk` from looping, however, the agent can still end up being \"trapped\" in a location from which it is unable to escape.\n",
"\n",
"Better navigation policy would be the one that we have used during training, which combines exploitation and exploration. In this policy, we will select each action with a certain probability, proportional to the values in Q-Table. This strategy may still result in the agent returning back to the position it has already explored, but, as you can see from the code below, it results in very short average path to the desired location (remember that `print_statistics` runs the simulation 100 times): "
"> **Task 2:** Modify the `walk` function so that it does not go back to the places where is has already been previously. This will prevent `walk` from looping, however, the agent can still end up being \"trapped\" in a location from which it is unable to escape. "
],
"cell_type": "markdown",
"metadata": {}
@ -531,9 +494,7 @@
},
{
"source": [
"## Investigating Learning Process\n",
"\n",
"As we have mentioned, the learning process is a balance between exploration and exploration of gained knowledge about the structure of problem space. We have seen that the result of learning (the ability to help an agent to find short path to the goal) has improved, but it is also interesting to observe how the average path length behaves during the learning process: "
"## Investigating the Learning Process"
],
"cell_type": "markdown",
"metadata": {}
@ -585,9 +546,9 @@
{
"source": [
"## Exercise\n",
"#### More Realistic Peter and the Wolf World\n",
"#### A More Realistic Peter and the Wolf World\n",
"\n",
"In our situation, Peter was able to move around almost without getting tired or hungry. In more realistic world, we has to sit down and rest from time to time, and also to feed himself. Let's make our world more realistic, by implementing the following rules:\n",
"In our situation, Peter was able to move around almost without getting tired or hungry. In a more realistic world, he has to sit down and rest from time to time, and also to feed himself. Let's make our world more realistic by implementing the following rules:\n",
"\n",
"1. By moving from one place to another, Peter loses **energy** and gains some **fatigue**.\n",
"2. Peter can gain more energy by eating apples.\n",
@ -603,13 +564,6 @@
],
"cell_type": "markdown",
"metadata": {}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
]
}
Loading…
Cancel
Save