diff --git a/README.md b/README.md index 47792d9b..eb2fb519 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,9 @@ > 🌍 Travel around the world as we explore Machine Learning by means of world cultures 🌍 -Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about traditional Machine Learning. In this lesson group, you will learn about what is sometimes called 'classic' ML, using primarily Scikit-Learn as a library and avoiding deep learning, which is covered in our 'AI for Beginners' curriculum. Travel with us around the world as we apply these classic techniques to data from many areas of the world. Each lesson includes pre- and post-lesson quizzes, written instructions to complete the lesson, a solution, an assignment and more. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'. +Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about traditional Machine Learning. In this lesson group, you will learn about what is sometimes called 'classic' ML, using primarily Scikit-Learn as a library and avoiding deep learning, which is covered in our forthcoming 'AI for Beginners' curriculum. + +Travel with us around the world as we apply these classic techniques to data from many areas of the world. Each lesson includes pre- and post-lesson quizzes, written instructions to complete the lesson, a solution, an assignment and more. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'. **Hearty thanks to our authors (list all authors here)** @@ -23,7 +25,7 @@ Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson cur - Start with a pre-lecture quiz - Read the lecture and complete the activities, pausing and reflecting at each knowledge check. -- Try to create the projects by comprehending the lessons rather than copying the solution code; however that code is available in the `/solution` folders in each project-oriented lesson. +- Try to create the projects by comprehending the lessons rather than running the solution code; however that code is available in the `/solution` folders in each project-oriented lesson. - Take the post-lecture quiz - Complete the challenge - Complete the assignment @@ -68,14 +70,14 @@ By ensuring that the content aligns with projects, the process is made more enga | 05 | North American Pumpkin Prices 🎃 | [Regression](Regression/README.md) | Visualize and clean data in preparation for ML | [lesson](Regression/2-Data/README.md) | Jen | | 06 | North American Pumpkin Prices 🎃 | [Regression](Regression/README.md) | Build Linear and Polynomial Regression models | [lesson](Regression/3-Linear/README.md) | Jen | | 07 | North American Pumpkin Prices 🎃 | [Regression](Regression/README.md) | Build a Logistic Regression model | [lesson](Regression/4-Logistic/README.md) | Jen | -| 08 | Introduction to Classification | [Classification](Classification/README.md) | Clean, Prep, and Visualize your Data; Introduction to Classification | [lesson](Classification/1-Data/README.md) | Cassie | -| 09 | Delicious Asian Recipes 🍜 | [Classification](Classification/README.md) | Build a Discriminative Model | [lesson](Classification/2-Descriminative/README.md) | Cassie | -| 10 | Delicious Asian Recipes 🍜 | [Classification](Classification/README.md) | Build a Generative Model | [lesson](Classification/3-Generative/README.md) | Cassie | -| 11 | Delicious Asian Recipes 🍜 | [Classification](Classification/README.md) | Build a Web App using your Model | [lesson](Classification/4-Applied/README.md) | Cassie | -| 12 | Introduction to Clustering | [Clustering](Clustering/README.md) | Clean, Prep, and Visualize your Data; Introduction to Clustering | [lesson](Clustering/1-Visualize/README.md) | | -| 13 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](Clustering/README.md) | Explore the K-Means Clustering Method | [lesson](Clustering/2-K-Means/README.md) | | -| 14 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](Clustering/README.md) | Explore Centroid models for Clustering | [lesson](Clustering/3-Centroid/README.md) | | -| 15 | A Web App 🔌 | [Web App](Web-App/README.md) | Build a Web app to use your trained model | [lesson](Web-App/README.md) | Jen | +| 08 | A Web App 🔌 | [Web App](Web-App/README.md) | Build a Web app to use your trained model | [lesson](Web-App/README.md) | Jen | +| 09 | Introduction to Classification | [Classification](Classification/README.md) | Clean, Prep, and Visualize your Data; Introduction to Classification | [lesson](Classification/1-Data/README.md) | Cassie | +| 10 | Delicious Asian Recipes 🍜 | [Classification](Classification/README.md) | Build a Discriminative Model | [lesson](Classification/2-Descriminative/README.md) | Cassie | +| 11 | Delicious Asian Recipes 🍜 | [Classification](Classification/README.md) | Build a Generative Model | [lesson](Classification/3-Generative/README.md) | Cassie | +| 12 | Delicious Asian Recipes 🍜 | [Classification](Classification/README.md) | Build a Web App using your Model | [lesson](Classification/4-Applied/README.md) | Cassie | +| 13 | Introduction to Clustering | [Clustering](Clustering/README.md) | Clean, Prep, and Visualize your Data; Introduction to Clustering | [lesson](Clustering/1-Visualize/README.md) | | +| 14 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](Clustering/README.md) | Explore the K-Means Clustering Method | [lesson](Clustering/2-K-Means/README.md) | | +| 15 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](Clustering/README.md) | Explore Centroid models for Clustering | [lesson](Clustering/3-Centroid/README.md) | | | 16 | Introduction to NLP | [Natural Language Processing](NLP/README.md) | tbd | [lesson]() | Stephen | | 17 | Romantic Hotels of Europe ♥️ | [Natural Language Processing](NLP/README.md) | tbd | [lesson]() | Stephen | | 18 | Romantic Hotels of Europe ♥️ | [Natural Language Processing](NLP/README.md) | tbd | [lesson]() | Stephen | diff --git a/Web-App/1-Web-App/README.md b/Web-App/1-Web-App/README.md index 4b897e99..9141dfbc 100644 --- a/Web-App/1-Web-App/README.md +++ b/Web-App/1-Web-App/README.md @@ -1,6 +1,7 @@ # Build a Web App to use a ML Model In this lesson, you will train a Linear Regression model and a Classification model on a dataset that's out of this world: UFO Sightings over the past century, sourced from [NUFORC's database](https://www.nuforc.org). We will continue our use of notebooks to clean data and train our model, but you can take the process one step further by exploring using a model 'in the wild', so to speak: in a web app. To do this, you need to build a web app using Flask. +## [Pre-lecture quiz](link-to-quiz-app) There are several ways to build web apps to consume machine learning models. Your web architecture may influence the way your model is trained. Imagine that you are working in a business where the data science group has trained a model that they want you to use in an app. There are many questions you need to ask: Is it a web app, or a mobile app? Where will the model reside, in the cloud or locally? Does the app have to work offline? And what technology was used to train the model, because that may influence the tooling you need to use? @@ -15,12 +16,256 @@ You also have the opportunity to build an entire Flask web app that would be abl ## Tools For this task, you need two tools: Flask and Pickle, both of which run on Python. -## [Pre-lecture quiz](link-to-quiz-app) -✅ Knowledge Check - use this moment to stretch students' knowledge with open questions +✅ What's [Flask](https://palletsprojects.com/p/flask/)? Defined as a 'micro-framework' by its creators, Flask provides the basic features of web frameworks using Python and a templating engine to build web pages. Take a look at [this Learn module](https://docs.microsoft.com/learn/modules/python-flask-build-ai-web-app?WT.mc_id=academic-15963-cxa) to practice building with Flask. + +✅ What's [Pickle](https://docs.python.org/3/library/pickle.html)? Pickle 🥒 is a Python module that serializes and de-serializes a Python object structure. When you 'pickle' a model, you serialize or flatten its structure for use on the web. Be careful: pickle is not intrinsically secure, so be careful if prompted to 'un-pickle' a file. A pickled file has the suffix `.pkl`. + +## Clean your data + +In this lesson you'll use data from 80,000 UFO sightings, gathered by [NUFORC](https://nuforc.org) (The National UFO Reporting Center). This data has some interesting descriptions of UFO sightings, for example "A man emerges from a beam of light that shines on a grassy field at night and he runs towards the Texas Instruments parking lot" or simply "the lights chased us". The [ufos.csv](./data/ufos.csv) spreadsheet includes columns about the city, state and country where the sighting occurred, the object's shape and its latitude and longitude. + +In the blank [notebook](notebook.ipynb) included in this lesson, import pandas, matplotlib, and numpy as you did in previous lessons and import the ufos spreadsheet. You can take a look at a sample data set: + +```python +import pandas as pd +import numpy as np + +ufos = pd.read_csv('../data/ufos.csv') +ufos.head() +``` +Convert the ufos data to a small dataframe with fresh titles. Check the unique values in the Country field. + +```python +from sklearn.preprocessing import LabelEncoder + +ufos = pd.DataFrame({'Seconds': ufos['duration (seconds)'], 'Country': ufos['country'],'Latitude': ufos['latitude'],'Longitude': ufos['longitude']}) + +ufos.Country.unique() +``` + +Now, you can reduce the amount of data we need to deal with by dropping any null values and only importing sightings between 1-60 seconds: + +```python +ufos.dropna(inplace=True) + +ufos = ufos[(ufos['Seconds'] >= 1) & (ufos['Seconds'] <= 60)] + +ufos.info() +``` + +Next, import Scikit-Learn's LabelEncoder library to convert the text values for countries to a number. + +✅ LabelEncoder encodes data alphabetically + +```python +from sklearn.preprocessing import LabelEncoder + +ufos['Country'] = LabelEncoder().fit_transform(ufos['Country']) + +ufos.head() +``` + +Your data should look like this: + +``` + Seconds Country Latitude Longitude +2 20.0 3 53.200000 -2.916667 +3 20.0 4 28.978333 -96.645833 +14 30.0 4 35.823889 -80.253611 +23 60.0 4 45.582778 -122.352222 +24 3.0 3 51.783333 -0.783333 +``` +## Build your model + +Now you can get ready to train a model by diving the data into the training and testing group. Select the three features you want to train on as your X vector, and the y vector will be the Country. You want to be able to input seconds, latitude and longitude and get a country id to return. + +```python +from sklearn.model_selection import train_test_split + +Selected_features = ['Seconds','Latitude','Longitude'] + +X = ufos[Selected_features] +y = ufos['Country'] + +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0) +``` + +Finally, train your model using Logistic Regression: + +```python +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score, classification_report +from sklearn.linear_model import LogisticRegression +model = LogisticRegression() +model.fit(X_train, y_train) +predictions = model.predict(X_test) + +print(classification_report(y_test, predictions)) +print('Predicted labels: ', predictions) +print('Accuracy: ', accuracy_score(y_test, predictions)) +``` + +The accuracy isn't bad (around 95%), unsurprisingly, as country and latitude/longitude have a good correlation. The model you created isn't very revolutionary, but it's a good exercise to try to train from raw data that you cleaned, export, and then use this model in a web app. +## Pickle your model + +Now, it's time to pickle your model! You can do that in just a few lines of code. Load your pickled model and test it against a sample data array containing values for seconds, latitude and longitude, + +```python +import pickle +model_filename = 'ufo-model.pkl' +pickle.dump(model, open(model_filename,'wb')) + +model = pickle.load(open('ufo-model.pkl','rb')) +print(model.predict([[50,44,-12]])) +``` +The model returns '3', which is the country code for the UK. Wild! 👽 + +## Build a Flask app + +Now you can build a Flask app to call your model and return similar results, but in a more visually pleasing way. + +Start by creating a folder called web-app next to the notebook.ipynb file where your ufo-model.pkl file resides. In that folder create three more folders: `static`, with a folder `css` inside it, and `templates`. + +> Refer to the solution folder for a view of the finished app + +The first file to create in `web-app` is a `requirements.txt` file. Like `package.json` in a JavaScript app, this file lists dependencies required by the app. In `requirements.txt` add the lines: + +```text +scikit-learn +pandas +numpy +flask +``` +Now, run this file by navigating to `web-app` (`cd web-app`) in your terminal and typing `python requirements.txt` + +> You might need to use `python3 requirements.txt`, depending on your local configuration. + +Now, you're ready to create three more files to finish the app: + +1. Create `app.py` in the root +1. Create `index.html` in `templates` +1. Create `styles.css` in `static/css` + +Build out the styles.css file with a few styles: + +```css +body { + width: 100%; + height: 100%; + font-family: 'Helvetica'; + background: black; + color: #fff; + text-align: center; + letter-spacing: 1.4px; + font-size: 30px; +} + +input { + min-width: 150px; +} + +.grid { + width: 300px; + border: 1px solid #2d2d2d; + display: grid; + justify-content: center; + margin: 20px auto; +} + +.box { + color: #fff; + background: #2d2d2d; + padding: 12px; + display: inline-block; +} +``` +Next, build out the `index.html` file: + +```html + + +
+ +According to the number of seconds, latitude and longitude, which country is likely to have reported seeing a UFO?
+ + + + +{{ prediction_text }}
+ +\n | datetime | \ncity | \nstate | \ncountry | \nshape | \nduration (seconds) | \nduration (hours/min) | \ncomments | \ndate posted | \nlatitude | \nlongitude | \n
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n10/10/1949 20:30 | \nsan marcos | \ntx | \nus | \ncylinder | \n2700.0 | \n45 minutes | \nThis event took place in early fall around 194... | \n4/27/2004 | \n29.883056 | \n-97.941111 | \n
1 | \n10/10/1949 21:00 | \nlackland afb | \ntx | \nNaN | \nlight | \n7200.0 | \n1-2 hrs | \n1949 Lackland AFB, TX. Lights racing acros... | \n12/16/2005 | \n29.384210 | \n-98.581082 | \n
2 | \n10/10/1955 17:00 | \nchester (uk/england) | \nNaN | \ngb | \ncircle | \n20.0 | \n20 seconds | \nGreen/Orange circular disc over Chester, En... | \n1/21/2008 | \n53.200000 | \n-2.916667 | \n
3 | \n10/10/1956 21:00 | \nedna | \ntx | \nus | \ncircle | \n20.0 | \n1/2 hour | \nMy older brother and twin sister were leaving ... | \n1/17/2004 | \n28.978333 | \n-96.645833 | \n
4 | \n10/10/1960 20:00 | \nkaneohe | \nhi | \nus | \nlight | \n900.0 | \n15 minutes | \nAS a Marine 1st Lt. flying an FJ4B fighter/att... | \n1/22/2004 | \n21.418056 | \n-157.803611 | \n
\n | Seconds | \nCountry | \nLatitude | \nLongitude | \n
---|---|---|---|---|
2 | \n20.0 | \n3 | \n53.200000 | \n-2.916667 | \n
3 | \n20.0 | \n4 | \n28.978333 | \n-96.645833 | \n
14 | \n30.0 | \n4 | \n35.823889 | \n-80.253611 | \n
23 | \n60.0 | \n4 | \n45.582778 | \n-122.352222 | \n
24 | \n3.0 | \n3 | \n51.783333 | \n-0.783333 | \n