Merge branch 'main' into regressio-linear

pull/37/head
Jen Looper 3 years ago committed by GitHub
commit 7e02d168b8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,99 +1,100 @@
# Introduction to Machine Learning
# Introduction to machine learning
[![ML, AI, Deep Learning - What's the difference?](https://img.youtube.com/vi/lTd9RSxS9ZE/0.jpg)](https://youtu.be/lTd9RSxS9ZE "ML, AI, Deep Learning - What's the difference?")
[![ML, AI, deep learning - What's the difference?](https://img.youtube.com/vi/lTd9RSxS9ZE/0.jpg)](https://youtu.be/lTd9RSxS9ZE "ML, AI, deep learning - What's the difference?")
> 🎥 Click the image above for a video discussing the difference between Machine Learning, AI, and Deep Learning.
> 🎥 Click the image above for a video discussing the difference between machine learning, AI, and deep learning.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/)
### Introduction
Welcome to this course on classical machine learning for beginners! Whether you're completely new to this topic, or an experienced ML practitioner looking to brush up on an area, we're happy to have you join us! We want to create a friendly launching spot for your ML learning and would be happy to evaluate, respond to, and incorporate your [feedback](https://github.com/microsoft/ML-For-Beginners/discussions).
Welcome to this course on classical machine learning for beginners! Whether you're completely new to this topic, or an experienced ML practitioner looking to brush up on an area, we're happy to have you join us! We want to create a friendly launching spot for your ML study and would be happy to evaluate, respond to, and incorporate your [feedback](https://github.com/microsoft/ML-For-Beginners/discussions).
[![Introduction to ML](https://img.youtube.com/vi/h0e2HAPTGF4/0.jpg)](https://youtu.be/h0e2HAPTGF4 "Introduction to ML")
> 🎥 Click the image above for a video: MIT's John Guttag introduces Machine Learning
### Getting started with Machine Learning
> 🎥 Click the image above for a video: MIT's John Guttag introduces machine learning
### Getting started with machine learning
Before starting with this curriculum, you need to have your computer set up and ready to run notebooks locally.
- **Configure your machine with these videos**. Learn more about how to set up your machine in this [set of videos](https://www.youtube.com/playlist?list=PLlrxD0HtieHhS8VzuMCfQD4uJ9yne1mE6).
- **Learn Python**. It's also recommended to have a basic understanding of [Python](https://docs.microsoft.com/learn/paths/python-language/?WT.mc_id=academic-15963-cxa), a programming language useful for data scientists that we use in this course.
- **Learn Node.js and JavaScript**. We also use JavaScript a few times in this course when building web apps, so you will need to have [node](https://nodejs.org) and [npm](https://www.npmjs.com/) installed, as well as [Visual Studio Code](https://code.visualstudio.com/) available for both Python and JavaScript development.
- **Create a GH account**. Since you found us here on [GitHub](https://github.com), you might already have an account, but if not, create one and then fork this curriculum to use on your own. (Feel free to give us a star, too :))
- **Explore Scikit-Learn**. Familiarize yourself with [Scikit-Learn]([https://scikit-learn.org/stable/user_guide.html), a set of ML libraries that we reference in these lessons.
- **Create a GitHub account**. Since you found us here on [GitHub](https://github.com), you might already have an account, but if not, create one and then fork this curriculum to use on your own. (Feel free to give us a star, too 😊)
- **Explore Scikit-learn**. Familiarize yourself with [Scikit-learn]([https://scikit-learn.org/stable/user_guide.html), a set of ML libraries that we reference in these lessons.
### What is Machine Learning?
### What is machine learning?
The term 'Machine Learning' is one of the most popular and frequently used terms of today. There is a nontrivial possibility that you have heard this term at least once if you have some sort of familiarity with technology, no matter what domain you work in. The mechanics of Machine Learning, however, are a mystery to most people. For a Machine Learning beginner, the subject can sometimes feel overwhelming. Therefore, it is important to understand what Machine Learning actually is, and to learn about it step by step, through practical examples.
The term 'machine learning' is one of the most popular and frequently used terms of today. There is a nontrivial possibility that you have heard this term at least once if you have some sort of familiarity with technology, no matter what domain you work in. The mechanics of machine learning, however, are a mystery to most people. For a machine learning beginner, the subject can sometimes feel overwhelming. Therefore, it is important to understand what machine learning actually is, and to learn about it step by step, through practical examples.
![ml hype curve](images/hype.png)
> Google Trends shows the recent 'hype curve' of the term 'machine learning'
We live in a universe full of unusual and interesting mysteries. Great scientists such as Stephen Hawking, Albert Einstein, and many more have devoted their lives in search of meaningful information that uncovers the mysteries of the world around us. This is the human condition of learning: a human child learns new things and uncovers the structure of their world year by year as they grow to adulthood.
We live in a universe full of fascinating mysteries. Great scientists such as Stephen Hawking, Albert Einstein, and many more have devoted their lives to searching for meaningful information that uncovers the mysteries of the world around us. This is the human condition of learning: a human child learns new things and uncovers the structure of their world year by year as they grow to adulthood.
A child's brain and senses perceive the facts of their surroundings and gradually learn the hidden patterns of life which help the child to craft logical rules to identify learned patterns. The learning process of the human brain makes humans the most sophisticated living creature of this world. Learning continuously by discovering hidden patterns and then innovating on those patterns enables us to make ourselves better and better throughout our lifetime. This learning capacity and evolving capability is related to a concept called [brain plasticity](https://www.simplypsychology.org/brain-plasticity.html). Superficially, we can draw some motivational similarities between the learning process of the human brain and the concepts of machine learning.
The [human brain](https://www.livescience.com/29365-human-brain.html) perceives things from the real world, processes the perceived information, makes rational decisions, and performs certain actions based on circumstances. This is what we called behaving intelligently. When we program a facsimile of the intelligent behavioral process to a machine, it is called Artificial Intelligence (AI). Although the terms can be confused, Machine Learning (ML) is an important subset of Artificial Intelligence. **ML is concerned with using specialized algorithms fetching meaningful information and finding hidden patterns from perceived data to corroborate the rational decision-making process**.
The [human brain](https://www.livescience.com/29365-human-brain.html) perceives things from the real world, processes the perceived information, makes rational decisions, and performs certain actions based on circumstances. This is what we called behaving intelligently. When we program a facsimile of the intelligent behavioral process to a machine, it is called artificial intelligence (AI).
## What you will learn in this course
Although the terms can be confused, machine learning (ML) is an important subset of artificial intelligence. **ML is concerned with using specialized algorithms to uncover meaningful information and find hidden patterns from perceived data to corroborate the rational decision-making process**.
In this curriculum, we are going to cover only the core concepts of Machine Learning that a beginner must know. We cover what we call 'Classical Machine Learning'. To understand broader concepts of Artificial Intelligence or Deep Learning, a strong fundamental knowledge of Machine Learning is indispensable, and so we would like to offer it here. You will additionally learn the basics of Regression, Classification, Clustering, Natural Language Processing, Time Series, and Reinforcement Learning, as well as real-world applications, the history of ML, ML and Fairness, and how to use your model in a web app.
![AI, ML, deep learning, data science](images/ai-ml-ds.png)
In this course you will learn:
> A diagram showing the relationships between AI, ML, deep learning, and data science. Infographic by [Jen Looper](https://twitter.com/jenlooper) inspired by [this graphic](https://softwareengineering.stackexchange.com/questions/366996/distinction-between-ai-ml-neural-networks-deep-learning-and-data-mining)
- Core concepts of Machine Learning
- The definition of "Classical Machine Learning"
- Regression
- Classification
- Clustering
- Natural Language Processing
- Time series
- Reinforcement learning
- Real world applications
- History of ML and ML and fairness
## What you will learn in this course
### We will not cover
In this curriculum, we are going to cover only the core concepts of machine learning that a beginner must know. We cover what we call 'classical machine learning' primarily using Scikit-learn, an excellent library many students use to learn the basics. To understand broader concepts of artificial intelligence or deep learning, a strong fundamental knowledge of machine learning is indispensable, and so we would like to offer it here.
To make for a better learning experience, we will avoid the complexities of neural networks, 'Deep Learning' - many-layered model-building - and AI, which we will discuss in a different curriculum.
In this course you will learn:
- Deep Learning
- core concepts of machine learning
- the history of ML
- ML and fairness
- regression ML techniques
- classification ML techniques
- clustering ML techniques
- natural language processing ML techniques
- time series forecasting ML techniques
- reinforcement learning
- real-world applications for ML
## What we will not cover
- deep learning
- neural networks
- AI
To make for a better learning experience, we will avoid the complexities of neural networks, 'deep learning' - many-layered model-building using neural networks - and AI, which we will discuss in a different curriculum. We also will offer a forthcoming data science curriculum to focus on that aspect of this larger field.
## Why study machine learning?
![AI, ML, Deep Learning, Data Science](images/ai-ml-ds.png)
> A diagram showing the relationships between AI, ML, Deep Learning, and Data Science. Infographic by [Jen Looper](https://twitter.com/jenlooper) inspired by [this graphic](https://softwareengineering.stackexchange.com/questions/366996/distinction-between-ai-ml-neural-networks-deep-learning-and-data-mining)
## Why learn Machine Learning
Machine Learning is defined as the creation of automated systems that can learn hidden patterns from data to infer intelligent decisions.
Machine learning, from a systems perspective, is defined as the creation of automated systems that can learn hidden patterns from data to aid in making intelligent decisions.
The major motivation behind leveraging Machine Learning is to create automated systems that can learn hidden patterns from data to infer intelligent decisions. This motivation seem to be loosely inspired by how the human brain learns certain things based on the data it perceives from the outside world.
This motivation is loosely inspired by how the human brain learns certain things based on the data it perceives from the outside world.
✅ Think for a minute why a business would want to try to use Machine Learning strategies vs. creating a hard-coded rules-based engine.
✅ Think for a minute why a business would want to try to use machine learning strategies vs. creating a hard-coded rules-based engine.
### Applications of Machine Learning
### Applications of machine learning
Applications of Machine Learning are now almost everywhere, and are as ubiquitous as the data that is flowing around our societies, generated by our smart phones, connected devices, and other systems. Considering the immense potential of state-of-the-art Machine Learning algorithms, researchers have been exploring their capability to solve multi-dimensional and multi-disciplinary real-life problems with great positive outcomes.
Applications of machine learning are now almost everywhere, and are as ubiquitous as the data that is flowing around our societies, generated by our smart phones, connected devices, and other systems. Considering the immense potential of state-of-the-art machine learning algorithms, researchers have been exploring their capability to solve multi-dimensional and multi-disciplinary real-life problems with great positive outcomes.
You can use Machine Learning in many ways:
**You can use machine learning in many ways**:
- Predict the likelihood of disease from a patient's medical history or reports.
- Leverage weather data to predict weather events.
- Understand the sentiment of a text.
- Detect fake news to stop the spread of propaganda.
- To predict the likelihood of disease from a patient's medical history or reports.
- To leverage weather data to predict weather events.
- To understand the sentiment of a text.
- To detect fake news to stop the spread of propaganda.
Finance, economics, earth science, space exploration, biomedical engineering, cognitive science, and even fields in the humanities have adapted Machine Learning to solve the arduous, data-processing heavy problems of their domain.
Finance, economics, earth science, space exploration, biomedical engineering, cognitive science, and even fields in the humanities have adapted machine learning to solve the arduous, data-processing heavy problems of their domain.
Machine Learning automates the process of pattern-discovery by finding meaningful insights from real-world or generated data. It has proven itself to be highly valuable in business, health, and financial applications, among others.
Machine learning automates the process of pattern-discovery by finding meaningful insights from real-world or generated data. It has proven itself to be highly valuable in business, health, and financial applications, among others.
In the near future, understanding the basics of Machine Learning is going to be a must for people from any domain due to its widespread adoption.
In the near future, understanding the basics of machine learning is going to be a must for people from any domain due to its widespread adoption.
---
## 🚀 Challenge
Sketch, on paper or using an online app like [Excalidraw](https://excalidraw.com/), your understanding of the differences between AI, ML, Deep Learning, and Data Science. Add some ideas of problems that each of these techniques are good at solving.
Sketch, on paper or using an online app like [Excalidraw](https://excalidraw.com/), your understanding of the differences between AI, ML, deep learning, and data science. Add some ideas of problems that each of these techniques are good at solving.
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/)

@ -2,7 +2,7 @@
## Instructions
In this non-graded assignment, you should brush up on Python and get your environment up and running and able to run notebooks.
In this non-graded assignment, you should brush up on Python and get your environment up and running and able to run notebooks.
Take this [Python Learning Path](https://docs.microsoft.com/learn/paths/python-language/?WT.mc_id=academic-15963-cxa), and then get your systems setup by going through these introductory videos:

@ -1,15 +1,15 @@
# History of Machine Learning
# History of machine learning
![Summary of History of Machine Learning in a sketchnote](../../sketchnotes/ml-history.png)
![Summary of History of machine learning in a sketchnote](../../sketchnotes/ml-history.png)
> Sketchnote by [Tomomi Imura](https://www.twitter.com/girlie_mac)
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/3/)
In this lesson, we will walk through the major milestones in the history of Machine Learning and Artificial Intelligence.
In this lesson, we will walk through the major milestones in the history of machine learning and artificial intelligence.
The history of Artificial Intelligence, AI, as a field is intertwined with the history of Machine Learning, as the algorithms and computational advances that underpin ML fed into the development of AI. It is useful to remember that, while these fields as distinct areas of inquiry began to crystallize in the 1950s, important [algorithmical, statistical, mathematical, computational and technical discoveries](https://wikipedia.org/wiki/Timeline_of_machine_learning) predated and overlapped this era. In fact, people have been thinking about these questions for [hundreds of years](https://wikipedia.org/wiki/History_of_artificial_intelligence): this article discusses the historical intellectual underpinnings of the idea of a 'thinking machine.'
The history of artificial intelligence, AI, as a field is intertwined with the history of machine learning, as the algorithms and computational advances that underpin ML fed into the development of AI. It is useful to remember that, while these fields as distinct areas of inquiry began to crystallize in the 1950s, important [algorithmical, statistical, mathematical, computational and technical discoveries](https://wikipedia.org/wiki/Timeline_of_machine_learning) predated and overlapped this era. In fact, people have been thinking about these questions for [hundreds of years](https://wikipedia.org/wiki/History_of_artificial_intelligence): this article discusses the historical intellectual underpinnings of the idea of a 'thinking machine.'
## Notable Discoveries
## Notable discoveries
- 1763, 1812 [Bayes Theorem](https://wikipedia.org/wiki/Bayes%27_theorem) and its predecessors. This theorem and its applications underlie inference, describing the probability of an event occurring based on prior knowledge.
- 1805 [Least Square Theory](https://wikipedia.org/wiki/Least_squares) by French mathematician Adrien-Marie Legendre. This theory, which you will learn about in our Regression unit, helps in data fitting.
@ -20,14 +20,13 @@ The history of Artificial Intelligence, AI, as a field is intertwined with the h
- 1982 [Recurrent Neural Networks](https://wikipedia.org/wiki/Recurrent_neural_network) are artificial neural networks derived from feedforward neural networks that create temporal graphs.
✅ Do a little research. What other dates stand out as pivotal in the history of ML and AI?
## 1950: Machines that Think
## 1950: Machines that think
Alan Turing, a truly remarkable person who was voted [by the public in 2019](https://wikipedia.org/wiki/Icons:_The_Greatest_Person_of_the_20th_Century) as the greatest scientist of the 20th century, is credited as helping to lay the foundation for the concept of a 'machine that can think.' He grappled with naysayers and his own need for empirical evidence of this concept in part by creating the [Turing Test](https://www.bbc.com/news/technology-18475646), which you will explore in our NLP lessons.
## 1956: Dartmouth Summer Research Project
"The Dartmouth Summer Research Project on Artificial Intelligence was a seminal event for artificial intelligence as a field," and it was here that the term 'Artificial Intelligence' was coined ([source](https://250.dartmouth.edu/highlights/artificial-intelligence-ai-coined-dartmouth)).
"The Dartmouth Summer Research Project on artificial intelligence was a seminal event for artificial intelligence as a field," and it was here that the term 'artificial intelligence' was coined ([source](https://250.dartmouth.edu/highlights/artificial-intelligence-ai-coined-dartmouth)).
> Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.
@ -35,11 +34,11 @@ The lead researcher, mathematics professor John McCarthy, hoped "to proceed on t
The workshop is credited with having initiated and encouraged several discussions including "the rise of symbolic methods, systems focussed on limited domains (early expert systems), and deductive systems versus inductive systems." ([source](https://wikipedia.org/wiki/Dartmouth_workshop)).
## 1956 - 1974: "The Golden Years"
## 1956 - 1974: "The golden years"
From the 1950s through the mid '70s, optimism ran high in the hope that AI could solve many problems. In 1967, Marvin Minsky stated confidently that "Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved." (Minsky, Marvin (1967), Computation: Finite and Infinite Machines, Englewood Cliffs, N.J.: Prentice-Hall)
Natural Language Processing research flourished, search was refined and made more powerful, and the concept of 'micro-worlds' was created, where simple tasks were completed using plain language instructions.
natural language processing research flourished, search was refined and made more powerful, and the concept of 'micro-worlds' was created, where simple tasks were completed using plain language instructions.
Research was well funded by government agencies, advances were made in computation and algorithms, and prototypes of intelligent machines were built. Some of these machines include:
@ -55,7 +54,8 @@ Research was well funded by government agencies, advances were made in computati
* "Blocks world" was an example of a micro-world where blocks could be stacked and sorted, and experiments in teaching machines to make decisions could be tested. Advances built with libraries such as [SHRDLU](https://wikipedia.org/wiki/SHRDLU) helped propel language processing forward.
[![blocks world with SHRDLU](https://img.youtube.com/vi/QAJz4YKUwqw/0.jpg)](https://www.youtube.com/watch?v=QAJz4YKUwqw "blocks world with SHRDLU")
[![blocks world with SHRDLU](https://img.youtube.com/vi/QAJz4YKUwqw/0.jpg)](https://www.youtube.com/watch?v=QAJz4YKUwqw "blocks world with SHRDLU")
> 🎥 Click the image above for a video: Blocks world with SHRDLU
## 1974 - 1980: "AI Winter"
@ -89,13 +89,14 @@ This epoch saw a new era for ML and AI to be able to solve some of the problems
## Now
Today, Machine Learning and AI touch almost every part of our lives. This era calls for careful understanding of the risks and potentials effects of these algorithms on human lives. As Microsoft's Brad Smith has stated, "Information technology raises issues that go to the heart of fundamental human-rights protections like privacy and freedom of expression. These issues heighten responsibility for tech companies that create these products. In our view, they also call for thoughtful government regulation and for the development of norms around acceptable uses" ([source](https://www.technologyreview.com/2019/12/18/102365/the-future-of-ais-impact-on-society/)).
Today, machine learning and AI touch almost every part of our lives. This era calls for careful understanding of the risks and potentials effects of these algorithms on human lives. As Microsoft's Brad Smith has stated, "Information technology raises issues that go to the heart of fundamental human-rights protections like privacy and freedom of expression. These issues heighten responsibility for tech companies that create these products. In our view, they also call for thoughtful government regulation and for the development of norms around acceptable uses" ([source](https://www.technologyreview.com/2019/12/18/102365/the-future-of-ais-impact-on-society/)).
It remains to be seen what the future holds, but it is important to understand these computer systems and the software and algorithms that they run. We hope that this curriculum will help you to gain a better understanding so that you can decide for yourself.
[![The history of Deep Learning](https://img.youtube.com/vi/mTtDfKgLm54/0.jpg)](https://www.youtube.com/watch?v=mTtDfKgLm54 "The history of Deep Learning")
> 🎥 Click the image above for a video: Yann LeCun discusses the history of Deep Learning in this lecture
[![The history of deep learning](https://img.youtube.com/vi/mTtDfKgLm54/0.jpg)](https://www.youtube.com/watch?v=mTtDfKgLm54 "The history of deep learning")
> 🎥 Click the image above for a video: Yann LeCun discusses the history of deep learning in this lecture
---
## 🚀Challenge
Dig into one of these historical moments and learn more about the people behind them. There are fascinating characters, and no scientific discovery was ever created in a cultural vacuum. What do you discover?

@ -171,6 +171,8 @@ The tool helps you to assesses how a model's predictions affect different groups
- Learn [how to enable fairness assessments](https://docs.microsoft.com/azure/machine-learning/how-to-machine-learning-fairness-aml?WT.mc_id=academic-15963-cxa) of machine learning models in Azure Machine Learning.
- Check out these [sample notebooks](https://github.com/Azure/MachineLearningNotebooks/tree/master/contrib/fairness) for more fairness assessment scenarios in Azure Machine Learning.
---
## 🚀 Challenge
To prevent biases from being introduced in the first place, we should:
@ -192,9 +194,9 @@ Watch this workshop to dive deeper into the topics:
Also, read:
- Microsofts RAI resource center: [Responsible AI Resources Microsoft AI](https://www.microsoft.com/en-us/ai/responsible-ai-resources?activetab=pivot1%3aprimaryr4)
- Microsofts RAI resource center: [Responsible AI Resources Microsoft AI](https://www.microsoft.com/ai/responsible-ai-resources?activetab=pivot1%3aprimaryr4)
- Microsofts FATE research group: [FATE: Fairness, Accountability, Transparency, and Ethics in AI - Microsoft Research](https://www.microsoft.com/en-us/research/theme/fate/)
- Microsofts FATE research group: [FATE: Fairness, Accountability, Transparency, and Ethics in AI - Microsoft Research](https://www.microsoft.com/research/theme/fate/)
Explore the Fairlearn toolkit

@ -1,84 +1,105 @@
# Techniques of Machine Learning
The process of building, using, and maintaining machine learning models and the data they use is a process very different from many other development workflows. For web developers, techniques of machine learning can initially seem very strange. In this lesson, we will demystify the process by outlining it. You will:
The process of building, using, and maintaining machine learning models and the data they use is a very different process from many other development workflows. In this lesson, we will demystify the process, and outline the main techniques you need to know. You will:
- Understand the processes underpinning machine learning at a high level.
- Explore base concepts such as 'models', 'predictions', and 'training data'.
-
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/7/)
# Introduction
## Introduction
On a high level, the craft of creating machine learning (ML) processes is comprised of a number of steps:
1. **Decide on the question**. Most ML processes start by asking a question that cannot be answered by a simple conditional program or rules-based engine. These questions often revolve around predictions based on a collection of data.
2. **Collect and prepare data**. To be able to answer your question, you need data. The quality and, sometimes, quantity of your data will determine how well you can answer your initial question. Visualizing data is an important aspect of this phase. This phase also includes splitting the data into a training and testing group to build a model.
3. **Choose a training method**. Depending on your question and the nature of your data, you need to choose how you want to train a model to best reflect your data and make accurate predictions against it. This is the part of your ML process that requires specific expertise and, often, a considerably amount of experimentation.
4. **Train the model**. Using your training data, you use various algorithms to train a model to recognize patterns in the data. The model might leverage internal weights that can be adjusted to privilege certain parts of the data over others to build a better model.
3. **Choose a training method**. Depending on your question and the nature of your data, you need to choose how you want to train a model to best reflect your data and make accurate predictions against it. This is the part of your ML process that requires specific expertise and, often, a considerable amount of experimentation.
4. **Train the model**. Using your training data, you'll use various algorithms to train a model to recognize patterns in the data. The model might leverage internal weights that can be adjusted to privilege certain parts of the data over others to build a better model.
5. **Evaluate the model**. You use never before seen data (your testing data) from your collected set to see how the model is performing.
6. **Parameter tuning**. Based on the performance of your model, you can redo the process using different parameters, or variables, that control the behavior of the algorithms used to train the model.
7. **Predict**. Use new input to test the accuracy of your model.
7. **Predict**. Use new inputs to test the accuracy of your model.
## What question to ask
Computers are particularly skilled at discovering hidden patterns in data. This utility is very helpful for researchers who have questions about a given domain that cannot be easily answered by creating a conditionally-based rules engine. Given an actuarial task, for example, a data scientist might be able to construct handcrafted rules around the mortality of smokers vs non-smokers.
Computers are particularly skilled at discovering hidden patterns in data. This utility is very helpful for researchers who have questions about a given domain that cannot be easily answered by creating a conditionally-based rules engine. Given an actuarial task, for example, a data scientist might be able to construct handcrafted rules around the mortality of smokers vs non-smokers.
When many other variables are brought into the equation, however, a ML model might prove more efficient to predict future mortality rates based on past health history. A more cheerful example might be making weather predictions for the month of April in a given location based on data that includes latitude, longitude, climate change, proximity to the ocean, patterns of the jet stream, and more.
✅ This [slide deck](https://www2.cisl.ucar.edu/sites/default/files/0900%20June%2024%20Haupt_0.pdf) on weather models offers a historical perspective for using ML in weather analysis
✅ This [slide deck](https://www2.cisl.ucar.edu/sites/default/files/0900%20June%2024%20Haupt_0.pdf) on weather models offers a historical perspective for using ML in weather analysis.
## Pre-Building Tasks
## Pre-building tasks
Before starting to build your model, there are several tasks you need to complete. To test your question and form a hypothesis based on a model's predictions, you need to identify and configure several elements.
### Data
To be able to answer your question with any kind of certainty, you need a good amount of data of the right type. There are two things you need to do at this point:
- **Collect data**. Keeping in mind the previous lesson on fairness in data analysis, collect your data with care. Be aware of the sources of this data, any inherent biases it might have, and document its origin.
- **Prepare data**. There are several steps in the data preparation process. You might need to collate data and normalize it if it comes from diverse sources. You can improve the data's quality and quantity through various methods such as converting strings to numbers (as we do in [Clustering](../../5-Clustering/1-Visualize/README.md)). You might also generate new data, based on the original (as we do in [Classification](../../4-Classification/1-Introduction/README.md)). You can clean and edit the data (as we did prior to the [Web App](../3-Web-App/README.md) lesson). Finally you might also need to randomize it and shuffle it, depending on your training techniques.
- **Collect data**. Keeping in mind the previous lesson on fairness in data analysis, collect your data with care. Be aware of the sources of this data, any inherent biases it might have, and document its origin.
- **Prepare data**. There are several steps in the data preparation process. You might need to collate data and normalize it if it comes from diverse sources. You can improve the data's quality and quantity through various methods such as converting strings to numbers (as we do in [Clustering](../../5-Clustering/1-Visualize/README.md)). You might also generate new data, based on the original (as we do in [Classification](../../4-Classification/1-Introduction/README.md)). You can clean and edit the data (as we did prior to the [Web App](../3-Web-App/README.md) lesson). Finally, you might also need to randomize it and shuffle it, depending on your training techniques.
✅ After collecting and processing your data, take a moment to see if its shape will allow you to address your intended question. It may be that the data will not perform well in your given task, as we discover in our [Clustering](../../5-Clustering/1-Visualize/README.md) lessons!
### Selecting your feature variable
A [feature](https://www.datasciencecentral.com/profiles/blogs/an-introduction-to-variable-and-feature-selection) is a measurable property of your data. In many datasets it is expressed as a column heading like 'date' 'size' or 'color'. Your feature variable, usually represented as `y` in code, represents the answer to the question you are trying to ask of your data: in December, what **color** pumpkins will be cheapest? in San Francisco, what neighborhoods will have the best real estate **price**?
🎓 **Feature Selection and Feature Extraction** How do you know which variable to choose when building a model? You'll probably go through a process of feature selection or feature extraction to choose the right variables for the most performant model. They're not the same thing, however: "Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features." [source](https://wikipedia.org/wiki/Feature_selection)
🎓 **Feature Selection and Feature Extraction** How do you know which variable to choose when building a model? You'll probably go through a process of feature selection or feature extraction to choose the right variables for the most performant model. They're not the same thing, however: "Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features." ([source](https://wikipedia.org/wiki/Feature_selection))
### Visualize your data
An important aspect of the data scientist's toolkit is the power to visualize data using several excellent libraries such as Seaborn or MatPlotLib. Representing your data visually might allow you to uncover hidden correlations that you can leverage. Your visualizations might also help you to uncover bias or unbalanced data (as we discover in [Classification](../../4-Classification/2-Classifiers-1/README.md)).
### Split your dataset
Prior to training, you need to split your dataset into two or more parts of unequal size, but representing the data well.
Prior to training, you need to split your dataset into two or more parts of unequal size that still represent the data well.
- **Training**. This part of the dataset is fit to your model to train it. This set constitutes the majority of the original dataset.
- **Testing**. A test dataset is an independent group of data, often gathered from the original data, that you use to confirm the performance of the built model.
- **Validating**. A validation set is a smaller independent group of examples that you use to tune the model's hyperparameters, or architecture, to improve the model. Depending on your data's size and the question you are asking, you might not need to build this third set (as we note in [Time Series Forecasting](../7-TimeSeries/1-Introduction/README.md)).
- **Training**, this part of the dataset goes into your model to train it. The size of this chunk constitutes the majority of the original dataset.
- **Testing**. A test dataset is another independent group of data, often gathered from the original data, that you use to confirm the performance of the built model.
- **Validating**. A validation set is a smaller independent group of examples that you use to tune the model's hyperparameters, or architecture, to improve the model. Depending on your data's size and the question you are asking, you might not need to build this third set (as we noted in [Time Series Forecasting](../7-TimeSeries/1-Introduction/README.md)).
## Building a model
Using your training data, your goal is to build a model, or a statistical representation of your data, using various algorithms to **train** it. Training a model exposes it to data and allows it to make assumptions about perceived patterns it discovers, validates, and accepts or rejects.
### Decide on a training method
Depending on your question and the nature of your data, your will choose a method to train it. Stepping through [Scikit-Learn's documentation](https://scikit-learn.org/stable/user_guide.html) - which we use in this course - , you can explore many ways to train a model. Depending on your experience, you might have to try several different methods to build the best model. You are likely to go through a process whereby data scientists evaluate the performance of a model by feeding it unseen data, checking for accuracy, bias, and other quality-degrading issues, selecting the most appropriate training method for the task at hand.
### Train
Depending on your question and the nature of your data, your will choose a method to train it. Stepping through [Scikit-learn's documentation](https://scikit-learn.org/stable/user_guide.html) - which we use in this course - you can explore many ways to train a model. Depending on your experience, you might have to try several different methods to build the best model. You are likely to go through a process whereby data scientists evaluate the performance of a model by feeding it unseen data, checking for accuracy, bias, and other quality-degrading issues, and selecting the most appropriate training method for the task at hand.
### Train a model
Armed with your training data, you are ready to 'fit' it to create a model. You will notice that in many ML libraries you will find the code 'model.fit' - it is at this time that you send in your data as an array of values (usually 'X') and a feature variable (usually 'y').
### Evaluate the model
Once the training process is complete (it can take many iterations, or 'epochs', to train a large model), you will be able to evaluate the model's quality by using test data to gauge its performance. This data is a subset of the original data that the model has not previously analyzed. You can print out a table of metrics about your model's quality.
🎓 Model Fitting
🎓 **Model fitting**
In the context of machine learning, model fitting refers to the accuracy of the model's underlying function as it attempts to analyze data with which it is not familiar.
In the context of machine learning, Model fitting refers to the accuracy of the model's underlying function as it attempts to analyze data with which it is not familiar.
🎓 **Underfitting** and **overfitting** are common problems that degrade the quality of the model, as the model fits either not well enough or too well. This causes the model to make predictions either too closely aligned or too loosely aligned with its training data. An overfit model predicts training data too well because it has learned the data's details and noise too well. An underfit model is not accurate as it can neither accurately analyze its training data nor data it has not yet 'seen'.
![overfitting model](images/overfitting.png)
> Infographic by [Jen Looper](https://twitter.com/jenlooper)
🎓 **Underfitting** and **overfitting** are common problems that degrade the quality of the model as the model fits either not well enough or too well. This causes the model to make predictions either too closely aligned or too loosely aligned with its training data. An overfit model predicts training data too well because it has learned the data's details and noise too well. An underfit model is not accurate as it can neither accurately analyze its training data nor data it has not yet 'seen'.
## Parameter tuning
Once your initial training is complete, observe the quality of the model and consider improving it by tweaking its 'hyperparameters'. Read more about the process [here](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters?WT.mc_id=academic-15963-cxa)
Once your initial training is complete, observe the quality of the model and consider improving it by tweaking its 'hyperparameters'. Read more about the process [in the documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters?WT.mc_id=academic-15963-cxa).
## Prediction
## Prediction
This is the moment where you can use completely new data to test your model's accuracy. In an 'applied' ML setting, where you are building web assets to use the model in production, this process might involve gathering user input (a button press, for example) to set a variable and send it to the model for inference, or evaluation.
This is the moment where you can use completely new data to test your model's accuracy. In an 'applied' ML setting, where you are building web assets to use the model in production, this process might involve gathering user input (a button press, for example) to set a variable and send it to the model for inference, or evaluation.
In these lessons, you will discover how to use these steps to prepare, build, test, evaluate, and predict - all the gestures of a data scientist and more, as you progress in your journey to become a 'full stack' ML engineer.
---
## 🚀Challenge
Draw a flow chart reflecting the steps of a ML practitioner. Where do you see yourself right now in the process? Where do you predict you will find difficulty? What seems easy to you?
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/8/)
## Review & Self Study
Search online for interviews with data scientists who discuss their daily work. Here is [one](https://www.youtube.com/watch?v=Z3IjgbbCEfs).
## Assignment
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/8/)
[Interview a data scientist](assignment.md)

@ -1,11 +1,11 @@
# Assignment
# Interview a data scientist
## Instructions
In your company, in a user group, or among your friends or fellow students, talk to someone who works professionally as a data scientist. Write a short paper (500 words) about their daily occupations. Are they specialists, or do they work 'full stack'?
## Rubric
| Criteria | Exemplary | Adequate | Needs Improvement |
| -------- | --------- | -------- | ----------------- |
| | 1 | 2 | 3 |
| Criteria | Exemplary | Adequate | Needs Improvement |
| -------- | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------ | --------------------- |
| | An essay of the correct length, with attributed sources, is presented as a .doc file | The essay is poorly attributed or shorter than the required length | No essay is presented |

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

@ -1,13 +1,16 @@
# Introduction to Machine Learning
# Introduction to machine learning
In this section of the curriculum, you will be introduced to the base concepts underlying the field of machine learning, what it is, and learn about its history and the techniques researchers use to work with it.
### Lessons
In this section of the curriculum, you will be introduced to the base concepts underlying the field of machine learning, what it is, and learn about its history and the techniques researchers use to work with it. Let's explore this new world of ML together!
1. [Introduction to Machine Learning](1-intro-to-ML/README.md)
1. [The History of Machine Learning and AI](2-history-of-ML/README.md)
1. [Fairness and Machine Learning](3-fairness/README.md)
1. [Techniques of Machine Learning](4-techniques-of-ML/README.md)
![globe](images/globe.jpg)
> Photo by <a href="https://unsplash.com/@bill_oxford?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Bill Oxford</a> on <a href="https://unsplash.com/s/photos/globe?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
### Lessons
1. [Introduction to machine learning](1-intro-to-ML/README.md)
1. [The History of machine learning and AI](2-history-of-ML/README.md)
1. [Fairness and machine learning](3-fairness/README.md)
1. [Techniques of machine learning](4-techniques-of-ML/README.md)
### Credits
"Introduction to Machine Learning" was written with ♥️ by a team of folks including [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan), [Ornella Altunyan](https://twitter.com/ornelladotcom) and [Jen Looper](https://twitter.com/jenlooper)

Binary file not shown.

After

Width:  |  Height:  |  Size: 348 KiB

@ -1,4 +1,4 @@
# Get started with Python and Scikit-Learn for Regression models
# Get started with Python and Scikit-learn for regression models
![Summary of regressions in a sketchnote](../../sketchnotes/ml-regression.png)
@ -7,18 +7,16 @@
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/9/)
## Introduction
In these four lessons, you will discover how to build regression models. We will discuss what these are for shortly. But before you do anything, make sure you have the right tools in place to start the process!
But before you do anything, make sure you have the right tools in place!
In this lesson, you will learn:
In this lesson, you will learn how to:
- Configure your computer for local machine learning tasks.
- Getting used to working with Jupyter notebooks.
- Using Scikit-Learn, including installation.
- Explore Linear Regression with a hands-on exercise.
- Work with Jupyter notebooks.
- Use Scikit-learn, including installation.
- Explore linear regression with a hands-on exercise.
## Installations and Configurations
## Installations and configurations
[![Using Python with Visual Studio Code](https://img.youtube.com/vi/7EXd4_ttIuw/0.jpg)](https://youtu.be/7EXd4_ttIuw "Using Python with Visual Studio Code")
@ -32,19 +30,19 @@ In this lesson, you will learn:
> Get comfortable with Python by working through this collection of [Learn modules](https://docs.microsoft.com/users/jenlooper-2911/collections/mp1pagggd5qrq7?WT.mc_id=academic-15963-cxa)
3. **Install Scikit-Learn**, by following [these instructions](https://scikit-learn.org/stable/install.html). Since you need to ensure that you use Python 3, it's recommended that you use a virtual environment. Note, if you are installing this library on a M1 Mac, there are special instructions on the page linked above.
3. **Install Scikit-learn**, by following [these instructions](https://scikit-learn.org/stable/install.html). Since you need to ensure that you use Python 3, it's recommended that you use a virtual environment. Note, if you are installing this library on a M1 Mac, there are special instructions on the page linked above.
1. **Install Jupyter Notebook**. You will need to [install the Jupyter package](https://pypi.org/project/jupyter/).
## Your ML Authoring Environment
## Your ML authoring environment
You are going to use **notebooks** to develop your Python code and create machine learning models. This type of file is a common tool for data scientists, and they can be identified by their suffix or extension `.ipynb`.
Notebooks are an interactive environment that allow the developer to both code and add notes and write documentation around the code which is quite helpful for experimental or research-oriented projects.
### Exercise - work with A Notebook
### Exercise - work with a notebook
In this folder, you will find the file `notebook.ipynb`.
In this folder, you will find the file _notebook.ipynb_.
1. Open _notebook.ipynb_ in Visual Studio Code.
@ -69,27 +67,25 @@ You can interleaf your code with comments to self-document the notebook.
✅ Think for a minute how different a web developer's working environment is versus that of a data scientist.
## up and Running with Scikit-Learn
## Up and running with Scikit-learn
Now that Python is set up in your local environment, and you are comfortable with Jupyter notebooks, let's get equally comfortable with Scikit-Learn (pronounce it `sci` as in `science`). Scikit-Learn provides an [extensive API](https://scikit-learn.org/stable/modules/classes.html#api-ref) to help you perform ML tasks.
Now that Python is set up in your local environment, and you are comfortable with Jupyter notebooks, let's get equally comfortable with Scikit-learn (pronounce it `sci` as in `science`). Scikit-learn provides an [extensive API](https://scikit-learn.org/stable/modules/classes.html#api-ref) to help you perform ML tasks.
According to their [website](https://scikit-learn.org/stable/getting_started.html), "Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection and evaluation, and many other utilities."
In this course, you will use Scikit-learn and other tools to build machine learning models to perform what we call 'traditional machine learning' tasks. We have deliberately avoided neural networks and deep learning, as they are better covered in our forthcoming 'AI for Beginners' curriculum.
Scikit-learn makes it straightforward to build models and evaluate them for use. It is primarily focused on using numeric data and contains several ready-made datasets for use as learning tools. It also includes pre-built models for students to try. Let's explore the process of loading prepackaged data and using a built in estimator first ML model with Scikit-learn with some basic data.
In this course, you will use Scikit-Learn and other tools to build machine learning models to perform what we call 'traditional machine learning' tasks. We have deliberately avoided neural networks and deep learning, as they are better covered in our forthcoming 'AI for Beginners' curriculum.
Scikit-Learn makes it straightforward to build models and evaluate them for use. It is primarily focused on using numeric data and contains several ready-made datasets for use as learning tools. It also includes pre-built models for students to try. Let's explore the process of loading prepackaged data and using a built in estimator first ML model with Scikit-Learn with some basic data.
## Exercise - your First Scikit-Learn Notebook
## Exercise - your first Scikit-learn notebook
> This tutorial was inspired by the [Linear Regression example](https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html#sphx-glr-auto-examples-linear-model-plot-ols-py) on Scikit-Learn's web site.
> This tutorial was inspired by the [linear regression example](https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html#sphx-glr-auto-examples-linear-model-plot-ols-py) on Scikit-learn's web site.
In the _notebook.ipynb_ file associated to this lesson, clear out all the cells by pressing the 'trash can' icon.
In this section, you will work with a small dataset about diabetes that is built into Scikit-Learn for learning purposes. Imagine that you wanted to test a treatment for diabetic patients. Machine Learning models might help you determine which patients would respond better to the treatment, based on combinations of variables. Even a very basic Regression model, when visualized, might show information about variables that would help you organize your theoretical clinical trials.
In this section, you will work with a small dataset about diabetes that is built into Scikit-learn for learning purposes. Imagine that you wanted to test a treatment for diabetic patients. Machine Learning models might help you determine which patients would respond better to the treatment, based on combinations of variables. Even a very basic regression model, when visualized, might show information about variables that would help you organize your theoretical clinical trials.
✅ There are many types of Regression methods, and which one you pick depends on the answer you're looking for. If you want to predict the probable height for a person of a given age, you'd use Linear Regression, as you're seeking a **numeric value**. If you're interested in discovering whether a type of cuisine should be considered vegan or not, you're looking for a **category assignment** so you would use Logistic Regression. You'll learn more about Logistic Regression later. Think a bit about some questions you can ask of data, and which of these methods would be more appropriate.
✅ There are many types of regression methods, and which one you pick depends on the answer you're looking for. If you want to predict the probable height for a person of a given age, you'd use linear regression, as you're seeking a **numeric value**. If you're interested in discovering whether a type of cuisine should be considered vegan or not, you're looking for a **category assignment** so you would use logistic regression. You'll learn more about logistic regression later. Think a bit about some questions you can ask of data, and which of these methods would be more appropriate.
Let's get started on this task.
@ -99,7 +95,7 @@ For this task we will import some libraries:
- **matplotlib**. It's a useful [graphing tool](https://matplotlib.org/) and we will use it to create a line plot.
- **numpy**. [numpy](https://numpy.org/doc/stable/user/whatisnumpy.html) is a useful library for handling numeric data in Python.
- **sklearn**. This is the Scikit-Learn library.
- **sklearn**. This is the Scikit-learn library.
Import some libraries to help with your tasks.
@ -113,9 +109,9 @@ Import some libraries to help with your tasks.
Above you are importing `matplottlib`, `numpy` and you are importing `datasets`, `linear_model` and `model_selection` from `sklearn`. `model_selection` is used for splitting data into training and test sets.
### The diabetes housing dataset
### The diabetes dataset
The built-in [diabetes housing dataset](https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset) includes 442 samples of data around diabetes, with 10 feature variables, some of which include:
The built-in [diabetes dataset](https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset) includes 442 samples of data around diabetes, with 10 feature variables, some of which include:
age: age in years
bmi: body mass index
@ -150,7 +146,7 @@ In a new code cell, load the diabetes dataset by calling `load_diabetes()`. The
✅ Think a bit about the relationship between the data and the regression target. Linear regression predicts relationships between feature X and target variable y. Can you find the [target](https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset) for the diabetes dataset in the documentation? What is this dataset demonstrating, given that target?
1. Next, select a portion of this dataset to plot by arranging it into a new array using numpy's `newaxis` function. We are going to use Linear Regression to generate a line between values in this data, according to a pattern it determines.
2. Next, select a portion of this dataset to plot by arranging it into a new array using numpy's `newaxis` function. We are going to use linear regression to generate a line between values in this data, according to a pattern it determines.
```python
X = X[:, np.newaxis, 2]
@ -158,13 +154,13 @@ In a new code cell, load the diabetes dataset by calling `load_diabetes()`. The
✅ At any time, print out the data to check its shape.
1. Now that you have data ready to be plotted, you can see if a machine can help determine a logical split between the numbers in this dataset. To do this, you need to split both the data (X) and the target (y) into test and training sets. Scikit-Learn has a straightforward way to do this; you can split your test data at a given point.
3. Now that you have data ready to be plotted, you can see if a machine can help determine a logical split between the numbers in this dataset. To do this, you need to split both the data (X) and the target (y) into test and training sets. Scikit-learn has a straightforward way to do this; you can split your test data at a given point.
```python
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.33)
```
1. Now you are ready to train your model! Load up the Linear Regression model and train it with your X and y training sets using `model.fit()`:
4. Now you are ready to train your model! Load up the linear regression model and train it with your X and y training sets using `model.fit()`:
```python
model = linear_model.LinearRegression()
@ -173,13 +169,13 @@ In a new code cell, load the diabetes dataset by calling `load_diabetes()`. The
`model.fit()` is a function you'll see in many ML libraries such as TensorFlow
1. Then, create a prediction using test data, using the function `predict()`. This will be used to draw the line between data groups
5. Then, create a prediction using test data, using the function `predict()`. This will be used to draw the line between data groups
```python
y_pred = model.predict(X_test)
```
8. Now it's time to show the data in a plot. Matplotlib is a very useful tool for this task. Create a scatterplot of all the X and y test data, and use the prediction to draw a line in the most appropriate place, between the model's data groupings.
6. Now it's time to show the data in a plot. Matplotlib is a very useful tool for this task. Create a scatterplot of all the X and y test data, and use the prediction to draw a line in the most appropriate place, between the model's data groupings.
```python
plt.scatter(X_test, y_test, color='black')
@ -191,7 +187,7 @@ In a new code cell, load the diabetes dataset by calling `load_diabetes()`. The
✅ Think a bit about what's going on here. A straight line is running through many small dots of data, but what is it doing exactly? Can you see how you should be able to use this line to predict where a new, unseen data point should fit in relationship to the plot's y axis? Try to put into words the practical use of this model.
Congratulations, you built your first Linear Regression model, created a prediction with it, and displayed it in a plot!
Congratulations, you built your first linear regression model, created a prediction with it, and displayed it in a plot!
---
## 🚀Challenge
@ -203,7 +199,7 @@ Plot a different variable from this dataset. Hint: edit this line: `X = X[:, np.
In this tutorial, you worked with simple linear regression, rather than univariate or multiple linear regression. Read a little about the differences between these methods, or take a look at [this video](https://www.coursera.org/lecture/quantifying-relationships-regression-models/linear-vs-nonlinear-categorical-variables-ai2Ef)
Read more about the concept of Regression and think about what kinds of questions can be answered by this technique. Take this [tutorial](https://docs.microsoft.com/learn/modules/train-evaluate-regression-models?WT.mc_id=academic-15963-cxa) to deepen your understanding.
Read more about the concept of regression and think about what kinds of questions can be answered by this technique. Take this [tutorial](https://docs.microsoft.com/learn/modules/train-evaluate-regression-models?WT.mc_id=academic-15963-cxa) to deepen your understanding.
## Assignment

@ -1,13 +1,13 @@
# Regression with Scikit-Learn
# Regression with Scikit-learn
## Instructions
Take a look at the [Linnerud dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_linnerud.html#sklearn.datasets.load_linnerud) in Scikit-Learn. This dataset has multiple [targets](https://scikit-learn.org/stable/datasets/toy_dataset.html#linnerrud-dataset): 'It consists of three excercise (data) and three physiological (target) variables collected from twenty middle-aged men in a fitness club'.
Take a look at the [Linnerud dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_linnerud.html#sklearn.datasets.load_linnerud) in Scikit-learn. This dataset has multiple [targets](https://scikit-learn.org/stable/datasets/toy_dataset.html#linnerrud-dataset): 'It consists of three excercise (data) and three physiological (target) variables collected from twenty middle-aged men in a fitness club'.
In your own words, describe how to create a Regression model that would plot the relationship between the waistline and how many situps are accomplished. Do the same for the other datapoints in this dataset.
## Rubric
| Criteria | Exemplary | Adequate | Needs Improvement |
| -------- | --------- | -------- | ----------------- |
| Submit a descriptive paragraph | Well-written paragraph is submitted | A few sentences are submitted | No description is supplied |
| Criteria | Exemplary | Adequate | Needs Improvement |
| ------------------------------ | ----------------------------------- | ----------------------------- | -------------------------- |
| Submit a descriptive paragraph | Well-written paragraph is submitted | A few sentences are submitted | No description is supplied |

Binary file not shown.

Before

Width:  |  Height:  |  Size: 219 KiB

@ -1,96 +1,130 @@
# Build a Regression Model using Scikit-Learn: Prepare and Visualize Data
# Build a regression model using Scikit-learn: prepare and visualize data
> ![Data Vizualization Infographic](./images/data-visualization.png)
> ![Data visualization infographic](./images/data-visualization.png)
> Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/11/)
### Introduction
## Introduction
Now that you are set up with the tools you need to start tackling machine learning model-building with Scikit-Learn, you are ready to start asking questions of your data. As you work with data and apply ML solutions, it's very important to understand how to ask the right question to properly unlock the potentials of your dataset.
Now that you are set up with the tools you need to start tackling machine learning model building with Scikit-learn, you are ready to start asking questions of your data. As you work with data and apply ML solutions, it's very important to understand how to ask the right question to properly unlock the potentials of your dataset.
In this lesson, you will learn:
- Preparing your data for model-building
- Using Matplotlib for data visualization
### Asking the Right Question
The question you need answered will determine what type of ML algorithms you will leverage. For example, do you need to determine the differences between cars and trucks as they cruise down a highway via a video feed? You will need some kind of highly performant classification model to make that differentiation. It will need to be able to perform object detection, probably by showing bounding boxes around detected cars and trucks.
- How to prepare your data for model-building.
- How to use Matplotlib for data visualization.
What if you are trying to correlate two points of data - like age to height? You can use a linear regression model, as shown in the previous lesson, to draw the classical straight line through the scatterplot of points to show how, with age, height tends to increase. Thus you can predict, for a given group of people, their height given their age.
## Asking the right question of your data
But it's not very common to be gifted a dataset that is completely ready to use to create a ML model. In this lesson, you will learn how to prepare a raw dataset using standard Python libraries. You will also learn various techniques to visualize the data.
### Preparation
The question you need answered will determine what type of ML algorithms you will leverage. And the quality of the answer you get back will be heavily dependent on the nature of your data.
In this folder you will find a .csv file in the root `data` folder called [US-pumpkins.csv](../data/US-pumpkins.csv) which includes 1757 lines of data about the pumpkin market, sorted into groupings by city. This is raw data extracted from the [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) distributed by the United States Department of Agriculture.
Take a look at the [data](../data/US-pumpkins.csv) provided for this lesson. You can open this .csv file in VS Code. A quick skim immediately shows that there are blanks and a mix of strings and numeric data. There's also a strange column called 'Package' where the data is a mix between 'sacks', 'bins' and other values. The data, in fact, is a bit of a mess.
This data is in the public domain. It can be downloaded in many separate files, per city, from the USDA web site. To avoid too many separate files we have concatenated all the city data into one spreadsheet. Take a look at this file.
## The Pumpkin data
In fact, it is not very common to be gifted a dataset that is completely ready to use to create a ML model out of the box. In this lesson, you will learn how to prepare a raw dataset using standard Python libraries. You will also learn various techniques to visualize the data.
What do you notice about this data? First, you see that it is a mix of text and numeric data. There are also dates. Second, you see that there's a considerable amount of missing and mixed data. To build a good model, you will need to handle that.
## Case study: 'the pumpkin market'
In this folder you will find a .csv file in the root `data` folder called [US-pumpkins.csv](../data/US-pumpkins.csv) which includes 1757 lines of data about the market for pumpkins, sorted into groupings by city. This is raw data extracted from the [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) distributed by the United States Department of Agriculture.
### Preparing data
This data is in the public domain. It can be downloaded in many separate files, per city, from the USDA web site. To avoid too many separate files, we have concatenated all the city data into one spreadsheet, thus we have already _prepared_ the data a bit. Next, let's take a closer look at the data.
### The pumpkin data - early conclusions
What do you notice about this data? You already saw that there is a mix of strings, numbers, blanks and strange values that you need to make sense of.
What question can you ask of this data, using a Regression technique? What about "Predict the price of a pumpkin for sale during a given month". Looking again at the data, there are some changes you need to make to create the data structure necessary for the task.
### Analyze the Pumpkin Data
## Exercise - analyze the pumpkin data
Let's use [Pandas](https://pandas.pydata.org/), (the name stands for `Python Data Analysis`) a tool very useful for shaping data, to analyze and prepare this pumpkin data.
### First, check for missing dates
You will first need to take steps to check for missing dates:
1. Convert the dates to a month format (these are US dates, so the format is `MM/DD/YYYY`).
2. Extract the month to a new column.
Open the _notebook.ipynb_ file in Visual Studio Code and import the spreadsheet in to a new Pandas dataframe.
1. Use the `head()` function to view the first five rows.
```python
import pandas as pd
pumpkins = pd.read_csv('../../data/US-pumpkins.csv')
pumpkins.head()
```
Let's use [Pandas](https://pandas.pydata.org/), (the name stands for `Python Data Analysis`) a tool very useful for shaping data, to analyze and prepare this pumpkin data. First, check for missing dates and then convert the dates to a month format (these are US dates, so the format is currently `MM/DD/YYYY`). Extract the month to a new column.
✅ What function would you use to view the last five rows?
Open the `notebook.ipynb` file in VS Code and import the spreadsheet in to a new Pandas dataframe. Use the `head()` function to view the first five rows.
1. Check if there is missing data in the current dataframe:
```python
import pandas as pd
pumpkins = pd.read_csv('../../data/US-pumpkins.csv')
pumpkins.head()
```
```python
pumpkins.isnull().sum()
```
✅ What function would you use to view the last five rows?
There is missing data, but maybe it won't matter for the task at hand.
Check if there is missing data in the current dataframe:
1. To make your dataframe easier to work with, drop several of its columns, using `drop()`, keeping only the columns you need:
```python
pumpkins.isnull().sum()
```
```python
new_columns = ['Package', 'Month', 'Low Price', 'High Price', 'Date']
pumpkins = pumpkins.drop([c for c in pumpkins.columns if c not in new_columns], axis=1)
```
There is missing data, but maybe it won't matter for the task at hand.
### Second, determine average price of pumpkin
To make your dataframe easier to work with, drop several of its columns, keeping only the ones you need:
Think about how to determine the average price of a pumpkin in a given month. What columns would you pick for this task? Hint: you'll need 3 columns.
```python
new_columns = ['Package', 'Month', 'Low Price', 'High Price', 'Date']
pumpkins = pumpkins.drop([c for c in pumpkins.columns if c not in new_columns], axis=1)
```
Solution: take the average of the `Low Price` and `High Price` columns to populate the new Price column, and convert the Date column to only show the month. Fortunately, according to the check above, there is no missing data for dates or prices.
Second, think about how to determine the average price of a pumpkin in a given month. What columns would you pick for this task? Hint: you'll need 3 columns.
1. To calculate the average, add the following code:
Solution: take the average of the Low Price and High Price columns to populate the new Price column, and convert the Date column to only show the month. Fortunately, according to the check above, there is no missing data for dates or prices.
```python
price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2
month = pd.DatetimeIndex(pumpkins['Date']).month
```
```python
price = (pumpkins['Low Price'] + pumpkins['High Price']) / 2
✅ Feel free to print any data you'd like to check using `print(month)`.
month = pd.DatetimeIndex(pumpkins['Date']).month
2. Now, copy your converted data into a fresh Pandas dataframe:
```
✅ Feel free to print any data you'd like to check: `print(month)` for example.
```python
new_pumpkins = pd.DataFrame({'Month': month, 'Package': pumpkins['Package'], 'Low Price': pumpkins['Low Price'],'High Price': pumpkins['High Price'], 'Price': price})
```
Now, append your converted data into a fresh Pandas dataframe:
Printing out your dataframe will show you a clean, tidy dataset on which you can build your new regression model.
```python
new_pumpkins = pd.DataFrame({'Month': month, 'Package': pumpkins['Package'], 'Low Price': pumpkins['Low Price'],'High Price': pumpkins['High Price'], 'Price': price})
```
Printing out your dataframe will show you a clean, tidy dataset on which you can build your new regression model.
### But wait! There's something odd here
But wait! There's something odd here. If you look at the `Package` column, pumpkins are sold in many different configurations. Some are sold in '1 1/9 bushel' measures, and some in '1/2 bushel' measures, some per pumpkin, some per pound, and some in big boxes with varying widths.
If you look at the `Package` column, pumpkins are sold in many different configurations. Some are sold in '1 1/9 bushel' measures, and some in '1/2 bushel' measures, some per pumpkin, some per pound, and some in big boxes with varying widths.
Digging into the original data, it's interesting that anything with `Unit of Sale` equalling 'EACH' or 'PER BIN' also have the `Package` type per inch, per bin, or 'each'. Pumpkins seem to be very hard to weigh consistently, so let's filter them out by selecting only pumpkins with the string 'bushel' in their `Package` column. Add a filter at the top of the file, under the initial .csv import:
> Pumpkins seem very hard to weigh consistently
```python
pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]
```
If you print the data now, you can see that you are only getting the 415 or so rows of data containing pumpkins by the bushel. But wait! there's one more thing to do. Did you notice that the bushel amount varies per row? You need to normalize the pricing so that you show the pricing per bushel, so do some math to standardize it. Add these lines after the block creating the new_pumpkins dataframe:
Digging into the original data, it's interesting that anything with `Unit of Sale` equalling 'EACH' or 'PER BIN' also have the `Package` type per inch, per bin, or 'each'. Pumpkins seem to be very hard to weigh consistently, so let's filter them by selecting only pumpkins with the string 'bushel' in their `Package` column.
```python
new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/(1 + 1/9)
1. Add a filter at the top of the file, under the initial .csv import:
new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price/(1/2)
```
```python
pumpkins = pumpkins[pumpkins['Package'].str.contains('bushel', case=True, regex=True)]
```
If you print the data now, you can see that you are only getting the 415 or so rows of data containing pumpkins by the bushel.
### But wait! There's one more thing to do
Did you notice that the bushel amount varies per row? You need to normalize the pricing so that you show the pricing per bushel, so do some math to standardize it.
1. Add these lines after the block creating the new_pumpkins dataframe:
```python
new_pumpkins.loc[new_pumpkins['Package'].str.contains('1 1/9'), 'Price'] = price/(1 + 1/9)
new_pumpkins.loc[new_pumpkins['Package'].str.contains('1/2'), 'Price'] = price/(1/2)
```
✅ According to [The Spruce Eats](https://www.thespruceeats.com/how-much-is-a-bushel-1389308), a bushel's weight depends on the type of produce, as it's a volume measurement. "A bushel of tomatoes, for example, is supposed to weigh 56 pounds... Leaves and greens take up more space with less weight, so a bushel of spinach is only 20 pounds." It's all pretty complicated! Let's not bother with making a bushel-to-pound conversion, and instead price by the bushel. All this study of bushels of pumpkins, however, goes to show how very important it is to understand the nature of your data!
@ -100,56 +134,67 @@ Now, you can analyze the pricing per unit based on their bushel measurement. If
## Visualization Strategies
Part of the data scientist's role is to demonstrate the quality and nature of the data they are working with. To do this, they often create interesting visualizations, or plots, graphs, and charts, showing different aspects of data. In this way, they are able to visually show relationships and gaps that are otherwise hard to uncover. Visualizations can also help determine the machine learning technique most appropriate for the data. A scatterplot that seems to follow a line, for example, indicates that the data is a good candidate for a linear regression exercise.
Part of the data scientist's role is to demonstrate the quality and nature of the data they are working with. To do this, they often create interesting visualizations, or plots, graphs, and charts, showing different aspects of data. In this way, they are able to visually show relationships and gaps that are otherwise hard to uncover.
Visualizations can also help determine the machine learning technique most appropriate for the data. A scatterplot that seems to follow a line, for example, indicates that the data is a good candidate for a linear regression exercise.
One data visualization libary that works well in Jupyter notebooks is [Matplotlib](https://matplotlib.org/) (which you also saw in the previous lesson).
> Get more experience with data visualization in [these tutorials](https://docs.microsoft.com/learn/modules/explore-analyze-data-with-python?WT.mc_id=academic-15963-cxa).
## Experiment with Matplotlib
## Exercise - experiment with Matplotlib
Try to create some basic plots to display the new dataframe you just created. What would a basic line plot show?
Import Matplotlib at the top of the file, under the Pandas import:
1. Import Matplotlib at the top of the file, under the Pandas import:
```python
import matplotlib.pyplot as plt
```
1. Rerun the entire notebook to refresh.
1. At the bottom of the notebook, add a cell to plot the data as a box:
```python
import matplotlib.pyplot as plt
```
```python
price = new_pumpkins.Price
month = new_pumpkins.Month
plt.scatter(price, month)
plt.show()
```
Rerun the entire notebook to refresh. Then at the bottom of the notebook, add a cell to plot the data as a box:
![A scatterplot showing price to month relationship](./images/scatterplot.png)
```python
price = new_pumpkins.Price
month = new_pumpkins.Month
plt.scatter(price, month)
plt.show()
```
![A scatterplot showing price to month relationship](./images/scatterplot.png)
Is this a useful plot? Does anything about it surprise you?
Is this a useful plot? Does anything about it surprise you?
It's not particularly useful as all it does is display in your data as a spread of points in a given month.
It's not particularly useful as all it does is display in your data as a spread of points in a given month. To get charts to display useful data, you usually need to group the data somehow. Let's try creating a plot where the y axis shows the months and the data demonstrates the distribution of data.
### Make it useful
Add a cell to create a grouped bar chart:
To get charts to display useful data, you usually need to group the data somehow. Let's try creating a plot where the y axis shows the months and the data demonstrates the distribution of data.
```python
new_pumpkins.groupby(['Month'])['Price'].mean().plot(kind='bar')
plt.ylabel("Pumpkin Price")
```
1. Add a cell to create a grouped bar chart:
![A bar chart showing price to month relationship](./images/barchart.png)
```python
new_pumpkins.groupby(['Month'])['Price'].mean().plot(kind='bar')
plt.ylabel("Pumpkin Price")
```
This is a more useful data visualization! It seems to indicate that the highest price for pumpkins occurs in September and October. Does that meet your expectation? Why or why not?
![A bar chart showing price to month relationship](./images/barchart.png)
This is a more useful data visualization! It seems to indicate that the highest price for pumpkins occurs in September and October. Does that meet your expectation? Why or why not?
---
## 🚀Challenge
Explore the different types of visualization that matplotlib offers. Which types are most appropriate for regression problems?
Explore the different types of visualization that M Matplotlib offers. Which types are most appropriate for regression problems?
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/12/)
## Review & Self Study
Take a look at the many ways to visualize data. Make a list of the various libraries available and note which are best for given types of tasks, for example 2D visualizations vs. 3D visualizations. What do you discover?
## Assignment
## Assignment
[Exploring visualization](assignment.md)

@ -1,6 +1,6 @@
# Build a Regression Model using Scikit-Learn: Regression Two Ways
# Build a regression model using Scikit-learn: regression two ways
![Linear vs Polynomial Regression Infographic](./images/linear-polynomial.png)
![Linear vs polynomial regression infographic](./images/linear-polynomial.png)
> Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
@ -16,11 +16,11 @@ Now you are ready to dive deeper into regression for ML. In this lesson, you wil
## Prerequisite
You should be familiar by now, with the structure of the pumpkin data that we are examining. You can find it preloaded and pre-cleaned in this lesson's _notebook.ipynb_ file. In the file, the pumpkin price is displayed per bushel in a new dataframe. Make sure you can run these notebooks in kernels in Visual Studio Code.
You should be familiar by now with the structure of the pumpkin data that we are examining. You can find it preloaded and pre-cleaned in this lesson's _notebook.ipynb_ file. In the file, the pumpkin price is displayed per bushel in a new dataframe. Make sure you can run these notebooks in kernels in Visual Studio Code.
## Preparation
As a reminder, you are loading this data so to ask questions of it, questions like:
As a reminder, you are loading this data so to ask questions of it:
- When is the best time to buy pumpkins?
- What price can I expect of a case of miniature pumpkins?
@ -28,13 +28,9 @@ As a reminder, you are loading this data so to ask questions of it, questions li
Let's keep digging into this data.
### Limitations of the previous lesson
✅ In the previous lesson, you created a Pandas data frame and populated it with part of the original dataset, standardizing the pricing by the bushel. By doing that, however, you were _only_ able to gather about 400 data points and only for the fall months. Take a look at the data that we preloaded in this lesson's accompanying _notebook.ipynb_. The data is preloaded and an initial scatter plot is charted to show month data. Maybe we can get a little more detail about the nature of the data by cleaning it more.
In the previous lesson, you created a Pandas data frame and populated it with part of the original dataset, standardizing the pricing by the bushel. By doing that, however, you were _only_ able to gather about 400 data points and only for the fall months.
✅ Take a look at the data that we preloaded in this lesson's accompanying notebook _notebook.ipynb_. The data is preloaded and an initial scatter plot is charted to show month data. Maybe we can get a little more detail about the nature of the data by cleaning it more.
## A Linear Regression Line
## A linear regression line
As you learned in Lesson 1, the goal of a linear regression exercise is to be able to plot a line to:
@ -43,7 +39,7 @@ As you learned in Lesson 1, the goal of a linear regression exercise is to be ab
### Understand the math
Lets; focus on understanding the underlying math.
Since you'll use Scikit-learn, there's no reason to do this by hand (although you could!). In the main data-processing block of your lesson notebook, add a library from Scikit-learn to automatically convert all string data to numbers:
This line has an equation:
@ -53,6 +49,26 @@ Y = a + bX`.
It is typical of **Least-Squares Regression** to draw this type of line.
Todo infographic
If you look at the new_pumpkins dataframe now, you see that all the strings are now numeric. This makes it harder for you to read but much more intelligible for Scikit-learn!
```
Package Price
70 0 13.636364
71 0 16.363636
72 0 16.363636
73 0 15.454545
74 0 13.636364
... ... ...
1738 2 30.000000
1739 2 28.750000
1740 2 25.750000
1741 2 24.000000
1742 2 24.000000
415 rows × 2 columns
```
`X` is the 'explanatory variable'. `Y` is the 'dependent variable'.
The slope of the line is `b` and `a` is the y-intercept (where the line intersects with the Y-axis), which refers to the value of `Y` when `X = 0`.
@ -78,8 +94,6 @@ We want to model a line that has the least cumulative distance from all of our d
One more term to understand is the **Correlation Coefficient** between given `X` and `Y` variables.
For a scatter plot, you can quickly visualize this coefficient.
- **High correlation**. A plot with data points scattered in a neat line have high correlation.
- **Low correlation**. A plot with data points scattered everywhere between X and Y have a low correlation.
@ -222,13 +236,13 @@ Before building your model, do one more tidy-up of your data.
Congratulations, you just created a model that can help predict the price of a few varieties of pumpkins. Your holiday pumpkin patch will be beautiful. But you can probably create a better model!
## Polynomial Regression
## Polynomial regression
Another type of Linear Regression is Polynomial Regression. While sometimes there's a linear relationship between variables - _the bigger the pumpkin in volume, the higher the price_. Sometimes these relationships can't be plotted as a plane or straight line.
Another type of linear regression is polynomial regression. While sometimes there's a linear relationship between variables - the bigger the pumpkin in volume, the higher the price - sometimes these relationships can't be plotted as a plane or straight line.
✅ Here are [some more examples](https://online.stat.psu.edu/stat501/lesson/9/9.8) of data that could use Polynomial Regression
✅ Here are [some more examples](https://online.stat.psu.edu/stat501/lesson/9/9.8) of data that could use polynomial regression
Take another look at the relationship between `Variety` to `Price` in the previous plot. Does this scatterplot seem like it should necessarily be analyzed by a straight line? Perhaps not. In this case, you can try Polynomial Regression.
Take another look at the relationship between Variety to Price in the previous plot. Does this scatterplot seem like it should necessarily be analyzed by a straight line? Perhaps not. In this case, you can try polynomial regression.
✅ Polynomials are mathematical expressions that might consist of one or more variables and coefficients
@ -262,7 +276,7 @@ A good way to visualize the correlations between data in dataframes is to displa
### Create a pipeline
Scikit-Learn includes a helpful API for building polynomial regression models - the `make_pipeline` [API](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html?highlight=pipeline#sklearn.pipeline.make_pipeline). A 'pipeline' is created which is a chain of estimators. In this case, the pipeline includes Polynomial Features, or predictions that form a nonlinear path.
Scikit-learn includes a helpful API for building polynomial regression models - the `make_pipeline` [API](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html?highlight=pipeline#sklearn.pipeline.make_pipeline). A 'pipeline' is created which is a chain of estimators. In this case, the pipeline includes polynomial features, or predictions that form a nonlinear path.
1. Build out the X and y columns:
@ -323,7 +337,7 @@ At this point, you need to create a new dataframe with _sorted_ data so that the
Model Accuracy: 0.8537946517073784
```
That's better! Try to predict a price:
That's better! Try to predict a price:
### Do a prediction
@ -343,7 +357,8 @@ Let's see where we are, can we input a new value and get a prediction?
It does make sense, if you compare it to the polynomial plot! And, if this is a better model than the previous one, looking at the same data, you need to budget for these more expensive pumpkins!
🏆 Well done! You created two Regression models in one lesson. In the final section on Regression, you will learn about Logistic Regression to determine categories.
🏆 Well done! You created two regression models in one lesson. In the final section on regression, you will learn about logistic regression to determine categories.
---

@ -2,7 +2,7 @@
## Instructions
In this lesson you were shown how to build a model using both Linear and Polynomial Regression. Using this knowledge, find a dataset or use one of Scikit-Learn's built-in sets to build a fresh model. Explain in your notebook why you chose the technique you did, and demonstrate your model's accuracy. If it is not accurate, explain why.
In this lesson you were shown how to build a model using both Linear and Polynomial Regression. Using this knowledge, find a dataset or use one of Scikit-learn's built-in sets to build a fresh model. Explain in your notebook why you chose the technique you did, and demonstrate your model's accuracy. If it is not accurate, explain why.
## Rubric

@ -1,6 +1,6 @@
# Logistic Regression to Predict Categories
# Logistic regression to predict categories
![Logistic vs. Linear Regression Infographic](./images/logistic-linear.png)
![Logistic vs. linear regression infographic](./images/logistic-linear.png)
> Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/15/)
@ -10,48 +10,47 @@ In this final lesson on Regression, one of the basic 'classic' ML techniques, we
In this lesson, you will learn:
- A new library for data visualization
- Techniques for Logistic Regression
- Techniques for logistic regression
Deepen your understanding of working with this type of Regression in this [Learn module](https://docs.microsoft.com/learn/modules/train-evaluate-classification-models?WT.mc_id=academic-15963-cxa)
Deepen your understanding of working with this type of regression in this [Learn module](https://docs.microsoft.com/learn/modules/train-evaluate-classification-models?WT.mc_id=academic-15963-cxa)
## Prerequisite
Having worked with the pumpkin data, we are now familiar enough with it to realize that there's one binary category that we can work with: Color. Let's build a Logistic Regression model to predict that, given some variables, what color a given pumpkin will be (orange 🎃 or white 👻).
Having worked with the pumpkin data, we are now familiar enough with it to realize that there's one binary category that we can work with: Color. Let's build a logistic regression model to predict that, given some variables, what color a given pumpkin is likely to be (orange 🎃 or white 👻).
> Why are we talking about binary classification in a lesson grouping about regression? Only for convenience, as Logistic Regression is [really a Classification method](https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression), albeit a linear-based one. Learn other ways to classify data in the next lesson group.
> Why are we talking about binary classification in a lesson grouping about regression? Only for linguistic convenience, as logistic regression is [really a classification method](https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression), albeit a linear-based one. Learn about other ways to classify data in the next lesson group.
For our purposes, we will express this as a binary: 'Orange' or 'Not Orange'. There is also a 'striped' category in our dataset but there are few instances of it, so we will not use it. It disappears once we remove null values from the dataset, anyway.
> 🎃 Fun fact, we sometimes call white pumpkins 'ghost' pumpkins. They aren't very easy to carve, so they aren't as popular as the orange ones but they are cool looking!
## About Logistic Regression
## About logistic regression
Logistic Regression differs from Linear Regression, which you learned about previously, in a few important ways.
### Binary Classification
Logistic regression differs from linear regression, which you learned about previously, in a few important ways.
### Binary classification
Logistic Regression does not offer the same features as Linear Regression. The former offers a prediction about a binary category ("orange or not orange") whereas the latter is capable of predicting continual values, for example given the origin of a pumpkin and the time of harvest, how much its price will rise.
Logistic regression does not offer the same features as linear regression. The former offers a prediction about a binary category ("orange or not orange") whereas the latter is capable of predicting continual values, for example given the origin of a pumpkin and the time of harvest, how much its price will rise.
![Pumpkin Classification Model](./images/pumpkin-classifier.png)
![Pumpkin classification Model](./images/pumpkin-classifier.png)
> Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
### Other Classifications
### Other classifications
There are other types of Logistic Regression, including Multinomial and Ordinal. Multinomial involves having more than one categories - "Orange, White, and Striped". Ordinal involves ordered categories, useful if we wanted to order our outcomes logically, like our pumpkins that are ordered by a finite number of sizes (mini,sm,med,lg,xl,xxl).
There are other types of logistic regression, including multinomial and ordinal. Multinomial involves having more than one categories - "Orange, White, and Striped". Ordinal involves ordered categories, useful if we wanted to order our outcomes logically, like our pumpkins that are ordered by a finite number of sizes (mini,sm,med,lg,xl,xxl).
![Multinomial vs Ordinal Regression](./images/multinomial-ordinal.png)
![Multinomial vs ordinal regression](./images/multinomial-ordinal.png)
> Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
### It's Still Linear
### It's still linear
Even though this type of Regression is all about category predictions, it still works best when there is a clear linear relationship between the dependent variable (color) and the other independent variables (the rest of the dataset, like city name and size). It's good to get an idea of whether there is any linearity dividing these variables or not.
### Variables DO NOT have to correlate
Remember how Linear Regression worked better with more correlated variables? Logistic Regression is the opposite - the variables don't have to align. That works for this data which has somewhat weak correlations.
### You Need a Lot of Clean Data
Remember how linear regression worked better with more correlated variables? Logistic regression is the opposite - the variables don't have to align. That works for this data which has somewhat weak correlations.
### You need a lot of clean data
Logistic Regression will give more accurate results if you use more data; our small dataset is not optimal for this task, so keep that in mind.
Logistic regression will give more accurate results if you use more data; our small dataset is not optimal for this task, so keep that in mind.
✅ Think about the types of data that would lend themselves well to Logistic Regression
## Tidy the Data
✅ Think about the types of data that would lend themselves well to logistic regression
## Tidy the data
First, clean the data a bit, dropping null values and selecting only some of the columns:
@ -107,11 +106,11 @@ sns.catplot(x="Color", y="Item Size",
✅ Try creating this plot, and other Seaborn plots, using other variables.
Now that we have an idea of the relationship between the binary categories of color and the larger group of sizes, let's explore Logistic Regression to determine a given pumpkin's likely color.
Now that we have an idea of the relationship between the binary categories of color and the larger group of sizes, let's explore logistic regression to determine a given pumpkin's likely color.
> **🧮 Show Me The Math**
>
> Remember how Linear Regression often used ordinary least squares to arrive at a value? Logistic Regression relies on the concept of 'maximum likelihood' using [sigmoid functions](https://wikipedia.org/wiki/Sigmoid_function). A 'Sigmoid Function' on a plot looks like an 'S' shape. It takes a value and maps it to somewhere between 0 and 1. Its curve is also called a 'logistic curve'. Its formula looks like thus:
> Remember how linear regression often used ordinary least squares to arrive at a value? Logistic regression relies on the concept of 'maximum likelihood' using [sigmoid functions](https://wikipedia.org/wiki/Sigmoid_function). A 'Sigmoid Function' on a plot looks like an 'S' shape. It takes a value and maps it to somewhere between 0 and 1. Its curve is also called a 'logistic curve'. Its formula looks like thus:
>
> ![logistic function](images/sigmoid.png)
>
@ -119,7 +118,7 @@ Now that we have an idea of the relationship between the binary categories of co
## Build your model
Building a model to find these binary classification is surprisingly straightforward in Scikit-Learn.
Building a model to find these binary classification is surprisingly straightforward in Scikit-learn.
Select the variables you want to use in your classification model and split the training and test sets:
@ -221,7 +220,7 @@ Let's revisit the terms we saw earlier with the help of the confusion matrix's m
🎓 Weighted Avg: The calculation of the mean metrics for each label, taking label imbalance into account by weighting them by their support (the number of true instances for each label).
✅ Can you think which metric you should watch if you want your model to reduce the number of false negatives?
## Visualize the ROC Curve of this Model
## Visualize the ROC curve of this model
This is not a bad model; its accuracy is in the 80% range so ideally you could use it to predict the color of a pumpkin given a set of variables.
@ -240,7 +239,7 @@ Using Seaborn again, plot the model's [Receiving Operating Characteristic](https
![ROC](./images/ROC.png)
Finally, use Scikit-Learn's [`roc_auc_score` API](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html?highlight=roc_auc#sklearn.metrics.roc_auc_score) to compute the actual 'Area Under the Curve' (AUC):
Finally, use Scikit-learn's [`roc_auc_score` API](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html?highlight=roc_auc#sklearn.metrics.roc_auc_score) to compute the actual 'Area Under the Curve' (AUC):
```python
auc = roc_auc_score(y_test,y_scores[:,1])
@ -253,13 +252,13 @@ In future lessons on classifications, you will learn how to iterate to improve y
---
## 🚀Challenge
There's a lot more to unpack regarding Logistic Regression! But the best way to learn is to experiment. Find a dataset that lends itself to this type of analysis and build a model with it. What do you learn? tip: try [Kaggle](https://kaggle.com) for interesting datasets.
There's a lot more to unpack regarding logistic regression! But the best way to learn is to experiment. Find a dataset that lends itself to this type of analysis and build a model with it. What do you learn? tip: try [Kaggle](https://kaggle.com) for interesting datasets.
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/16/)
## Review & Self Study
Read the first few pages of [this paper from Stanford](https://web.stanford.edu/~jurafsky/slp3/5.pdf) on some practical uses for Logistic Regression. Think about tasks that are better suited for one or the other type of Regression tasks that we have studied up to this point. What would work best?
Read the first few pages of [this paper from Stanford](https://web.stanford.edu/~jurafsky/slp3/5.pdf) on some practical uses for logistic regression. Think about tasks that are better suited for one or the other type of regression tasks that we have studied up to this point. What would work best?
## Assignment
[Retrying this Regression](assignment.md)
[Retrying this regression](assignment.md)

Before

Width:  |  Height:  |  Size: 1.1 MiB

After

Width:  |  Height:  |  Size: 1.1 MiB

@ -1,4 +1,4 @@
# Regression Models for Machine Learning
# Regression models for machine learning
## Regional topic: Regression models for pumpkin prices in North America 🎃
In North America, pumpkins are often carved into scary faces for Halloween. Let's discover more about these fascinating vegetables!
@ -8,25 +8,25 @@ In North America, pumpkins are often carved into scary faces for Halloween. Let'
## What you will learn
The lessons in this section cover types of Regression in the context of machine learning. Regression models can help determine the _relationship_ between variables. This type of model can predict values such as length, temperature, or age, thus uncovering relationships between variables as it analyzes data points.
The lessons in this section cover types of regression in the context of machine learning. Regression models can help determine the _relationship_ between variables. This type of model can predict values such as length, temperature, or age, thus uncovering relationships between variables as it analyzes data points.
In this series of lessons, you'll discover the difference between Linear vs. Logistic Regression, and when you should use one or the other.
In this series of lessons, you'll discover the difference between linear vs. logistic regression, and when you should use one or the other.
In this group of lessons, you will get set up to begin machine learning tasks, including configuring Visual Studio code to manage notebooks, the common environment for data scientists. You will discover Scikit-Learn, a library for machine learning, and you will build your first models, focusing on Regression models in this chapter.
In this group of lessons, you will get set up to begin machine learning tasks, including configuring Visual Studio code to manage notebooks, the common environment for data scientists. You will discover Scikit-learn, a library for machine learning, and you will build your first models, focusing on Regression models in this chapter.
> There are useful low-code tools that can help you learn about working with Regression models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)
> There are useful low-code tools that can help you learn about working with regression models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)
### Lessons
1. [Tools of the Trade](1-Tools/README.md)
2. [Managing Data](2-Data/README.md)
3. [Linear and Polynomial Regression](3-Linear/README.md)
4. [Logistic Regression](4-Logistic/README.md)
1. [Tools of the trade](1-Tools/README.md)
2. [Managing data](2-Data/README.md)
3. [Linear and polynomial regression](3-Linear/README.md)
4. [Logistic regression](4-Logistic/README.md)
---
### Credits
"ML with Regression" was written with ♥️ by [Jen Looper](https://twitter.com/jenlooper)
"ML with regression" was written with ♥️ by [Jen Looper](https://twitter.com/jenlooper)
♥️ Quiz contributors include: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) and [Ornella Altunyan](https://twitter.com/ornelladotcom)

@ -1,17 +1,17 @@
# Build a Web App to use a ML Model
In this lesson, you will train a Linear Regression model and a Classification model on a dataset that's out of this world: UFO Sightings over the past century, sourced from [NUFORC's database](https://www.nuforc.org). We will continue our use of notebooks to clean data and train our model, but you can take the process one step further by exploring using a model 'in the wild', so to speak: in a web app. To do this, you need to build a web app using Flask.
In this lesson, you will train a ML model on a dataset that's out of this world: UFO sightings over the past century, sourced from [NUFORC's database](https://www.nuforc.org). We will continue our use of notebooks to clean data and train our model, but you can take the process one step further by exploring using a model 'in the wild', so to speak: in a web app. To do this, you need to build a web app using Flask.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/17/)
There are several ways to build web apps to consume machine learning models. Your web architecture may influence the way your model is trained. Imagine that you are working in a business where the data science group has trained a model that they want you to use in an app. There are many questions you need to ask: Is it a web app, or a mobile app? Where will the model reside, in the cloud or locally? Does the app have to work offline? And what technology was used to train the model, because that may influence the tooling you need to use?
If you are training a model using TensorFlow, for example, that ecosystem provides the ability to convert a TensorFlow model for use in a web app by using [TensorFlow.js](https://www.tensorflow.org/js/). If you are building a mobile app or need to use the model in an IoT context, you could use [TensorFlow Lite](https://www.tensorflow.org/lite/) and use the model in an Android or iOS app.
If you are building a model using [PyTorch](https://pytorch.org/), you have the option to export it in [ONNX](https://onnx.ai/) (Open Neural Network Exchange) format for use in JavaScript web apps that can use [onnx.js](https://github.com/Microsoft/onnxjs). This option will be explored in a future lesson.
If you are building a model using a library such as [PyTorch](https://pytorch.org/), you have the option to export it in [ONNX](https://onnx.ai/) (Open Neural Network Exchange) format for use in JavaScript web apps that can use the [Onnx Runtime](https://www.onnxruntime.ai/). This option will be explored in a future lesson for a Scikit-learn-trained model.
If you are using an ML SaaS (Software as a Service) system such as [Lobe.ai](https://lobe.ai/) or [Azure Custom Vision](https://azure.microsoft.com/en-us/services/cognitive-services/custom-vision-service/) to train a model, this type of software provides ways to export the model for many platforms, including building a bespoke API to be queried in the cloud by your online application.
If you are using an ML SaaS (Software as a Service) system such as [Lobe.ai](https://lobe.ai/) or [Azure Custom Vision](https://azure.microsoft.com/services/cognitive-services/custom-vision-service/?WT.mc_id=academic-15963-cxa) to train a model, this type of software provides ways to export the model for many platforms, including building a bespoke API to be queried in the cloud by your online application.
You also have the opportunity to build an entire Flask web app that would be able to train the model itself in a web browser. This can also be done using TensorFlow.js in a JavaScript context. For our purposes, since we have been working with notebooks, let's explore the steps you need to take to export a trained model to a format readable by a Python-built web app.
You also have the opportunity to build an entire Flask web app that would be able to train the model itself in a web browser. This can also be done using TensorFlow.js in a JavaScript context. For our purposes, since we have been working with Python-based notebooks, let's explore the steps you need to take to export a trained model from such a notebook to a format readable by a Python-built web app.
## Tools
@ -52,7 +52,7 @@ ufos = ufos[(ufos['Seconds'] >= 1) & (ufos['Seconds'] <= 60)]
ufos.info()
```
Next, import Scikit-Learn's LabelEncoder library to convert the text values for countries to a number.
Next, import Scikit-learn's LabelEncoder library to convert the text values for countries to a number.
✅ LabelEncoder encodes data alphabetically
@ -89,7 +89,7 @@ y = ufos['Country']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
```
Finally, train your model using Logistic Regression:
Finally, train your model using logistic regression:
```python
from sklearn.metrics import accuracy_score, classification_report
@ -103,10 +103,10 @@ print('Predicted labels: ', predictions)
print('Accuracy: ', accuracy_score(y_test, predictions))
```
The accuracy isn't bad (around 95%), unsurprisingly, as country and latitude/longitude have a good correlation. The model you created isn't very revolutionary, but it's a good exercise to try to train from raw data that you cleaned, export, and then use this model in a web app.
The accuracy isn't bad (around 95%), unsurprisingly, as country and latitude/longitude correlate. The model you created isn't very revolutionary as it's obvious you should be able to infer a country from its latitude and longitude, but it's a good exercise to try to train from raw data that you cleaned, exported, and then use this model in a web app.
## Pickle your model
Now, it's time to pickle your model! You can do that in just a few lines of code. Load your pickled model and test it against a sample data array containing values for seconds, latitude and longitude,
Now, it's time to pickle your model! You can do that in just a few lines of code. Once it's pickled, load your pickled model and test it against a sample data array containing values for seconds, latitude and longitude,
```python
import pickle
@ -122,9 +122,9 @@ The model returns '3', which is the country code for the UK. Wild! 👽
Now you can build a Flask app to call your model and return similar results, but in a more visually pleasing way.
Start by creating a folder called web-app next to the notebook.ipynb file where your ufo-model.pkl file resides. In that folder create three more folders: `static`, with a folder `css` inside it, and `templates`.
Start by creating a folder called web-app next to the _notebook.ipynb_ file where your _ufo-model.pkl_ file resides. In that folder create three more folders: `static`, with a folder `css` inside it, and `templates`.
> Refer to the solution folder for a view of the finished app
Refer to the solution folder for a view of the finished app
The first file to create in `web-app` is a `requirements.txt` file. Like `package.json` in a JavaScript app, this file lists dependencies required by the app. In `requirements.txt` add the lines:
@ -258,11 +258,14 @@ Before doing that, take a look at the parts of `app.py`.
First, dependencies are loaded and the app starts. Then, the model is imported. Then, index.html is rendered on the home route. On the `/predict` route, several things happen when the form is posted:
1. The form variables are gathered and converted to a numpy array. They are then sent to the model and a prediction is returned.
1. The form variables are gathered and converted to a numpy array. They are then sent to the model and a prediction is returned.
2. The Countries that we want displayed are re-rendered as readable text from their predicted country code, and that value is sent back to index.html to be rendered in the template.
Using a model this way, with Flask and a pickled model, is relatively straightforward. The hardest thing is to understand what shape the data is that must be sent to the model to get a prediction. That all depends on how the model was trained. This one has three data points to be input in order to get a prediction. In a professional setting, you can see how good communication is necessary between the folks who train the model and those who consume it in a web or mobile app. In our case, it's only one person, you!
Using a model this way, with Flask and a pickled model, is relatively straightforward. The hardest thing is to understand what shape the data is that must be sent to the model to get a prediction. That all depends on how the model was trained. This one has three data points to be input in order to get a prediction.
In a professional setting, you can see how good communication is necessary between the folks who train the model and those who consume it in a web or mobile app. In our case, it's only one person, you!
---
## 🚀 Challenge:
Instead of working in a notebook and importing the model to the Flask app, you could train the model right within the Flask app! Try converting your Python code in the notebook, perhaps after your data is cleaned, to train the model from within the app on a route called `train`. What are the pros and cons of pursuing this method?

@ -1,6 +1,11 @@
# Build a Web App to use your ML Model
# Build a web app to use your ML model
In this section of the curriculum, you will be introduced to an applied ML topic: how to save your Scikit-Learn model as a file that can be used to make predictions within a web application. Once the model is saved, you'll learn how to use it in a web app built in Flask. You'll first create a model using some data that's all about UFO sightings! Then, you'll build a web app that will allow you to input a number of seconds with a latitude and a longitude value to predict which country reported seeing a UFO.
In this section of the curriculum, you will be introduced to an applied ML topic: how to save your Scikit-learn model as a file that can be used to make predictions within a web application. Once the model is saved, you'll learn how to use it in a web app built in Flask. You'll first create a model using some data that's all about UFO sightings! Then, you'll build a web app that will allow you to input a number of seconds with a latitude and a longitude value to predict which country reported seeing a UFO.
![UFO Parking](images/ufo.jpg)
Photo by <a href="https://unsplash.com/@mdherren?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Michael Herren</a> on <a href="https://unsplash.com/s/photos/ufo?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
## Lessons

Binary file not shown.

After

Width:  |  Height:  |  Size: 790 KiB

@ -1,14 +1,14 @@
# Introduction to Classification
# Introduction to classification
In these four lessons, you will discover the 'meat and potatoes' of classic machine learning - Classification. No pun intended - we will walk through using various classification algorithms with a dataset all about the brilliant cuisines of Asia and India. Hope you're hungry!
In these four lessons, you will discover the 'meat and potatoes' of classic machine learning - classification. No pun intended - we will walk through using various classification algorithms with a dataset all about the brilliant cuisines of Asia and India. Hope you're hungry!
Classification is a form of [supervised learning](https://wikipedia.org/wiki/Supervised_learning) that bears a lot in common with Regression techniques. If machine learning is all about assigning names to things via datasets, then classification generally falls into two groups: binary classification and multiclass classfication.
Classification is a form of [supervised learning](https://wikipedia.org/wiki/Supervised_learning) that bears a lot in common with regression techniques. If machine learning is all about assigning names to things via datasets, then classification generally falls into two groups: binary classification and multiclass classification.
[![Introduction to Classification](https://img.youtube.com/vi/eg8DJYwdMyg/0.jpg)](https://youtu.be/eg8DJYwdMyg "Introduction to Classification")
[![Introduction to classification](https://img.youtube.com/vi/eg8DJYwdMyg/0.jpg)](https://youtu.be/eg8DJYwdMyg "Introduction to classification")
> 🎥 Click the image above for a video: MIT's John Guttag introduces Classification
> 🎥 Click the image above for a video: MIT's John Guttag introduces classification
Remember, Linear Regression helped you predict relationships between variables and make accurate predictions on where a new datapoint would fall in relationship to that line. So, you could predict what price a pumpkin would be in September vs. December, for example. Logistic Regression helped you discover binary categories: at this price point, is this pumpkin orange or not-orange?
Remember, linear regression helped you predict relationships between variables and make accurate predictions on where a new datapoint would fall in relationship to that line. So, you could predict what price a pumpkin would be in September vs. December, for example. Logistic Regression helped you discover binary categories: at this price point, is this pumpkin orange or not-orange?
Classification uses various algorithms to determine other ways of determining a data point's label or class. Let's work with this cuisine data to see whether, by observing a group of ingredients, we can determine its cuisine of origin.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/19/)
@ -19,21 +19,24 @@ Classification is one of the fundamental activities of the machine learning rese
Before starting the process of cleaning our data, visualizing it, and prepping it for our ML tasks, let's learn a bit about the various ways machine learning can be leveraged to classify data.
Derived from [statistics](https://wikipedia.org/wiki/Statistical_classification), classification using classic machine learning uses features, such as 'smoker','weight', and 'age' to determine 'likelihood of developing X disease'. As a supervised learning technique similar to the Regression exercises you performed earlier, your data is labeled and the ML algorithms use those labels to classify and predict classes (or 'features') of a dataset and assign them to a group or outcome.
Derived from [statistics](https://wikipedia.org/wiki/Statistical_classification), classification using classic machine learning uses features, such as 'smoker','weight', and 'age' to determine 'likelihood of developing X disease'. As a supervised learning technique similar to the regression exercises you performed earlier, your data is labeled and the ML algorithms use those labels to classify and predict classes (or 'features') of a dataset and assign them to a group or outcome.
✅ Take a moment to imagine a dataset about cuisines. What would a multiclass model be able to answer? What would a binary model be able to answer? What if you wanted to determine whether a given cuisine was likely to use fenugreek? What if you wanted to see if, given a present of a grocery bag full of star anise, artichokes, cauliflower, and horseradish, you could create a typical Indian dish?
[![Crazy mystery baskets](https://img.youtube.com/vi/GuTeDbaNoEU/0.jpg)](https://youtu.be/GuTeDbaNoEU "Crazy mystery baskets")
> The whole premise of the show 'Chopped' is the 'mystery basket' where chefs have to make some dish out of a random choice of ingredients. Surely a ML model would have helped!
## Hello 'classifier'
The question we want to ask of this cuisine dataset is actually a **multiclass question**, as we have several potential national cuisines to work with. Given a batch of ingredients, which of these many classes will the data fit?
Scikit-Learn offers several different algorithms to use to classify data, depending on the kind of problem you want to solve. In the next two lessons, you'll learn about several of these algorithms.
Scikit-learn offers several different algorithms to use to classify data, depending on the kind of problem you want to solve. In the next two lessons, you'll learn about several of these algorithms.
## Clean and Balance Your Data
## Clean and balance your data
The first task at hand before starting this project is to clean and **balance** your data to get better results. Start with the blank `notebook.ipynb` file ini the root of this folder.
The first task at hand before starting this project is to clean and **balance** your data to get better results. Start with the blank `notebook.ipynb` file in the root of this folder.
The first think to install is [imblearn](https://imbalanced-learn.org/stable/). This is a Scikit-Learn package that will allow you to better balance the data (you will learn more about this task in a minute).
The first thing to install is [imblearn](https://imbalanced-learn.org/stable/). This is a Scikit-learn package that will allow you to better balance the data (you will learn more about this task in a minute).
```python
pip install imblearn
@ -206,6 +209,8 @@ The data is nice and clean, balanced, and very delicious! You can take one more
transformed_df.to_csv("../../data/cleaned_cuisine.csv")
```
This fresh CSV can now be found in the root data folder.
---
## 🚀Challenge
This curriculum contains several interesting datasets. Dig through the `data` folders and see if any contain datasets that would be appropriate for binary or multi-class classification? What questions would you ask of this dataset?

@ -2,7 +2,7 @@
## Instructions
In [Scikit-Learn documentation](https://scikit-learn.org/stable/supervised_learning.html) you'll find a large list of ways to classify data. Do a little scavenger hunt in these docs: your goals is to look for classification methods and match a dataset in this curriculum, a question you can ask of it, and a technique of classification. Create a spreadsheet or table in a .doc file and explain how the dataset would work with the classification algorithm.
In [Scikit-learn documentation](https://scikit-learn.org/stable/supervised_learning.html) you'll find a large list of ways to classify data. Do a little scavenger hunt in these docs: your goals is to look for classification methods and match a dataset in this curriculum, a question you can ask of it, and a technique of classification. Create a spreadsheet or table in a .doc file and explain how the dataset would work with the classification algorithm.
## Rubric

@ -9,7 +9,7 @@
},
{
"source": [
"Install Imblearn which will enable SMOTE. This is a Scikit-Learn package that helps handle imbalanced data when performing classification. (https://imbalanced-learn.org/stable/)"
"Install Imblearn which will enable SMOTE. This is a Scikit-learn package that helps handle imbalanced data when performing classification. (https://imbalanced-learn.org/stable/)"
],
"cell_type": "markdown",
"metadata": {}

@ -1,13 +1,13 @@
# Cuisine Classifiers 1
# Cuisine classifiers 1
In this lesson, you will use the dataset you saved from the last lesson full of balanced, clean data all about cuisines. You will use this dataset with a variety of classifiers to predict a given national cuisine based on a group of ingredients. While doing so, you'll learn more about some of the ways that algorithms can be leveraged for classification tasks.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/21/)
# Preparation
Assuming you completed Lesson 1, make sure that a `cleaned_cuisines.csv` file exists in the root `/data` folder for these four lessons.
Assuming you completed [Lesson 1](../1-Introduction/README.md), make sure that a _cleaned_cuisines.csv_ file exists in the root `/data` folder for these four lessons.
Working in this lesson's `notebook.ipynb` folder, import that file along with the Pandas library:
Working in this lesson's _notebook.ipynb_ folder, import that file along with the Pandas library:
```python
import pandas as pd
@ -70,12 +70,11 @@ Your features look like this:
| 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Now you are ready to train your model!
## Choosing your classifier
Now that your data is clean and ready for training, you have to decide which algorithm to use for the job.
Scikit-Learn groups Classification under Supervised Learning, and in that category you will find many ways to classify. [The variety](https://scikit-learn.org/stable/supervised_learning.html) is quite bewildering at first sight. The following methods all include classification techniques:
Scikit-learn groups classification under Supervised Learning, and in that category you will find many ways to classify. [The variety](https://scikit-learn.org/stable/supervised_learning.html) is quite bewildering at first sight. The following methods all include classification techniques:
- Linear Models
- Support Vector Machines
@ -86,37 +85,34 @@ Scikit-Learn groups Classification under Supervised Learning, and in that catego
- Ensemble methods (voting Classifier)
- Multiclass and multioutput algorithms (multiclass and multilabel classification, multiclass-multioutput classification)
You can also use [neural networks to classify](https://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification), but that is outside the scope of this lesson.
> You can also use [neural networks to classify data](https://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification), but that is outside the scope of this lesson.
So, which classifier should you choose? Often, running through several and looking for a good result is a way to test. Scikit-Learn offers a [side-by-side comparison](https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html) on a created dataset, comparing KNeighbors, SVC two ways, GaussianProcessClassifier, DecisionTreeClassifier, RandomForestClassifier, MLPClassifier, AdaBoostClassifier, GaussianNB and QuadraticDiscrinationAnalysis, showing the results visualized:
So, which classifier should you choose? Often, running through several and looking for a good result is a way to test. Scikit-learn offers a [side-by-side comparison](https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html) on a created dataset, comparing KNeighbors, SVC two ways, GaussianProcessClassifier, DecisionTreeClassifier, RandomForestClassifier, MLPClassifier, AdaBoostClassifier, GaussianNB and QuadraticDiscrinationAnalysis, showing the results visualized:
![comparison of classifiers](images/comparison.png)
> Plots generated on Scikit-Learn's documentation
> Plots generated on Scikit-learn's documentation
> AutoML solves this problem neatly by running these comparisons in the cloud, allowing you to choose the best algorithm for your data. Try it [here](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-15963-cxa)
A better way than wildly guessing, however, is to follow the ideas on this downloadable [ML Cheat sheet](https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-15963-cxa). Here, we discover that, for our multiclass problem, we have some choices:
A better way than wildly guessing, however, is to follow the ideas on this downloadable [ML Cheat sheet](https://docs.microsoft.com/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-15963-cxa). Here, we discover that, for our multiclass problem, we have some choices:
![cheatsheet for multiclass problems](images/cheatsheet.png)
> A section of Microsoft's Algorithm Cheat Sheet, detailing multiclass classification options
✅ Download this cheat sheet, print it out, and hang it on your wall!
Given our clean, but minimal dataset, and the fact that we are running training locally via notebooks, neural networks are too heavyweight for this task. We do not use a two-class classifier, so that rules out One-vs-All. A
Decision Tree might work, or Logistic Regression for multiclass data. The Multiclass Boosted Decision Tree is most suitable for nonparametric tasks, e.g. tasks designed to build rankings, so it is not useful for us.
We can focus on Decision Trees and Logistic Regression.
Given our clean, but minimal dataset, and the fact that we are running training locally via notebooks, neural networks are too heavyweight for this task. We do not use a two-class classifier, so that rules out one-vs-all. A decision tree might work, or logistic regression for multiclass data. The multiclass boosted decision tree is most suitable for nonparametric tasks, e.g. tasks designed to build rankings, so it is not useful for us.
Let's focus on Logistic Regression for our first training trial since you recently learned about it in a previous lesson.
We can focus on logistic regression for our first training trial since you recently learned about the latter in a previous lesson.
## Train your model
Let's train that model. Split your data into training and testing groups:
Let's train a model. Split your data into training and testing groups:
```python
X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
```
There are many ways to use the LogisticRegression library in Scikit-Learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
There are many ways to use the LogisticRegression library in Scikit-learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
According to the docs, "In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the multi_class option is set to ovr, and uses the cross-entropy loss if the multi_class option is set to multinomial. (Currently the multinomial option is supported only by the lbfgs, sag, saga and newton-cg solvers.)"
@ -124,11 +120,11 @@ Since you are using the multiclass case, you need to choose what scheme to use a
Use LogisticRegression with a multiclass setting and the liblinear solver to train.
> 🎓 The 'scheme' here can either be 'ovr' (one-vs-rest) or 'multinomial'. Since Logistic Regression is really designed to support binary classification, these schemes allow it to better handle multiclass classification tasks. [source](https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/)
> 🎓 The 'scheme' here can either be 'ovr' (one-vs-rest) or 'multinomial'. Since logistic regression is really designed to support binary classification, these schemes allow it to better handle multiclass classification tasks. [source](https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/)
> 🎓 The 'solver' is defined as "the algorithm to use in the optimization problem". [source](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).
Scikit-Learn offers this table to explain how solvers handle different challenges presented by different kinds of data structures:
Scikit-learn offers this table to explain how solvers handle different challenges presented by different kinds of data structures:
![solvers](images/solvers.png)
@ -183,7 +179,7 @@ The result is printed - Indian cuisine is its best guess, with good probability:
✅ Can you explain why the model is pretty sure this is an Indian cuisine?
Get more detail by printing a classification report, as you did in the Regression lessons:
Get more detail by printing a classification report, as you did in the regression lessons:
```python
y_pred = model.predict(X_test)
@ -203,12 +199,12 @@ print(classification_report(y_test,y_pred))
## 🚀Challenge
In this lesson, you used your cleaned data to build a machine learning model that can predict a national cuisine based on a series of ingredients. Take some time to read through the many options Scikit-Learn provides to classify data. Dig deeper into the concept of 'solver' to understand what goes on behind the scenes.
In this lesson, you used your cleaned data to build a machine learning model that can predict a national cuisine based on a series of ingredients. Take some time to read through the many options Scikit-learn provides to classify data. Dig deeper into the concept of 'solver' to understand what goes on behind the scenes.
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/22/)
## Review & Self Study
Dig a little more into the math behind Logistic Regression in [this lesson](https://people.eecs.berkeley.edu/~russell/classes/cs194/f11/lectures/CS194%20Fall%202011%20Lecture%2006.pdf)
Dig a little more into the math behind logistic regression in [this lesson](https://people.eecs.berkeley.edu/~russell/classes/cs194/f11/lectures/CS194%20Fall%202011%20Lecture%2006.pdf)
## Assignment
[Study the solvers](assignment.md)

@ -1,20 +1,20 @@
# Cuisine Classifiers 2
# Cuisine classifiers 2
In this second Classification lesson, you will explore more ways to classify numeric data, and the ramifications for choosing one over the other.
In this second classification lesson, you will explore more ways to classify numeric data. You will also learn about the ramifications for choosing one over the other.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/23/)
### Prerequisite
We assume that you have completed the previous lessons and have a cleaned dataset in your `data` folder called `cleaned_cuisine.csv` in the root of this 4-lesson folder.
We assume that you have completed the previous lessons and have a cleaned dataset in your `data` folder called _cleaned_cuisine.csv_ in the root of this 4-lesson folder.
### Preparation
We have loaded your `notebook.ipynb` file with the cleaned dataset and have divided it into X and y dataframes, ready for the model building process.
We have loaded your _notebook.ipynb_ file with the cleaned dataset and have divided it into X and y dataframes, ready for the model building process.
## A Classification Map
## A classification map
Previously, you learned about the various options you have when classifying data using Microsoft's cheat sheet. Scikit-Learn offers a similar, but more granular cheat sheet that can further help narrow down your estimators (another term for classifiers):
Previously, you learned about the various options you have when classifying data using Microsoft's cheat sheet. Scikit-learn offers a similar, but more granular cheat sheet that can further help narrow down your estimators (another term for classifiers):
![ML Map from Scikit-Learn](images/map.png)
![ML Map from Scikit-learn](images/map.png)
> Tip: [visit this map online](https://scikit-learn.org/stable/tutorial/machine_learning_map/) and click along the path to read documentation.
This map is very helpful once you have a clear grasp of your data, as you can 'walk' along its paths to a decision:
@ -28,7 +28,7 @@ This map is very helpful once you have a clear grasp of your data, as you can 'w
- We can try a ✨ KNeighbors Classifier
- If that doesn't work, try ✨ SVC and ✨ Ensemble Classifiers
This is a terrific trail to try. Following this path, we should start by importing some libraries to use:
This is a very helpful trail to follow. Following this path, we should start by importing some libraries to use:
```python
from sklearn.neighbors import KNeighborsClassifier
@ -44,7 +44,7 @@ Split your training and test data:
```python
X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
```
## Linear SVC Classifier
## Linear SVC classifier
Start by creating an array of classifiers. You will add progressively to this array as we test. Start with a Linear SVC:
@ -87,10 +87,10 @@ weighted avg 0.79 0.79 0.79 1199
✅ Learn about Linear SVC
Support-Vector Clustering (SVC) is a child of the Support-Vector machines family of ML techniques (learn more about these below). In this method, you can choose a 'kernel' to decide how to cluster the labels. The 'C' parameter refers to 'regularization' which regulates the influence of parameters. The kernel can be one of [several](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC); here we set it to 'linear' to ensure that we leverage Linear SVC. Probability defaults to 'false'; here we set it to 'true' to gather probability estimates. We set the random state to '0' to shuffle the data to get probabilities.
## K-Neighbors Classifier
Support-Vector clustering (SVC) is a child of the Support-Vector machines family of ML techniques (learn more about these below). In this method, you can choose a 'kernel' to decide how to cluster the labels. The 'C' parameter refers to 'regularization' which regulates the influence of parameters. The kernel can be one of [several](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC); here we set it to 'linear' to ensure that we leverage linear SVC. Probability defaults to 'false'; here we set it to 'true' to gather probability estimates. We set the random state to '0' to shuffle the data to get probabilities.
## K-Neighbors classifier
The previous classifier was good, and worked well with the data, but maybe we can get better accuracy. Try a K-Neighbors Classifer. Add a line to your classifier array (add a comma after the Linear SVC item):
The previous classifier was good, and worked well with the data, but maybe we can get better accuracy. Try a K-Neighbors classifier. Add a line to your classifier array (add a comma after the Linear SVC item):
```python
'KNN classifier': KNeighborsClassifier(C),
@ -140,7 +140,7 @@ weighted avg 0.84 0.83 0.83 1199
✅ Learn about [Support-Vectors](https://scikit-learn.org/stable/modules/svm.html#svm)
Support-Vector Classifiers are part of the [Support-Vector Machine](https://en.wikipedia.org/wiki/Support-vector_machine) family of ML methods that are used for classification and regression tasks. SVMs "map training examples to points in space" to maximize the distance between two categories. Subsequent data is mapped into this space so their category can be predicted.
Support-Vector classifiers are part of the [Support-Vector Machine](https://wikipedia.org/wiki/Support-vector_machine) family of ML methods that are used for classification and regression tasks. SVMs "map training examples to points in space" to maximize the distance between two categories. Subsequent data is mapped into this space so their category can be predicted.
## Ensemble Classifiers
Let's follow the path to the very end, even though the previous test was quite good. Let's try some 'Ensemble Classifiers, specifically Random Forest and AdaBoost:
@ -185,6 +185,8 @@ This method of Machine Learning "combines the predictions of several base estima
- [Random Forest](https://scikit-learn.org/stable/modules/ensemble.html#forest), an averaging method, builds a 'forest' of 'decision trees' infused with randomness to avoid overfitting. The n_estimators parameter is set to the number of trees.
- [AdaBoost](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html) fits a classifier to a dataset and then fits copies of that classifier to the same dataset. It focuses on the weights of incorrectly classified items and adjusts the fit for the next classifier to correct.
---
## 🚀Challenge
Each of these techniques has a large number of parameters that you can tweak. Research each one's default parameters and think about what tweaking these parameters would mean for the model's quality.
@ -192,7 +194,7 @@ Each of these techniques has a large number of parameters that you can tweak. Re
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/24/)
## Review & Self Study
There's a lot of jargon in these lessons, so take a minute to review [this list](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/glossary?WT.mc_id=academic-15963-cxa) of useful terminology!
There's a lot of jargon in these lessons, so take a minute to review [this list](https://docs.microsoft.com/dotnet/machine-learning/resources/glossary?WT.mc_id=academic-15963-cxa) of useful terminology!
## Assignment
[Parameter Play](assignment.md)
[Parameter play](assignment.md)

@ -24,7 +24,7 @@ First, train a classification model using the cleaned cuisines dataset we used.
pip install skl2onnx
import pandas as pd
```
You need '[skl2onnx](https://onnx.ai/sklearn-onnx/)' to help convert your Scikit-Learn model to Onnx format.
You need '[skl2onnx](https://onnx.ai/sklearn-onnx/)' to help convert your Scikit-learn model to Onnx format.
Then, work with your data in the same way you did in previous lessons:
@ -48,7 +48,7 @@ y.head()
```
Commence the training routine. We will use the 'SVC' library which has good accuracy. Import the appropriate libraries from Scikit-Learn:
Commence the training routine. We will use the 'SVC' library which has good accuracy. Import the appropriate libraries from Scikit-learn:
```python
from sklearn.model_selection import train_test_split

@ -1,4 +1,4 @@
# Getting Started with Classification
# Getting started with classification
## Regional topic: Delicious Asian and Indian Cuisines 🍜
In Asia and India, food traditions are extremely diverse, and very delicious! Let's look at data about regional cuisines to try to guess where they originated.
@ -8,18 +8,18 @@ In Asia and India, food traditions are extremely diverse, and very delicious! Le
## What you will learn
In this section, you will build on the skills you learned in Lesson 1 (Regression) to learn about other classifiers you can use that will help you learn about your data.
In this section, you will build on the skills you learned in the first part of this curriculum all about regressionn to learn about other classifiers you can use that will help you learn about your data.
> There are useful low-code tools that can help you learn about working with Classification models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-classification-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)
> There are useful low-code tools that can help you learn about working with classification models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-classification-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)
## Lessons
1. [Introduction to Classification](1-Introduction/README.md)
2. [More Classifiers](2-Classifiers-1/README.md)
3. [Yet Other Classifiers](3-Classifiers-2/README.md)
4. [Applied ML: Build a Web App](4-Applied/README.md)
1. [Introduction to classification](1-Introduction/README.md)
2. [More classifiers](2-Classifiers-1/README.md)
3. [Yet other classifiers](3-Classifiers-2/README.md)
4. [Applied ML: build a web app](4-Applied/README.md)
## Credits
"Getting Started with Classification" was written with ♥️ by [Cassie Breviu](https://www.twitter.com/cassieview) and [Jen Looper](https://www.twitter.com/jenlooper)
"Getting started with classification" was written with ♥️ by [Cassie Breviu](https://www.twitter.com/cassieview) and [Jen Looper](https://www.twitter.com/jenlooper)
The delicious cuisines dataset was sourced from [Kaggle](https://www.kaggle.com/hoandan/asian-and-indian-cuisines)

@ -1,35 +1,35 @@
# Introduction to Clustering
# Introduction to clustering
Clustering is a type of [Unsupervised Learning](https://wikipedia.org/wiki/Unsupervised_learning) that presumes that a dataset is unlabelled. It uses various algorithms to sort through unlabeled data and provide groupings according to patterns it discerns in the data.
Clustering is a type of [Unsupervised Learning](https://wikipedia.org/wiki/Unsupervised_learning) that presumes that a dataset is unlabelled or that its inputs are not matched with predefined outputs. It uses various algorithms to sort through unlabeled data and provide groupings according to patterns it discerns in the data.
[![No One Like You by PSquare](https://img.youtube.com/vi/ty2advRiWJM/0.jpg)](https://youtu.be/ty2advRiWJM "No One Like You by PSquare")
> 🎥 Click the image above for a video. While you're studying Machine Learning with Clustering, enjoy some Nigerian Dance Hall tracks - this is a highly rated song from 2014 by PSquare.
> 🎥 Click the image above for a video. While you're studying machine learning with clustering, enjoy some Nigerian Dance Hall tracks - this is a highly rated song from 2014 by PSquare.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/27/)
### Introduction
[Clustering](https://link.springer.com/referenceworkentry/10.1007%2F978-0-387-30164-8_124) is very useful for data exploration. Let's see if it can help discover trends and patterns in the way Nigerian audiences consume music.
✅ Take a minute to think about the uses of clustering. In real life, clustering happens whenever you have a pile of laundry and need to sort out your family members' clothes 🧦👕👖🩲. In data science, clustering happens when trying to analyze a user's preferences, or determine the characteristics of any unlabeled dataset. Clustering, in a way, helps make sense of chaos.
✅ Take a minute to think about the uses of clustering. In real life, clustering happens whenever you have a pile of laundry and need to sort out your family members' clothes 🧦👕👖🩲. In data science, clustering happens when trying to analyze a user's preferences, or determine the characteristics of any unlabeled dataset. Clustering, in a way, helps make sense of chaos, like a sock drawer.
[![Introduction to ML](https://img.youtube.com/vi/esmzYhuFnds/0.jpg)](https://youtu.be/esmzYhuFnds "Introduction to Clustering")
> 🎥 Click the image above for a video: MIT's John Guttag introduces Clustering
> 🎥 Click the image above for a video: MIT's John Guttag introduces clustering
In a professional setting, clustering can be used to determine things like market segmentation, determining what age groups buy what items, for example. Another use would be anomaly detection, perhaps to detect fraud from a dataset of credit card transactions. Or you might use clustering to determine tumors in a batch of medical scans.
✅ Think a minute about how you might have encountered clustering 'in the wild', in a banking, e-commerce, or business setting.
> 🎓 Interestingly, Cluster Analysis originated in the fields of Anthropology and Psychology in the 1930s. Can you imagine how it might have been used?
> 🎓 Interestingly, cluster analysis originated in the fields of Anthropology and Psychology in the 1930s. Can you imagine how it might have been used?
Alternately, you could use it for grouping search results - by shopping links, images, or reviews, for example. Clustering is useful when you have a large dataset that you want to reduce and on which you want to perform more granular analysis, so the technique can be used to learn about data before other models are constructed.
✅ Once your data is organized in clusters, you assign it a cluster Id, and this technique can be useful when preserving a dataset's privacy; you can instead refer to a data point by its cluster id, rather than by more revealing identifiable data. Can you think of other reasons why you'd refer to a cluster Id rather than other elements of the cluster to identify it?
Deepen your understanding of Clustering techniques in this [Learn module](https://docs.microsoft.com/learn/modules/train-evaluate-cluster-models?WT.mc_id=academic-15963-cxa)
Deepen your understanding of clustering techniques in this [Learn module](https://docs.microsoft.com/learn/modules/train-evaluate-cluster-models?WT.mc_id=academic-15963-cxa)
## Getting started with clustering
[Scikit-Learn offers a large array](https://scikit-learn.org/stable/modules/clustering.html) of methods to perform clustering. The type you choose will depend on your use case. According to the documentation, each method has various benefits. Here is a simplified table of the methods supported by Scikit-Learn and their appropriate use cases:
[Scikit-learn offers a large array](https://scikit-learn.org/stable/modules/clustering.html) of methods to perform clustering. The type you choose will depend on your use case. According to the documentation, each method has various benefits. Here is a simplified table of the methods supported by Scikit-learn and their appropriate use cases:
| Method name | Use case |
| :--------------------------- | :--------------------------------------------------------------------- |
@ -75,13 +75,13 @@ Deepen your understanding of Clustering techniques in this [Learn module](https:
>
> Data that is 'noisy' is considered to be 'dense'. The distances between points in each of its clusters may prove, on examination, to be more or less dense, or 'crowded' and thus this data needs to be analyzed with the appropriate clustering method. [This article](https://www.kdnuggets.com/2020/02/understanding-density-based-clustering.html) demonstrates the difference between using K-Means clustering vs. HDBSCAN algorithms to explore a noisy dataset with uneven cluster density.
### Clustering Algorithms
### Clustering algorithms
There are over 100 clustering algorithms, and their use depends on the nature of the data at hand. Let's discuss some of the major ones:
**Hierarchical clustering**
If an object is classified by its proximity to a nearby object, rather than to one farther away, clusters are formed based on their members' distance to and from other objects. Scikit-Learn's Agglomerative clustering is hierarchical.
If an object is classified by its proximity to a nearby object, rather than to one farther away, clusters are formed based on their members' distance to and from other objects. Scikit-learn's agglomerative clustering is hierarchical.
![Hierarchical clustering Infographic](./images/hierarchical.png)
> Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
@ -95,7 +95,7 @@ This popular algorithm requires the choice of 'k', or the number of clusters to
**Distribution-based clustering**
Based in statistical modeling, distribution-based clustering centers on determining the probability that a data point belongs to a cluster, and assigning it accordingly. Gaussian Mixture methods belong to this type.
Based in statistical modeling, distribution-based clustering centers on determining the probability that a data point belongs to a cluster, and assigning it accordingly. Gaussian mixture methods belong to this type.
**Density-based clustering**
@ -120,7 +120,7 @@ Append the song data .csv file. Load up a dataframe with some data about the son
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv("../../data/nigerian-songs.csv")
df = pd.read_csv("../data/nigerian-songs.csv")
df.head()
```
@ -211,6 +211,8 @@ df.describe()
| 75% | 2017 | 242098.5 | 31 | 0.8295 | 0.403 | 0.87575 | 0.000234 | 0.164 | -3.331 | 0.177 | 125.03925 | 4 |
| max | 2020 | 511738 | 73 | 0.966 | 0.954 | 0.995 | 0.91 | 0.811 | 0.582 | 0.514 | 206.007 | 5 |
> 🤔 If we are working with clustering, an unsupervised method that does not require labeled data, why are we showing this data with labels? In the data exploration phase, they come in handy, but they are not necessary for the clustering algorithms to work. You could just as well remove the column headers and refer to the data by column number.
Look at the general values of the data. Note that popularity can be '0', which show songs that have no ranking. Let's remove those shortly.
Use a barplot to find out the most popular genres:
@ -271,7 +273,7 @@ Is there any convergence in this dataset around a song's perceived popularity an
✅ Try different datapoints (energy, loudness, speechiness) and more or different musical genres. What can you discover? Take a look at the `df.describe()` table to see the general spread of the data points.
### Data Distribution
### Data distribution
Are these three genres significantly different in the perception of their danceability, based on their popularity? Examine our top three genres data distribution for popularity and danceability along a given x and y axis.
@ -289,7 +291,7 @@ You can discover concentric circles around a general point of convergence, showi
> 🎓 Note that this example uses a KDE (Kernel Density Estimate) graph that represents the data using a continuous probability density curve. This allows us to interpret data when working with multiple distributions.
In general, the three genres align loosely in terms of their popularity and danceability. Determining clusters in this loosely-aligned data will be interesting:
In general, the three genres align loosely in terms of their popularity and danceability. Determining clusters in this loosely-aligned data will be a challenge:
![distribution](images/distribution.png)
@ -303,7 +305,9 @@ sns.FacetGrid(df, hue="artist_top_genre", size=5) \
![Facetgrid](images/facetgrid.png)
In general, for clustering, you can use scatterplots to show clusters of data, so mastering this type of visualization is very useful. In the next lesson, we will take this filtered data and use k-means clustering to discover groups in this data that seems to overlap in interesting ways.
In general, for clustering, you can use scatterplots to show clusters of data, so mastering this type of visualization is very useful. In the next lesson, we will take this filtered data and use k-means clustering to discover groups in this data that see to overlap in interesting ways.
---
## 🚀Challenge
In preparation for the next lesson, make a chart about the various clustering algorithms you might discover and use in a production environment. What kinds of problems is the clustering trying to address?

@ -1,17 +1,17 @@
# K-Means Clustering
# K-Means clustering
[![Andrew Ng explains Clustering](https://img.youtube.com/vi/hDmNF9JG3lo/0.jpg)](https://youtu.be/hDmNF9JG3lo "Andrew Ng explains Clustering")
> 🎥 Click the image above for a video: Andrew Ng explains Clustering
> 🎥 Click the image above for a video: Andrew Ng explains clustering
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/29/)
In this lesson, you will learn how to create clusters using Scikit-Learn and the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for Clustering. Keep in mind that, as you learned in the earlier lesson, there are many ways to work with clusters and the method you use depends on your data. We will try K-Means as it's the most common Clustering technique. Let's get started!
In this lesson, you will learn how to create clusters using Scikit-learn and the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for Clustering. Keep in mind that, as you learned in the earlier lesson, there are many ways to work with clusters and the method you use depends on your data. We will try K-Means as it's the most common clustering technique. Let's get started!
Terms you will learn about:
- Silhouette Scoring
- Elbow Method
- Silhouette scoring
- Elbow method
- Inertia
- Variance
### Introduction
@ -22,7 +22,7 @@ Terms you will learn about:
> infographic by [Jen Looper](https://twitter.com/jenlooper)
The K-Means Clustering process [executes in a three-step process](https://scikit-learn.org/stable/modules/clustering.html#k-means):
The K-Means clustering process [executes in a three-step process](https://scikit-learn.org/stable/modules/clustering.html#k-means):
1. The algorithm selects k-number of center points by sampling from the dataset. After this, it loops:
1. It assigns each sample to the nearest centroid
@ -145,7 +145,7 @@ for i in range(1, 11):
> 🎓 Inertia: K-Means algorithms attempt to choose centroids to minimize 'inertia', "a measure of how internally coherent clusters are."[source](https://scikit-learn.org/stable/modules/clustering.html). The value is appended to the wcss variable on each iteration.
> 🎓 k-means++: In [Scikit-Learn](https://scikit-learn.org/stable/modules/clustering.html#k-means) you can use the 'k-means++' optimization, which "initializes the centroids to be (generally) distant from each other, leading to probably better results than random initialization.
> 🎓 k-means++: In [Scikit-learn](https://scikit-learn.org/stable/modules/clustering.html#k-means) you can use the 'k-means++' optimization, which "initializes the centroids to be (generally) distant from each other, leading to probably better results than random initialization.
### Elbow method
Previously, you surmised that, because you have targeted 3 song genres, you should choose 3 clusters. But is that the case? Use the 'elbow method' to make sure.
@ -162,7 +162,7 @@ plt.show()
Use the `wcss` variable that you built in the previous step to create a chart showing where the 'bend' in the elbow is, which indicates the optimum number of clusters. Maybe it **is** 3!
![elbow method](images/elbow.png)
### Display the Clusters
### Display the clusters
Try the process again, this time setting three clusters, and display the clusters as a scatterplot:
@ -192,13 +192,12 @@ This model's accuracy is not very good, and the shape of the clusters gives you
![clusters](images/clusters.png)
This data is too imbalanced, too little correlated and there is too much variance between the column values, to cluster well. In fact, the clusters that form are probably heavily influenced or skewed by the three genre categories we defined above. That was a learning process!
This data is too imbalanced, too little correlated and there is too much variance between the column values to cluster well. In fact, the clusters that form are probably heavily influenced or skewed by the three genre categories we defined above. That was a learning process!
In Scikit-Learn's documentation, you can see that a model like this one, with clusters not very well demarcated, has a 'variance' problem:
In Scikit-learn's documentation, you can see that a model like this one, with clusters not very well demarcated, has a 'variance' problem:
![problem models](images/problems.png)
> Infographic from Scikit-Learn
> Infographic from Scikit-learn
## Variance
Variance is defined as "the average of the squared differences from the Mean."[source](https://www.mathsisfun.com/data/standard-deviation.html) In the context of this clustering problem, it refers to data that the numbers of our dataset tend to diverge a bit too much from the mean.
@ -206,18 +205,20 @@ Variance is defined as "the average of the squared differences from the Mean."[s
✅ This is a great moment to think about all the ways you could correct this issue. Tweak the data a bit more? Use different columns? Use a different algorithm? Hint: Try [scaling your data](https://www.mygreatlearning.com/blog/learning-data-science-with-k-means-clustering/) to normalize it and test other columns.
> Try this '[variance calculator](https://www.calculatorsoup.com/calculators/statistics/variance-calculator.php)' to understand the concept a bit more.
---
## 🚀Challenge
Spend some time with this notebook, tweaking parameters. Can you improve the accuracy of the model by cleaning the data more (removing outliers, for example)? You can use weights to give more weight to given data samples. What else can you do to create better clusters?
Hint: Try to scale your data. There's commented code in the notebook that adds Standard Scaling to make the data columns resemble each other more closely in terms of range. You'll find that while the silhouette score goes down, the 'kink' in the elbow graph smooths out. This is because leaving the data unscaled allows data with less variance to carry more weight. Read a bit more on this problem [here](https://stats.stackexchange.com/questions/21222/are-mean-normalization-and-feature-scaling-needed-for-k-means-clustering/21226#21226).
Hint: Try to scale your data. There's commented code in the notebook that adds standard scaling to make the data columns resemble each other more closely in terms of range. You'll find that while the silhouette score goes down, the 'kink' in the elbow graph smooths out. This is because leaving the data unscaled allows data with less variance to carry more weight. Read a bit more on this problem [here](https://stats.stackexchange.com/questions/21222/are-mean-normalization-and-feature-scaling-needed-for-k-means-clustering/21226#21226).
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/30/)
## Review & Self Study
Take a look at Stanford's K-Means Simulator [here](https://stanford.edu/class/engr108/visualizations/kmeans/kmeans.html). You can use this tool to visualize sample data points and determine its centroids. With fresh data, click 'update' to see how long it takes to find convergence. You can edit the data's randomness, numbers of clusters and numbers of centroids. Does this help you get an idea of how the data can be grouped?
Also, take a look at [this handout on k-means](https://stanford.edu/~cpiech/cs221/handouts/kmeans.html) from Stanford
Also, take a look at [this handout on k-means](https://stanford.edu/~cpiech/cs221/handouts/kmeans.html) from Stanford.
## Assignment

@ -1,5 +1,5 @@
# Clustering Models for Machine Learning
## Regional topic: Clustering models for a Nigerian audience's musical taste 🎧
# Clustering models for machine learning
## Regional topic: clustering models for a Nigerian audience's musical taste 🎧
Nigeria's diverse audience has diverse musical tastes. Using data scraped from Spotify (inspired by [this article](https://towardsdatascience.com/country-wise-visual-analysis-of-music-taste-using-spotify-api-seaborn-in-python-77f5b749b421), let's look at some music popular in Nigeria. This dataset includes data about various songs' 'danceability' score, 'acousticness', loudness, 'speechiness', popularity and energy. It will be interesting to discover patterns in this data!
@ -8,17 +8,17 @@ Nigeria's diverse audience has diverse musical tastes. Using data scraped from S
Photo by <a href="https://unsplash.com/@marcelalaskoski?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Marcela Laskoski</a> on <a href="https://unsplash.com/s/photos/nigerian-music?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
In this series of lessons, you will discover new ways to analyze data using Clustering techniques. Clustering is particularly useful when your dataset lacks labels. If it does have labels, then Classification techniques such as those you learned in previous lessons are more useful. But in cases where you are looking to group unlabelled data, clustering is a great way to discover patterns.
In this series of lessons, you will discover new ways to analyze data using clustering techniques. Clustering is particularly useful when your dataset lacks labels. If it does have labels, then classification techniques such as those you learned in previous lessons might be more useful. But in cases where you are looking to group unlabelled data, clustering is a great way to discover patterns.
> There are useful low-code tools that can help you learn about working with Clustering models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-clustering-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)
> There are useful low-code tools that can help you learn about working with clustering models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-clustering-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)
## Lessons
1. [Introduction to Clustering](1-Visualize/README.md)
2. [K-Means Clustering](2-K-Means/README.md)
1. [Introduction to clustering](1-Visualize/README.md)
2. [K-Means clustering](2-K-Means/README.md)
## Credits
These lessons were written with 🎶 by [Jen Looper](https://www.twitter.com/jenlooper) with helpful reviews by [Rishit Dagli](https://rishit_dagli) and [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan).
The [Nigerian Songs](https://www.kaggle.com/sootersaalu/nigerian-songs-spotify) dataset was sourced from Kaggle as scraped from Spotify.
Useful K-Means examples that aided in creating this lesson include this [iris exploration](https://www.kaggle.com/bburns/iris-exploration-pca-k-means-and-gmm-clustering), this [introductory notebook](https://www.kaggle.com/prashant111/k-means-clustering-with-python), this [hypothetical NGO example](https://www.kaggle.com/ankandash/pca-k-means-clustering-hierarchical-clustering) and
Useful K-Means examples that aided in creating this lesson include this [iris exploration](https://www.kaggle.com/bburns/iris-exploration-pca-k-means-and-gmm-clustering), this [introductory notebook](https://www.kaggle.com/prashant111/k-means-clustering-with-python), and this [hypothetical NGO example](https://www.kaggle.com/ankandash/pca-k-means-clustering-hierarchical-clustering).

@ -1,8 +1,6 @@
# Introduction to Natural Language Processing
# Introduction to natural language processing
This lesson covers a brief history and important concepts of *Computational Linguistics* focusing on *Natural Language Processing*.
[![NLP 101](https://img.youtube.com/vi/C75SiVhXjRM/0.jpg)](https://youtu.be/C75SiVhXjRM "NLP 101")
This lesson covers a brief history and important concepts of *computational linguistics* focusing on *natural language processing*.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/31/)
## Introduction
@ -13,14 +11,13 @@ NLP, as it is commonly known, is one of the best-known areas where machine learn
You will learn about how the ideas about languages developed and what the major areas of study have been. You will also learn definitions and concepts about how computers process text, including parsing, grammar, and identifying nouns and verbs. There are some coding tasks in this lesson, and several important concepts are introduced that you will learn to code later on in the next lessons.
Computational linguistics is an area of research and development over many decades that studies how computers can work with, and even understand, translate, and communicate with languages. Natural Language Processing (NLP) is a related field focused on how computers can process 'natural', or human, languages. If you have ever dictated to your phone instead of typing or asked a virtual assistant a question, your speech was converted into a text form and then processed or *parsed* from the language you spoke. The detected keywords were then processed into a format that the phone or assistant could understand and act on.
Computational linguistics is an area of research and development over many decades that studies how computers can work with, and even understand, translate, and communicate with languages. natural language processing (NLP) is a related field focused on how computers can process 'natural', or human, languages. If you have ever dictated to your phone instead of typing or asked a virtual assistant a question, your speech was converted into a text form and then processed or *parsed* from the language you spoke. The detected keywords were then processed into a format that the phone or assistant could understand and act on.
This is possible because someone wrote a computer program to do this. A few decades ago, some science fiction writers predicted that people would mostly speak to their computers, and the computers would always understand exactly what they meant. Sadly, it turned out to be a harder problem that many imagined, and while it is a much better understood problem today, there are significant challenges in achieving 'perfect' natural language processing when it comes to understanding the meaning of a sentence. This is a particularly hard problem when it comes to understanding humour or detecting emotions such as sarcasm in a sentence.
At this point, you may be remembering school classes where the teacher covered the parts of grammar in a sentence. In some countries, students are taught grammar and linguistics as a dedicated subject, but in many, these topics are included as part of learning a language: either your first language in primary school (learning to read and write) and perhaps a second language in post-primary, or high school. Don't worry if you are not an expert at differentiating nouns from verbs or adverbs from adjectives!
At this point, you may be remembering school classes where the teacher covered the parts of grammar in a sentence. In some countries, students are taught grammar and linguistics as a dedicated subject, but in many, these topics are included as part of learning a language: either your first language in primary school (learning to read and write) and perhaps a second language in post-primary, or high school. Don't worry if you are not an expert at differentiating nouns from verbs or adverbs from adjectives!
If you struggle with the difference between the *simple present* and *present progressive*, you are not alone. This is a challenging thing for many people, even native speakers of a language. The good news is that computers are really good at applying formal rules, and you will learn to write code that can *parse* a sentence as well as a human. The greater challenge you will examine later is understanding the *meaning*, and *sentiment*, of a sentence.
## Prerequisites
For this lesson, the main prerequisite is being able to read and understand the language of this lesson. There are no math problems or equations to solve. While the original author wrote this lesson in English, it is also translated into other languages, so you could be reading a translation. There are examples where a number of different languages are used (to compare the different grammar rules of different languages). These are *not* translated, but the explanatory text is, so the meaning should be clear.
@ -43,17 +40,20 @@ In this section, you will need:
## Conversing with Eliza
The history of trying to make computers understand human language goes back decades, and one of the earliest scientists to consider natural language processing was *Alan Turing*. When Turing was researching *Artificial Intelligence* in the 1950's, he considered if a conversational test could be given to a human and computer (via typed correspondence) where the human in the conversation was not sure if they were conversing with another human or a computer. If, after a certain length of conversation, the human could not determine that the answers were from a computer or not, then could the computer be said to be *thinking*?
[![Chatting with Eliza](https://img.youtube.com/vi/QD8mQXaUFG4/0.jpg)](https://youtu.be/QD8mQXaUFG4 "Chatting with Eliza")
The history of trying to make computers understand human language goes back decades, and one of the earliest scientists to consider natural language processing was *Alan Turing*. When Turing was researching *artificial intelligence* in the 1950's, he considered if a conversational test could be given to a human and computer (via typed correspondence) where the human in the conversation was not sure if they were conversing with another human or a computer. If, after a certain length of conversation, the human could not determine that the answers were from a computer or not, then could the computer be said to be *thinking*?
The idea for this came from a party game called *The Imitation Game* where an interrogator is alone in a room and tasked with determining which of two people (in another room) are male and female respectively. The interrogator can send notes, and must try to think of questions where the written answers reveal the gender of the mystery person. Of course, the players in the other room are trying to trick the interrogator by answering questions in such as way as to mislead or confuse the interrogator, whilst also giving the appearance of answering honestly.
In the 1960's an MIT scientist called *Joseph Weizenbaum* developed [*Eliza*](https:/wikipedia.org/wiki/ELIZA), a computer 'therapist' that would ask the human questions and give the appearance of understanding their answers. However, while Eliza could parse a sentence and identify certain grammatical constructs and keywords so as to give a reasonable answer, it could not be said to *understand* the sentence. If Eliza was presented with a sentence following the format "**I am** <u>sad</u>" it might rearrange and substitute words in the sentence to form the response "How long have **you been** <u>sad</u>". This gave the impression that Eliza understood the statement and was asking a follow-on question, whereas in reality, it was changing the tense and adding some words. If Eliza could not identify a keyword that it had a response for, it would instead give a random response that could be applicable to many different statements. Eliza could be easily tricked, for instance if a user wrote "**You are** a <u>bicycle</u>" it might respond with "How long have **I been** a <u>bicycle</u>?", instead of a more reasoned response.
In the 1960's an MIT scientist called *Joseph Weizenbaum* developed [*Eliza*](https:/wikipedia.org/wiki/ELIZA), a computer 'therapist' that would ask the human questions and give the appearance of understanding their answers. However, while Eliza could parse a sentence and identify certain grammatical constructs and keywords so as to give a reasonable answer, it could not be said to *understand* the sentence. If Eliza was presented with a sentence following the format "**I am** <u>sad</u>" it might rearrange and substitute words in the sentence to form the response "How long have **you been** <u>sad</u>".
This gave the impression that Eliza understood the statement and was asking a follow-on question, whereas in reality, it was changing the tense and adding some words. If Eliza could not identify a keyword that it had a response for, it would instead give a random response that could be applicable to many different statements. Eliza could be easily tricked, for instance if a user wrote "**You are** a <u>bicycle</u>" it might respond with "How long have **I been** a <u>bicycle</u>?", instead of a more reasoned response.
[![Chatting with Eliza](https://img.youtube.com/vi/RMK9AphfLco/0.jpg)](https://youtu.be/RMK9AphfLco "Chatting with Eliza")
> 🎥 Click the image above for a video about original ELIZA program
> Note: You can read the original description of [Eliza](https://cacm.acm.org/magazines/1966/1/13317-elizaa-computer-program-for-the-study-of-natural-language-communication-between-man-and-machine/abstract) published in 1966 if you have an ACM account. Alternately, read about Eliza on [wikipedia](https://wikipedia.org/wiki/ELIZA)
### Task: Coding a basic conversational bot
### Exercise: Coding a basic conversational bot
A conversational bot, like Eliza, is a program that elicits user input and seems to understand and respond intelligently. Unlike Eliza, our bot will not have several rules giving it the appearance of having an intelligent conversation. Instead, out bot will have one ability only, to keep the conversation going with random responses that might work in almost any trivial conversation.
@ -108,6 +108,7 @@ One possible solution to the task is [here](solution/bot.py)
2. What features would the bot need to be more effective?
3. If a bot could really 'understand' the meaning of a sentence, would it need to 'remember' the meaning of previous sentences in a conversation too?
---
## 🚀Challenge
Choose one of the "stop and consider" elements above and either try to implement them in code or write a solution on paper using pseudocode.
@ -123,7 +124,6 @@ Take a look at the references below as further reading opportunities.
1. Schubert, Lenhart, "Computational Linguistics", *The Stanford Encyclopedia of Philosophy* (Spring 2020 Edition), Edward N. Zalta (ed.), URL = <https://plato.stanford.edu/archives/spr2020/entries/computational-linguistics/>.
2. Princeton University "About WordNet." [WordNet](https://wordnet.princeton.edu/). Princeton University. 2010.
## Assignment
[Search for a Bot](assignment.md)
[Search for a bot](assignment.md)

@ -1,9 +1,8 @@
# Common Natural Language Processing Tasks and Techniques
# Common natural language processing tasks and techniques
For most *Natural Language Processing* tasks, the text to be processed must be broken down, examined, and the results stored or cross referenced with rules and data sets. This allows the programmer to derive the meaning or intent or only the frequency of terms and words in a text.
For most *natural language processing* tasks, the text to be processed must be broken down, examined, and the results stored or cross referenced with rules and data sets. This allows the programmer to derive the meaning or intent or only the frequency of terms and words in a text.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/33/)
Let's discover common techniques used in processing text. Combined with machine learning, these techniques help you to analyse large amounts of text efficiently. Before applying ML to these tasks, however, let's understand the problems encountered by an NLP specialist.
## Tasks common to NLP
@ -175,6 +174,7 @@ One possible solution to the task is [here](solution/bot.py)
2. Does identifying the noun phrase make the bot more 'believable'?
3. Why would extracting a 'noun phrase' from a sentence a useful thing to do?
---
## 🚀Challenge
Take a task in the prior knowledge check and try to implement it. Test the bot on a friend. Can it trick them? Can you make your bot more 'believable?'

@ -1,9 +1,8 @@
# Translation and Sentiment Analysis with ML
# Translation and sentiment analysis with ML
In the previous lessons you learned how to build a basic bot using TextBlob, a library that embeds ML behind-the-scenes to perform basic NLP tasks such as noun phrase extraction. Another important challenge in computational linguistics is accurate *translation* of a sentence from one spoken or written language to another.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/35/)
This is a very hard problem compounded by the fact that there are thousands of languages and each can have very different grammar rules. One approach is to convert the formal grammar rules for one language, such as English, into a non-language dependent structure, and then translate it by converting back to another language. This means that you would take the following steps:
1. Identify or tag the words in input language into nouns, verbs etc.
@ -25,7 +24,7 @@ Naive translation leads to bad (and sometimes hilarious) mistranslations: `I fee
So far, you've learned about the formal rules approach to natural language processing. Another approach is to ignore the meaning of the words, and _instead use machine learning to detect patterns_. This can work in translation if you have lots of text (a *corpus*) or texts (*corpora*) in both the origin and target languages. For instance, consider the case of *Pride and Prejudice*, a well-known English novel written by Jane Austen in 1813. If you consult the book in English and a human translation of the book in *French*, you could detect phrases in one that are idiomatically translated into the other. You'll do that in a minute.
For instance, when an English phrase such as `John looked at the cake with a wolfish grin` is translated literally, to, say French, it might become `John regarda le gâteau avec un sourire de loup`. A reader of both languages would understand that the direct translation of `wolfish grin` is not the French translation `wolf smile` but a synonym - in this case for being very hungry or voracious. A better translation that a human might make would be `John regarda le gâteau avec voracité`, because it better conveys the meaning. If a ML model has enough human translations to build a model on, it can improve the accuracy of translations by identifying common patterns in texts that have been previously translated by expert human speakers of both languages.
For instance, when an English phrase such as `I have no money` is translated literally to French, it might become `Je n'ai pas de monnaie`. "Monnaie" is a tricky french 'false cognate', as 'money' and 'monnaie' are not synonymous. A better translation that a human might make would be `Je n'ai pas d'argent`, because it better conveys the meaning that you have no money (rather than 'loose change' which is the meaning of 'monnaie'). If a ML model has enough human translations to build a model on, it can improve the accuracy of translations by identifying common patterns in texts that have been previously translated by expert human speakers of both languages.
### Task: Translation
@ -58,8 +57,7 @@ Another area where machine learning can work very well is sentiment analysis. A
This approach is easily tricked as you may have seen in the Marvin task - the sentence `Great, that was a wonderful waste of time, I'm glad we are lost on this dark road` is a sarcastic, negative sentiment sentence, but the simple algorithm detects 'great', 'wonderful', 'glad' as positive and 'waste', 'lost' and 'dark' as negative. The overall sentiment is swayed by these conflicting words.
✅ Stop a second and think about how we convey sarcasm as human speakers. Tone inflection plays a large role. Try to say the phrase "Well, that film was awesome" to discover how your voice conveys meaning.
✅ Stop a second and think about how we convey sarcasm as human speakers. Tone inflection plays a large role. Try to say the phrase "Well, that film was awesome" in different ways to discover how your voice conveys meaning.
### Machine learning approaches
The ML approach would be to hand gather negative and positive bodies of text - tweets, or movie reviews, or anything where the human has given a score *and* a written opinion. Then NLP techniques can be applied to opinions and scores, so that patterns emerge (e.g., positive movie reviews tend to have the phrase 'Oscar worthy' more than negative movie reviews, or positive restaurant reviews say 'gourmet' much more than 'disgusting').
@ -70,7 +68,7 @@ The ML approach would be to hand gather negative and positive bodies of text - t
✅ Does this process sound like processes you have used in previous lessons?
### Task: Sentimental Sentences
### Exercise: sentimental sentences
Sentiment is measured in with a *polarity* of -1 to 1, meaning -1 is the most negative sentiment, and 1 is the most positive. Sentiment is also measured with an 0 - 1 score for objectivity (0) and subjectivity (1).
@ -111,7 +109,7 @@ Here is a sample [solution](solutions/book.py).
✅ Knowledge Check
1. The sentiment is based on words used in the sentence, but does it code *understand* the words?
1. The sentiment is based on words used in the sentence, but does the code *understand* the words?
2. Do you think the sentiment polarity is accurate, or in other words, do you *agree* with the scores?
1. In particular, do you agree or disagree with the absolute **positive** polarity of the following sentences?
* “What an excellent father you have, girls!” said she, when the door was shut.
@ -132,8 +130,9 @@ Here is a sample [solution](solutions/book.py).
- The pause was to Elizabeths feelings dreadful.
- It would be dreadful!
✅ Any aficionado of Jane Austen will understand that she often uses her books to critique the more ridiculous aspects of English Regency society. Elizabeth Bennett, the main character in Pride and Prejudice, is a keen social observer (like the author) and her language is often heavily nuanced. Even Mr. Darcy (the love interest in the story) notes Elizabeth's playful and teasing use of language: "I have had the pleasure of your acquaintance long enough to know that you find great enjoyment in occasionally professing opinions which in fact are not your own."
✅ Any aficionado of Jane Austen will understand that she often uses her books to critique the more ridiculous aspects of English Regency society. Elizabeth Bennett, the main character in *Pride and Prejudice*, is a keen social observer (like the author) and her language is often heavily nuanced. Even Mr. Darcy (the love interest in the story) notes Elizabeth's playful and teasing use of language: "I have had the pleasure of your acquaintance long enough to know that you find great enjoyment in occasionally professing opinions which in fact are not your own."
---
## 🚀Challenge
Can you make Marvin even better by extracting other features from the user input?
@ -146,4 +145,4 @@ There are many ways to extract sentiment from text. Think of the business applic
## Assignment
[Poetic License](assignment.md)
[Poetic license](assignment.md)

@ -1,22 +1,21 @@
# Getting Started with Natural Language Processing
# Getting started with natural language processing
## Regional topic: European literature and Romantic Hotels of Europe ❤️
## Regional topic: European languages and literature and romantic hotels of Europe ❤️
In this section of the curriculum, you will be introduced to one of the most widespread uses of machine learning: Natural Language Processing (NLP). Derived from Computational Linguistics, this category of Artificial Intelligence is the bridge between humans and machines via voice or textual communication.
In this section of the curriculum, you will be introduced to one of the most widespread uses of machine learning: natural language processing (NLP). Derived from computational linguistics, this category of artificial intelligence is the bridge between humans and machines via voice or textual communication.
In these lessons we'll learn the basics of NLP by building small conversational bots to learn how Machine Learning aids in making these conversations more and more 'smart'. You'll travel back in time, chatting with Elizabeth Bennett and Mr. Darcy from Jane Austen's classic novel, **Pride and Prejudice**, published in 1813. Then, you'll further your knowledge by learning about sentiment analysis via hotel reviews in Europe.
In these lessons we'll learn the basics of NLP by building small conversational bots to learn how machine learning aids in making these conversations more and more 'smart'. You'll travel back in time, chatting with Elizabeth Bennett and Mr. Darcy from Jane Austen's classic novel, **Pride and Prejudice**, published in 1813. Then, you'll further your knowledge by learning about sentiment analysis via hotel reviews in Europe.
![Pride and Prejudice book and tea](images/p&p.jpg)
> Photo by <a href="https://unsplash.com/@elaineh?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Elaine Howlin</a> on <a href="https://unsplash.com/s/photos/pride-and-prejudice?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
## Lessons
1. [Introduction to Natural Language Processing](1-Introduction-to-NLP/README.md)
2. [Common NLP Tasks and Techniques](2-Tasks/README.md)
3. [Translation and Sentiment Analysis with Machine Learning](3-Translation-Sentiment/README.md)
1. [Introduction to natural language processing](1-Introduction-to-NLP/README.md)
2. [Common NLP tasks and techniques](2-Tasks/README.md)
3. [Translation and sentiment analysis with machine learning](3-Translation-Sentiment/README.md)
4. TBD
5. TBD
## Credits
These Natural Language Processing lessons were written with ☕ by [Stephen Howell]([Twitter](https://twitter.com/Howell_MSFT))
These natural language processing lessons were written with ☕ by [Stephen Howell](https://twitter.com/Howell_MSFT)

@ -1,43 +1,43 @@
# Introduction to Time Series Forecasting
# Introduction to time series forecasting
![Summary of Time series in a sketchnote](../../sketchnotes/ml-timeseries.png)
![Summary of time series in a sketchnote](../../sketchnotes/ml-timeseries.png)
> Sketchnote by [Tomomi Imura](https://www.twitter.com/girlie_mac)
In this lesson and the following one, you will learn a bit about Time Series Forecasting, an interesting and valuable part of a ML scientist's repertoire that is a bit lesser known than other topics. Time Series Forecasting is a sort of crystal ball: based on past performance of a variable such as price, you can predict its future potential value.
In this lesson and the following one, you will learn a bit about time series forecasting, an interesting and valuable part of a ML scientist's repertoire that is a bit lesser known than other topics. Time series forecasting is a sort of crystal ball: based on past performance of a variable such as price, you can predict its future potential value.
[![Introduction to Time Series Forecasting](https://img.youtube.com/vi/wGUV_XqchbE/0.jpg)](https://youtu.be/wGUV_XqchbE "Introduction to Time Series Forecasting")
[![Introduction to time series forecasting](https://img.youtube.com/vi/wGUV_XqchbE/0.jpg)](https://youtu.be/wGUV_XqchbE "Introduction to time series forecasting")
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/39/)
It's a useful and interesting field with real value to business, given its direct application to problems of pricing, inventory, and supply chain issues. While deep learning techniques have started to be used to gain more insights in the prediction of future performance, Time Series Forecasting remains a field greatly informed by classic ML techniques.
It's a useful and interesting field with real value to business, given its direct application to problems of pricing, inventory, and supply chain issues. While deep learning techniques have started to be used to gain more insights in the prediction of future performance, time series forecasting remains a field greatly informed by classic ML techniques.
> Penn State's useful Time Series curriculum can be found [here](https://online.stat.psu.edu/stat510/lesson/1)
> Penn State's useful time series curriculum can be found [here](https://online.stat.psu.edu/stat510/lesson/1)
### Introduction
Supposing you maintain an array of smart parking meters that provide data about how often they are used and for how long over time. What if you could generate revenue to maintain your streets by slightly augmenting the prices of the meters when there is greater demand for them? What if you could predict, based on the meter's past performance, its future value according to the laws of supply and demand? This is a challenge that could be tackled by Time Series Forecasting. It wouldn't make those folks in search of a rare parking spot in busy times very happy to have to pay more for it, but it would be a sure way to generate revenue to clean the streets!
Supposing you maintain an array of smart parking meters that provide data about how often they are used and for how long over time. What if you could generate revenue to maintain your streets by slightly augmenting the prices of the meters when there is greater demand for them? What if you could predict, based on the meter's past performance, its future value according to the laws of supply and demand? This is a challenge that could be tackled by time series forecasting. It wouldn't make those folks in search of a rare parking spot in busy times very happy to have to pay more for it, but it would be a sure way to generate revenue to clean the streets!
Let's explore some of the types of Time Series algorithms and start a notebook to clean and prepare some data. The data you will analyze is taken from the GEFCom2014 forecasting competition. It consists of 3 years of hourly electricity load and temperature values between 2012 and 2014. Given the historical patterns of electricity load and temperature, you can predict future values of electricity load. In this example, you'll learn how to forecast one time step ahead, using historical load data only.
Let's explore some of the types of time series algorithms and start a notebook to clean and prepare some data. The data you will analyze is taken from the GEFCom2014 forecasting competition. It consists of 3 years of hourly electricity load and temperature values between 2012 and 2014. Given the historical patterns of electricity load and temperature, you can predict future values of electricity load. In this example, you'll learn how to forecast one time step ahead, using historical load data only.
Before starting, however, it's useful to understand what's going on behind the scenes.
## Some Definitions
## Some definitions
When encountering the term 'time series' you need to understand its use in several different contexts.
### Time Series
### Time series
In mathematics, "a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time." An example of a time series is the daily closing value of the [Dow Jones Industrial Average](https://wikipedia.org/wiki/Time_series). The use of time series plots and statistical modeling is frequently encountered in signal processing, weather forecasting, earthquake prediction, and other fields where events occur and data points can be plotted over time.
### Time Series Analysis
### Time series analysis
Time Series Analysis is the analysis of the above mentioned time series data. Time series data can take distinct forms, including 'interrupted time series' which detects patterns in a time series' evolution before and after an interrupting event. The type of analysis needed for the time series depends on the nature of the data. Time series data itself can take the form of series of numbers or characters.
Time series analysis is the analysis of the above mentioned time series data. Time series data can take distinct forms, including 'interrupted time series' which detects patterns in a time series' evolution before and after an interrupting event. The type of analysis needed for the time series depends on the nature of the data. Time series data itself can take the form of series of numbers or characters.
The analysis be performed using a variety of methods, including frequency-domain and time-domain, linear and nonlinear, and more. [Learn more](https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm) about the may ways to analyze this type of data.
### Time Series Forecasting
### Time series forecasting
Time Series Forecasting is the use of a model to predict future values based on patterns displayed by previously gathered data as it occurred in the past. While it is possible to use regression models to explore time series data, with time indices as x variables on a plot, this type of data is best analyzed using special types of models.
Time series forecasting is the use of a model to predict future values based on patterns displayed by previously gathered data as it occurred in the past. While it is possible to use regression models to explore time series data, with time indices as x variables on a plot, this type of data is best analyzed using special types of models.
Time series data is a list of ordered observations, unlike data that can be analyzed by linear regression. The most common one is ARIMA, an acronym that stands for "Autoregressive Integrated Moving Average".
@ -83,7 +83,7 @@ The data might display an abrupt change that might need further analysis. The ab
✅ Here is a [sample time series plot](https://www.kaggle.com/kashnitsky/topic-9-part-1-time-series-analysis-in-python) showing daily in-game currency spent over a few years. Can you identify any of the characteristics listed above in this data?
![in-game currency spend](./images/currency.png)
![In-game currency spend](./images/currency.png)
## Getting started with power usage data
@ -145,15 +145,16 @@ A beautiful plot! Take a look at these plots and see if you can determine any of
In the next lesson, you will create an ARIMA model to create some forecasts.
---
## 🚀Challenge
Make a list of all the industries and areas of inquiry you can think of that would benefit from Time Series Forecasting. Can you think of an application of these techniques in the arts? In Econometrics? Ecology? Retail? Industry? Finance? Where else?
Make a list of all the industries and areas of inquiry you can think of that would benefit from time series forecasting. Can you think of an application of these techniques in the arts? In Econometrics? Ecology? Retail? Industry? Finance? Where else?
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/40/)
## Review & Self Study
Although we won't cover them here, neural networks are sometimes used to enhance classic methods of Time Series Forecasting. Read more about them [in this article](https://medium.com/microsoftazure/neural-networks-for-forecasting-financial-and-economic-time-series-6aca370ff412)
Although we won't cover them here, neural networks are sometimes used to enhance classic methods of time series forecasting. Read more about them [in this article](https://medium.com/microsoftazure/neural-networks-for-forecasting-financial-and-economic-time-series-6aca370ff412)
## Assignment
[Visualize some more Time Series](assignment.md)
[Visualize some more time series](assignment.md)

@ -1,10 +1,10 @@
# Time Series Forecasting with ARIMA
# Time series forecasting with ARIMA
In the previous lesson, you learned a bit about Time Series Forecasting and loaded a dataset showing the fluctuations of electrical load over a time period.
In the previous lesson, you learned a bit about time series forecasting and loaded a dataset showing the fluctuations of electrical load over a time period.
[![Introduction to ARIMA](https://img.youtube.com/vi/IUSk-YDau10/0.jpg)](https://youtu.be/IUSk-YDau10 "Introduction to ARIMA")
> A brief introduction to ARIMA models. The example is done in R, but the concepts are universal.
> 🎥 Click the image above for a video: A brief introduction to ARIMA models. The example is done in R, but the concepts are universal.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/41/)
In this lesson, you will discover a specific way to build models with [ARIMA: *A*uto*R*egressive *I*ntegrated *M*oving *A*verage](https://wikipedia.org/wiki/Autoregressive_integrated_moving_average). ARIMA models are particularly suited to fit data that shows [non-stationarity](https://wikipedia.org/wiki/Stationary_process).
@ -13,7 +13,7 @@ In this lesson, you will discover a specific way to build models with [ARIMA: *A
> 🎓 [Differencing](https://wikipedia.org/wiki/Autoregressive_integrated_moving_average#Differencing) data, again from a statistical context, refers to the process of transforming non-stationary data to make it stationary by removing its non-constant trend. "Differencing removes the changes in the level of a time series, eliminating trend and seasonality and consequently stabilizing the mean of the time series." [Paper by Shixiong et al](https://arxiv.org/abs/1904.07632)
Let's unpack the parts of ARIMA to better understand how it helps us model Time Series and help us make predictions against it.
Let's unpack the parts of ARIMA to better understand how it helps us model time series and help us make predictions against it.
## AR - for AutoRegressive
Autoregressive models, as the name implies, look 'back' in time to analyze previous values in your data and make assumptions about them. These previous values are called 'lags'. An example would be data that shows monthly sales of pencils. Each month's sales total would be considered an 'evolving variable' in the dataset. This model is built as the "evolving variable of interest is regressed on its own lagged (i.e., prior) values." [wikipedia](https://wikipedia.org/wiki/Autoregressive_integrated_moving_average)
@ -348,8 +348,9 @@ plt.show()
![a time series model](images/accuracy.png)
A very nice plot, showing a model with good accuracy. Well done!
🏆 A very nice plot, showing a model with good accuracy. Well done!
---
## 🚀Challenge
Dig into the ways to test the accuracy of a Time Series Model. We touch on MAPE in this lesson, but are there other methods you could use? Research them and annotate them. A helpful document can be found [here](https://otexts.com/fpp2/accuracy.html)

@ -1,21 +1,18 @@
# Time Series Forecasting
# Introduction to time series forecasting
## Regional topic: worldwide electricity usage ✨
## Regional topic: Worldwide Electricity Usage ✨
In these two lessons, you will be introduced to Time Series Forecasting, a somewhat lesser known area of Machine Learning that is nevertheless extremely valuable for industry and business applications, among other fields. While neural networks can be used to enhance the utility of these models, we will study them in the context of classical machine learning as models help predict future performance based on the past.
In these two lessons, you will be introduced to time series forecasting, a somewhat lesser known area of machine learning that is nevertheless extremely valuable for industry and business applications, among other fields. While neural networks can be used to enhance the utility of these models, we will study them in the context of classical machine learning as models help predict future performance based on the past.
Our regional focus is electrical usage in the world, an interesting dataset to learn about forecasting future power usage based on patterns of past load. You can see how this kind of forecasting can be extremely helpful in a business environment.
![electric grid](images/electric-grid.jpg)
Photo by <a href="https://unsplash.com/@shutter_log?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Peddi Sai hrithik</a> of electrical towers on a road in Rajasthan on <a href="https://unsplash.com/s/photos/electric-india?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
## Lessons
1. [Introduction to Time Series Forecasting](1-Introduction/README.md)
2. [Building ARIMA Time Series Models](2-ARIMA/README.md)
1. [Introduction to time series forecasting](1-Introduction/README.md)
2. [Building ARIMA time series models](2-ARIMA/README.md)
## Credits
"Time Series Forecasting" was written with ⚡️ by [Francesca Lazzeri](https://twitter.com/frlazzeri) and [Jen Looper](https://twitter.com/jenlooper)
"Introduction to time series forecasting" was written with ⚡️ by [Francesca Lazzeri](https://twitter.com/frlazzeri) and [Jen Looper](https://twitter.com/jenlooper)

@ -9,15 +9,17 @@ In this lesson, we will explore the world of **[Peter and the Wolf](https://en.w
### Prerequisites and Setup
In this lesson, we will be experimenting with some code in Python. So you are expected to be able to run the Jupyter Notebook code from this lesson, either on your computer, or somewhere in the cloud.
In this lesson, we will be experimenting with some code in Python. You should be able to run the Jupyter Notebook code from this lesson, either on your computer or somewhere in the cloud.
You can open [the lesson notebook](notebook.ipynb) and continue reading the material there, or continue reading here, and run the code in your favorite Python environment.
You can open [the lesson notebook](notebook.ipynb) and continue reading the material there, or continue reading here, and run the code in your favorite Python environment.
> **Note:** If you are opening this code from the cloud, you also need to fetch the [`rlboard.py`](rlboard.py) file, which is used in the notebook code. Add it to the same directory as the notebook.
> **Note:** If you are opening this code from the cloud, you also need to fetch [`rlboard.py`](rlboard.py) file, because notebook code uses it. Put it into the same directory with the notebook.
## Introduction
**Reinforcement Learning** (RL) is a learning technique that allows us to learn an optimal behavior of an **agent** in some **environment** by running many experiments. An agent in this environment should have some **goal**, defined by a **reward function**.
## The Environment
## The environment
For simplicity, let's consider Peter's world to be a square board of size `width` x `height`, like this:
@ -27,9 +29,9 @@ Each cell in this board can either be:
* **ground**, on which Peter and other creatures can walk
* **water**, on which you obviously cannot walk
* **a tree** or **grass** - a place where you can take some rest
* **an apple**, which represents something Peter would be glad to find in order to feed himself
* **a wolf**, which is dangerous and should be avoided
* a **tree** or **grass**, a place where you can rest
* an **apple**, which represents something Peter would be glad to find in order to feed himself
* a **wolf**, which is dangerous and should be avoided
There is a separate Python module, [`rlboard.py`](rlboard.py), which contains the code to work with this environment. Because this code is not important for understanding our concepts, we will just import the module and use it to create the sample board (code block 1):
@ -41,24 +43,25 @@ m = Board(width,height)
m.randomize(seed=13)
m.plot()
```
This code should print the picture of the environment similar to the one above.
## Actions and Policy
This code should print a picture of the environment similar to the one above.
## Actions and policy
In our example, Peter's goal would be to find an apple, while avoiding the wolf and other obstacles. To do this, he can essentially walk around until he finds and apple. Therefore, at any position he can chose between one of the following actions: up, down, left and right. We will define those actions as a dictionary, and map them to pairs of corresponding coordinate changes. For example, moving right (`R`) would correspond to a pair `(1,0)`. (code block 2)
In our example, Peter's goal would be to find an apple, while avoiding the wolf and other obstacles. To do this, he can essentially walk around until he finds an apple. Therefore, at any position he can choose between one of the following actions: up, down, left and right. We will define those actions as a dictionary, and map them to pairs of corresponding coordinate changes. For example, moving right (`R`) would correspond to a pair `(1,0)`. (code block 2)
```python
actions = { "U" : (0,-1), "D" : (0,1), "L" : (-1,0), "R" : (1,0) }
action_idx = { a : i for i,a in enumerate(actions.keys()) }
```
The strategy of our agent (Peter) is defined by so-called **policy**. A policy is a function that returns the action at any given state. In our case, the state of the problem is represented by the board, including the current position of the player.
The strategy of our agent (Peter) is defined by a so-called **policy**. A policy is a function that returns the action at any given state. In our case, the state of the problem is represented by the board, including the current position of the player.
The goal of reinforcement learning is to eventually learn a good policy that will allow us to solve the problem efficiently. However, as a baseline, let's consider the simplest policy called **random walk**.
## Random walk
Let's first solve our problem by implementing a random walk strategy. With random walk, we will randomly chose the next action from allowed ones, until we reach the apple (code block 3).
Let's first solve our problem by implementing a random walk strategy. With random walk, we will randomly choose the next action from the allowed actions, until we reach the apple (code block 3).
```python
def random_policy(m):
@ -87,7 +90,7 @@ def walk(m,policy,start_position=None):
walk(m,random_policy)
```
The call to `walk` should return us the length of corresponding path, which can vary from one run to another. We can run the walk experiment a number of times (say, 100), and print the resulting statistics (code block 4):
The call to `walk` should return the length of the corresponding path, which can vary from one run to another. We can run the walk experiment a number of times (say, 100), and print the resulting statistics (code block 4):
```python
def print_statistics(policy):
@ -106,13 +109,13 @@ print_statistics(random_policy)
Note that the average length of a path is around 30-40 steps, which is quite a lot, given the fact that the average distance to the nearest apple is around 5-6 steps.
You can also see how Peter's movement looks like during random walk:
You can also see what Peter's movement looks like during the random walk:
![Peter's Random Walk](images/random_walk.gif)
## Reward Function
## Reward function
To make out policy more intelligent, we need to understand which moves are "better" than others. To do this, we need to define our goal. The goal can be defined in terms of **reward function**, that will return some score value for each state. The higher the number - the better is the reward function. (code block 5)
To make our policy more intelligent, we need to understand which moves are "better" than others. To do this, we need to define our goal. The goal can be defined in terms of a **reward function**, which will return some score value for each state. The higher the number, the better the reward function. (code block 5)
```python
move_reward = -0.1
@ -131,19 +134,19 @@ def reward(m,pos=None):
return move_reward
```
An interesting thing about reward function is that in most of the cases *we are only given substantial reward at the end of the game*. It means that out algorithm should somehow remember "good" steps that lead to positive reward at the end, and increase their importance. Similarly, all moves that lead to bad results should be discouraged.
An interesting thing about reward functions is that in most cases, *we are only given a substantial reward at the end of the game*. This means that our algorithm should somehow remember "good" steps that lead to a positive reward at the end, and increase their importance. Similarly, all moves that lead to bad results should be discouraged.
## Q-Learning
An algorithm that we will discuss here is called **Q-Learning**. In this algorithm, the policy is defined by a function (or a data structure) called **Q-Table**. It records the "goodness" of each of the actions in a given state.
An algorithm that we will discuss here is called **Q-Learning**. In this algorithm, the policy is defined by a function (or a data structure) called a **Q-Table**. It records the "goodness" of each of the actions in a given state.
It is called Q-Table because it is often convenient to represent it as a table, or multi-dimensional array. Since our board has dimensions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`: (code block 6)
It is called a Q-Table because it is often convenient to represent it as a table, or multi-dimensional array. Since our board has dimensions `width` x `height`, we can represent the Q-Table using a numpy array with shape `width` x `height` x `len(actions)`: (code block 6)
```python
Q = np.ones((width,height,len(actions)),dtype=np.float)*1.0/len(actions)
```
Notice that we initially initialize all the values of Q-Table with equal value, in our case - 0.25. That corresponds to the "random walk" policy, because all moves in each state are equally good. We can pass the Q-Table to the `plot` function in order to visualize the table on the board: `m.plot(Q)`.
Notice that we initialize all the values of the Q-Table with an equal value, in our case - 0.25. This corresponds to the "random walk" policy, because all moves in each state are equally good. We can pass the Q-Table to the `plot` function in order to visualize the table on the board: `m.plot(Q)`.
![Peter's Environment](images/env_init.png)
@ -153,19 +156,19 @@ Now we need to run the simulation, explore our environment, and learn a better d
## Essence of Q-Learning: Bellman Equation
Once we start moving, each action will have a corresponding reward, i.e. we can theoretically select the next action based on the highest immediate reward. However, in most of the states the move will not achieve our goal or reaching the apple, and thus we cannot immediately decide which direction is better.
Once we start moving, each action will have a corresponding reward, i.e. we can theoretically select the next action based on the highest immediate reward. However, in most states, the move will not achieve our goal of reaching the apple, and thus we cannot immediately decide which direction is better.
> It is not the immediate result that matters, but rather the final result, which we will obtain at the end of the simulation.
> Remember that it is not the immediate result that matters, but rather the final result, which we will obtain at the end of the simulation.
In order to account for this delayed reward, we need to use the principles of **[dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming)**, which allows us to think about out problem recursively.
In order to account for this delayed reward, we need to use the principles of **[dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming)**, which allow us to think about out problem recursively.
Suppose we are now at the state *s*, and we want to move to the next state *s'*. By doing so, we will receive the immediate reward *r(s,a)*, defined by reward function, plus some future reward. If we suppose that our Q-Table correctly reflects the "attractiveness" of each action, then at state *s'* we will chose an action *a* that corresponds to maximum value of *Q(s',a')*. Thus, the best possible future reward we could get at state *s* will be defined as `max`<sub>a'</sub>*Q(s',a')* (maximum here is computed over all possible actions *a'* at state *s'*.
Suppose we are now at the state *s*, and we want to move to the next state *s'*. By doing so, we will receive the immediate reward *r(s,a)*, defined by the reward function, plus some future reward. If we suppose that our Q-Table correctly reflects the "attractiveness" of each action, then at state *s'* we will chose an action *a* that corresponds to maximum value of *Q(s',a')*. Thus, the best possible future reward we could get at state *s* will be defined as `max`<sub>a'</sub>*Q(s',a')* (maximum here is computed over all possible actions *a'* at state *s'*).
This gives the **Bellman formula** for calculating the value of Q-Table at state *s*, given action *a*:
This gives the **Bellman formula** for calculating the value of the Q-Table at state *s*, given action *a*:
<img src="images/bellmaneq.gif"/>
Here γ is the so-called **discount factor** that determines to which extent you should prefer current reward over the future reward and vice versa.
Here γ is the so-called **discount factor** that determines to which extent you should prefer the current reward over the future reward and vice versa.
## Learning Algorithm
@ -184,14 +187,15 @@ Given the equation above, we can now write pseudo-code for our leaning algorithm
6. *s* ← *s'*
7. Update the total reward and decrease α.
## Exploit vs. Explore
## Exploit vs. explore
In the algorithm above, we did not specify how exactly we should choose an action at step 2.1. If we are choosing the action randomly, we will randomly **explore** the environment, and we are quite likely to die often as well as explore areas where we would not normally go. An alternative approach would be to **exploit** the Q-Table values that we already know, and thus to choose the best action (with highers Q-Table value) at state *s*. This, however, will prevent us from exploring other states, and quite likely we might not find the optimal solution.
In the algorithm above, we did not specify how exactly we should choose an action at step 2.1. If we are choosing the action randomly, we will randomly **explore** the environment, and we are quite likely to die often as well as explore areas where we would not normally go. An alternative approach would be to **exploit** the Q-Table values that we already know, and thus to choose the best action (with higher Q-Table value) at state *s*. This, however, will prevent us from exploring other states, and it's likely we might not find the optimal solution.
Thus, the best approach is to balance between exploration and exploitation. This can be done by choosing the action at state *s* with probabilities proportional to values in Q-Table. In the beginning, when Q-Table values are all the same, it would correspond to a random selection, but as we learn more about our environment, we would be more likely to follow the optimal route while allowing the agent to choose the unexplored path once in a while.
## Python Implementation
Thus, the best approach is to strike a balance between exploration and exploitation. This can be done by choosing the action at state *s* with probabilities proportional to values in the Q-Table. In the beginning, when Q-Table values are all the same, it would correspond to a random selection, but as we learn more about our environment, we would be more likely to follow the optimal route while allowing the agent to choose the unexplored path once in a while.
Now we are ready to implement the learning algorithm. Before that, we also need some function that will convert arbitrary numbers in the Q-Table into a vector of probabilities for corresponding actions: (code block 7)
## Python implementation
We are now ready to implement the learning algorithm. Before we do that, we also need some function that will convert arbitrary numbers in the Q-Table into a vector of probabilities for corresponding actions: (code block 7)
```python
def probs(v,eps=1e-4):
@ -231,13 +235,13 @@ for epoch in range(5000):
n+=1
```
After executing this algorithm, Q-Table should be updated with values that define the attractiveness of different actions at each step. We can try to visualize Q-Table by plotting a vector at each cell that will point in the desired direction of movement. For simplicity, we draw a small circle instead of an arrow head.
After executing this algorithm, the Q-Table should be updated with values that define the attractiveness of different actions at each step. We can try to visualize the Q-Table by plotting a vector at each cell that will point in the desired direction of movement. For simplicity, we draw a small circle instead of an arrow head.
<img src="images/learned.png"/>
## Checking the Policy
## Checking the policy
Since Q-Table lists the "attractiveness" of each action at each state, it is quite easy to use it to define the efficient navigation in our world. In the simplest case, we can select the action corresponding to the highest Q-Table value: (code block 9)
Since the Q-Table lists the "attractiveness" of each action at each state, it is quite easy to use it to define the efficient navigation in our world. In the simplest case, we can select the action corresponding to the highest Q-Table value: (code block 9)
```python
def qpolicy_strict(m):
@ -259,7 +263,7 @@ walk(m,qpolicy_strict)
## Navigation
Better navigation policy would be the one that we have used during training, which combines exploitation and exploration. In this policy, we will select each action with a certain probability, proportional to the values in Q-Table. This strategy may still result in the agent returning back to the position it has already explored, but, as you can see from the code below, it results in very short average path to the desired location (remember that `print_statistics` runs the simulation 100 times): (code block 10)
A better navigation policy would be the one that we used during training, which combines exploitation and exploration. In this policy, we will select each action with a certain probability, proportional to the values in the Q-Table. This strategy may still result in the agent returning back to a position it has already explored, but, as you can see from the code below, it results in a very short average path to the desired location (remember that `print_statistics` runs the simulation 100 times): (code block 10)
```python
def qpolicy(m):
@ -275,19 +279,18 @@ After running this code, you should get a much smaller average path length than
## Investigating the learning process
As we have mentioned, the learning process is a balance between exploration and exploration of gained knowledge about the structure of problem space. We have seen that the result of learning (the ability to help an agent to find a short path to the goal) has improved, but it is also interesting to observe how the average path length behaves during the learning process:
As we have mentioned, the learning process is a balance between exploration and exploration of gained knowledge about the structure of problem space. We have seen that the result of learning (the ability to help an agent to find a short path to the goal) has improved, but it is also interesting to observe how the average path length behaves during the learning process:
<img src="images/lpathlen1.png"/>
What we see here is that at first the average path length increases. This is probably due to the fact that when we know nothing about the environment we are likely to get trapped into bad states, water or wolf. As we learn more and start using this knowledge, we can explore the environment for longer, but we still do not know where the apples are very well.
What we see here is that at first, the average path length increases. This is probably due to the fact that when we know nothing about the environment, we are likely to get trapped in bad states, water or wolf. As we learn more and start using this knowledge, we can explore the environment for longer, but we still do not know where the apples are very well.
Once we learn enough, it becomes easier for the agent to achieve the goal, and the path length starts to decrease. However, we are still open to exploration, so we often diverge away from the best path, and explore new options, making the path longer than optimal.
What we also observe on this graph, is that at some point the length increased abruptly. This indicates stochastic nature of the process, and that we can at some point "spoil" the Q-Table coefficients by overwriting them with new values. This ideally should be minimized by decreasing learning rate (i.e. towards the end of training we only adjust Q-Table values by a small value).
What we also observe on this graph is that at some point, the length increased abruptly. This indicates the stochastic nature of the process, and that we can at some point "spoil" the Q-Table coefficients by overwriting them with new values. This ideally should be minimized by decreasing learning rate (for example, towards the end of training, we only adjust Q-Table values by a small value).
Overall, it is important to remember that the success and quality of the learning process significantly depends on parameters, such as leaning rate, learning rate decay and discount factor. Those are often called **hyperparameters**, to distinguish them from **parameters** which we optimize during training (eg. Q-Table coefficients). The process of finding best hyperparameter values is called **hyperparameter optimization**, and it deserves a separate topic.
Overall, it is important to remember that the success and quality of the learning process significantly depends on parameters, such as learning rate, learning rate decay, and discount factor. Those are often called **hyperparameters**, to distinguish them from **parameters**, which we optimize during training (for example, Q-Table coefficients). The process of finding the best hyperparameter values is called **hyperparameter optimization**, and it deserves a separate topic.
## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/44/)
## Assignment [A More Realistic World](assignment.md)

@ -1,6 +1,6 @@
# A More Realistic World
In our situation, Peter was able to move around almost without getting tired or hungry. In more realistic world, we has to sit down and rest from time to time, and also to feed himself. Let's make our world more realistic, by implementing the following rules:
In our situation, Peter was able to move around almost without getting tired or hungry. In a more realistic world, we has to sit down and rest from time to time, and also to feed himself. Let's make our world more realistic, by implementing the following rules:
1. By moving from one place to another, Peter loses **energy** and gains some **fatigue**.
2. Peter can gain more energy by eating apples.

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

@ -1,55 +1,316 @@
# [Lesson Topic]
# CartPole Skating
Add a sketchnote if possible/appropriate
The problem we have been solving in the previous lesson might seem like a toy problem, not really applicable for real life scenarios. This is not the case, because many real world problems also share this scenario - including playing Chess or Go. They are similar, because we also have a board with given rules and a **discrete state**.
![Embed a video here if available](video-url)
In this lesson we will apply the same principles of Q-Learning to a problem with **continuous state**, i.e. a state that is given by one or more real numbers. We will deal with the following problem:
## [Pre-lecture quiz](link-to-quiz-app) 45
> **Problem**: If Peter wants to escape from the wolf, he needs to be able to move faster. We will see how Peter can learn to skate, in particular, to keep balance, using Q-Learning.
Describe what we will learn
We will use a simplified version of balancing known as a **CartPole** problem. In the cartpole world, we have a horizontal slider that can move left or right, and the goal is to balance a vertical pole on top of the slider.
### Introduction
<img alt="a cartpole" src="images/cartpole.png" width="200"/>
Describe what will be covered
## Prerequisites
> Notes
In this lesson, we will be using a library called **OpenAI Gym** to simulate different **environments**. You can run this lesson's code locally (eg. from Visual Studio Code), in which case the simulation will open in a new window. When running the code online, you may need to make some tweaks to the code, as described [here](https://towardsdatascience.com/rendering-openai-gym-envs-on-binder-and-google-colab-536f99391cc7).
## OpenAI Gym
### Prerequisite
In the previous lesson, the rules of the game and the state were given by the `Board` class which we defined ourselves. Here we will use a special **sumulation environment**, which will simulate the physics behind the balancing pole. One of the most popular simulation environments for training reinforcement learning algorithms is called a [Gym](https://gym.openai.com/), which is maintained by [OpenAI](https://openai.com/). By using this gym we can create difference **environments** from a cartpole simulation to Atari games.
What steps should have been covered before this lesson?
> **Note**: You can see other environments available from OpenAI Gym [here](https://gym.openai.com/envs/#classic_control).
### Preparation
First, let's install the gym and import required libraries (code block 1):
Preparatory steps to start this lesson
```python
import sys
!{sys.executable} -m pip install gym
---
import gym
import matplotlib.pyplot as plt
import numpy as np
import random
```
## A cartpole environment
To work with a cartpole balancing problem, we need to initialize corresponding environment. Each environment is associated with an:
* **Observation space** that defines the structure of information that we receive from the environment. For cartpole problem, we receive position of the pole, velocity and some other values.
* **Action space** that defines possible actions. In our case the action space is discrete, and consists of two actions - **left** and **right**. (code block 2)
```python
env = gym.make("CartPole-v1")
print(env.action_space)
print(env.observation_space)
print(env.action_space.sample())
```
To see how the environment works, let's run a short simulation for 100 steps. At each step, we provide one of the actions to be taken - in this simulation we just randomly select an action from `action_space`. Run the code below and see what it leads to.
> **Note**: Remember that it is preferred to run this code on local Python installation! (code block 3)
```python
env.reset()
for i in range(100):
env.render()
env.step(env.action_space.sample())
env.close()
```
You should be seeing something similar to this one:
![non-balancing cartpole](images/cartpole-nobalance.gif)
During simulation, we need to get observations in order to decide how to act. In fact, `step` function returns us back current observations, reward function, and the `done` flag that indicates whether it makes sense to continue the simulation or not: (code block 4)
```python
env.reset()
done = False
while not done:
env.render()
obs, rew, done, info = env.step(env.action_space.sample())
print(f"{obs} -> {rew}")
env.close()
```
You will end up seeing something like this in the notebook output:
```text
[ 0.03403272 -0.24301182 0.02669811 0.2895829 ] -> 1.0
[ 0.02917248 -0.04828055 0.03248977 0.00543839] -> 1.0
[ 0.02820687 0.14636075 0.03259854 -0.27681916] -> 1.0
[ 0.03113408 0.34100283 0.02706215 -0.55904489] -> 1.0
[ 0.03795414 0.53573468 0.01588125 -0.84308041] -> 1.0
...
[ 0.17299878 0.15868546 -0.20754175 -0.55975453] -> 1.0
[ 0.17617249 0.35602306 -0.21873684 -0.90998894] -> 1.0
```
The observation vector that is returned at each step of the simulation contains the following values:
* Position of cart
* Velocity of cart
* Angle of pole
* Rotation rate of pole
We can get min and max value of those numbers: (code block 5)
```python
print(env.observation_space.low)
print(env.observation_space.high)
```
You may also notice that reward value on each simulation step is always 1. This is because our goal is to survive as long as possible, i.e. keep the pole to a reasonably vertical position for the longest period of time.
> In fact, CartPole simulation is considered solved if we manage to get the average reward of 195 over 100 consecutive trials.
## State discretization
In Q=Learning, we need to build Q-Table that defines what to do at each state. To be able to do this, we need state to be **discreet**, more precisely, it should contain finite number of discrete values. Thus, we need somehow to **discretize** our observations, mapping them to a finite set of states.
There are a few ways we can do this:
* If we know the interval of a certain value, we can divide this interval into a number of **bins**, and then replace the value by the bin number that it belongs to. This can be done using the numpy [`digitize`](https://numpy.org/doc/stable/reference/generated/numpy.digitize.html) method. In this case, we will precisely know the state size, because it will depend on the number of bins we select for digitalization.
✅ We can use linear interpolation to bring values to some finite interval (say, from -20 to 20), and then convert numbers to integers by rounding them. This gives us a bit less control on the size of the state, especially if we do not know the exact ranges of input values. For example, in our case 2 out of 4 values do not have upper/lower bounds on their values, which may result in the infinite number of states.
In our example, we will go with the second approach. As you may notice later, despite undefined upper/lower bounds, those value rarely take values outside of certain finite intervals, thus those states with extreme values will be very rare.
Here is the function that will take the observation from our model, and produces a tuple of 4 integer values: (code block 6)
```python
def discretize(x):
return tuple((x/np.array([0.25, 0.25, 0.01, 0.1])).astype(np.int))
```
Let's also explore another discretization method using bins: (code block 7)
```python
def create_bins(i,num):
return np.arange(num+1)*(i[1]-i[0])/num+i[0]
print("Sample bins for interval (-5,5) with 10 bins\n",create_bins((-5,5),10))
ints = [(-5,5),(-2,2),(-0.5,0.5),(-2,2)] # intervals of values for each parameter
nbins = [20,20,10,10] # number of bins for each parameter
bins = [create_bins(ints[i],nbins[i]) for i in range(4)]
def discretize_bins(x):
return tuple(np.digitize(x[i],bins[i]) for i in range(4))
```
Let's now run a short simulation and observe those discrete environment values. Feel free to try both `discretize` and `discretize_bins` and see if there is a difference.
> **Note**: `discretize_bins` returns the bin number, which is 0-based, thus for values of input variable around 0 it returns the number from the middle of the interval (10). In `discretize`, we did not care about the range of output values, allowing them to be negative, thus the state values are not shifted, and 0 corresponds to 0. (code block 8)
```python
env.reset()
done = False
while not done:
#env.render()
obs, rew, done, info = env.step(env.action_space.sample())
#print(discretize_bins(obs))
print(discretize(obs))
env.close()
```
> **Note**: Uncomment the line starting with `env.render` if you want to see how the environment executes. Otherwise you can execute it in the background, which is faster. We will use this "invisible" execution during our Q-Learning process.
[Step through content in blocks]
## The Q-Table structure
## [Topic 1]
In our previous lesson, the state was a simple pair of numbers from 0 to 8, and thus it was convenient to represent Q-Table by a numpy tensor with a shape of 8x8x2. If we use bins discretization, the size of our state vector is also known, so we can use the same approach and represent state by an array of shape 20x20x10x10x2 (here 2 is the dimension of action space, and first dimensions correspond to the number of bins we have selected to use for each of the parameters in observation space).
### Task:
However, sometimes precise dimensions of the observation space are not known. In case of the `discretize` function, we may never be sure that our state stays within certain limits, because some of the original values are not bound. Thus, we will use a slightly different approach and represent Q-Table by a dictionary.
Work together to progressively enhance your codebase to build the project with shared code:
We will use the pair *(state,action)* as the dictionary key, and the value would correspond to Q-Table entry value. (code block 9)
```html
code blocks
```python
Q = {}
actions = (0,1)
def qvalues(state):
return [Q.get((state,a),0) for a in actions]
```
Here we also define a function `qvalues`, which returns a list of Q-Table values for a given state that corresponds to all possible actions. If the entry is not present in the Q-Table, we will return 0 as the default.
## Let's start Q-Learning!
Now we are ready to teach Peter to balance! First, let's set some hyperparameters: (code block 10)
```python
# hyperparameters
alpha = 0.3
gamma = 0.9
epsilon = 0.90
```
Here, `alpha` is the **learning rate** that defines to which extent we should adjust the current values of Q-Table at each step. In the previous lesson we started with 1, and then decreased `alpha` to lower values during training. In this example we will keep it constant just for simplicity, and you can experiment with adjusting `alpha` values later.
`gamma` is the **discount factor** that shows to which extent we should prioritize future reward over current reward.
`epsilon` is the **exploration/exploitation factor** that determines whether we should prefer exploration to exploitation or vice versa. In our algorithm, we will in `epsilon` percent of the cases select the next action according to Q-Table values, and in the remaining number of cases we will execute a random action. This will allow us to explore areas of the search space that we have never seen before.
✅ In terms of balancing - choosing random action (exploration) would act as a random punch in the wrong direction, and the pole would have to learn how to recover the balance from those "mistakes"
We can also make two improvements to our algorithm from the previous lesson:
* Calculating average cumulative reward over a number of simulations. We will print the progress each 5000 iterations, and we will average out our cumulative reward over that period of time. It means that if we get more than 195 point - we can consider the problem solved, with even higher quality than required.
* We will calculate maximum average cumulative result `Qmax`, and we will store the Q-Table corresponding to that result. When you run the training you will notice that sometimes the average cumulative result starts to drop, and we want to keep the values of Q-Table that correspond to the best model observed during training.
We will also collect all cumulative rewards at each simulation at `rewards` vector for further plotting. (code block 11)
```python
def probs(v,eps=1e-4):
v = v-v.min()+eps
v = v/v.sum()
return v
Qmax = 0
cum_rewards = []
rewards = []
for epoch in range(100000):
obs = env.reset()
done = False
cum_reward=0
# == do the simulation ==
while not done:
s = discretize(obs)
if random.random()<epsilon:
# exploitation - chose the action according to Q-Table probabilities
v = probs(np.array(qvalues(s)))
a = random.choices(actions,weights=v)[0]
else:
# exploration - randomly chose the action
a = np.random.randint(env.action_space.n)
obs, rew, done, info = env.step(a)
cum_reward+=rew
ns = discretize(obs)
Q[(s,a)] = (1 - alpha) * Q.get((s,a),0) + alpha * (rew + gamma * max(qvalues(ns)))
cum_rewards.append(cum_reward)
rewards.append(cum_reward)
# == Periodically print results and calculate average reward ==
if epoch%5000==0:
print(f"{epoch}: {np.average(cum_rewards)}, alpha={alpha}, epsilon={epsilon}")
if np.average(cum_rewards) > Qmax:
Qmax = np.average(cum_rewards)
Qbest = Q
cum_rewards=[]
```
What you may notice from those results:
* We are very close to achieving the goal of getting 195 cumulative rewards over 100+ consecutive runs of the simulation, or we may have actually achieved it! Even if we get smaller numbers, we still do not know, because we average over 5000 runs, and only 100 runs is required in the formal criteria.
* Sometimes the reward start to drop, which means that we can "destroy" already learnt values in the Q-Table with the ones that make the situation worse.
This is more clearly visible if we plot training progress.
## Plotting Training Progress
During training, we have collected the cumulative reward value at each of the iterations into `rewards` vector. Here is how it looks when we plot it against the iteration number:
```python
plt.plot(rewards)
```
✅ Knowledge Check - use this moment to stretch students' knowledge with open questions
![raw progress](images/train_progress_raw.png)
From this graph, it is not possible to tell anything, because due to the nature of stochastic training process the length of training sessions varies greatly. To make more sense of this graph, we can calculate the **running average** over a series of experiments, let's say 100. This can be done conveniently using `np.convolve`: (code block 12)
```python
def running_average(x,window):
return np.convolve(x,np.ones(window)/window,mode='valid')
plt.plot(running_average(rewards,100))
```
![training progress](images/train_progress_runav.png)
## Varying hyperparameters
To make learning more stable, it makes sense to adjust some of our hyperparameters during training. In particular:
* For **learning rate**, `alpha`, we may start with values close to 1, and then keep decreasing the parameter. With time, we will be getting good probability values in the Q-Table, and thus we should be adjusting them slightly, and not overwriting completely with new values.
## [Topic 2]
* We may want to increase the `epsilon` slowly, in order to explore less and exploit more. It probably makes sense to start with lower value of `epsilon`, and move up to almost 1.
## [Topic 3]
> **Task 1**: Play with hyperparameter values and see if you can achieve higher cumulative reward. Are you getting above 195?
> **Task 2**: To formally solve the problem, you need to get 195 average reward across 100 consecutive runs. Measure that during training and make sure that you have formally solved the problem!
## Seeing the result in action
It would be interesting to actually see how the trained model behaves. Let's run the simulation and follow the same action selection strategy as during training, sampling according to the probability distribution in Q-Table: (code block 13)
```python
obs = env.reset()
done = False
while not done:
s = discretize(obs)
env.render()
v = probs(np.array(qvalues(s)))
a = random.choices(actions,weights=v)[0]
obs,_,done,_ = env.step(a)
env.close()
```
You should see something like this:
![a balancing cartpole](images/cartpole-balance.gif)
---
## 🚀Challenge
Add a challenge for students to work on collaboratively in class to enhance the project
> **Task 3**: Here, we were using the final copy of Q-Table, which may not be the best one. Remember that we have stored the best-performing Q-Table into `Qbest` variable! Try the same example with the best-performing Q-Table by copying `Qbest` over to `Q` and see if you notice the difference.
> **Task 4**: Here we were not selecting the best action on each step, but rather sampling with corresponding probability distribution. Would it make more sense to always select the best action, with the highest Q-Table value? This can be done by using `np.argmax` function to find out the action number corresponding to highers Q-Table value. Implement this strategy and see if it improves the balancing.
## [Post-lecture quiz](link-to-quiz-app)
Optional: add a screenshot of the completed lesson's UI if appropriate
## Assignment: [Train a Mountain Car](assignment.md)
## [Post-lecture quiz](link-to-quiz-app) 46
## Conclusion
## Review & Self Study
We have now learned how to train agents to achieve good results just by providing them a reward function that defines the desired state of the game, and by giving them an opportunity to intelligently explore the search space. We have successfully applied the Q-Learning algorithm in the cases of discrete and continuous environments, but with discrete actions.
## Assignment [Assignment Name](assignment.md)
It's important to also study situations where action state is also continuous, and when observation space is much more complex, such as the image from the Atari game screen. In those problems we often need to use more powerful machine learning techniques, such as neural networks, in order to achieve good results. Those more advanced topics are the subject of our forthcoming more advanced AI course.

@ -1,9 +1,43 @@
# [Assignment Name]
# Train Mountain Car
[OpenAI Gym](http://gym.openai.com) has been designed in such a way that all environments provide the same API - i.e. the same methods `reset`, `step` and `render`, and the same abstractions of **action space** and **observation space**. Thus is should be possible to adapt the same reinforcement learning algorithms to different environments with minimal code changes.
## A Mountain Car Environment
[Mountain Car environment](https://gym.openai.com/envs/MountainCar-v0/) contains a car stuck in a valley:
<img src="images/mountaincar.png" width="300"/>
The goal is to get out of the valley and capture the flag, by doing at each step one of the following actions:
| Value | Meaning |
|---|---|
| 0 | Accelerate to the left |
| 1 | Do not accelerate |
| 2 | Accelerate to the right |
The main trick of this problem is, however, that the car's engine is not strong enough to scale the mountain in a single pass. Therefore, the only way to succeed is to drive back and forth to build up momentum.
Observation space consists of just two values:
| Num | Observation | Min | Max |
|-----|--------------|-----|-----|
| 0 | Car Position | -1.2| 0.6 |
| 1 | Car Velocity | -0.07 | 0.07 |
Reward system for the mountain car is rather tricky:
* Reward of 0 is awarded if the agent reached the flag (position = 0.5) on top of the mountain.
* Reward of -1 is awarded if the position of the agent is less than 0.5.
Episode terminates if the car position is more than 0.5, or episode length is greater than 200.
## Instructions
Adapt our reinforcement learning algorithm to solve the mountain car problem. Start with existing [notebook.ipynb](notebook.ipynb) code, substitute new environment, change state discretization functions, and try to make existing algorithm to train with minimal code modifications. Optimize the result by adjusting hyperparameters.
> **Note**: Hyperparameters adjustment is likely to be needed to make algorithm converge.
## Rubric
| Criteria | Exemplary | Adequate | Needs Improvement |
| -------- | --------- | -------- | ----------------- |
| | | | |
| | Q-Learning algorithm is successfully adapted from CartPole example, with minimal code modifications, which is able to solve the problem of capturing the flag under 200 steps. | A new Q-Learning algorithm has been adopted from the Internet, but is well-documented; or existing algorithm adopted, but does not reach desired results | Student was not able to successfully adopt any algorithm, but has mede substantial steps towards solution (implemented state discretization, Q-Table data structure, etc.) |

Binary file not shown.

After

Width:  |  Height:  |  Size: 383 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 926 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.3 KiB

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

@ -1,39 +1,39 @@
# Getting Started with Reinforcement Learning
# Introduction to reinforcement learning
[![Peter and the Wolf](https://img.youtube.com/vi/Fmi5zHg4QSM/0.jpg)](https://www.youtube.com/watch?v=Fmi5zHg4QSM)
> 🎥 Click the image above to listen to Peter and the Wolf by Prokofiev
## Regional Topic: Peter and the Wolf (Russia)
## Regional topic: Peter and the Wolf (Russia)
[Peter and the Wolf](https://en.wikipedia.org/wiki/Peter_and_the_Wolf) is a musical fairy tale written by a Russian composer [Sergei Prokofiev](https://en.wikipedia.org/wiki/Sergei_Prokofiev). It is a story about young pioneer Peter, who bravely goes out of his house to the forest clearing to chase the wolf. In this section, we will train machine learning algorithms that will help Peter:
- **Explore** the surrounding area and build an optimal navigation map
- **Learn** how to use a skateboard and balance on it, in order to move around faster.
## Introduction to Reinforcement Learning
## Introduction to reinforcement learning
In previous sections, you have seen two example of machine learning problems:
In previous sections, you have seen two examples of machine learning problems:
* **Supervised**, where we had some datasets that show sample solutions to the problem we want to solve. [Classification](../4-Classification/README.md) and [Regression](../2-Regression/README.md) are supervised learning tasks.
* **Unsupervised**, in which we do not have training data. The main example of unsupervised learning is [Clustering](../5-Clustering/README.md).
* **Supervised**, where we have datasets that suggest sample solutions to the problem we want to solve. [Classification](../4-Classification/README.md) and [regression](../2-Regression/README.md) are supervised learning tasks.
* **Unsupervised**, in which we do not have labeled training data. The main example of unsupervised learning is [Clustering](../5-Clustering/README.md).
In this section, we will introduce you to a new type of learning problems, which do not require labeled training data. There are a several types of such problems:
In this section, we will introduce you to a new type of learning problems which do not require labeled training data. There are a several types of such problems:
* **[Semi-supervised learning](https://en.wikipedia.org/wiki/Semi-supervised_learning)**, where we have a lot of unlabeled data that can be used to pre-train the model.
* **[Reinforcement learning](https://en.wikipedia.org/wiki/Reinforcement_learning)**, in which the agent learns how to behave by performing a lot of experiments in some simulated environment.
* **[Semi-supervised learning](https://wikipedia.org/wiki/Semi-supervised_learning)**, where we have a lot of unlabeled data that can be used to pre-train the model.
* **[Reinforcement learning](https://wikipedia.org/wiki/Reinforcement_learning)**, in which an agent learns how to behave by performing experiments in some simulated environment.
Suppose, you want to teach computer to play a game, such as chess, or [Super Mario](https://en.wikipedia.org/wiki/Super_Mario). For computer to play a game, we need it to predict which move to make in each of the game states. While this may seem like a classification problem, it is not - because we do not have a dataset with states and corresponding actions. While we may have some data like that (existing chess matches, or recording of players playing Super Mario), it is likely not to cover sufficiently large number of possible states.
Suppose you want to teach computer to play a game, such as chess, or [Super Mario](https://wikipedia.org/wiki/Super_Mario). For the computer to play a game, we need it to predict which move to make in each of the game states. While this may seem like a classification problem, it is not - because we do not have a dataset with states and corresponding actions. While we may have some data like existing chess matches or recording of players playing Super Mario, it is likely that that data will not sufficiently cover a large enough number of possible states.
Instead of looking for existing game data, **Reinforcement Learning** (RL) is based on the idea of *making the computer play* many times, observing the result. Thus, to apply Reinforcement Learning, we need two things:
1. **An environment** and **a simulator**, which would allow us to play a game many times. This simulator would define all game rules, possible states and actions.
Instead of looking for existing game data, **Reinforcement Learning** (RL) is based on the idea of *making the computer play* many times and observing the result. Thus, to apply Reinforcement Learning, we need two things:
1. **An environment** and **a simulator** which allow us to play a game many times. This simulator would define all the game rules as well as possible states and actions.
2. **A reward function**, which would tell us how well we did during each move or game.
The main difference between supervised learning is that in RL we typically do not know whether we win or lose until we finish the game. Thus, we cannot say whether a certain move alone is good or now - we only receive reward at the end of the game. And our goal is to design such algorithms that will allow us to train a model under such uncertain conditions. We will learn about one RL algorithm called **Q-learning**.
The main difference between other types of machine learning and RL is that in RL we typically do not know whether we win or lose until we finish the game. Thus, we cannot say whether a certain move alone is good or not - we only receive a reward at the end of the game. And our goal is to design algorithms that will allow us to train a model under uncertain conditions. We will learn about one RL algorithm called **Q-learning**.
## Lessons
1. [Introduction to Reinforcement Learning and Q-Learning](1-QLearning/README.md)
2. [Using gym simulation environment](2-Gym/README.md)
1. [Introduction to reinforcement learning and Q-Learning](1-QLearning/README.md)
2. [Using a gym simulation environment](2-Gym/README.md)
## Credits

@ -1,8 +1,8 @@
# Machine Learning in the Real World
# Machine learning in the real world
In this curriculum, you have learned many ways to prepare data for training and create machine learning models. You built a series of classic Regression, Clustering, Classification, Natural Language Processing, and Time Series models. Congratulations! Now, you might be wondering what it's all for... what are the real world applications for these models?
In this curriculum, you have learned many ways to prepare data for training and create machine learning models. You built a series of classic regression, clustering, classification, natural language processing, and time series models. Congratulations! Now, you might be wondering what it's all for... what are the real world applications for these models?
While a lot of interest in industry has been garnered by AI, which usually leverages Deep Learning, there are still valuable applications for classical machine learning models. You might even use some of these applications today! In this lesson, you'll explore how eight different industries and subject-matter domains use these types of models to make their applications more performant, reliable, intelligent, and valuable to users.
While a lot of interest in industry has been garnered by AI, which usually leverages deep learning, there are still valuable applications for classical machine learning models. You might even use some of these applications today! In this lesson, you'll explore how eight different industries and subject-matter domains use these types of models to make their applications more performant, reliable, intelligent, and valuable to users.
## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/47/)
@ -10,7 +10,7 @@ While a lot of interest in industry has been garnered by AI, which usually lever
### Credit card fraud detection
We learned about [k-means clustering](5-Clustering/2-K-Means/README.md) earlier in the course, but how can it be used to solve problems related to credit card fraud?
We learned about [k-means clustering](../../5-Clustering/2-K-Means/README.md) earlier in the course, but how can it be used to solve problems related to credit card fraud?
K-means clustering comes in handy during a credit card fraud detection technique called **outlier detection**. Outliers, or deviations in observations about a set of data, can tell us if a credit card is being used in a normal capacity or if something unusual is going on. As shown in the paper linked below, you can sort credit card data using a k-means clustering algorithm and assign each transaction to a cluster based on how much of an outlier it appears to be. Then, you can evaluate the riskiest clusters for fraudulent versus legitimate transactions.
@ -20,12 +20,11 @@ https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.680.1195&rep=rep1&type
In wealth management, an individual or firm handles investments on behalf of their clients. Their job is to sustain and grow wealth in the long-term, so it is essential to choose investments that perform well.
One way to evaluate how a particular investment performs is through statistical regression. [Linear regression](2-Regression/1-Tools/README.md) is a valuable tool for understanding how a fund performs relative to some benchmark. We can also deduce whether or not the results of the regression are statistically significant, or how much they would affect a client's investments. You could even further expand your analysis using multiple regression, where additional risk factors can be taken into account. For an example of how this would work for a specific fund, check out the paper below on evaluating fund performance using regression.
One way to evaluate how a particular investment performs is through statistical regression. [Linear regression](../../2-Regression/1-Tools/README.md) is a valuable tool for understanding how a fund performs relative to some benchmark. We can also deduce whether or not the results of the regression are statistically significant, or how much they would affect a client's investments. You could even further expand your analysis using multiple regression, where additional risk factors can be taken into account. For an example of how this would work for a specific fund, check out the paper below on evaluating fund performance using regression.
http://www.brightwoodventures.com/evaluating-fund-performance-using-regression/
## 🎓 Education
### Predicting student behavior
[Coursera](https://coursera.com), an online open course provider, has a great tech blog where they discuss many engineering decisions. In this case study, they plotted a regression line to try to explore any correlation between a low NPS (Net Promoter Score) rating and course retention or drop-off.
@ -34,12 +33,11 @@ https://medium.com/coursera-engineering/controlled-regression-quantifying-the-im
### Mitigating bias
[Grammarly](https://grammarly.com), a writing assistant that checks for spelling and grammar errors, uses sophisticated [NLP](6-NLP/README.md) throughout its products. They published an interesting case study in their tech blog about how they dealt with gender bias in machine learning, which you learned about in our [introductory fairness lesson](1-Introduction/3-fairness/README.md).
[Grammarly](https://grammarly.com), a writing assistant that checks for spelling and grammar errors, uses sophisticated [natural language processing systems](../../6-NLP/README.md) throughout its products. They published an interesting case study in their tech blog about how they dealt with gender bias in machine learning, which you learned about in our [introductory fairness lesson](../../1-Introduction/3-fairness/README.md).
https://www.grammarly.com/blog/engineering/mitigating-gender-bias-in-autocorrect/
## 👜 Retail
### Personalizing the customer journey
At Wayfair, a company that sells home goods like furniture, helping customers find the right products for their taste and needs is paramount. In this article, engineers from the company describe how they use ML and NLP to "surface the right results for customers". Notably, their Query Intent Engine has been built to use entity extraction, classifier training, asset and opinion extraction, and sentiment tagging on customer reviews. This is a classic use case of how NLP works in online retail.
@ -56,13 +54,13 @@ https://www.zdnet.com/article/how-stitch-fix-uses-machine-learning-to-master-the
### Managing clinical trials
Toxicity in clinical trials is a major concern to drug makers. How much toxicity is tolerable? In this study, analyzing various clinical trial methods led to the development of a new approach for predicting the odds of clinical trial outcomes. Specifically, they were able to use random forest to produce a [classifier](4-Classification/README.md) that is able to distinguish between groups of drugs.
Toxicity in clinical trials is a major concern to drug makers. How much toxicity is tolerable? In this study, analyzing various clinical trial methods led to the development of a new approach for predicting the odds of clinical trial outcomes. Specifically, they were able to use random forest to produce a [classifier](../../4-Classification/README.md) that is able to distinguish between groups of drugs.
https://www.sciencedirect.com/science/article/pii/S2451945616302914
### Hospital readmission management
Hospital care is costly, especially when patients have to be readmitted. This paper discusses a company that uses ML to predict readmission potential using [clustering](5-Clustering/README.md) algorithms. These clusters help analysts to "discover groups of readmissions that may share a common cause".
Hospital care is costly, especially when patients have to be readmitted. This paper discusses a company that uses ML to predict readmission potential using [clustering](../../5-Clustering/README.md) algorithms. These clusters help analysts to "discover groups of readmissions that may share a common cause".
https://healthmanagement.org/c/healthmanagement/issuearticle/hospital-readmissions-and-machine-learning
@ -76,7 +74,7 @@ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7979218/
### Forest management
You learned about [Reinforcement Learning](8-Reinforcement/README.md) in previous lessons. It can be very useful when trying to predict patterns in nature. In particular, it can be used to track ecological problems like forest fires and the spread of invasive species. In Canada, a group of researchers used Reinforcement Learning to build forest wildfire dynamics models from satellite images. Using an innovative "spatially spreading process (SSP)", they envisioned a forest fire as "the agent at any cell in the landscape." "The set of actions the fire can take from a location at any point in time includes spreading north, south, east, or west or not spreading.
You learned about [Reinforcement Learning](../../8-Reinforcement/README.md) in previous lessons. It can be very useful when trying to predict patterns in nature. In particular, it can be used to track ecological problems like forest fires and the spread of invasive species. In Canada, a group of researchers used Reinforcement Learning to build forest wildfire dynamics models from satellite images. Using an innovative "spatially spreading process (SSP)", they envisioned a forest fire as "the agent at any cell in the landscape." "The set of actions the fire can take from a location at any point in time includes spreading north, south, east, or west or not spreading.
This approach inverts the usual RL setup since the dynamics of the corresponding Markov Decision Process (MDP) is a known function for immediate wildfire spread." Read more about the classic algorithms used by this group at the link below.
@ -92,7 +90,7 @@ https://druckhaus-hofmann.de/gallery/31-wj-feb-2020.pdf
### ⚡️ Energy Management
In our lesson on [Time Series](7-TimeSeries/README.md), we invoked the concept of smart parking meters to generate revenue for a town based on understanding supply and demand. This article discusses in detail how clustering, regression and time series forecasting combined to help predict future energy use in Ireland, based off of smart metering.
In our lessons on [time series forecasting](../../7-TimeSeries/README.md), we invoked the concept of smart parking meters to generate revenue for a town based on understanding supply and demand. This article discusses in detail how clustering, regression and time series forecasting combined to help predict future energy use in Ireland, based off of smart metering.
https://www-cdn.knime.com/sites/default/files/inline-images/knime_bigdata_energy_timeseries_whitepaper.pdf
@ -113,7 +111,6 @@ Detecting fake news has become a game of cat and mouse in today's media. In this
https://www.irjet.net/archives/V7/i6/IRJET-V7I6688.pdf
This article shows how combining different ML domains can produce interesting results that can help stop fake news from spreading and creating real damage; in this case, the impetus was the spread of rumors about COVID treatments that incited mob violence.
### Museum ML
Museums are at the cusp of an AI revolution in which cataloging and digitizing collections and finding links between artifacts is becoming easier as technology advances. Projects such as [In Codice Ratio](https://www.sciencedirect.com/science/article/abs/pii/S0306457321001035#:~:text=1.,studies%20over%20large%20historical%20sources.) are helping unlock the mysteries of inaccessible collections such as the Vatican Archives. But, the business aspect of museums benefits from ML models as well.
@ -138,7 +135,7 @@ Identify another sector that benefits from some of the techniques you learned in
## Review & Self Study
The Wayfair Data Science team has several interesting videos on how they use ML at their company. It's worth [taking a look](https://www.youtube.com/channel/UCe2PjkQXqOuwkW1gw6Ameuw/videos)!
The Wayfair data science team has several interesting videos on how they use ML at their company. It's worth [taking a look](https://www.youtube.com/channel/UCe2PjkQXqOuwkW1gw6Ameuw/videos)!
## Assignment

@ -1,6 +1,11 @@
# Real World Applications of Classic Machine Learning
In this section of the curriculum, you will be introduced to some real-world applications of classical ML. We have scoured the internet to find whitepapers and articles about applications that have used these strategies, avoiding neural networks, deep learning and AI as much as possible. Learn about how ML is used in business systems, ecological applications, finance, arts and culture, and more.
![chess](images/chess.jpg)
> Photo by <a href="https://unsplash.com/@childeye?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Alexis Fauvet</a> on <a href="https://unsplash.com/s/photos/artificial-intelligence?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
## Lesson
1. [Real-World Applications for ML](1-Applications/README.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

@ -12,7 +12,7 @@
> 🌍 Travel around the world as we explore Machine Learning by means of world cultures 🌍
Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about traditional Machine Learning. In this lesson group, you will learn about what is sometimes called 'classic' ML, using primarily Scikit-Learn as a library and avoiding deep learning, which is covered in our forthcoming 'AI for Beginners' curriculum.
Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about traditional Machine Learning. In this lesson group, you will learn about what is sometimes called 'classic' ML, using primarily Scikit-learn as a library and avoiding deep learning, which is covered in our forthcoming 'AI for Beginners' curriculum. Pair this curriculum with our forthcoming 'Data Science for Beginners' curriculum, as well!
Travel with us around the world as we apply these classic techniques to data from many areas of the world. Each lesson includes pre- and post-lesson quizzes, written instructions to complete the lesson, a solution, an assignment and more. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'.
@ -69,34 +69,38 @@ By ensuring that the content aligns with projects, the process is made more enga
> **A note about quizzes**: All quizzes are contained [in this app](https://jolly-sea-0a877260f.azurestaticapps.net), for 48 total quizzes of three questions each. They are linked from within the lessons but the quiz app can be run locally; follow the instruction in the `quiz-app` folder.
| Lesson Number | Section | Concepts Taught | Learning Objectives | Linked Lesson | Author |
| :-----------: | :--------------------------------------------------------: | :-------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------: | :------------: |
| 01 | [Introduction](1-Introduction/README.md) | Introduction to Machine Learning | Learn the basic concepts behind Machine Learning | [lesson](1-Introduction/1-intro-to-ML/README.md) | Muhammad |
| 02 | [Introduction](1-Introduction/README.md) | The History of Machine Learning | Learn the history underlying this field | [lesson](Introduction/2-history-of-ML/README.md) | Jen and Amy |
| 03 | [Introduction](1-Introduction/README.md) | Fairness and Machine Learning | What are the important philosophical issues around fairness that students should consider when building and applying ML models? | [lesson](1-Introduction/3-fairness/README.md) | Tomomi |
| 04 | [Introduction](1-Introduction/README.md) | Techniques for Machine Learning | What techniques do ML researchers use to build ML models? | [lesson](1-Introduction/4-techniques-of-ML/README.md) | Chris and Jen |
| 05 | Introduction to Regression | [Regression](2-Regression/README.md) | Get started with Python and Scikit-Learn for Regression models | [lesson](2-Regression/1-Tools/README.md) | Jen |
| 06 | North American Pumpkin Prices 🎃 | [Regression](2-Regression/README.md) | Visualize and clean data in preparation for ML | [lesson](2-Regression/2-Data/README.md) | Jen |
| 07 | North American Pumpkin Prices 🎃 | [Regression](2-Regression/README.md) | Build Linear and Polynomial Regression models | [lesson](2-Regression/3-Linear/README.md) | Jen |
| 08 | North American Pumpkin Prices 🎃 | [Regression](2-Regression/README.md) | Build a Logistic Regression model | [lesson](2-Regression/4-Logistic/README.md) | Jen |
| 09 | A Web App 🔌 | [Web App](3-Web-App/README.md) | Build a Web app to use your trained model | [lesson](3-Web-App/README.md) | Jen |
| 10 | Introduction to Classification | [Classification](4-Classification/README.md) | Clean, Prep, and Visualize your Data; Introduction to Classification | [lesson](4-Classification/1-Introduction/README.md) | Jen and Cassie |
| 11 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Introduction to Classifiers | [lesson](4-Classification/2-Classifiers-1/README.md) | Jen and Cassie |
| 12 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | More Classifiers | [lesson](4-Classification/3-Classifiers-2/README.md) | Jen and Cassie |
| 13 | Delicious Asian and Indian Cuisines 🍜 | [Classification](4-Classification/README.md) | Build a Recommender Web App using your Model | [lesson](4-Classification/4-Applied/README.md) | Jen |
| 14 | Introduction to Clustering | [Clustering](5-Clustering/README.md) | Clean, Prep, and Visualize your Data; Introduction to Clustering | [lesson](5-Clustering/1-Visualize/README.md) | Jen |
| 15 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](5-Clustering/README.md) | Explore the K-Means Clustering Method | [lesson](5-Clustering/2-K-Means/README.md) | Jen |
| 16 | Introduction to Natural Language Processing ☕️ | [Natural Language Processing](6-NLP/README.md) | Learn the basics about NLP by building a simple bot | [lesson](6-NLP/1-Introduction-to-NLP/README.md) | Stephen |
| 17 | Common NLP Tasks ☕️ | [Natural Language Processing](6-NLP/README.md) | Deepen your NLP knowledge by understanding common tasks required when dealing with language structures | [lesson](6-NLP/2-Tasks/README.md) | Stephen |
| 18 | Translation and Sentiment Analysis ♥️ | [Natural Language Processing](6-NLP/README.md) | Translation and Sentiment analysis with Jane Austen | [lesson](6-NLP/3-Translation-Sentiment/README.md) | Stephen |
| 19 | Romantic Hotels of Europe ♥️ | [Natural Language Processing](6-NLP/README.md) | Sentiment analysis, continued | [lesson]() | Stephen |
| 20 | Introduction to Time Series Forecasting | [Time Series](7-TimeSeries/README.md) | Introduction to Time Series Forecasting | [lesson](7-TimeSeries/1-Introduction/README.md) | Francesca |
| 21 | ⚡️ World Power Usage ⚡️ Time Series Forecasting with ARIMA ⚡️ | [Time Series](7-TimeSeries/README.md) | Time Series Forecasting with ARIMA | [lesson](7-TimeSeries/2-ARIMA/README.md) | Francesca |
| 22 | Introduction to Reinforcement Learning | [Reinforcement Learning](8-Reinforcement/README.md) | Introduction to Reinforcement Learning with Q-Learning | [lesson](8-Reinforcement/1-QLearning/README.md) | Dmitry |
| 23 | Help Peter avoid the Wolf! 🐺 | [Reinforcement Learning](8-Reinforcement/README.md) | Reinforcement Learning Gym | [lesson](8-Reinforcement/2-Gym/README.md) | Dmitry |
| 24 | Real-World ML Scenarios and Applications | [ML in the Wild](9-Real-World/README.md) | Interesting and Revealing real-world applications of classical ML | [lesson](9-Real-World/1-Applications/README.md) | Team |
| Lesson Number | Topic | Section | Learning Objectives | Linked Lesson | Author |
| :-----------: | :------------------------------------------------------: | :-------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------: | :------------: |
| 01 | Introduction to machine learning | [Introduction](1-Introduction/README.md) | Learn the basic concepts behind machine learning | [lesson](1-Introduction/1-intro-to-ML/README.md) | Muhammad |
| 02 | The History of machine learning | [Introduction](1-Introduction/README.md) | Learn the history underlying this field | [lesson](Introduction/2-history-of-ML/README.md) | Jen and Amy |
| 03 | Fairness and machine learning | [Introduction](1-Introduction/README.md) | What are the important philosophical issues around fairness that students should consider when building and applying ML models? | [lesson](1-Introduction/3-fairness/README.md) | Tomomi |
| 04 | Techniques for machine learning | [Introduction](1-Introduction/README.md) | What techniques do ML researchers use to build ML models? | [lesson](1-Introduction/4-techniques-of-ML/README.md) | Chris and Jen |
| 05 | Introduction to regression | [Regression](2-Regression/README.md) | Get started with Python and Scikit-learn for regression models | [lesson](2-Regression/1-Tools/README.md) | Jen |
| 06 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Visualize and clean data in preparation for ML | [lesson](2-Regression/2-Data/README.md) | Jen |
| 07 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Build linear and polynomial regression models | [lesson](2-Regression/3-Linear/README.md) | Jen |
| 08 | North American pumpkin prices 🎃 | [Regression](2-Regression/README.md) | Build a logistic regression model | [lesson](2-Regression/4-Logistic/README.md) | Jen |
| 09 | A Web App 🔌 | [Web App](3-Web-App/README.md) | Build a web app to use your trained model | [lesson](3-Web-App/README.md) | Jen |
| 10 | Introduction to classification | [Classification](4-Classification/README.md) | Clean, prep, and visualize your data; introduction to classification | [lesson](4-Classification/1-Introduction/README.md) | Jen and Cassie |
| 11 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | Introduction to classifiers | [lesson](4-Classification/2-Classifiers-1/README.md) | Jen and Cassie |
| 12 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | More classifiers | [lesson](4-Classification/3-Classifiers-2/README.md) | Jen and Cassie |
| 13 | Delicious Asian and Indian cuisines 🍜 | [Classification](4-Classification/README.md) | Build a recommender web app using your model | [lesson](4-Classification/4-Applied/README.md) | Jen |
| 14 | Introduction to clustering | [Clustering](5-Clustering/README.md) | Clean, prep, and visualize your data; Introduction to clustering | [lesson](5-Clustering/1-Visualize/README.md) | Jen |
| 15 | Exploring Nigerian Musical Tastes 🎧 | [Clustering](5-Clustering/README.md) | Explore the K-Means clustering method | [lesson](5-Clustering/2-K-Means/README.md) | Jen |
| 16 | Introduction to natural language processing ☕️ | [Natural language processing](6-NLP/README.md) | Learn the basics about NLP by building a simple bot | [lesson](6-NLP/1-Introduction-to-NLP/README.md) | Stephen |
| 17 | Common NLP Tasks ☕️ | [Natural language processing](6-NLP/README.md) | Deepen your NLP knowledge by understanding common tasks required when dealing with language structures | [lesson](6-NLP/2-Tasks/README.md) | Stephen |
| 18 | Translation and sentiment analysis ♥️ | [Natural language processing](6-NLP/README.md) | Translation and sentiment analysis with Jane Austen | [lesson](6-NLP/3-Translation-Sentiment/README.md) | Stephen |
| 19 | Romantic hotels of Europe ♥️ | [Natural language processing](6-NLP/README.md) | Sentiment analysis, continued | [lesson]() | Stephen |
| 20 | Introduction to time series forecasting | [Time series](7-TimeSeries/README.md) | Introduction to time series forecasting | [lesson](7-TimeSeries/1-Introduction/README.md) | Francesca |
| 21 | ⚡️ World Power Usage ⚡️ - time series forecasting with ARIMA | [Time series](7-TimeSeries/README.md) | Time series forecasting with ARIMA | [lesson](7-TimeSeries/2-ARIMA/README.md) | Francesca |
| 22 | Introduction to reinforcement learning | [Reinforcement learning](8-Reinforcement/README.md) | Introduction to reinforcement learning with Q-Learning | [lesson](8-Reinforcement/1-QLearning/README.md) | Dmitry |
| 23 | Help Peter avoid the wolf! 🐺 | [Reinforcement learning](8-Reinforcement/README.md) | Reinforcement learning Gym | [lesson](8-Reinforcement/2-Gym/README.md) | Dmitry |
| 24 | Real-World ML scenarios and applications | [ML in the Wild](9-Real-World/README.md) | Interesting and revealing real-world applications of classical ML | [lesson](9-Real-World/1-Applications/README.md) | Team |
## Offline access
You can run this documentation offline by using [Docsify](https://docsify.js.org/#/). Fork this repo, [install Docsify](https://docsify.js.org/#/quickstart) on your local machine, and then in the root folder of this repo, type `docsify serve`. The website will be served on port 3000 on your localhost: `localhost:3000`.
## PDFs
Find a pdf of the curriculum with links [here](pdf/readme.pdf)

@ -0,0 +1,39 @@
- Introduction
- [Introduction to Machine Learning](../1-Introduction/1-intro-to-ML/README.md)
- [History of Machine Learning](../1-Introduction/2-history-of-ML/README.md)
- [ML and Fairness](../1-Introduction/3-fairness/README.md)
- [Techniques of ML](../1-Introduction/4-techniques-of-ML/README.md)
- Regression
- [Tools of the Trade](../2-Regression/1-Tools/README.md)
- [Data](../2-Regression/2-Data/README.md)
- [Linear Regression](../2-Regression/3-Linear/README.md)
- [Logistic Regression](../2-Regression/4-Logistic/README.md)
- Build a Web App
- [Web App](../3-Web-App/1-Web-App/README.md)
- Classification
- [Intro to Classification](../4-Classification/1-Introduction/README.md)
- [Classifiers 1](../4-Classification/2-Classifiers-1/README.md)
- [Classifiers 2](../4-Classification/3-Classifiers-2/README.md)
- [Applied ML](../4-Classification/4-Applied/README.md)
- Clustering
- [Visualize your Data](../5-Clustering/1-Visualize/README.md)
- [K-Means](../5-Clustering/2-K-Means/README.md)
- NLP
- [Introduction to NLP](../6-NLP/1-Introduction-to-NLP/README.md)
- [NLP Tasks](../6-NLP/2-Tasks/README.md)
- [Translation and Sentiment](../6-NLP/3-Translation-Sentiment/README.md)
- Time Series Forecasting
- [Introduction to Time Series Forecasting](../7-TimeSeries/1-Introduction/README.md)
- [ARIMA](../7-TimeSeries/2-ARIMA/README.md)
- Reinforcement Learning
- [Q-Learning](../8-Reinforcement/1-QLearning/README.md)
- Real World ML
- [Applications](../9-Real-World/1-Applications/README.md)

@ -0,0 +1,9 @@
module.exports = {
contents: ['docs/_sidebar.md'], // array of "table of contents" files path
pathToPublic: 'pdf/readme.pdf', // path where pdf will stored
pdfOptions: {
margin: { top: '100px', bottom: '100px' }
}, // reference: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagepdfoptions
removeTemp: true, // remove generated .md and .html or not
emulateMedia: 'print', // mediaType, emulating by puppeteer for rendering pdf, 'print' by default (reference: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pageemulatemediamediatype)
};

2251
package-lock.json generated

File diff suppressed because it is too large Load Diff

@ -0,0 +1,29 @@
{
"name": "ml-for-beginners",
"version": "1.0.0",
"description": "Machine Learning for Beginners - A Curriculum",
"main": "index.js",
"scripts": {
"convert": "node_modules/.bin/docsify-to-pdf"
},
"repository": {
"type": "git",
"url": "git+https://github.com/microsoft/ML-For-Beginners.git"
},
"keywords": [
"machine",
"learning",
"ml",
"ai",
"curriculum"
],
"author": "Jen Looper and team",
"license": "MIT",
"bugs": {
"url": "https://github.com/microsoft/ML-For-Beginners/issues"
},
"homepage": "https://github.com/microsoft/ML-For-Beginners#readme",
"devDependencies": {
"docsify-to-pdf": "0.0.5"
}
}

Binary file not shown.

@ -82,11 +82,11 @@
"questionText": "What is an example of a classical ML technique?",
"answerOptions": [
{
"answerText": "Natural Language Processing",
"answerText": "natural language processing",
"isCorrect": "true"
},
{
"answerText": "Deep Learning",
"answerText": "deep learning",
"isCorrect": "false"
},
{
@ -210,7 +210,7 @@
]
},
{
"questionText": "Which event was foundational in the creation and expansion of the field of Artificial Intelligence?",
"questionText": "Which event was foundational in the creation and expansion of the field of artificial intelligence?",
"answerOptions": [
{
"answerText": "Turing Test",
@ -347,48 +347,52 @@
"title": "Tools and Techniques: Pre-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "When building a model, you should:",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
"answerText": "prepare your data, then train your model",
"isCorrect": "true"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "choose a training method, then prepare your data",
"isCorrect": "false"
},
{
"answerText": "c",
"answerText": "tune parameters, then train your model",
"isCorrect": "false"
}
]
},
{
"questionText": "q2",
"questionText": "Your data's ___ will impact the quality of your ML model",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "true"
"answerText": "quantity",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "shape",
"isCorrect": "false"
},
{
"answerText": "both of the above",
"isCorrect": "true"
}
]
},
{
"questionText": "q3",
"questionText": "A feature variable is:",
"answerOptions": [
{
"answerText": "a",
"answerText": "a quality of your data",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "a measurable property of your data",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "a row of your data",
"isCorrect": "false"
}
]
@ -400,49 +404,53 @@
"title": "Tools and Techniques: Post-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "You should visualize your data because",
"answerOptions": [
{
"answerText": "a",
"answerText": "you can discover outliers",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "you can discover potential cause for bias",
"isCorrect": "true"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "both of these",
"isCorrect": "true"
}
]
},
{
"questionText": "q2",
"questionText": "Split your data into:",
"answerOptions": [
{
"answerText": "a",
"answerText": "training and turing sets",
"isCorrect": "false"
},
{
"answerText": "training and test sets",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "validation and evaluation sets",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "A common command to start the training process in various ML libraries is:",
"answerOptions": [
{
"answerText": "a",
"answerText": "model.travel",
"isCorrect": "false"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "model.train",
"isCorrect": "false"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "model.fit",
"isCorrect": "true"
}
]
}
@ -574,7 +582,7 @@
"isCorrect": "false"
},
{
"answerText": "Scikit-Learn",
"answerText": "Scikit-learn",
"isCorrect": "false"
},
{
@ -1000,7 +1008,7 @@
]
},
{
"questionText": "What does Scikit-Learn's LabelEncoder library do?",
"questionText": "What does Scikit-learn's LabelEncoder library do?",
"answerOptions": [
{
"answerText": "Encodes data alphabetically",
@ -1243,49 +1251,53 @@
"title": "Classification 3: Pre-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "A good initial classifier to try is:",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
"answerText": "Linear SVC",
"isCorrect": "true"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "K-Means",
"isCorrect": "false"
},
{
"answerText": "c",
"answerText": "Logical SVC",
"isCorrect": "false"
}
]
},
{
"questionText": "q2",
"questionText": "Regularization controls:",
"answerOptions": [
{
"answerText": "a",
"answerText": "the influence of parameters",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "the influence of training speed",
"isCorrect": "false"
},
{
"answerText": "the influence of outliers",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "K-Neighbors classifier can be used for:",
"answerOptions": [
{
"answerText": "a",
"answerText": "supervised learning",
"isCorrect": "false"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "unsupervised learning",
"isCorrect": "false"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "both of these",
"isCorrect": "true"
}
]
}
@ -1296,48 +1308,52 @@
"title": "Classification 3: Post-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "Support-Vector classifiers can be used for",
"answerOptions": [
{
"answerText": "a",
"answerText": "classification",
"isCorrect": "false"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "regression",
"isCorrect": "false"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "both of these",
"isCorrect": "true"
}
]
},
{
"questionText": "q2",
"questionText": "Random Forest is a ___ type of classifier",
"answerOptions": [
{
"answerText": "a",
"answerText": "Ensemble",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "Dissemble",
"isCorrect": "false"
},
{
"answerText": "Assemble",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "Adaboost is known for:",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
"answerText": "focusing on the weights of incorrectly classified items",
"isCorrect": "true"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "focusing on outliers",
"isCorrect": "false"
},
{
"answerText": "c",
"answerText": "focusing on incorrect data",
"isCorrect": "false"
}
]
@ -1349,48 +1365,48 @@
"title": "Classification 4: Pre-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "Recommendation systems might be used for",
"answerOptions": [
{
"answerText": "a",
"answerText": "Recommending a good restaurant",
"isCorrect": "false"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "Recommending fashions to try",
"isCorrect": "false"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "Both of these",
"isCorrect": "true"
}
]
},
{
"questionText": "q2",
"questionText": "Embedding a model in a web app helps it to be offline-capable",
"answerOptions": [
{
"answerText": "a",
"answerText": "true",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "false",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "Onnx Runtime can be used for",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
"answerText": "Running models in a web app",
"isCorrect": "true"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "Training models",
"isCorrect": "false"
},
{
"answerText": "c",
"answerText": "Hyperparameter tuning",
"isCorrect": "false"
}
]
@ -1402,48 +1418,52 @@
"title": "Classification 4: Post-Lecture Quiz",
"quiz": [
{
"questionText": "q11",
"questionText": "Netron app helps you:",
"answerOptions": [
{
"answerText": "a",
"answerText": "Visualize data",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "Visualize your model's structure",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "Test your web app",
"isCorrect": "false"
}
]
},
{
"questionText": "q2",
"questionText": "Convert your Scikit-learn model for use with Onnx using:",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "true"
"answerText": "sklearn-app",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "sklearn-web",
"isCorrect": "false"
},
{
"answerText": "sklearn-onnx",
"isCorrect": "true"
}
]
},
{
"questionText": "q3",
"questionText": "Using your model in a web app is called:",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
"answerText": "inference",
"isCorrect": "true"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "interference",
"isCorrect": "false"
},
{
"answerText": "c",
"answerText": "insurance",
"isCorrect": "false"
}
]
@ -1569,49 +1589,53 @@
"title": "K-Means Clustering: Pre-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "K-Means is derived from:",
"answerOptions": [
{
"answerText": "a",
"answerText": "electrical engineering",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "signal processing",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "computational linguistics",
"isCorrect": "false"
}
]
},
{
"questionText": "q2",
"questionText": "A good Silhouette score means:",
"answerOptions": [
{
"answerText": "a",
"answerText": "clusters are well-separated and well-defined",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "there are few clusters",
"isCorrect": "false"
},
{
"answerText": "there are many clusters",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "Variance is:",
"answerOptions": [
{
"answerText": "a",
"answerText": "the average of the squared differences from the mean",
"isCorrect": "false"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "a problem for clustering if it becomes too high",
"isCorrect": "false"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "both of these",
"isCorrect": "true"
}
]
}
@ -1622,48 +1646,48 @@
"title": "K-Means Clustering: Post-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "A Voronoi diagram shows:",
"answerOptions": [
{
"answerText": "a",
"answerText": "a cluster's variance",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "a cluster's seed and its region",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "a cluster's inertia",
"isCorrect": "false"
}
]
},
{
"questionText": "q2",
"questionText": "Inertia is",
"answerOptions": [
{
"answerText": "a",
"answerText": "a measure of how internally coherent clusters are",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "a measure of how much clusters move",
"isCorrect": "false"
},
{
"answerText": "a measure of cluster quality",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "Using K-Means, you must first determine the value of 'k'",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "true",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "false",
"isCorrect": "false"
}
]
@ -1682,7 +1706,7 @@
"isCorrect": "false"
},
{
"answerText": "Natural Language Processing",
"answerText": "natural language processing",
"isCorrect": "true"
},
{
@ -2351,48 +2375,48 @@
"title": "Reinforcement 1: Pre-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "What is reinforcement learning?",
"answerOptions": [
{
"answerText": "a",
"answerText": "teaching someone something over and over again until they understand",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "a learning technique that deciphers the optimal behavior of an agent in some environment by running many experiments",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "understanding how to run multiple experiments at once",
"isCorrect": "false"
}
]
},
{
"questionText": "q2",
"questionText": "What is a policy?",
"answerOptions": [
{
"answerText": "a",
"answerText": "a function that returns the action at any given state",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "a document that tells you whether or not you can return an item",
"isCorrect": "false"
},
{
"answerText": "a function that is used for a random purpose",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "A reward function returns a score for each state of an environment.",
"answerOptions": [
{
"answerText": "a",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "true",
"isCorrect": "true"
},
{
"answerText": "c",
"answerText": "false",
"isCorrect": "false"
}
]
@ -2404,49 +2428,49 @@
"title": "Reinforcement 1: Post-Lecture Quiz",
"quiz": [
{
"questionText": "q1",
"questionText": "What is Q-Learning?",
"answerOptions": [
{
"answerText": "a",
"answerText": "a mechanism for recording the 'goodness' of each state",
"isCorrect": "false"
},
{
"answerText": "b",
"isCorrect": "true"
"answerText": "an algorithm where the policy is defined by a Q-Table",
"isCorrect": "false"
},
{
"answerText": "c",
"isCorrect": "false"
"answerText": "both of the above",
"isCorrect": "true"
}
]
},
{
"questionText": "q2",
"questionText": "For what values does a Q-Table correspond to the random walk policy?",
"answerOptions": [
{
"answerText": "a",
"answerText": "all equal values",
"isCorrect": "true"
},
{
"answerText": "b",
"answerText": "-0.25",
"isCorrect": "false"
},
{
"answerText": "all different values",
"isCorrect": "false"
}
]
},
{
"questionText": "q3",
"questionText": "It was better to use exploration than exploitation during the learning process in our lesson.",
"answerOptions": [
{
"answerText": "a",
"answerText": "true",
"isCorrect": "false"
},
{
"answerText": "b",
"answerText": "false",
"isCorrect": "true"
},
{
"answerText": "c",
"isCorrect": "false"
}
]
}

@ -1,3 +1,3 @@
All the curriculum's sketchnotes can be downloaded here.
Credits:
Credits: Tomomi Imura
Loading…
Cancel
Save