@ -93,4 +93,6 @@ Here are items to watch and listen to:
[![The history of AI by Amy Boyd](https://img.youtube.com/vi/EJt3_bFYKss/0.jpg)](https://www.youtube.com/watch?v=EJt3_bFYKss "The history of AI by Amy Boyd")
[![The history of AI by Amy Boyd](https://img.youtube.com/vi/EJt3_bFYKss/0.jpg)](https://www.youtube.com/watch?v=EJt3_bFYKss "The history of AI by Amy Boyd")
**Assignment**: [Create a timeline](assignment.md)
@ -187,4 +187,6 @@ In this tutorial, you worked with simple linear regression, rather than univaria
Read more about the concept of Regression and think about what kinds of questions can be answered by this technique. Take this [tutorial](https://docs.microsoft.com/learn/modules/train-evaluate-regression-models?WT.mc_id=academic-15963-cxa) to deepen your understanding.
Read more about the concept of Regression and think about what kinds of questions can be answered by this technique. Take this [tutorial](https://docs.microsoft.com/learn/modules/train-evaluate-regression-models?WT.mc_id=academic-15963-cxa) to deepen your understanding.
**Assignment**: [A different dataset](assignment.md)
@ -149,4 +149,6 @@ Explore the different types of visualization that matplotlib offers. Which types
Take a look at the many ways to visualize data. Make a list of the various libraries available and note which are best for given types of tasks, for example 2D visualizations vs. 3D visualizations. What do you discover?
Take a look at the many ways to visualize data. Make a list of the various libraries available and note which are best for given types of tasks, for example 2D visualizations vs. 3D visualizations. What do you discover?
@ -255,4 +255,6 @@ Test several different variables in this notebook to see how correlation corresp
In this lesson we learned about Linear Regression. There are other important types of Regression. Read about Stepwise, Ridge, Lasso and Elasticnet techniques. A good course to study to learn more is the [Stanford Statistical Learning course](https://online.stanford.edu/courses/sohs-ystatslearning-statistical-learning)
In this lesson we learned about Linear Regression. There are other important types of Regression. Read about Stepwise, Ridge, Lasso and Elasticnet techniques. A good course to study to learn more is the [Stanford Statistical Learning course](https://online.stanford.edu/courses/sohs-ystatslearning-statistical-learning)
@ -256,4 +256,6 @@ There's a lot more to unpack regarding Logistic Regression! But the best way to
Read the first few pages of [this paper from Stanford](https://web.stanford.edu/~jurafsky/slp3/5.pdf) on some practical uses for Logistic Regression. Think about tasks that are better suited for one or the other type of Regression tasks that we have studied up to this point. What would work best?
Read the first few pages of [this paper from Stanford](https://web.stanford.edu/~jurafsky/slp3/5.pdf) on some practical uses for Logistic Regression. Think about tasks that are better suited for one or the other type of Regression tasks that we have studied up to this point. What would work best?
**Assignment**: [Retrying this Regression](assignment.md)
@ -273,6 +273,8 @@ Instead of working in a notebook and importing the model to the Flask app, you c
There are many ways to build a web app to consume ML models. Make a list of the ways you could use JavaScript or Python to build a web app to leverage machine learning. Consider architecture: should the model stay in the app or live in the cloud? If the latter, how would you access it? Draw out an architectural model for an applied ML web solution.
There are many ways to build a web app to consume ML models. Make a list of the ways you could use JavaScript or Python to build a web app to leverage machine learning. Consider architecture: should the model stay in the app or live in the cloud? If the latter, how would you access it? Draw out an architectural model for an applied ML web solution.
**Assignment**: [Try a different model](assignment.md)
@ -312,4 +312,6 @@ Before you apply clustering algorithms, as we have learned, it's a good idea to
[This helpful article](https://www.freecodecamp.org/news/8-clustering-algorithms-in-machine-learning-that-all-data-scientists-should-know/) walks you through the different ways that various clustering algorithms behave, given different data shapes.
[This helpful article](https://www.freecodecamp.org/news/8-clustering-algorithms-in-machine-learning-that-all-data-scientists-should-know/) walks you through the different ways that various clustering algorithms behave, given different data shapes.
**Assignment**: [Research other visualizations for clustering](assignment.md)
## Assignment
[Research other visualizations for clustering](assignment.md)
In this lesson, you will learn how to create clusters using Scikit-Learn and the Nigerian music dataset you imported earlier. We will cover
In this lesson, you will learn how to create clusters using Scikit-Learn and the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for Clustering. Keep in mind that, as you learned in the earlier lesson, there are many ways to work with clusters and the method you use depends on your data. We will try K-Means as it's the most common Clustering technique. Let's get started!
- Data variance
- Data variance
- Silhouette Scoring
- Elbow Method
- K-Means for Clustering
### Introduction
### Introduction
@ -17,7 +20,9 @@ In this lesson, you will learn how to create clusters using Scikit-Learn and the
Preparatory steps to start this lesson
Preparatory steps to start this lesson
### Silhouette score
"The value of the Silhouette score varies from -1 to 1. If the score is 1, the cluster is dense and well-separated than other clusters. A value near 0 represents overlapping clusters with samples very close to the decision boundary of the neighboring clusters. A negative score [-1, 0] indicates that the samples might have got assigned to the wrong clusters." - https://dzone.com/articles/kmeans-silhouette-score-explained-with-python-exam
✅ Knowledge Check - use this moment to stretch students' knowledge with open questions
✅ Knowledge Check - use this moment to stretch students' knowledge with open questions
@ -25,9 +30,7 @@ Preparatory steps to start this lesson
## 🚀Challenge
## 🚀Challenge
Spend some time with this notebook, tweaking parameters. Can you improve the accuracy of the model by cleaning the data more (removing outliers, for example)? What else can you do to create better clusters?
Spend some time with this notebook, tweaking parameters. Can you improve the accuracy of the model by cleaning the data more (removing outliers, for example)? What else can you do to create better clusters?
Take a look at Stanford's K-Means Simulator [here](https://stanford.edu/class/engr108/visualizations/kmeans/kmeans.html). You can use this tool to visualize sample data points and determine its centroids. With fresh data, click 'update' to see how long it takes to find convergence. You can edit the data's randomness, numbers of clusters and numbers of centroids. Does this help you get an idea of how the data can be grouped?
Take a look at Stanford's K-Means Simulator [here](https://stanford.edu/class/engr108/visualizations/kmeans/kmeans.html). You can use this tool to visualize sample data points and determine its centroids. With fresh data, click 'update' to see how long it takes to find convergence. You can edit the data's randomness, numbers of clusters and numbers of centroids. Does this help you get an idea of how the data can be grouped?
@ -35,4 +38,6 @@ Take a look at Stanford's K-Means Simulator [here](https://stanford.edu/class/en
Also, take a look at [this handout on k-means](https://stanford.edu/~cpiech/cs221/handouts/kmeans.html
Also, take a look at [this handout on k-means](https://stanford.edu/~cpiech/cs221/handouts/kmeans.html
) from Stanford
) from Stanford
**Assignment**: [Try different clustering methods](assignment.md)
"Those numbers don't mean much to us, so let's get a 'silhouette score' to see the accuracy. \"The value of the Silhouette score varies from -1 to 1. If the score is 1, the cluster is dense and well-separated than other clusters. A value near 0 represents overlapping clusters with samples very close to the decision boundary of the neighboring clusters. A negative score [-1, 0] indicates that the samples might have got assigned to the wrong clusters.\" - https://dzone.com/articles/kmeans-silhouette-score-explained-with-python-exam\n",
"Those numbers don't mean much to us, so let's get a 'silhouette score' to see the accuracy. Our score is in the middle."
@ -185,4 +185,6 @@ Take a task in the prior knowledge check and try to implement it. Test the bot o
In the next few lessons you will learn more about sentiment analysis. Research this interesting technique in articles such as these on [KDNuggets](https://www.kdnuggets.com/tag/nlp)
In the next few lessons you will learn more about sentiment analysis. Research this interesting technique in articles such as these on [KDNuggets](https://www.kdnuggets.com/tag/nlp)
**Assignment**: [Make a bot talk back](assignment.md)
@ -144,4 +144,6 @@ Can you make Marvin even better by extracting other features from the user input
There are many ways to extract sentiment from text. Think of the business applications that might make use of this technique. Think about how it can go awry. Read more about sophisticated enterprise-ready systems that analyze sentiment such as [Azure Text Analysis](https://docs.microsoft.com/en-us/azure/cognitive-services/Text-Analytics/how-tos/text-analytics-how-to-sentiment-analysis?tabs=version-3-1?WT.mc_id=academic-15963-cxa). Test some of the Pride and Prejudice sentences above and see if it can detect nuance.
There are many ways to extract sentiment from text. Think of the business applications that might make use of this technique. Think about how it can go awry. Read more about sophisticated enterprise-ready systems that analyze sentiment such as [Azure Text Analysis](https://docs.microsoft.com/en-us/azure/cognitive-services/Text-Analytics/how-tos/text-analytics-how-to-sentiment-analysis?tabs=version-3-1?WT.mc_id=academic-15963-cxa). Test some of the Pride and Prejudice sentences above and see if it can detect nuance.
@ -150,4 +150,6 @@ Make a list of all the industries and areas of inquiry you can think of that wou
Although we won't cover them here, neural networks are sometimes used to enhance classic methods of Time Series Forecasting. Read more about them [in this article](https://medium.com/microsoftazure/neural-networks-for-forecasting-financial-and-economic-time-series-6aca370ff412)
Although we won't cover them here, neural networks are sometimes used to enhance classic methods of Time Series Forecasting. Read more about them [in this article](https://medium.com/microsoftazure/neural-networks-for-forecasting-financial-and-economic-time-series-6aca370ff412)
**Assignment**: [Visualize some more Time Series](assignment.md)
@ -357,4 +357,6 @@ Dig into the ways to test the accuracy of a Time Series Model. We touch on MAPE
This lesson touches on only the basics of Time Series Forecasting with ARIMA. Take some time to deepen your knowledge by digging into [this repository](https://microsoft.github.io/forecasting/) and its various model types to learn other ways to build Time Series models.
This lesson touches on only the basics of Time Series Forecasting with ARIMA. Take some time to deepen your knowledge by digging into [this repository](https://microsoft.github.io/forecasting/) and its various model types to learn other ways to build Time Series models.
**Assignment**: [A new ARIMA model](assignment.md)
@ -133,4 +133,6 @@ Discover one more sector that benefits from some of the techniques you learned i
Wayfair Data Science group has several interesting videos on how they use ML in their company. It's worth [taking a look](https://www.youtube.com/channel/UCe2PjkQXqOuwkW1gw6Ameuw/videos)!
Wayfair Data Science group has several interesting videos on how they use ML in their company. It's worth [taking a look](https://www.youtube.com/channel/UCe2PjkQXqOuwkW1gw6Ameuw/videos)!
**Assignment**: [A ML scavenger hunt](assignment.md)