ML-For-Beginners/9-Real-World/1-Applications/README.md

# Machine Learning in the Real World

In this curriculum, you have learned many ways to prepare data for training and create machine learning models. You built a series of classic Regression, Clustering, Classification, Natural Language Processing, and Time Series models. Congratulations! Now, you might be wondering what it's all for... what are the real world applications for these models?

While a lot of interest in industry has been garnered by AI, which usually leverages Deep Learning, there are still valuable applications for classical machine learning models. You might even use some of these applications today! In this lesson, you'll explore how eight different industries and subject-matter domains use these types of models to make their applications more performant, reliable, intelligent, and valuable to users.

## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/47/)

## 💰 Finance

### Credit card fraud detection

We learned about [k-means clustering](5-Clustering/2-K-Means/README.md) earlier in the course, but how can it be used to solve problems related to credit card fraud?

K-means clustering comes in handy during a credit card fraud detection technique called **outlier detection**. Outliers, or deviations in observations about a set of data, can tell us if a credit card is being used in a normal capacity or if something unusual is going on. As shown in the paper linked below, you can sort credit card data using a k-means clustering algorithm and assign each transaction to a cluster based on how much of an outlier it appears to be. Then, you can evaluate the riskiest clusters for fraudulent versus legitimate transactions.

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.680.1195&rep=rep1&type=pdf

### Wealth management

In wealth management, an individual or firm handles investments on behalf of their clients. Their job is to sustain and grow wealth in the long-term, so it is essential to choose investments that perform well.

One way to evaluate how a particular investment performs is through statistical regression. [Linear regression](2-Regression/1-Tools/README.md) is a valuable tool for understanding how a fund performs relative to some benchmark. We can also deduce whether or not the results of the regression are statistically significant, or how much they would affect a client's investments. You could even further expand your analysis using multiple regression, where additional risk factors can be taken into account. For an example of how this would work for a specific fund, check out the paper below on evaluating fund performance using regression.

http://www.brightwoodventures.com/evaluating-fund-performance-using-regression/

## 🎓 Education

### Predicting student behavior

[Coursera](https://coursera.com), an online open course provider, has a great tech blog where they discuss many engineering decisions. In this case study, they plotted a regression line to try to explore any correlation between a low NPS (Net Promoter Score) rating and course retention or drop-off.

https://medium.com/coursera-engineering/controlled-regression-quantifying-the-impact-of-course-quality-on-learner-retention-31f956bd592a

### Mitigating bias

[Grammarly](https://grammarly.com), a writing assistant that checks for spelling and grammar errors, uses sophisticated [NLP](6-NLP/README.md) throughout its products. They published an interesting case study in their tech blog about how they dealt with gender bias in machine learning, which you learned about in our [introductory fairness lesson](1-Introduction/3-fairness/README.md).

https://www.grammarly.com/blog/engineering/mitigating-gender-bias-in-autocorrect/

## 👜 Retail

### Personalizing the customer journey

At Wayfair, a company that sells home goods like furniture, helping customers find the right products for their taste and needs is paramount. In this article, engineers from the company describe how they use ML and NLP to "surface the right results for customers". Notably, their Query Intent Engine has been built to use entity extraction, classifier training, asset and opinion extraction, and sentiment tagging on customer reviews. This is a classic use case of how NLP works in online retail.

https://www.aboutwayfair.com/tech-innovation/how-we-use-machine-learning-and-natural-language-processing-to-empower-search

### Inventory management

Innovative, nimble companies like [StitchFix](https://stitchfix.com), a box service that ships clothing to consumers, rely heavily on ML for recommendations and inventory management. Their styling teams work together with their merchandising teams, in fact: "one of our data scientists tinkered with a genetic algorithm and applied it to apparel to predict what would be a successful piece of clothing that doesn't exist today. We brought that to the merchandise team and now they can use that as a tool."

https://www.zdnet.com/article/how-stitch-fix-uses-machine-learning-to-master-the-science-of-styling/

## 🏥 Health Care

### Managing clinical trials

Toxicity in clinical trials is a major concern to drug makers. How much toxicity is tolerable? In this study, analyzing various clinical trial methods led to the development of a new approach for predicting the odds of clinical trial outcomes. Specifically, they were able to use random forest to produce a [classifier](4-Classification/README.md) that is able to distinguish between groups of drugs.

https://www.sciencedirect.com/science/article/pii/S2451945616302914

### Hospital readmission management

Hospital care is costly, especially when patients have to be readmitted. This paper discusses a company that uses ML to predict readmission potential using [clustering](5-Clustering/README.md) algorithms. These clusters help analysts to "discover groups of readmissions that may share a common cause".

https://healthmanagement.org/c/healthmanagement/issuearticle/hospital-readmissions-and-machine-learning

### Disease management

The recent pandemic has shone a bright light on the ways that machine learning can aid in stopping the spread of disease. In this article, you'll recognize the use of ARIMA, logistic curves, linear regression, and SARIMA. "This work is an attempt to calculate the rate of spread of this virus and thus to predict the deaths, recoveries, and confirmed cases, so that it may help us to prepare better and survive."

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7979218/

## 🌲 Ecology and Green Tech

### Forest management

You learned about [Reinforcement Learning](8-Reinforcement/README.md) in previous lessons. It can be very useful when trying to predict patterns in nature. In particular, it can be used to track ecological problems like forest fires and the spread of invasive species. In Canada, a group of researchers used Reinforcement Learning to build forest wildfire dynamics models from satellite images. Using an innovative "spatially spreading process (SSP)", they envisioned a forest fire as "the agent at any cell in the landscape." "The set of actions the fire can take from a location at any point in time includes spreading north, south, east, or west or not spreading.

This approach inverts the usual RL setup since the dynamics of the corresponding Markov Decision Process (MDP) is a known function for immediate wildfire spread." Read more about the classic algorithms used by this group at the link below.

https://www.frontiersin.org/articles/10.3389/fict.2018.00006/full

### Motion sensing of animals

While deep learning has created a revolution in visually-tracking animal movements (you can build your own [polar bear tracker](https://docs.microsoft.com/learn/modules/build-ml-model-with-azure-stream-analytics/?WT.mc_id=academic-15963-cxa) here), classic ML still has a place in this task.

Sensors to track movements of farm animals and IoT makes use of this type of visual processing, but more basic ML techniques are useful to preprocess data. For example, in this paper, sheep postures were monitored and analyzed using various classifier algorithms. You might recognize the ROC curve on page 335.

https://druckhaus-hofmann.de/gallery/31-wj-feb-2020.pdf

### ⚡️ Energy Management
  
In our lesson on [Time Series](7-TimeSeries/README.md), we invoked the concept of smart parking meters to generate revenue for a town based on understanding supply and demand. This article discusses in detail how clustering, regression and time series forecasting combined to help predict future energy use in Ireland, based off of smart metering.

https://www-cdn.knime.com/sites/default/files/inline-images/knime_bigdata_energy_timeseries_whitepaper.pdf

## 💼 Insurance

### Volatility Management

MetLife, a life insurance provider, is forthcoming with the way they analyze and mitigate volatility in their financial models. In this article you'll notice binary and ordinal classification visualizations. You'll also discover forecasting visualizations.

https://investments.metlife.com/content/dam/metlifecom/us/investments/insights/research-topics/macro-strategy/pdf/MetLifeInvestmentManagement_MachineLearnedRanking_070920.pdf

## 🎨 Arts, Culture, and Literature

### Fake news detection

Detecting fake news has become a game of cat and mouse in today's media. In this article, researchers suggest that a system combining several of the ML techniques we have studied can be tested and the best model deployed: "This system is based on natural language processing to extract features from the data and then these features are used for the training of machine learning classifiers such as Naive Bayes,  Support Vector Machine (SVM), Random Forest (RF), Stochastic Gradient Descent (SGD), and Logistic Regression(LR)."

https://www.irjet.net/archives/V7/i6/IRJET-V7I6688.pdf

This article shows how combining different ML domains can produce interesting results that can help stop fake news from spreading and creating real damage; in this case, the impetus was the spread of rumors about COVID treatments that incited mob violence.

### Museum ML

Museums are at the cusp of an AI revolution in which cataloging and digitizing collections and finding links between artifacts is becoming easier as technology advances. Projects such as [In Codice Ratio](https://www.sciencedirect.com/science/article/abs/pii/S0306457321001035#:~:text=1.,studies%20over%20large%20historical%20sources.) are helping unlock the mysteries of inaccessible collections such as the Vatican Archives. But, the business aspect of museums benefits from ML models as well.

For example, the Art Institute of Chicago built models to predict what audiences are interested in and when they will attend expositions. The goals is to create individualized and optimized visitor experiences each time the user visit the museum. "During fiscal 2017, the model predicted attendance and admissions within 1 percent of accuracy, says Andrew Simnick, senior vice president at the Art Institute."

https://www.chicagobusiness.com/article/20180518/ISSUE01/180519840/art-institute-of-chicago-uses-data-to-make-exhibit-choices

## 🏷 Marketing

### Customer segmentation

The most effective marketing strategies target customers in different ways based on various groupings. In this article, the uses of Clustering algorithms are discussed to support differentiated marketing. Differentiated marketing helps companies improve brand recognition, reach more customers, and make more money.

https://ai.inqline.com/machine-learning-for-marketing-customer-segmentation/

## 🚀 Challenge

Identify another sector that benefits from some of the techniques you learned in this curriculum, and discover how it uses ML.

## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/48/)

## Review & Self Study

The Wayfair Data Science team has several interesting videos on how they use ML at their company. It's worth [taking a look](https://www.youtube.com/channel/UCe2PjkQXqOuwkW1gw6Ameuw/videos)!

## Assignment

[A ML scavenger hunt](assignment.md)
ML IRL 4 years ago			`# Machine Learning in the Real World`
lessons 4 years ago
Review of real world applications 3 years ago			`In this curriculum, you have learned many ways to prepare data for training and create machine learning models. You built a series of classic Regression, Clustering, Classification, Natural Language Processing, and Time Series models. Congratulations! Now, you might be wondering what it's all for... what are the real world applications for these models?`

			`While a lot of interest in industry has been garnered by AI, which usually leverages Deep Learning, there are still valuable applications for classical machine learning models. You might even use some of these applications today! In this lesson, you'll explore how eight different industries and subject-matter domains use these types of models to make their applications more performant, reliable, intelligent, and valuable to users.`
lessons 4 years ago
quizzes ready for input 3 years ago			`## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/47/)`
lessons 4 years ago
real -life 3 years ago			`## 💰 Finance`
first pass at creating a 'future of ai' list of topics for collab 4 years ago
ML IRL 4 years ago			`### Credit card fraud detection`
Add some notes 4 years ago
re-numbering lesson groups 3 years ago			`We learned about [k-means clustering](5-Clustering/2-K-Means/README.md) earlier in the course, but how can it be used to solve problems related to credit card fraud?`
Add some notes 4 years ago
Review of real world applications 3 years ago			K-means clustering comes in handy during a credit card fraud detection technique called outlier detection. Outliers, or deviations in observations about a set of data, can tell us if a credit card is being used in a normal capacity or if something unusual is going on. As shown in the paper linked below, you can sort credit card data using a k-means clustering algorithm and assign each transaction to a cluster based on how much of an outlier it appears to be. Then, you can evaluate the riskiest clusters for fraudulent versus legitimate transactions.
Add some notes 4 years ago
quizzes ready for input 3 years ago			`https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.680.1195&rep=rep1&type=pdf`
Review of real world applications 3 years ago
ML IRL 4 years ago			`### Wealth management`
lessons 4 years ago
Add paragraph about wealth management 4 years ago			`In wealth management, an individual or firm handles investments on behalf of their clients. Their job is to sustain and grow wealth in the long-term, so it is essential to choose investments that perform well.`

Review of real world applications 3 years ago			One way to evaluate how a particular investment performs is through statistical regression. [Linear regression](2-Regression/1-Tools/README.md) is a valuable tool for understanding how a fund performs relative to some benchmark. We can also deduce whether or not the results of the regression are statistically significant, or how much they would affect a client's investments. You could even further expand your analysis using multiple regression, where additional risk factors can be taken into account. For an example of how this would work for a specific fund, check out the paper below on evaluating fund performance using regression.
Add paragraph about wealth management 4 years ago
quizzes ready for input 3 years ago			`http://www.brightwoodventures.com/evaluating-fund-performance-using-regression/`
lessons 4 years ago
quizzes ready for input 3 years ago			`## 🎓 Education`
Review of real world applications 3 years ago
ML IRL 4 years ago			`### Predicting student behavior`
reducing to 8 3 years ago
Review of real world applications 3 years ago			`[Coursera](https://coursera.com), an online open course provider, has a great tech blog where they discuss many engineering decisions. In this case study, they plotted a regression line to try to explore any correlation between a low NPS (Net Promoter Score) rating and course retention or drop-off.`
reducing to 8 3 years ago
			`https://medium.com/coursera-engineering/controlled-regression-quantifying-the-impact-of-course-quality-on-learner-retention-31f956bd592a`
Review of real world applications 3 years ago
reducing to 8 3 years ago			`### Mitigating bias`

Review of real world applications 3 years ago			`[Grammarly](https://grammarly.com), a writing assistant that checks for spelling and grammar errors, uses sophisticated [NLP](6-NLP/README.md) throughout its products. They published an interesting case study in their tech blog about how they dealt with gender bias in machine learning, which you learned about in our [introductory fairness lesson](1-Introduction/3-fairness/README.md).`
reducing to 8 3 years ago
			`https://www.grammarly.com/blog/engineering/mitigating-gender-bias-in-autocorrect/`
Add some notes 4 years ago
real -life 3 years ago			`## 👜 Retail`
lessons 4 years ago
ML IRL 4 years ago			`### Personalizing the customer journey`
lessons 4 years ago
Review of real world applications 3 years ago			`At Wayfair, a company that sells home goods like furniture, helping customers find the right products for their taste and needs is paramount. In this article, engineers from the company describe how they use ML and NLP to "surface the right results for customers". Notably, their Query Intent Engine has been built to use entity extraction, classifier training, asset and opinion extraction, and sentiment tagging on customer reviews. This is a classic use case of how NLP works in online retail.`
stitchfix and wayfair for retail 3 years ago
			`https://www.aboutwayfair.com/tech-innovation/how-we-use-machine-learning-and-natural-language-processing-to-empower-search`

ML IRL 4 years ago			`### Inventory management`
lessons 4 years ago
stitchfix and wayfair for retail 3 years ago			Innovative, nimble companies like [StitchFix](https://stitchfix.com), a box service that ships clothing to consumers, rely heavily on ML for recommendations and inventory management. Their styling teams work together with their merchandising teams, in fact: "one of our data scientists tinkered with a genetic algorithm and applied it to apparel to predict what would be a successful piece of clothing that doesn't exist today. We brought that to the merchandise team and now they can use that as a tool."

			`https://www.zdnet.com/article/how-stitch-fix-uses-machine-learning-to-master-the-science-of-styling/`

real -life 3 years ago			`## 🏥 Health Care`
lessons 4 years ago
health care 3 years ago			`### Managing clinical trials`

Review of real world applications 3 years ago			`Toxicity in clinical trials is a major concern to drug makers. How much toxicity is tolerable? In this study, analyzing various clinical trial methods led to the development of a new approach for predicting the odds of clinical trial outcomes. Specifically, they were able to use random forest to produce a [classifier](4-Classification/README.md) that is able to distinguish between groups of drugs.`
health care 3 years ago
			`https://www.sciencedirect.com/science/article/pii/S2451945616302914`

			`### Hospital readmission management`

Review of real world applications 3 years ago			`Hospital care is costly, especially when patients have to be readmitted. This paper discusses a company that uses ML to predict readmission potential using [clustering](5-Clustering/README.md) algorithms. These clusters help analysts to "discover groups of readmissions that may share a common cause".`
health care 3 years ago
			`https://healthmanagement.org/c/healthmanagement/issuearticle/hospital-readmissions-and-machine-learning`
Review of real world applications 3 years ago
ML IRL 4 years ago			`### Disease management`
lessons 4 years ago
Review of real world applications 3 years ago			`The recent pandemic has shone a bright light on the ways that machine learning can aid in stopping the spread of disease. In this article, you'll recognize the use of ARIMA, logistic curves, linear regression, and SARIMA. "This work is an attempt to calculate the rate of spread of this virus and thus to predict the deaths, recoveries, and confirmed cases, so that it may help us to prepare better and survive."`
health care 3 years ago
			`https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7979218/`

real -life 3 years ago			`## 🌲 Ecology and Green Tech`
lessons 4 years ago
ML IRL 4 years ago			`### Forest management`
quizzes ready for input 3 years ago
Review of real world applications 3 years ago			You learned about [Reinforcement Learning](8-Reinforcement/README.md) in previous lessons. It can be very useful when trying to predict patterns in nature. In particular, it can be used to track ecological problems like forest fires and the spread of invasive species. In Canada, a group of researchers used Reinforcement Learning to build forest wildfire dynamics models from satellite images. Using an innovative "spatially spreading process (SSP)", they envisioned a forest fire as "the agent at any cell in the landscape." "The set of actions the fire can take from a location at any point in time includes spreading north, south, east, or west or not spreading.
Update README.md 4 years ago
Review of real world applications 3 years ago			`This approach inverts the usual RL setup since the dynamics of the corresponding Markov Decision Process (MDP) is a known function for immediate wildfire spread." Read more about the classic algorithms used by this group at the link below.`
quizzes ready for input 3 years ago
			`https://www.frontiersin.org/articles/10.3389/fict.2018.00006/full`
Update README.md 4 years ago
ML IRL 4 years ago			`### Motion sensing of animals`
Update README.md 4 years ago
quizzes ready for input 3 years ago			`While deep learning has created a revolution in visually-tracking animal movements (you can build your own [polar bear tracker](https://docs.microsoft.com/learn/modules/build-ml-model-with-azure-stream-analytics/?WT.mc_id=academic-15963-cxa) here), classic ML still has a place in this task.`
Update README.md 4 years ago
Review of real world applications 3 years ago			`Sensors to track movements of farm animals and IoT makes use of this type of visual processing, but more basic ML techniques are useful to preprocess data. For example, in this paper, sheep postures were monitored and analyzed using various classifier algorithms. You might recognize the ROC curve on page 335.`
quizzes ready for input 3 years ago
			`https://druckhaus-hofmann.de/gallery/31-wj-feb-2020.pdf`
Update README.md 4 years ago
real -life 3 years ago			`### ⚡️ Energy Management`
Update README.md 4 years ago
Review of real world applications 3 years ago			`In our lesson on [Time Series](7-TimeSeries/README.md), we invoked the concept of smart parking meters to generate revenue for a town based on understanding supply and demand. This article discusses in detail how clustering, regression and time series forecasting combined to help predict future energy use in Ireland, based off of smart metering.`
quizzes ready for input 3 years ago
			`https://www-cdn.knime.com/sites/default/files/inline-images/knime_bigdata_energy_timeseries_whitepaper.pdf`
lessons 4 years ago
real -life 3 years ago			`## 💼 Insurance`
lessons 4 years ago
insurance 3 years ago			`### Volatility Management`

Review of real world applications 3 years ago			`MetLife, a life insurance provider, is forthcoming with the way they analyze and mitigate volatility in their financial models. In this article you'll notice binary and ordinal classification visualizations. You'll also discover forecasting visualizations.`
insurance 3 years ago
			`https://investments.metlife.com/content/dam/metlifecom/us/investments/insights/research-topics/macro-strategy/pdf/MetLifeInvestmentManagement_MachineLearnedRanking_070920.pdf`
lessons 4 years ago
real -life 3 years ago			`## 🎨 Arts, Culture, and Literature`
Review of real world applications 3 years ago
ML IRL 4 years ago			`### Fake news detection`
Update README.md 4 years ago
Review of real world applications 3 years ago			Detecting fake news has become a game of cat and mouse in today's media. In this article, researchers suggest that a system combining several of the ML techniques we have studied can be tested and the best model deployed: "This system is based on natural language processing to extract features from the data and then these features are used for the training of machine learning classifiers such as Naive Bayes, Support Vector Machine (SVM), Random Forest (RF), Stochastic Gradient Descent (SGD), and Logistic Regression(LR)."
Update README.md 4 years ago
			`https://www.irjet.net/archives/V7/i6/IRJET-V7I6688.pdf`

			`This article shows how combining different ML domains can produce interesting results that can help stop fake news from spreading and creating real damage; in this case, the impetus was the spread of rumors about COVID treatments that incited mob violence.`
Review of real world applications 3 years ago
Update README.md 4 years ago			`### Museum ML`

Review of real world applications 3 years ago			`Museums are at the cusp of an AI revolution in which cataloging and digitizing collections and finding links between artifacts is becoming easier as technology advances. Projects such as [In Codice Ratio](https://www.sciencedirect.com/science/article/abs/pii/S0306457321001035#:~:text=1.,studies%20over%20large%20historical%20sources.) are helping unlock the mysteries of inaccessible collections such as the Vatican Archives. But, the business aspect of museums benefits from ML models as well.`
real -life 3 years ago
			`For example, the Art Institute of Chicago built models to predict what audiences are interested in and when they will attend expositions. The goals is to create individualized and optimized visitor experiences each time the user visit the museum. "During fiscal 2017, the model predicted attendance and admissions within 1 percent of accuracy, says Andrew Simnick, senior vice president at the Art Institute."`
Update README.md 4 years ago
			`https://www.chicagobusiness.com/article/20180518/ISSUE01/180519840/art-institute-of-chicago-uses-data-to-make-exhibit-choices`
ML IRL 4 years ago
real -life 3 years ago			`## 🏷 Marketing`
Review of real world applications 3 years ago
ML IRL 4 years ago			`### Customer segmentation`

Review of real world applications 3 years ago			`The most effective marketing strategies target customers in different ways based on various groupings. In this article, the uses of Clustering algorithms are discussed to support differentiated marketing. Differentiated marketing helps companies improve brand recognition, reach more customers, and make more money.`
reducing to 8 3 years ago
			`https://ai.inqline.com/machine-learning-for-marketing-customer-segmentation/`

Review of real world applications 3 years ago			`## 🚀 Challenge`

			`Identify another sector that benefits from some of the techniques you learned in this curriculum, and discover how it uses ML.`
Challenge typography edit 4 years ago
quizzes ready for input 3 years ago			`## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/48/)`
lessons 4 years ago
			`## Review & Self Study`

small edits 3 years ago			`The Wayfair Data Science team has several interesting videos on how they use ML at their company. It's worth [taking a look](https://www.youtube.com/channel/UCe2PjkQXqOuwkW1gw6Ameuw/videos)!`
real world edits 3 years ago
Review of real world applications 3 years ago			`## Assignment`
Assignment callout made more clear 3 years ago
			`[A ML scavenger hunt](assignment.md)`