Scikit-learn spelling audit

4 years ago · b7c3a8ba99
parent 8752679f69
commit b7c3a8ba99
19 changed files with 44 additions and 44 deletions
--- a/1-Introduction/1-intro-to-ML/README.md
+++ b/1-Introduction/1-intro-to-ML/README.md
@ -21,7 +21,7 @@ Before starting with this curriculum, you need to have your computer set up and
 - **Learn Python**. It's also recommended to have a basic understanding of [Python](https://docs.microsoft.com/learn/paths/python-language/?WT.mc_id=academic-15963-cxa), a programming language useful for data scientists that we use in this course.
 - **Learn Node.js and JavaScript**. We also use JavaScript a few times in this course when building web apps, so you will need to have [node](https://nodejs.org) and [npm](https://www.npmjs.com/) installed, as well as [Visual Studio Code](https://code.visualstudio.com/) available for both Python and JavaScript development.
 - **Create a GitHub account**. Since you found us here on [GitHub](https://github.com), you might already have an account, but if not, create one and then fork this curriculum to use on your own. (Feel free to give us a star, too 😊)
- **Explore Scikit-Learn**. Familiarize yourself with [Scikit-Learn]([https://scikit-learn.org/stable/user_guide.html), a set of ML libraries that we reference in these lessons.
+- **Explore Scikit-learn**. Familiarize yourself with [Scikit-learn]([https://scikit-learn.org/stable/user_guide.html), a set of ML libraries that we reference in these lessons.

 ### What is machine learning?

@ -45,7 +45,7 @@ Although the terms can be confused, machine learning (ML) is an important subset

 ## What you will learn in this course

-In this curriculum, we are going to cover only the core concepts of machine learning that a beginner must know. We cover what we call 'Classical machine learning' primarily using Scikit-Learn, an excellent library many students use to learn the basics.  To understand broader concepts of artificial intelligence or deep learning, a strong fundamental knowledge of machine learning is indispensable, and so we would like to offer it here. 
+In this curriculum, we are going to cover only the core concepts of machine learning that a beginner must know. We cover what we call 'Classical machine learning' primarily using Scikit-learn, an excellent library many students use to learn the basics.  To understand broader concepts of artificial intelligence or deep learning, a strong fundamental knowledge of machine learning is indispensable, and so we would like to offer it here. 

 You will additionally learn the basics of Regression, Classification, Clustering, Natural Language Processing, Time Series Forecasting, and Reinforcement Learning, as well as real-world applications, the history of ML, ML and Fairness, and how to use your model in web apps.

--- a/2-Regression/1-Tools/assignment.md
+++ b/2-Regression/1-Tools/assignment.md
@ -1,13 +1,13 @@
-# Regression with Scikit-Learn
+# Regression with Scikit-learn

 ## Instructions

-Take a look at the [Linnerud dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_linnerud.html#sklearn.datasets.load_linnerud) in Scikit-Learn. This dataset has multiple [targets](https://scikit-learn.org/stable/datasets/toy_dataset.html#linnerrud-dataset): 'It consists of three excercise (data) and three physiological (target) variables collected from twenty middle-aged men in a fitness club'.
+Take a look at the [Linnerud dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_linnerud.html#sklearn.datasets.load_linnerud) in Scikit-learn. This dataset has multiple [targets](https://scikit-learn.org/stable/datasets/toy_dataset.html#linnerrud-dataset): 'It consists of three excercise (data) and three physiological (target) variables collected from twenty middle-aged men in a fitness club'.

 In your own words, describe how to create a Regression model that would plot the relationship between the waistline and how many situps are accomplished. Do the same for the other datapoints in this dataset.

 ## Rubric

-| Criteria | Exemplary | Adequate | Needs Improvement |
-| -------- | --------- | -------- | ----------------- |
-| Submit a descriptive paragraph         |  Well-written paragraph is submitted         |  A few sentences are submitted        | No description is supplied                  |
+| Criteria                       | Exemplary                           | Adequate                      | Needs Improvement          |
+| ------------------------------ | ----------------------------------- | ----------------------------- | -------------------------- |
+| Submit a descriptive paragraph | Well-written paragraph is submitted | A few sentences are submitted | No description is supplied |
--- a/2-Regression/2-Data/README.md
+++ b/2-Regression/2-Data/README.md
@ -1,13 +1,13 @@
-# Build a Regression Model using Scikit-Learn: Prepare and Visualize Data
+# Build a regression model using Scikit-learn: prepare and visualize data

-> ![Data Vizualization Infographic](./images/data-visualization.png)
+> ![Data visualization infographic](./images/data-visualization.png)
 > Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)

 ## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/11/)

 ## Introduction

-Now that you are set up with the tools you need to start tackling machine learning model-building with Scikit-Learn, you are ready to start asking questions of your data. As you work with data and apply ML solutions, it's very important to understand how to ask the right question to properly unlock the potentials of your dataset.
+Now that you are set up with the tools you need to start tackling machine learning model building with Scikit-learn, you are ready to start asking questions of your data. As you work with data and apply ML solutions, it's very important to understand how to ask the right question to properly unlock the potentials of your dataset.

 In this lesson, you will learn:

--- a/2-Regression/3-Linear/README.md
+++ b/2-Regression/3-Linear/README.md
@ -1,4 +1,4 @@
-# Build a Regression Model using Scikit-Learn: Regression Two Ways
+# Build a Regression Model using Scikit-learn: Regression Two Ways

 ![Linear vs Polynomial Regression Infographic](./images/linear-polynomial.png)
 > Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
@ -43,7 +43,7 @@ As you learned in Lesson 1, the goal of a linear regression exercise is to be ab

 Now that you have an understanding of the math behind this exercise, create a Regression model to see if you can predict which package of pumpkins will have the best pumpkin prices. Someone buying pumpkins for a holiday pumpkin patch might want this information to be able to optimize their purchases of pumpkin packages for the patch.

-Since you'll use Scikit-Learn, there's no reason to do this by hand (although you could!). In the main data-processing block of your lesson notebook, add a library from Scikit-Learn to automatically convert all string data to numbers:
+Since you'll use Scikit-learn, there's no reason to do this by hand (although you could!). In the main data-processing block of your lesson notebook, add a library from Scikit-learn to automatically convert all string data to numbers:

 ```python
 from sklearn.preprocessing import LabelEncoder
@ -52,7 +52,7 @@ new_pumpkins.iloc[:, 0:-1] = new_pumpkins.iloc[:, 0:-1].apply(LabelEncoder().fit
 new_pumpkins.iloc[:, 0:-1] = new_pumpkins.iloc[:, 0:-1].apply(LabelEncoder().fit_transform)
 ```

-If you look at the new_pumpkins dataframe now, you see that all the strings are now numeric. This makes it harder for you to read but much more intelligible for Scikit-Learn!
+If you look at the new_pumpkins dataframe now, you see that all the strings are now numeric. This makes it harder for you to read but much more intelligible for Scikit-learn!

 Now you can make more educated decisions (not just based on eyeballing a scatterplot) about the data that is best suited to regression.

@ -189,7 +189,7 @@ X=poly_pumpkins.iloc[:,3:4].values
 y=poly_pumpkins.iloc[:,4:5].values
 ```

-Scikit-Learn includes a helpful API for building polynomial regression models - the `make_pipeline` [API](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html?highlight=pipeline#sklearn.pipeline.make_pipeline). A 'pipeline' is created which is a chain of estimators. In this case, the pipeline includes Polynomial Features, or predictions that form a nonlinear path.
+Scikit-learn includes a helpful API for building polynomial regression models - the `make_pipeline` [API](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html?highlight=pipeline#sklearn.pipeline.make_pipeline). A 'pipeline' is created which is a chain of estimators. In this case, the pipeline includes Polynomial Features, or predictions that form a nonlinear path.

 ```python
 from sklearn.preprocessing import PolynomialFeatures
--- a/2-Regression/3-Linear/assignment.md
+++ b/2-Regression/3-Linear/assignment.md
@ -2,7 +2,7 @@

 ## Instructions

-In this lesson you were shown how to build a model using both Linear and Polynomial Regression. Using this knowledge, find a dataset or use one of Scikit-Learn's built-in sets to build a fresh model. Explain in your notebook why you chose the technique you did, and demonstrate your model's accuracy. If it is not accurate, explain why.
+In this lesson you were shown how to build a model using both Linear and Polynomial Regression. Using this knowledge, find a dataset or use one of Scikit-learn's built-in sets to build a fresh model. Explain in your notebook why you chose the technique you did, and demonstrate your model's accuracy. If it is not accurate, explain why.

 ## Rubric

--- a/2-Regression/4-Logistic/README.md
+++ b/2-Regression/4-Logistic/README.md
@ -119,7 +119,7 @@ Now that we have an idea of the relationship between the binary categories of co

 ## Build your model

-Building a model to find these binary classification is surprisingly straightforward in Scikit-Learn.
+Building a model to find these binary classification is surprisingly straightforward in Scikit-learn.

 Select the variables you want to use in your classification model and split the training and test sets:

@ -240,7 +240,7 @@ Using Seaborn again, plot the model's [Receiving Operating Characteristic](https

 ![ROC](./images/ROC.png)

-Finally, use Scikit-Learn's [`roc_auc_score` API](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html?highlight=roc_auc#sklearn.metrics.roc_auc_score) to compute the actual 'Area Under the Curve' (AUC):
+Finally, use Scikit-learn's [`roc_auc_score` API](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html?highlight=roc_auc#sklearn.metrics.roc_auc_score) to compute the actual 'Area Under the Curve' (AUC):

 ```python
 auc = roc_auc_score(y_test,y_scores[:,1])
--- a/2-Regression/README.md
+++ b/2-Regression/README.md
@ -12,7 +12,7 @@ The lessons in this section cover types of Regression in the context of machine

 In this series of lessons, you'll discover the difference between Linear vs. Logistic Regression, and when you should use one or the other.

-In this group of lessons, you will get set up to begin machine learning tasks, including configuring Visual Studio code to manage notebooks, the common environment for data scientists. You will discover Scikit-Learn, a library for machine learning, and you will build your first models, focusing on Regression models in this chapter.
+In this group of lessons, you will get set up to begin machine learning tasks, including configuring Visual Studio code to manage notebooks, the common environment for data scientists. You will discover Scikit-learn, a library for machine learning, and you will build your first models, focusing on Regression models in this chapter.

 > There are useful low-code tools that can help you learn about working with Regression models. Try [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa)

--- a/3-Web-App/1-Web-App/README.md
+++ b/3-Web-App/1-Web-App/README.md
@ -52,7 +52,7 @@ ufos = ufos[(ufos['Seconds'] >= 1) & (ufos['Seconds'] <= 60)]
 ufos.info()
 ```

-Next, import Scikit-Learn's LabelEncoder library to convert the text values for countries to a number. 
+Next, import Scikit-learn's LabelEncoder library to convert the text values for countries to a number. 

 ✅ LabelEncoder encodes data alphabetically

--- a/3-Web-App/README.md
+++ b/3-Web-App/README.md
@ -1,6 +1,6 @@
 # Build a Web App to use your ML Model

-In this section of the curriculum, you will be introduced to an applied ML topic: how to save your Scikit-Learn model as a file that can be used to make predictions within a web application. Once the model is saved, you'll learn how to use it in a web app built in Flask. You'll first create a model using some data that's all about UFO sightings! Then, you'll build a web app that will allow you to input a number of seconds with a latitude and a longitude value to predict which country reported seeing a UFO. 
+In this section of the curriculum, you will be introduced to an applied ML topic: how to save your Scikit-learn model as a file that can be used to make predictions within a web application. Once the model is saved, you'll learn how to use it in a web app built in Flask. You'll first create a model using some data that's all about UFO sightings! Then, you'll build a web app that will allow you to input a number of seconds with a latitude and a longitude value to predict which country reported seeing a UFO. 

 ## Lessons

--- a/4-Classification/1-Introduction/README.md
+++ b/4-Classification/1-Introduction/README.md
@ -27,13 +27,13 @@ Derived from [statistics](https://wikipedia.org/wiki/Statistical_classification)

 The question we want to ask of this cuisine dataset is actually a **multiclass question**, as we have several potential national cuisines to work with. Given a batch of ingredients, which of these many classes will the data fit?

-Scikit-Learn offers several different algorithms to use to classify data, depending on the kind of problem you want to solve. In the next two lessons, you'll learn about several of these algorithms.
+Scikit-learn offers several different algorithms to use to classify data, depending on the kind of problem you want to solve. In the next two lessons, you'll learn about several of these algorithms.

 ## Clean and Balance Your Data

 The first task at hand before starting this project is to clean and **balance** your data to get better results. Start with the blank `notebook.ipynb` file ini the root of this folder.

-The first think to install is [imblearn](https://imbalanced-learn.org/stable/). This is a Scikit-Learn package that will allow you to better balance the data (you will learn more about this task in a minute).
+The first think to install is [imblearn](https://imbalanced-learn.org/stable/). This is a Scikit-learn package that will allow you to better balance the data (you will learn more about this task in a minute).

 ```python
 pip install imblearn
--- a/4-Classification/1-Introduction/assignment.md
+++ b/4-Classification/1-Introduction/assignment.md
@ -2,7 +2,7 @@

 ## Instructions

-In [Scikit-Learn documentation](https://scikit-learn.org/stable/supervised_learning.html) you'll find a large list of ways to classify data. Do a little scavenger hunt in these docs: your goals is to look for classification methods and match a dataset in this curriculum, a question you can ask of it, and a technique of classification. Create a spreadsheet or table in a .doc file and explain how the dataset would work with the classification algorithm.
+In [Scikit-learn documentation](https://scikit-learn.org/stable/supervised_learning.html) you'll find a large list of ways to classify data. Do a little scavenger hunt in these docs: your goals is to look for classification methods and match a dataset in this curriculum, a question you can ask of it, and a technique of classification. Create a spreadsheet or table in a .doc file and explain how the dataset would work with the classification algorithm.

 ## Rubric

--- a/4-Classification/1-Introduction/solution/notebook.ipynb
+++ b/4-Classification/1-Introduction/solution/notebook.ipynb
@ -9,7 +9,7 @@
  },
  {
   "source": [
-    "Install Imblearn which will enable SMOTE. This is a Scikit-Learn package that helps handle imbalanced data when performing classification. (https://imbalanced-learn.org/stable/)"
+    "Install Imblearn which will enable SMOTE. This is a Scikit-learn package that helps handle imbalanced data when performing classification. (https://imbalanced-learn.org/stable/)"
   ],
   "cell_type": "markdown",
   "metadata": {}
--- a/4-Classification/2-Classifiers-1/README.md
+++ b/4-Classification/2-Classifiers-1/README.md
@ -75,7 +75,7 @@ Now you are ready to train your model!

 Now that your data is clean and ready for training, you have to decide which algorithm to use for the job. 

-Scikit-Learn groups Classification under Supervised Learning, and in that category you will find many ways to classify. [The variety](https://scikit-learn.org/stable/supervised_learning.html) is quite bewildering at first sight. The following methods all include classification techniques:
+Scikit-learn groups Classification under Supervised Learning, and in that category you will find many ways to classify. [The variety](https://scikit-learn.org/stable/supervised_learning.html) is quite bewildering at first sight. The following methods all include classification techniques:

 - Linear Models
 - Support Vector Machines
@ -88,10 +88,10 @@ Scikit-Learn groups Classification under Supervised Learning, and in that catego

 You can also use [neural networks to classify](https://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification), but that is outside the scope of this lesson.

-So, which classifier should you choose? Often, running through several and looking for a good result is a way to test. Scikit-Learn offers a [side-by-side comparison](https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html) on a created dataset, comparing KNeighbors, SVC two ways, GaussianProcessClassifier, DecisionTreeClassifier, RandomForestClassifier, MLPClassifier, AdaBoostClassifier, GaussianNB and QuadraticDiscrinationAnalysis, showing the results visualized: 
+So, which classifier should you choose? Often, running through several and looking for a good result is a way to test. Scikit-learn offers a [side-by-side comparison](https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html) on a created dataset, comparing KNeighbors, SVC two ways, GaussianProcessClassifier, DecisionTreeClassifier, RandomForestClassifier, MLPClassifier, AdaBoostClassifier, GaussianNB and QuadraticDiscrinationAnalysis, showing the results visualized: 

 ![comparison of classifiers](images/comparison.png)
-> Plots generated on Scikit-Learn's documentation
+> Plots generated on Scikit-learn's documentation

 > AutoML solves this problem neatly by running these comparisons in the cloud, allowing you to choose the best algorithm for your data. Try it [here](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-15963-cxa)

@ -116,7 +116,7 @@ Let's train that model. Split your data into training and testing groups:
 X_train, X_test, y_train, y_test = train_test_split(cuisines_feature_df, cuisines_label_df, test_size=0.3)
 ```

-There are many ways to use the LogisticRegression library in Scikit-Learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).  
+There are many ways to use the LogisticRegression library in Scikit-learn. Take a look at the [parameters to pass](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression).  

 According to the docs, "In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the ‘multi_class’ option is set to ‘ovr’, and uses the cross-entropy loss if the ‘multi_class’ option is set to ‘multinomial’. (Currently the ‘multinomial’ option is supported only by the ‘lbfgs’, ‘sag’, ‘saga’ and ‘newton-cg’ solvers.)"

@ -128,7 +128,7 @@ Use LogisticRegression with a multiclass setting and the liblinear solver to tra

 > 🎓 The 'solver' is defined as "the algorithm to use in the optimization problem". [source](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regressio#sklearn.linear_model.LogisticRegression). 

-Scikit-Learn offers this table to explain how solvers handle different challenges presented by different kinds of data structures:
+Scikit-learn offers this table to explain how solvers handle different challenges presented by different kinds of data structures:

 ![solvers](images/solvers.png)

@ -203,7 +203,7 @@ print(classification_report(y_test,y_pred))

 ## 🚀Challenge

-In this lesson, you used your cleaned data to build a machine learning model that can predict a national cuisine based on a series of ingredients. Take some time to read through the many options Scikit-Learn provides to classify data. Dig deeper into the concept of 'solver' to understand what goes on behind the scenes.
+In this lesson, you used your cleaned data to build a machine learning model that can predict a national cuisine based on a series of ingredients. Take some time to read through the many options Scikit-learn provides to classify data. Dig deeper into the concept of 'solver' to understand what goes on behind the scenes.

 ## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/22/)
 ## Review & Self Study
--- a/4-Classification/3-Classifiers-2/README.md
+++ b/4-Classification/3-Classifiers-2/README.md
@ -12,9 +12,9 @@ We have loaded your `notebook.ipynb` file with the cleaned dataset and have divi

 ## A Classification Map

-Previously, you learned about the various options you have when classifying data using Microsoft's cheat sheet. Scikit-Learn offers a similar, but more granular cheat sheet that can further help narrow down your estimators (another term for classifiers):
+Previously, you learned about the various options you have when classifying data using Microsoft's cheat sheet. Scikit-learn offers a similar, but more granular cheat sheet that can further help narrow down your estimators (another term for classifiers):

-![ML Map from Scikit-Learn](images/map.png)
+![ML Map from Scikit-learn](images/map.png)
 > Tip: [visit this map online](https://scikit-learn.org/stable/tutorial/machine_learning_map/) and click along the path to read documentation.

 This map is very helpful once you have a clear grasp of your data, as you can 'walk' along its paths to a decision:
--- a/4-Classification/4-Applied/README.md
+++ b/4-Classification/4-Applied/README.md
@ -24,7 +24,7 @@ First, train a classification model using the cleaned cuisines dataset we used.
 pip install skl2onnx
 import pandas as pd 
 ```
-You need '[skl2onnx](https://onnx.ai/sklearn-onnx/)' to help convert your Scikit-Learn model to Onnx format.
+You need '[skl2onnx](https://onnx.ai/sklearn-onnx/)' to help convert your Scikit-learn model to Onnx format.

 Then, work with your data in the same way you did in previous lessons:

@ -48,7 +48,7 @@ y.head()

 ```

-Commence the training routine. We will use the 'SVC' library which has good accuracy. Import the appropriate libraries from Scikit-Learn:
+Commence the training routine. We will use the 'SVC' library which has good accuracy. Import the appropriate libraries from Scikit-learn:

 ```python
 from sklearn.model_selection import train_test_split
--- a/5-Clustering/1-Visualize/README.md
+++ b/5-Clustering/1-Visualize/README.md
@ -29,7 +29,7 @@ Alternately, you could use it for grouping search results - by shopping links, i
 Deepen your understanding of Clustering techniques in this [Learn module](https://docs.microsoft.com/learn/modules/train-evaluate-cluster-models?WT.mc_id=academic-15963-cxa)
 ## Getting started with clustering

-[Scikit-Learn offers a large array](https://scikit-learn.org/stable/modules/clustering.html) of methods to perform clustering. The type you choose will depend on your use case. According to the documentation, each method has various benefits. Here is a simplified table of the methods supported by Scikit-Learn and their appropriate use cases:
+[Scikit-learn offers a large array](https://scikit-learn.org/stable/modules/clustering.html) of methods to perform clustering. The type you choose will depend on your use case. According to the documentation, each method has various benefits. Here is a simplified table of the methods supported by Scikit-learn and their appropriate use cases:

 | Method name                  | Use case                                                               |
 | :--------------------------- | :--------------------------------------------------------------------- |
@ -81,7 +81,7 @@ There are over 100 clustering algorithms, and their use depends on the nature of

 **Hierarchical clustering** 

-If an object is classified by its proximity to a nearby object, rather than to one farther away, clusters are formed based on their members' distance to and from other objects. Scikit-Learn's Agglomerative clustering is hierarchical.
+If an object is classified by its proximity to a nearby object, rather than to one farther away, clusters are formed based on their members' distance to and from other objects. Scikit-learn's Agglomerative clustering is hierarchical.

 ![Hierarchical clustering Infographic](./images/hierarchical.png)
 > Infographic by [Dasani Madipalli](https://twitter.com/dasani_decoded)
--- a/5-Clustering/2-K-Means/README.md
+++ b/5-Clustering/2-K-Means/README.md
@ -6,7 +6,7 @@

 ## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/29/)

-In this lesson, you will learn how to create clusters using Scikit-Learn and the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for Clustering. Keep in mind that, as you learned in the earlier lesson, there are many ways to work with clusters and the method you use depends on your data. We will try K-Means as it's the most common Clustering technique. Let's get started!
+In this lesson, you will learn how to create clusters using Scikit-learn and the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for Clustering. Keep in mind that, as you learned in the earlier lesson, there are many ways to work with clusters and the method you use depends on your data. We will try K-Means as it's the most common Clustering technique. Let's get started!

 Terms you will learn about:

@ -145,7 +145,7 @@ for i in range(1, 11):

 > 🎓 Inertia: K-Means algorithms attempt to choose centroids to minimize 'inertia', "a measure of how internally coherent clusters are."[source](https://scikit-learn.org/stable/modules/clustering.html). The value is appended to the wcss variable on each iteration.

-> 🎓 k-means++: In [Scikit-Learn](https://scikit-learn.org/stable/modules/clustering.html#k-means) you can use the 'k-means++' optimization, which "initializes the centroids to be (generally) distant from each other, leading to probably better results than random initialization.
+> 🎓 k-means++: In [Scikit-learn](https://scikit-learn.org/stable/modules/clustering.html#k-means) you can use the 'k-means++' optimization, which "initializes the centroids to be (generally) distant from each other, leading to probably better results than random initialization.
 ### Elbow method

 Previously, you surmised that, because you have targeted 3 song genres, you should choose 3 clusters. But is that the case? Use the 'elbow method' to make sure.
@ -194,11 +194,11 @@ This model's accuracy is not very good, and the shape of the clusters gives you

 This data is too imbalanced, too little correlated and there is too much variance between the column values, to cluster well. In fact, the clusters that form are probably heavily influenced or skewed by the three genre categories we defined above. That was a learning process!

-In Scikit-Learn's documentation, you can see that a model like this one, with clusters not very well demarcated, has a 'variance' problem:
+In Scikit-learn's documentation, you can see that a model like this one, with clusters not very well demarcated, has a 'variance' problem:

 ![problem models](images/problems.png)

-> Infographic from Scikit-Learn
+> Infographic from Scikit-learn
 ## Variance

 Variance is defined as "the average of the squared differences from the Mean."[source](https://www.mathsisfun.com/data/standard-deviation.html) In the context of this clustering problem, it refers to data that the numbers of our dataset tend to diverge a bit too much from the mean. 
--- a/README.md
+++ b/README.md
@ -12,7 +12,7 @@

 > 🌍 Travel around the world as we explore Machine Learning by means of world cultures 🌍

-Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about traditional Machine Learning. In this lesson group, you will learn about what is sometimes called 'classic' ML, using primarily Scikit-Learn as a library and avoiding deep learning, which is covered in our forthcoming 'AI for Beginners' curriculum. 
+Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about traditional Machine Learning. In this lesson group, you will learn about what is sometimes called 'classic' ML, using primarily Scikit-learn as a library and avoiding deep learning, which is covered in our forthcoming 'AI for Beginners' curriculum. 

 Travel with us around the world as we apply these classic techniques to data from many areas of the world. Each lesson includes pre- and post-lesson quizzes, written instructions to complete the lesson, a solution, an assignment and more. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'.

@ -75,7 +75,7 @@ By ensuring that the content aligns with projects, the process is made more enga
 |      02       |          [Introduction](1-Introduction/README.md)          |           The History of Machine Learning           | Learn the history underlying this field                                                                                         |   [lesson](Introduction/2-history-of-ML/README.md)    |  Jen and Amy   |
 |      03       |          [Introduction](1-Introduction/README.md)          |            Fairness and Machine Learning            | What are the important philosophical issues around fairness that students should consider when building and applying ML models? |     [lesson](1-Introduction/3-fairness/README.md)     |     Tomomi     |
 |      04       |          [Introduction](1-Introduction/README.md)          |           Techniques for Machine Learning           | What techniques do ML researchers use to build ML models?                                                                       | [lesson](1-Introduction/4-techniques-of-ML/README.md) | Chris and Jen  |
-|      05       |                 Introduction to Regression                 |        [Regression](2-Regression/README.md)         | Get started with Python and Scikit-Learn for Regression models                                                                  |       [lesson](2-Regression/1-Tools/README.md)        |      Jen       |
+|      05       |                 Introduction to Regression                 |        [Regression](2-Regression/README.md)         | Get started with Python and Scikit-learn for Regression models                                                                  |       [lesson](2-Regression/1-Tools/README.md)        |      Jen       |
 |      06       |              North American Pumpkin Prices 🎃               |        [Regression](2-Regression/README.md)         | Visualize and clean data in preparation for ML                                                                                  |        [lesson](2-Regression/2-Data/README.md)        |      Jen       |
 |      07       |              North American Pumpkin Prices 🎃               |        [Regression](2-Regression/README.md)         | Build Linear and Polynomial Regression models                                                                                   |       [lesson](2-Regression/3-Linear/README.md)       |      Jen       |
 |      08       |              North American Pumpkin Prices 🎃               |        [Regression](2-Regression/README.md)         | Build a Logistic Regression model                                                                                               |      [lesson](2-Regression/4-Logistic/README.md)      |      Jen       |
--- a/quiz-app/src/assets/translations/en.json
+++ b/quiz-app/src/assets/translations/en.json
@ -574,7 +574,7 @@
 								"isCorrect": "false"
 							},
 							{
-								"answerText": "Scikit-Learn",
+								"answerText": "Scikit-learn",
 								"isCorrect": "false"
 							},
 							{
@ -1000,7 +1000,7 @@
 						]
 					},
 					{
-						"questionText": "What does Scikit-Learn's LabelEncoder library do?",
+						"questionText": "What does Scikit-learn's LabelEncoder library do?",
 						"answerOptions": [
 							{
 								"answerText": "Encodes data alphabetically",