From d97bb8e8789287001c4f593d4d5fd12fc732b256 Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Tue, 18 May 2021 20:19:53 -0400 Subject: [PATCH] kc --- Clustering/1-Visualize/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Clustering/1-Visualize/README.md b/Clustering/1-Visualize/README.md index ae22571c..695a3dd5 100644 --- a/Clustering/1-Visualize/README.md +++ b/Clustering/1-Visualize/README.md @@ -12,7 +12,7 @@ Clustering is a type of unsupervised learning that presumes that a dataset is un In real life, clustering can be used to determine things like market segmentation, determining what age groups buy what items, for example. Another use would be anomaly detection, perhaps to detect fraud from a dataset of credit card transactions. Or you might use clustering to determine tumors in a batch of medical scans. Alternately, you could use it for grouping search results - by shopping links, images, or reviews, for example. Clustering is useful when you have a large dataset that you want to reduce and on which you want to perform more granular analysis, so the technique can be used to learn about data before other models are constructed. -> ✅ Once your data is organized in clusters, you assign it a cluster Id, and this technique can be useful when preserving a dataset's privacy; you can instead refer to a data point by its cluster id, rather than by more revealing identifiable data. Can you think of other reasons why you'd refer to a cluster Id rather than other elements of the cluster to identify it? +✅ Once your data is organized in clusters, you assign it a cluster Id, and this technique can be useful when preserving a dataset's privacy; you can instead refer to a data point by its cluster id, rather than by more revealing identifiable data. Can you think of other reasons why you'd refer to a cluster Id rather than other elements of the cluster to identify it? ## Getting started with clustering [Scikit-Learn offers a large array](https://scikit-learn.org/stable/modules/clustering.html) of methods to perform clustering. The type you choose will depend on your use case. According to the documentation, each method has various benefits. Here is a simplified table of the methods supported by Scikit-Learn and their appropriate use cases: