{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "anaconda-cloud": "", "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.4.1" }, "colab": { "name": "lesson_14.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "GULATlQXLXyR" }, "source": [ "## Explore K-Means clustering using R and Tidy data principles.\n", "\n", "### [**Pre-lecture quiz**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/29/)\n", "\n", "In this lesson, you will learn how to create clusters using the Tidymodels package and other packages in the R ecosystem (we'll call them friends 🧑🤝🧑), and the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for Clustering. Keep in mind that, as you learned in the earlier lesson, there are many ways to work with clusters and the method you use depends on your data. We will try K-Means as it's the most common clustering technique. Let's get started!\n", "\n", "Terms you will learn about:\n", "\n", "- Silhouette scoring\n", "\n", "- Elbow method\n", "\n", "- Inertia\n", "\n", "- Variance\n", "\n", "### **Introduction**\n", "\n", "[K-Means Clustering](https://wikipedia.org/wiki/K-means_clustering) is a method derived from the domain of signal processing. It is used to divide and partition groups of data into `k clusters` based on similarities in their features.\n", "\n", "The clusters can be visualized as [Voronoi diagrams](https://wikipedia.org/wiki/Voronoi_diagram), which include a point (or 'seed') and its corresponding region.\n", "\n", "
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"