{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "anaconda-cloud": "", "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.4.1" }, "colab": { "name": "lesson_14.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "coopTranslator": { "original_hash": "ad65fb4aad0a156b42216e4929f490fc", "translation_date": "2025-09-06T15:36:34+00:00", "source_file": "5-Clustering/2-K-Means/solution/R/lesson_15-R.ipynb", "language_code": "en" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "GULATlQXLXyR" }, "source": [ "## Explore K-Means clustering using R and Tidy data principles.\n", "\n", "### [**Pre-lecture quiz**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/29/)\n", "\n", "In this lesson, you will learn how to create clusters using the Tidymodels package and other packages in the R ecosystem (we'll call them friends 🧑🤝🧑), along with the Nigerian music dataset you imported earlier. We will cover the basics of K-Means for clustering. Remember, as you learned in the previous lesson, there are many ways to work with clusters, and the method you choose depends on your data. We'll try K-Means since it's the most common clustering technique. Let's dive in!\n", "\n", "Terms you will learn about:\n", "\n", "- Silhouette scoring \n", "- Elbow method \n", "- Inertia \n", "- Variance \n", "\n", "### **Introduction**\n", "\n", "[K-Means Clustering](https://wikipedia.org/wiki/K-means_clustering) is a method derived from the field of signal processing. It is used to divide and group data into `k clusters` based on similarities in their features.\n", "\n", "The clusters can be visualized as [Voronoi diagrams](https://wikipedia.org/wiki/Voronoi_diagram), which consist of a point (or 'seed') and its corresponding region.\n", "\n", "
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"