{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "anaconda-cloud": "", "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.4.1" }, "colab": { "name": "lesson_14.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "coopTranslator": { "original_hash": "ad65fb4aad0a156b42216e4929f490fc", "translation_date": "2025-09-03T20:17:16+00:00", "source_file": "5-Clustering/2-K-Means/solution/R/lesson_15-R.ipynb", "language_code": "zh" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "GULATlQXLXyR" }, "source": [ "## 使用 R 和 Tidy 数据原则探索 K-Means 聚类\n", "\n", "### [**课前测验**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/29/)\n", "\n", "在本课中,您将学习如何使用 Tidymodels 包以及 R 生态系统中的其他包(我们称它们为朋友 🧑🤝🧑)创建聚类,并使用您之前导入的尼日利亚音乐数据集。我们将介绍 K-Means 聚类的基础知识。请记住,正如您在之前的课程中学到的那样,有许多方法可以处理聚类,您使用的方法取决于您的数据。我们将尝试 K-Means,因为它是最常见的聚类技术。让我们开始吧!\n", "\n", "您将学习的术语:\n", "\n", "- Silhouette评分\n", "\n", "- 肘部法则\n", "\n", "- 惯性\n", "\n", "- 方差\n", "\n", "### **简介**\n", "\n", "[K-Means 聚类](https://wikipedia.org/wiki/K-means_clustering) 是一种源自信号处理领域的方法。它用于根据特征的相似性将数据分成 `k 个聚类`。\n", "\n", "这些聚类可以通过 [Voronoi 图](https://wikipedia.org/wiki/Voronoi_diagram) 可视化,其中包括一个点(或“种子”)及其对应的区域。\n", "\n", "
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"