{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "lesson_12-R.ipynb", "provenance": [], "collapsed_sections": [] }, "kernelspec": { "name": "ir", "display_name": "R" }, "language_info": { "name": "R" }, "coopTranslator": { "original_hash": "fab50046ca413a38939d579f8432274f", "translation_date": "2025-09-03T20:31:31+00:00", "source_file": "4-Classification/3-Classifiers-2/solution/R/lesson_12-R.ipynb", "language_code": "zh" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "jsFutf_ygqSx" }, "source": [ "# 构建分类模型:美味的亚洲和印度美食\n" ] }, { "cell_type": "markdown", "metadata": { "id": "HD54bEefgtNO" }, "source": [ "## 美食分类器 2\n", "\n", "在第二节分类课程中,我们将探索`更多方法`来分类类别数据。同时,我们还会学习选择不同分类器所带来的影响。\n", "\n", "### [**课前测验**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/23/)\n", "\n", "### **前置知识**\n", "\n", "我们假设你已经完成了之前的课程,因为我们会继续使用之前学到的一些概念。\n", "\n", "在本课程中,我们需要以下软件包:\n", "\n", "- `tidyverse`: [tidyverse](https://www.tidyverse.org/) 是一个[由 R 包组成的集合](https://www.tidyverse.org/packages),旨在让数据科学更快、更简单、更有趣!\n", "\n", "- `tidymodels`: [tidymodels](https://www.tidymodels.org/) 框架是一个[由 R 包组成的集合](https://www.tidymodels.org/packages),用于建模和机器学习。\n", "\n", "- `themis`: [themis 包](https://themis.tidymodels.org/) 提供了额外的配方步骤,用于处理不平衡数据。\n", "\n", "你可以通过以下命令安装它们:\n", "\n", "`install.packages(c(\"tidyverse\", \"tidymodels\", \"kernlab\", \"themis\", \"ranger\", \"xgboost\", \"kknn\"))`\n", "\n", "或者,下面的脚本会检查你是否已经安装了完成本模块所需的软件包,并在缺少时为你安装它们。\n" ] }, { "cell_type": "code", "metadata": { "id": "vZ57IuUxgyQt" }, "source": [ "suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\n", "\n", "pacman::p_load(tidyverse, tidymodels, themis, kernlab, ranger, xgboost, kknn)" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "z22M-pj4g07x" }, "source": [ "## **1. 分类图**\n", "\n", "在我们[上一节课](https://github.com/microsoft/ML-For-Beginners/tree/main/4-Classification/2-Classifiers-1)中,我们尝试解决一个问题:如何在多个模型之间进行选择?在很大程度上,这取决于数据的特性以及我们想要解决的问题类型(例如分类或回归)。\n", "\n", "之前,我们学习了使用微软的速查表对数据进行分类的各种选项。Python的机器学习框架Scikit-learn提供了一个类似但更细化的速查表,可以进一步帮助缩小你的估算器(分类器的另一种说法)的选择范围:\n", "\n", "
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"