{ "nbformat": 4, "nbformat_minor": 2, "metadata": { "colab": { "name": "lesson_10-R.ipynb", "provenance": [], "collapsed_sections": [] }, "kernelspec": { "name": "ir", "display_name": "R" }, "language_info": { "name": "R" }, "coopTranslator": { "original_hash": "2621e24705e8100893c9bf84e0fc8aef", "translation_date": "2025-09-03T20:41:29+00:00", "source_file": "4-Classification/1-Introduction/solution/R/lesson_10-R.ipynb", "language_code": "hk" } }, "cells": [ { "cell_type": "markdown", "source": [ "# 建立分類模型:美味的亞洲和印度美食\n" ], "metadata": { "id": "ItETB4tSFprR" } }, { "cell_type": "markdown", "source": [ "## 分類簡介:清理、準備和視覺化數據\n", "\n", "在這四節課中,你將探索經典機器學習的一個基本重點——*分類*。我們將使用一個關於亞洲和印度各種美食的數據集,逐步了解如何使用不同的分類算法。希望你已經準備好大快朵頤了!\n", "\n", "

\n", " \n", "

在這些課程中一起慶祝泛亞洲美食吧!圖片由 Jen Looper 提供
\n", "\n", "分類是一種[監督式學習](https://wikipedia.org/wiki/Supervised_learning),與回歸技術有許多相似之處。在分類中,你訓練一個模型來預測某個項目屬於哪個`類別`。如果說機器學習是通過數據集來預測值或名稱,那麼分類通常分為兩類:*二元分類*和*多類分類*。\n", "\n", "請記住:\n", "\n", "- **線性回歸**幫助你預測變量之間的關係,並準確預測新數據點在該線性關係中的位置。例如,你可以預測南瓜在九月和十二月的價格。\n", "\n", "- **邏輯回歸**幫助你發現「二元類別」:在這個價格範圍內,*這個南瓜是橙色還是不是橙色*?\n", "\n", "分類使用各種算法來確定數據點的標籤或類別。讓我們使用這個美食數據集,看看通過觀察一組食材,是否可以確定其美食的來源。\n", "\n", "### [**課前測驗**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/19/)\n", "\n", "### **簡介**\n", "\n", "分類是機器學習研究人員和數據科學家的基本活動之一。從基本的二元值分類(「這封郵件是垃圾郵件還是不是?」),到使用計算機視覺進行的複雜圖像分類和分割,能夠將數據分類並對其進行提問始終是非常有用的。\n", "\n", "用更科學的方式來描述這個過程,你的分類方法會創建一個預測模型,幫助你將輸入變量與輸出變量之間的關係映射出來。\n", "\n", "

\n", " \n", "

分類算法處理的二元與多類問題。信息圖由 Jen Looper 提供
\n", "\n", "在開始清理數據、可視化數據並為機器學習任務準備數據之前,讓我們先了解一下機器學習如何用於分類數據的各種方式。\n", "\n", "分類源於[統計學](https://wikipedia.org/wiki/Statistical_classification),使用經典機器學習進行分類時,會利用特徵(例如`smoker`、`weight`和`age`)來確定*患某種疾病的可能性*。作為一種與之前進行的回歸練習類似的監督式學習技術,你的數據是帶標籤的,機器學習算法使用這些標籤來分類和預測數據集的類別(或「特徵」),並將它們分配到某個組或結果中。\n", "\n", "✅ 花點時間想像一個關於美食的數據集。一個多類模型能回答什麼問題?一個二元模型能回答什麼問題?如果你想確定某種美食是否可能使用葫蘆巴,該怎麼辦?如果你想知道,假如收到一袋包含八角、洋薊、花椰菜和辣根的雜貨,你是否能做出一道典型的印度菜呢?\n", "\n", "### **你好,分類器**\n", "\n", "我們想要從這個美食數據集中提出的問題實際上是一個**多類問題**,因為我們有多個潛在的國家美食類別可供選擇。給定一組食材,這些數據會屬於哪一類?\n", "\n", "Tidymodels 提供了多種算法來分類數據,具體取決於你想解決的問題類型。在接下來的兩節課中,你將學習其中幾種算法。\n", "\n", "#### **前置要求**\n", "\n", "在這節課中,我們需要以下套件來清理、準備和可視化數據:\n", "\n", "- `tidyverse`: [tidyverse](https://www.tidyverse.org/) 是一個[由 R 套件組成的集合](https://www.tidyverse.org/packages),旨在讓數據科學更快速、更簡單、更有趣!\n", "\n", "- `tidymodels`: [tidymodels](https://www.tidymodels.org/) 框架是一個[建模和機器學習的套件集合](https://www.tidymodels.org/packages/)。\n", "\n", "- `DataExplorer`: [DataExplorer 套件](https://cran.r-project.org/web/packages/DataExplorer/vignettes/dataexplorer-intro.html)旨在簡化和自動化探索性數據分析(EDA)過程和報告生成。\n", "\n", "- `themis`: [themis 套件](https://themis.tidymodels.org/) 提供了處理不平衡數據的額外 Recipes 步驟。\n", "\n", "你可以通過以下方式安裝它們:\n", "\n", "`install.packages(c(\"tidyverse\", \"tidymodels\", \"DataExplorer\", \"here\"))`\n", "\n", "或者,以下腳本會檢查你是否已安裝完成本模組所需的套件,並在缺失時為你安裝。\n" ], "metadata": { "id": "ri5bQxZ-Fz_0" } }, { "cell_type": "code", "execution_count": null, "source": [ "suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\r\n", "\r\n", "pacman::p_load(tidyverse, tidymodels, DataExplorer, themis, here)" ], "outputs": [], "metadata": { "id": "KIPxa4elGAPI" } }, { "cell_type": "markdown", "source": [ "我們稍後會載入這些很棒的套件,並使它們在我們目前的 R 工作環境中可用。(這只是為了說明,`pacman::p_load()` 已經為你完成了這個步驟)\n" ], "metadata": { "id": "YkKAxOJvGD4C" } }, { "cell_type": "markdown", "source": [ "## 練習 - 清理及平衡你的數據\n", "\n", "在開始這個項目之前,第一項任務是清理並**平衡**你的數據,以獲得更好的結果。\n", "\n", "來認識一下這些數據吧!🕵️\n" ], "metadata": { "id": "PFkQDlk0GN5O" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Import data\r\n", "df <- read_csv(file = \"https://raw.githubusercontent.com/microsoft/ML-For-Beginners/main/4-Classification/data/cuisines.csv\")\r\n", "\r\n", "# View the first 5 rows\r\n", "df %>% \r\n", " slice_head(n = 5)\r\n" ], "outputs": [], "metadata": { "id": "Qccw7okxGT0S" } }, { "cell_type": "markdown", "source": [ "有趣!看來第一列是一種 `id` 列。我們來了解更多關於這些數據的信息。\n" ], "metadata": { "id": "XrWnlgSrGVmR" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Basic information about the data\r\n", "df %>%\r\n", " introduce()\r\n", "\r\n", "# Visualize basic information above\r\n", "df %>% \r\n", " plot_intro(ggtheme = theme_light())" ], "outputs": [], "metadata": { "id": "4UcGmxRxGieA" } }, { "cell_type": "markdown", "source": [ "從輸出中,我們可以立即看到我們有 `2448` 行和 `385` 列,並且沒有缺失值。我們還有一個離散欄位,*cuisine*。\n", "\n", "## 練習 - 了解菜系\n", "\n", "現在工作開始變得更有趣了。讓我們探索每種菜系的數據分佈。\n" ], "metadata": { "id": "AaPubl__GmH5" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Count observations per cuisine\r\n", "df %>% \r\n", " count(cuisine) %>% \r\n", " arrange(n)\r\n", "\r\n", "# Plot the distribution\r\n", "theme_set(theme_light())\r\n", "df %>% \r\n", " count(cuisine) %>% \r\n", " ggplot(mapping = aes(x = n, y = reorder(cuisine, -n))) +\r\n", " geom_col(fill = \"midnightblue\", alpha = 0.7) +\r\n", " ylab(\"cuisine\")" ], "outputs": [], "metadata": { "id": "FRsBVy5eGrrv" } }, { "cell_type": "markdown", "source": [ "有各種不同的菜系,但數據的分佈並不平均。你可以改變這種情況!在此之前,先進一步探索一下。\n", "\n", "接下來,讓我們將每種菜系分配到各自的 tibble,並找出每種菜系的數據量(行數和列數)。\n", "\n", "> [tibble](https://tibble.tidyverse.org/) 是一種現代化的數據框。\n", "\n", "

\n", " \n", "

插圖由 @allison_horst 提供
\n" ], "metadata": { "id": "vVvyDb1kG2in" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Create individual tibble for the cuisines\r\n", "thai_df <- df %>% \r\n", " filter(cuisine == \"thai\")\r\n", "japanese_df <- df %>% \r\n", " filter(cuisine == \"japanese\")\r\n", "chinese_df <- df %>% \r\n", " filter(cuisine == \"chinese\")\r\n", "indian_df <- df %>% \r\n", " filter(cuisine == \"indian\")\r\n", "korean_df <- df %>% \r\n", " filter(cuisine == \"korean\")\r\n", "\r\n", "\r\n", "# Find out how much data is available per cuisine\r\n", "cat(\" thai df:\", dim(thai_df), \"\\n\",\r\n", " \"japanese df:\", dim(japanese_df), \"\\n\",\r\n", " \"chinese_df:\", dim(chinese_df), \"\\n\",\r\n", " \"indian_df:\", dim(indian_df), \"\\n\",\r\n", " \"korean_df:\", dim(korean_df))" ], "outputs": [], "metadata": { "id": "0TvXUxD3G8Bk" } }, { "cell_type": "markdown", "source": [ "## **練習 - 使用 dplyr 探索不同菜系的主要食材**\n", "\n", "現在你可以深入研究數據,了解每種菜系的典型食材。你需要清理一些重複的數據,這些數據可能會在菜系之間造成混淆,因此讓我們來了解這個問題。\n", "\n", "在 R 中創建一個名為 `create_ingredient()` 的函數,該函數會返回一個食材的數據框。這個函數將首先刪除一個無用的列,然後根據食材的數量進行排序。\n", "\n", "R 中函數的基本結構如下:\n", "\n", "`myFunction <- function(arglist){`\n", "\n", "**`...`**\n", "\n", "**`return`**`(value)`\n", "\n", "`}`\n", "\n", "可以在[這裡](https://skirmer.github.io/presentations/functions_with_r.html#1)找到一個簡潔的 R 函數入門介紹。\n", "\n", "讓我們直接開始吧!我們將使用 [dplyr 動詞](https://dplyr.tidyverse.org/),這些動詞我們在之前的課程中已經學過。以下是回顧:\n", "\n", "- `dplyr::select()`: 幫助你選擇要保留或排除的**列**。\n", "\n", "- `dplyr::pivot_longer()`: 幫助你將數據“拉長”,增加行數並減少列數。\n", "\n", "- `dplyr::group_by()` 和 `dplyr::summarise()`: 幫助你找到不同組的統計摘要,並將它們放入一個整齊的表格中。\n", "\n", "- `dplyr::filter()`: 創建一個僅包含滿足條件的行的數據子集。\n", "\n", "- `dplyr::mutate()`: 幫助你創建或修改列。\n", "\n", "查看這個由 Allison Horst 製作的[充滿藝術感的 learnr 教程](https://allisonhorst.shinyapps.io/dplyr-learnr/#section-welcome),它介紹了一些在 dplyr *(Tidyverse 的一部分)* 中非常有用的數據整理函數。\n" ], "metadata": { "id": "K3RF5bSCHC76" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Creates a functions that returns the top ingredients by class\r\n", "\r\n", "create_ingredient <- function(df){\r\n", " \r\n", " # Drop the id column which is the first colum\r\n", " ingredient_df = df %>% select(-1) %>% \r\n", " # Transpose data to a long format\r\n", " pivot_longer(!cuisine, names_to = \"ingredients\", values_to = \"count\") %>% \r\n", " # Find the top most ingredients for a particular cuisine\r\n", " group_by(ingredients) %>% \r\n", " summarise(n_instances = sum(count)) %>% \r\n", " filter(n_instances != 0) %>% \r\n", " # Arrange by descending order\r\n", " arrange(desc(n_instances)) %>% \r\n", " mutate(ingredients = factor(ingredients) %>% fct_inorder())\r\n", " \r\n", " \r\n", " return(ingredient_df)\r\n", "} # End of function" ], "outputs": [], "metadata": { "id": "uB_0JR82HTPa" } }, { "cell_type": "markdown", "source": [ "現在我們可以使用這個函數來了解按菜系分類的十大最受歡迎食材。讓我們用 `thai_df` 試試看吧。\n" ], "metadata": { "id": "h9794WF8HWmc" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Call create_ingredient and display popular ingredients\r\n", "thai_ingredient_df <- create_ingredient(df = thai_df)\r\n", "\r\n", "thai_ingredient_df %>% \r\n", " slice_head(n = 10)" ], "outputs": [], "metadata": { "id": "agQ-1HrcHaEA" } }, { "cell_type": "markdown", "source": [ "在上一節中,我們使用了 `geom_col()`,現在讓我們看看如何使用 `geom_bar` 來製作柱狀圖。使用 `?geom_bar` 了解更多資訊。\n" ], "metadata": { "id": "kHu9ffGjHdcX" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Make a bar chart for popular thai cuisines\r\n", "thai_ingredient_df %>% \r\n", " slice_head(n = 10) %>% \r\n", " ggplot(aes(x = n_instances, y = ingredients)) +\r\n", " geom_bar(stat = \"identity\", width = 0.5, fill = \"steelblue\") +\r\n", " xlab(\"\") + ylab(\"\")" ], "outputs": [], "metadata": { "id": "fb3Bx_3DHj6e" } }, { "cell_type": "markdown", "source": [], "metadata": { "id": "RHP_xgdkHnvM" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Get popular ingredients for Japanese cuisines and make bar chart\r\n", "create_ingredient(df = japanese_df) %>% \r\n", " slice_head(n = 10) %>%\r\n", " ggplot(aes(x = n_instances, y = ingredients)) +\r\n", " geom_bar(stat = \"identity\", width = 0.5, fill = \"darkorange\", alpha = 0.8) +\r\n", " xlab(\"\") + ylab(\"\")\r\n" ], "outputs": [], "metadata": { "id": "019v8F0XHrRU" } }, { "cell_type": "markdown", "source": [ "關於中國菜呢?\n" ], "metadata": { "id": "iIGM7vO8Hu3v" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Get popular ingredients for Chinese cuisines and make bar chart\r\n", "create_ingredient(df = chinese_df) %>% \r\n", " slice_head(n = 10) %>%\r\n", " ggplot(aes(x = n_instances, y = ingredients)) +\r\n", " geom_bar(stat = \"identity\", width = 0.5, fill = \"cyan4\", alpha = 0.8) +\r\n", " xlab(\"\") + ylab(\"\")" ], "outputs": [], "metadata": { "id": "lHd9_gd2HyzU" } }, { "cell_type": "markdown", "source": [], "metadata": { "id": "ir8qyQbNH1c7" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Get popular ingredients for Indian cuisines and make bar chart\r\n", "create_ingredient(df = indian_df) %>% \r\n", " slice_head(n = 10) %>%\r\n", " ggplot(aes(x = n_instances, y = ingredients)) +\r\n", " geom_bar(stat = \"identity\", width = 0.5, fill = \"#041E42FF\", alpha = 0.8) +\r\n", " xlab(\"\") + ylab(\"\")" ], "outputs": [], "metadata": { "id": "ApukQtKjH5FO" } }, { "cell_type": "markdown", "source": [ "最後,繪製韓國食材。\n" ], "metadata": { "id": "qv30cwY1H-FM" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Get popular ingredients for Korean cuisines and make bar chart\r\n", "create_ingredient(df = korean_df) %>% \r\n", " slice_head(n = 10) %>%\r\n", " ggplot(aes(x = n_instances, y = ingredients)) +\r\n", " geom_bar(stat = \"identity\", width = 0.5, fill = \"#852419FF\", alpha = 0.8) +\r\n", " xlab(\"\") + ylab(\"\")" ], "outputs": [], "metadata": { "id": "lumgk9cHIBie" } }, { "cell_type": "markdown", "source": [ "從數據可視化中,我們現在可以使用 `dplyr::select()` 去掉那些在不同菜系之間容易引起混淆的最常見食材。\n", "\n", "大家都喜歡米飯、大蒜和薑!\n" ], "metadata": { "id": "iO4veMXuIEta" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Drop id column, rice, garlic and ginger from our original data set\r\n", "df_select <- df %>% \r\n", " select(-c(1, rice, garlic, ginger))\r\n", "\r\n", "# Display new data set\r\n", "df_select %>% \r\n", " slice_head(n = 5)" ], "outputs": [], "metadata": { "id": "iHJPiG6rIUcK" } }, { "cell_type": "markdown", "source": [ "## 使用食譜預處理數據 👩‍🍳👨‍🍳 - 處理不平衡數據 ⚖️\n", "\n", "

\n", " \n", "

插圖由 @allison_horst 提供
\n", "\n", "既然這節課是關於烹飪,我們需要將 `recipes` 放到合適的背景中。\n", "\n", "Tidymodels 提供了另一個非常方便的套件:`recipes`——一個用於預處理數據的套件。\n" ], "metadata": { "id": "kkFd-JxdIaL6" } }, { "cell_type": "markdown", "source": [ "讓我們再次看看我們菜式的分佈情況。\n" ], "metadata": { "id": "6l2ubtTPJAhY" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Distribution of cuisines\r\n", "old_label_count <- df_select %>% \r\n", " count(cuisine) %>% \r\n", " arrange(desc(n))\r\n", "\r\n", "old_label_count" ], "outputs": [], "metadata": { "id": "1e-E9cb7JDVi" } }, { "cell_type": "markdown", "source": [ "如你所見,菜式的數量分佈非常不均衡。韓國菜的數量幾乎是泰國菜的三倍。不均衡的數據通常會對模型的表現產生負面影響。試想一個二元分類問題,如果你的大部分數據都屬於某一個類別,機器學習模型就會更頻繁地預測該類別,僅僅因為它有更多的數據可供學習。平衡數據可以修正任何偏斜的數據,幫助消除這種不均衡。許多模型在觀測數量相等時表現最佳,因此在處理不均衡數據時往往會遇到困難。\n", "\n", "處理不均衡數據集主要有兩種方法:\n", "\n", "- 為少數類別添加觀測值:`過採樣`,例如使用 SMOTE 演算法\n", "\n", "- 從多數類別移除觀測值:`欠採樣`\n", "\n", "現在讓我們演示如何使用 `recipe` 來處理不均衡數據集。`recipe` 可以被視為一個藍圖,描述了應該對數據集應用哪些步驟,以使其準備好進行數據分析。\n" ], "metadata": { "id": "soAw6826JKx9" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Load themis package for dealing with imbalanced data\r\n", "library(themis)\r\n", "\r\n", "# Create a recipe for preprocessing data\r\n", "cuisines_recipe <- recipe(cuisine ~ ., data = df_select) %>% \r\n", " step_smote(cuisine)\r\n", "\r\n", "cuisines_recipe" ], "outputs": [], "metadata": { "id": "HS41brUIJVJy" } }, { "cell_type": "markdown", "source": [ "讓我們逐步了解預處理的步驟。\n", "\n", "- 使用公式調用 `recipe()` 時,會根據 `df_select` 數據作為參考,告訴 recipe 變數的*角色*。例如,`cuisine` 列被分配了 `outcome` 角色,而其他列則被分配了 `predictor` 角色。\n", "\n", "- [`step_smote(cuisine)`](https://themis.tidymodels.org/reference/step_smote.html) 創建了一個 recipe 步驟的*規範*,使用這些案例的最近鄰居合成生成少數類別的新樣本。\n", "\n", "現在,如果我們想查看預處理後的數據,就需要使用 [**`prep()`**](https://recipes.tidymodels.org/reference/prep.html) 和 [**`bake()`**](https://recipes.tidymodels.org/reference/bake.html) 來處理我們的 recipe。\n", "\n", "`prep()`:從訓練集估算所需的參數,這些參數可以稍後應用到其他數據集。\n", "\n", "`bake()`:使用已準備好的 recipe,並將操作應用到任何數據集。\n" ], "metadata": { "id": "Yb-7t7XcJaC8" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Prep and bake the recipe\r\n", "preprocessed_df <- cuisines_recipe %>% \r\n", " prep() %>% \r\n", " bake(new_data = NULL) %>% \r\n", " relocate(cuisine)\r\n", "\r\n", "# Display data\r\n", "preprocessed_df %>% \r\n", " slice_head(n = 5)\r\n", "\r\n", "# Quick summary stats\r\n", "preprocessed_df %>% \r\n", " introduce()" ], "outputs": [], "metadata": { "id": "9QhSgdpxJl44" } }, { "cell_type": "markdown", "source": [ "現在讓我們檢查我們的菜式分佈,並將其與不平衡的數據進行比較。\n" ], "metadata": { "id": "dmidELh_LdV7" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Distribution of cuisines\r\n", "new_label_count <- preprocessed_df %>% \r\n", " count(cuisine) %>% \r\n", " arrange(desc(n))\r\n", "\r\n", "list(new_label_count = new_label_count,\r\n", " old_label_count = old_label_count)" ], "outputs": [], "metadata": { "id": "aSh23klBLwDz" } }, { "cell_type": "markdown", "source": [ "好棒!數據既乾淨又平衡,簡直美味可口 😋!\n", "\n", "> 通常,配方(recipe)通常用作建模的預處理器,定義了需要對數據集進行哪些步驟以準備好進行建模。在這種情況下,通常會使用 `workflow()`(正如我們在之前的課程中已經看到的),而不是手動估算配方。\n", ">\n", "> 因此,當使用 tidymodels 時,通常不需要使用 **`prep()`** 和 **`bake()`** 來處理配方,但這些函數在工具箱中是非常有用的,可以用來確認配方是否按照預期運行,就像我們的情況一樣。\n", ">\n", "> 當你使用 **`new_data = NULL`** 來 **`bake()`** 一個已準備好的配方時,你會得到在定義配方時提供的數據,但這些數據已經經過了預處理步驟。\n", "\n", "現在讓我們保存一份這些數據的副本,以便在未來的課程中使用:\n" ], "metadata": { "id": "HEu80HZ8L7ae" } }, { "cell_type": "code", "execution_count": null, "source": [ "# Save preprocessed data\r\n", "write_csv(preprocessed_df, \"../../../data/cleaned_cuisines_R.csv\")" ], "outputs": [], "metadata": { "id": "cBmCbIgrMOI6" } }, { "cell_type": "markdown", "source": [ "這個新的 CSV 現在可以在根目錄的資料夾中找到。\n", "\n", "**🚀挑戰**\n", "\n", "這份課程包含多個有趣的數據集。深入探索 `data` 資料夾,看看是否有任何數據集適合用於二元或多類別分類?你會對這些數據集提出什麼問題?\n", "\n", "## [**課後測驗**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/20/)\n", "\n", "## **回顧與自學**\n", "\n", "- 查看 [themis 套件](https://github.com/tidymodels/themis)。我們還可以使用哪些其他技術來處理不平衡數據?\n", "\n", "- Tidy models [參考網站](https://www.tidymodels.org/start/)。\n", "\n", "- H. Wickham 和 G. Grolemund,[*R for Data Science: Visualize, Model, Transform, Tidy, and Import Data*](https://r4ds.had.co.nz/)。\n", "\n", "#### 特別感謝:\n", "\n", "[`Allison Horst`](https://twitter.com/allison_horst/) 創作了令人驚嘆的插圖,使 R 更加親切和吸引人。可以在她的 [畫廊](https://www.google.com/url?q=https://github.com/allisonhorst/stats-illustrations&sa=D&source=editors&ust=1626380772530000&usg=AOvVaw3zcfyCizFQZpkSLzxiiQEM) 中找到更多插圖。\n", "\n", "[Cassie Breviu](https://www.twitter.com/cassieview) 和 [Jen Looper](https://www.twitter.com/jenlooper) 創作了這個模組的原始 Python 版本 ♥️\n", "\n", "

\n", " \n", "

插圖由 @allison_horst 提供
\n" ], "metadata": { "id": "WQs5621pMGwf" } }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n---\n\n**免責聲明**: \n本文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 進行翻譯。儘管我們致力於提供準確的翻譯,但請注意,自動翻譯可能包含錯誤或不準確之處。原始語言的文件應被視為權威來源。對於重要資訊,建議使用專業的人類翻譯。我們對因使用此翻譯而引起的任何誤解或錯誤解釋不承擔責任。\n" ] } ] }