You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/translations/de/4-Classification/2-Classifiers-1/solution/R/lesson_11-R.ipynb

1300 lines
78 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"nbformat": 4,
"nbformat_minor": 2,
"metadata": {
"colab": {
"name": "lesson_11-R.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true
},
"kernelspec": {
"name": "ir",
"display_name": "R"
},
"language_info": {
"name": "R"
},
"coopTranslator": {
"original_hash": "6ea6a5171b1b99b7b5a55f7469c048d2",
"translation_date": "2025-09-04T02:28:50+00:00",
"source_file": "4-Classification/2-Classifiers-1/solution/R/lesson_11-R.ipynb",
"language_code": "de"
}
},
"cells": [
{
"cell_type": "markdown",
"source": [
"# Erstellen Sie ein Klassifikationsmodell: Köstliche asiatische und indische Küchen\n"
],
"metadata": {
"id": "zs2woWv_HoE8"
}
},
{
"cell_type": "markdown",
"source": [
"## Küchenklassifikatoren 1\n",
"\n",
"In dieser Lektion werden wir eine Vielzahl von Klassifikatoren erkunden, um *eine nationale Küche basierend auf einer Gruppe von Zutaten vorherzusagen.* Dabei lernen wir mehr über einige der Möglichkeiten, wie Algorithmen für Klassifikationsaufgaben eingesetzt werden können.\n",
"\n",
"### [**Quiz vor der Vorlesung**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/21/)\n",
"\n",
"### **Vorbereitung**\n",
"\n",
"Diese Lektion baut auf unserer [vorherigen Lektion](https://github.com/microsoft/ML-For-Beginners/blob/main/4-Classification/1-Introduction/solution/lesson_10-R.ipynb) auf, in der wir:\n",
"\n",
"- Eine sanfte Einführung in Klassifikationen anhand eines Datensatzes über die großartigen Küchen Asiens und Indiens 😋 gemacht haben.\n",
"\n",
"- Einige [dplyr-Verben](https://dplyr.tidyverse.org/) erkundet haben, um unsere Daten vorzubereiten und zu bereinigen.\n",
"\n",
"- Wunderschöne Visualisierungen mit ggplot2 erstellt haben.\n",
"\n",
"- Demonstriert haben, wie man mit unausgewogenen Daten umgeht, indem man sie mit [recipes](https://recipes.tidymodels.org/articles/Simple_Example.html) vorverarbeitet.\n",
"\n",
"- Gezeigt haben, wie man unser Rezept `prep` und `bake`, um sicherzustellen, dass es wie vorgesehen funktioniert.\n",
"\n",
"#### **Voraussetzungen**\n",
"\n",
"Für diese Lektion benötigen wir die folgenden Pakete, um unsere Daten zu bereinigen, vorzubereiten und zu visualisieren:\n",
"\n",
"- `tidyverse`: Das [tidyverse](https://www.tidyverse.org/) ist eine [Sammlung von R-Paketen](https://www.tidyverse.org/packages), die darauf abzielt, Datenwissenschaft schneller, einfacher und unterhaltsamer zu machen!\n",
"\n",
"- `tidymodels`: Das [tidymodels](https://www.tidymodels.org/) Framework ist eine [Sammlung von Paketen](https://www.tidymodels.org/packages/) für Modellierung und maschinelles Lernen.\n",
"\n",
"- `themis`: Das [themis-Paket](https://themis.tidymodels.org/) bietet zusätzliche Rezeptschritte für den Umgang mit unausgewogenen Daten.\n",
"\n",
"- `nnet`: Das [nnet-Paket](https://cran.r-project.org/web/packages/nnet/nnet.pdf) bietet Funktionen zur Schätzung von Feedforward-Neuronalen Netzen mit einer einzigen versteckten Schicht sowie für multinomiale logistische Regressionsmodelle.\n",
"\n",
"Sie können diese Pakete wie folgt installieren:\n"
],
"metadata": {
"id": "iDFOb3ebHwQC"
}
},
{
"cell_type": "markdown",
"source": [
"`install.packages(c(\"tidyverse\", \"tidymodels\", \"DataExplorer\", \"here\"))`\n",
"\n",
"Alternativ überprüft das folgende Skript, ob die für dieses Modul benötigten Pakete vorhanden sind, und installiert sie für Sie, falls sie fehlen.\n"
],
"metadata": {
"id": "4V85BGCjII7F"
}
},
{
"cell_type": "code",
"execution_count": 2,
"source": [
"suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\r\n",
"\r\n",
"pacman::p_load(tidyverse, tidymodels, themis, here)"
],
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Loading required package: pacman\n",
"\n"
]
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "an5NPyyKIKNR",
"outputId": "834d5e74-f4b8-49f9-8ab5-4c52ff2d7bc8"
}
},
{
"cell_type": "markdown",
"source": [
"Jetzt legen wir los!\n",
"\n",
"## 1. Teile die Daten in Trainings- und Testdatensätze auf.\n",
"\n",
"Wir beginnen mit ein paar Schritten aus unserer vorherigen Lektion.\n",
"\n",
"### Entferne die häufigsten Zutaten, die Verwirrung zwischen verschiedenen Küchenstilen stiften, mit `dplyr::select()`.\n",
"\n",
"Jeder liebt Reis, Knoblauch und Ingwer!\n"
],
"metadata": {
"id": "0ax9GQLBINVv"
}
},
{
"cell_type": "code",
"execution_count": 3,
"source": [
"# Load the original cuisines data\r\n",
"df <- read_csv(file = \"https://raw.githubusercontent.com/microsoft/ML-For-Beginners/main/4-Classification/data/cuisines.csv\")\r\n",
"\r\n",
"# Drop id column, rice, garlic and ginger from our original data set\r\n",
"df_select <- df %>% \r\n",
" select(-c(1, rice, garlic, ginger)) %>%\r\n",
" # Encode cuisine column as categorical\r\n",
" mutate(cuisine = factor(cuisine))\r\n",
"\r\n",
"# Display new data set\r\n",
"df_select %>% \r\n",
" slice_head(n = 5)\r\n",
"\r\n",
"# Display distribution of cuisines\r\n",
"df_select %>% \r\n",
" count(cuisine) %>% \r\n",
" arrange(desc(n))"
],
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"New names:\n",
"* `` -> ...1\n",
"\n",
"\u001b[1m\u001b[1mRows: \u001b[1m\u001b[22m\u001b[34m\u001b[34m2448\u001b[34m\u001b[39m \u001b[1m\u001b[1mColumns: \u001b[1m\u001b[22m\u001b[34m\u001b[34m385\u001b[34m\u001b[39m\n",
"\n",
"\u001b[36m──\u001b[39m \u001b[1m\u001b[1mColumn specification\u001b[1m\u001b[22m \u001b[36m────────────────────────────────────────────────────────\u001b[39m\n",
"\u001b[1mDelimiter:\u001b[22m \",\"\n",
"\u001b[31mchr\u001b[39m (1): cuisine\n",
"\u001b[32mdbl\u001b[39m (384): ...1, almond, angelica, anise, anise_seed, apple, apple_brandy, a...\n",
"\n",
"\n",
"\u001b[36m\u001b[39m Use \u001b[30m\u001b[47m\u001b[30m\u001b[47m`spec()`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to retrieve the full column specification for this data.\n",
"\u001b[36m\u001b[39m Specify the column types or set \u001b[30m\u001b[47m\u001b[30m\u001b[47m`show_col_types = FALSE`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to quiet this message.\n",
"\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n",
"1 indian 0 0 0 0 0 0 0 0 \n",
"2 indian 1 0 0 0 0 0 0 0 \n",
"3 indian 0 0 0 0 0 0 0 0 \n",
"4 indian 0 0 0 0 0 0 0 0 \n",
"5 indian 0 0 0 0 0 0 0 0 \n",
" artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n",
"1 0 ⋯ 0 0 0 0 0 0 \n",
"2 0 ⋯ 0 0 0 0 0 0 \n",
"3 0 ⋯ 0 0 0 0 0 0 \n",
"4 0 ⋯ 0 0 0 0 0 0 \n",
"5 0 ⋯ 0 0 0 0 0 0 \n",
" yam yeast yogurt zucchini\n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"5 0 0 1 0 "
],
"text/markdown": [
"\n",
"A tibble: 5 × 381\n",
"\n",
"| cuisine &lt;fct&gt; | almond &lt;dbl&gt; | angelica &lt;dbl&gt; | anise &lt;dbl&gt; | anise_seed &lt;dbl&gt; | apple &lt;dbl&gt; | apple_brandy &lt;dbl&gt; | apricot &lt;dbl&gt; | armagnac &lt;dbl&gt; | artemisia &lt;dbl&gt; | ⋯ ⋯ | whiskey &lt;dbl&gt; | white_bread &lt;dbl&gt; | white_wine &lt;dbl&gt; | whole_grain_wheat_flour &lt;dbl&gt; | wine &lt;dbl&gt; | wood &lt;dbl&gt; | yam &lt;dbl&gt; | yeast &lt;dbl&gt; | yogurt &lt;dbl&gt; | zucchini &lt;dbl&gt; |\n",
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 381\n",
"\\begin{tabular}{lllllllllllllllllllll}\n",
" cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n",
" <fct> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & ⋯ & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 381</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>almond</th><th scope=col>angelica</th><th scope=col>anise</th><th scope=col>anise_seed</th><th scope=col>apple</th><th scope=col>apple_brandy</th><th scope=col>apricot</th><th scope=col>armagnac</th><th scope=col>artemisia</th><th scope=col>⋯</th><th scope=col>whiskey</th><th scope=col>white_bread</th><th scope=col>white_wine</th><th scope=col>whole_grain_wheat_flour</th><th scope=col>wine</th><th scope=col>wood</th><th scope=col>yam</th><th scope=col>yeast</th><th scope=col>yogurt</th><th scope=col>zucchini</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>⋯</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine n \n",
"1 korean 799\n",
"2 indian 598\n",
"3 chinese 442\n",
"4 japanese 320\n",
"5 thai 289"
],
"text/markdown": [
"\n",
"A tibble: 5 × 2\n",
"\n",
"| cuisine &lt;fct&gt; | n &lt;int&gt; |\n",
"|---|---|\n",
"| korean | 799 |\n",
"| indian | 598 |\n",
"| chinese | 442 |\n",
"| japanese | 320 |\n",
"| thai | 289 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 2\n",
"\\begin{tabular}{ll}\n",
" cuisine & n\\\\\n",
" <fct> & <int>\\\\\n",
"\\hline\n",
"\t korean & 799\\\\\n",
"\t indian & 598\\\\\n",
"\t chinese & 442\\\\\n",
"\t japanese & 320\\\\\n",
"\t thai & 289\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 2</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>n</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;int&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>korean </td><td>799</td></tr>\n",
"\t<tr><td>indian </td><td>598</td></tr>\n",
"\t<tr><td>chinese </td><td>442</td></tr>\n",
"\t<tr><td>japanese</td><td>320</td></tr>\n",
"\t<tr><td>thai </td><td>289</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 735
},
"id": "jhCrrH22IWVR",
"outputId": "d444a85c-1d8b-485f-bc4f-8be2e8f8217c"
}
},
{
"cell_type": "markdown",
"source": [
"Perfekt! Jetzt teilen wir die Daten so auf, dass 70 % der Daten für das Training und 30 % für das Testen verwendet werden. Dabei wenden wir eine `Stratifizierung` an, um `das Verhältnis der einzelnen Küchenarten` in den Trainings- und Validierungsdatensätzen beizubehalten.\n",
"\n",
"[rsample](https://rsample.tidymodels.org/), ein Paket in Tidymodels, bietet eine Infrastruktur für effizientes Aufteilen und Resampling von Daten:\n"
],
"metadata": {
"id": "AYTjVyajIdny"
}
},
{
"cell_type": "code",
"execution_count": 4,
"source": [
"# Load the core Tidymodels packages into R session\r\n",
"library(tidymodels)\r\n",
"\r\n",
"# Create split specification\r\n",
"set.seed(2056)\r\n",
"cuisines_split <- initial_split(data = df_select,\r\n",
" strata = cuisine,\r\n",
" prop = 0.7)\r\n",
"\r\n",
"# Extract the data in each split\r\n",
"cuisines_train <- training(cuisines_split)\r\n",
"cuisines_test <- testing(cuisines_split)\r\n",
"\r\n",
"# Print the number of cases in each split\r\n",
"cat(\"Training cases: \", nrow(cuisines_train), \"\\n\",\r\n",
" \"Test cases: \", nrow(cuisines_test), sep = \"\")\r\n",
"\r\n",
"# Display the first few rows of the training set\r\n",
"cuisines_train %>% \r\n",
" slice_head(n = 5)\r\n",
"\r\n",
"\r\n",
"# Display distribution of cuisines in the training set\r\n",
"cuisines_train %>% \r\n",
" count(cuisine) %>% \r\n",
" arrange(desc(n))"
],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Training cases: 1712\n",
"Test cases: 736"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n",
"1 chinese 0 0 0 0 0 0 0 0 \n",
"2 chinese 0 0 0 0 0 0 0 0 \n",
"3 chinese 0 0 0 0 0 0 0 0 \n",
"4 chinese 0 0 0 0 0 0 0 0 \n",
"5 chinese 0 0 0 0 0 0 0 0 \n",
" artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n",
"1 0 ⋯ 0 0 0 0 1 0 \n",
"2 0 ⋯ 0 0 0 0 1 0 \n",
"3 0 ⋯ 0 0 0 0 0 0 \n",
"4 0 ⋯ 0 0 0 0 0 0 \n",
"5 0 ⋯ 0 0 0 0 0 0 \n",
" yam yeast yogurt zucchini\n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"5 0 0 0 0 "
],
"text/markdown": [
"\n",
"A tibble: 5 × 381\n",
"\n",
"| cuisine &lt;fct&gt; | almond &lt;dbl&gt; | angelica &lt;dbl&gt; | anise &lt;dbl&gt; | anise_seed &lt;dbl&gt; | apple &lt;dbl&gt; | apple_brandy &lt;dbl&gt; | apricot &lt;dbl&gt; | armagnac &lt;dbl&gt; | artemisia &lt;dbl&gt; | ⋯ ⋯ | whiskey &lt;dbl&gt; | white_bread &lt;dbl&gt; | white_wine &lt;dbl&gt; | whole_grain_wheat_flour &lt;dbl&gt; | wine &lt;dbl&gt; | wood &lt;dbl&gt; | yam &lt;dbl&gt; | yeast &lt;dbl&gt; | yogurt &lt;dbl&gt; | zucchini &lt;dbl&gt; |\n",
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 381\n",
"\\begin{tabular}{lllllllllllllllllllll}\n",
" cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n",
" <fct> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & ⋯ & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 381</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>almond</th><th scope=col>angelica</th><th scope=col>anise</th><th scope=col>anise_seed</th><th scope=col>apple</th><th scope=col>apple_brandy</th><th scope=col>apricot</th><th scope=col>armagnac</th><th scope=col>artemisia</th><th scope=col>⋯</th><th scope=col>whiskey</th><th scope=col>white_bread</th><th scope=col>white_wine</th><th scope=col>whole_grain_wheat_flour</th><th scope=col>wine</th><th scope=col>wood</th><th scope=col>yam</th><th scope=col>yeast</th><th scope=col>yogurt</th><th scope=col>zucchini</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>⋯</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine n \n",
"1 korean 559\n",
"2 indian 418\n",
"3 chinese 309\n",
"4 japanese 224\n",
"5 thai 202"
],
"text/markdown": [
"\n",
"A tibble: 5 × 2\n",
"\n",
"| cuisine &lt;fct&gt; | n &lt;int&gt; |\n",
"|---|---|\n",
"| korean | 559 |\n",
"| indian | 418 |\n",
"| chinese | 309 |\n",
"| japanese | 224 |\n",
"| thai | 202 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 2\n",
"\\begin{tabular}{ll}\n",
" cuisine & n\\\\\n",
" <fct> & <int>\\\\\n",
"\\hline\n",
"\t korean & 559\\\\\n",
"\t indian & 418\\\\\n",
"\t chinese & 309\\\\\n",
"\t japanese & 224\\\\\n",
"\t thai & 202\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 2</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>n</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;int&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>korean </td><td>559</td></tr>\n",
"\t<tr><td>indian </td><td>418</td></tr>\n",
"\t<tr><td>chinese </td><td>309</td></tr>\n",
"\t<tr><td>japanese</td><td>224</td></tr>\n",
"\t<tr><td>thai </td><td>202</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 535
},
"id": "w5FWIkEiIjdN",
"outputId": "2e195fd9-1a8f-4b91-9573-cce5582242df"
}
},
{
"cell_type": "markdown",
"source": [
"## 2. Umgang mit unausgewogenen Daten\n",
"\n",
"Wie Sie vielleicht im ursprünglichen Datensatz sowie in unserem Trainingssatz bemerkt haben, gibt es eine ziemlich ungleiche Verteilung der Anzahl der Küchen. Koreanische Küchen sind *fast* dreimal so häufig wie thailändische Küchen. Unaussgewogene Daten haben oft negative Auswirkungen auf die Modellleistung. Viele Modelle funktionieren am besten, wenn die Anzahl der Beobachtungen gleich ist, und haben daher Schwierigkeiten mit unausgewogenen Daten.\n",
"\n",
"Es gibt hauptsächlich zwei Möglichkeiten, mit unausgewogenen Datensätzen umzugehen:\n",
"\n",
"- Hinzufügen von Beobachtungen zur Minderheitsklasse: `Over-Sampling`, z. B. mit einem SMOTE-Algorithmus, der synthetisch neue Beispiele der Minderheitsklasse anhand der nächsten Nachbarn dieser Fälle generiert.\n",
"\n",
"- Entfernen von Beobachtungen aus der Mehrheitsklasse: `Under-Sampling`\n",
"\n",
"In unserer vorherigen Lektion haben wir demonstriert, wie man mit unausgewogenen Datensätzen mithilfe eines `recipe` umgehen kann. Ein Rezept kann als eine Art Blaupause betrachtet werden, die beschreibt, welche Schritte auf einen Datensatz angewendet werden sollten, um ihn für die Datenanalyse vorzubereiten. In unserem Fall möchten wir eine gleichmäßige Verteilung der Anzahl unserer Küchen für unseren `training set` erreichen. Lassen Sie uns direkt loslegen.\n"
],
"metadata": {
"id": "daBi9qJNIwqW"
}
},
{
"cell_type": "code",
"execution_count": 5,
"source": [
"# Load themis package for dealing with imbalanced data\r\n",
"library(themis)\r\n",
"\r\n",
"# Create a recipe for preprocessing training data\r\n",
"cuisines_recipe <- recipe(cuisine ~ ., data = cuisines_train) %>% \r\n",
" step_smote(cuisine)\r\n",
"\r\n",
"# Print recipe\r\n",
"cuisines_recipe"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Data Recipe\n",
"\n",
"Inputs:\n",
"\n",
" role #variables\n",
" outcome 1\n",
" predictor 380\n",
"\n",
"Operations:\n",
"\n",
"SMOTE based on cuisine"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 200
},
"id": "Az6LFBGxI1X0",
"outputId": "29d71d85-64b0-4e62-871e-bcd5398573b6"
}
},
{
"cell_type": "markdown",
"source": [
"Du kannst natürlich bestätigen (mithilfe von prep+bake), dass das Rezept wie erwartet funktioniert alle Küchenlabels haben `559` Beobachtungen.\n",
"\n",
"Da wir dieses Rezept als Vorverarbeitung für das Modellieren verwenden werden, übernimmt ein `workflow()` den gesamten Vorbereitungs- und Backprozess für uns, sodass wir das Rezept nicht manuell schätzen müssen.\n",
"\n",
"Jetzt sind wir bereit, ein Modell zu trainieren 👩‍💻👨‍💻!\n",
"\n",
"## 3. Auswahl des Klassifikators\n",
"\n",
"<p >\n",
" <img src=\"../../images/parsnip.jpg\"\n",
" width=\"600\"/>\n",
" <figcaption>Kunstwerk von @allison_horst</figcaption>\n"
],
"metadata": {
"id": "NBL3PqIWJBBB"
}
},
{
"cell_type": "markdown",
"source": [
"Jetzt müssen wir entscheiden, welchen Algorithmus wir für die Aufgabe verwenden 🤔.\n",
"\n",
"In Tidymodels bietet das [`parsnip package`](https://parsnip.tidymodels.org/index.html) eine einheitliche Schnittstelle für die Arbeit mit Modellen über verschiedene Engines (Pakete) hinweg. Bitte sehen Sie sich die parsnip-Dokumentation an, um [Modelltypen und Engines](https://www.tidymodels.org/find/parsnip/#models) sowie deren entsprechende [Modellargumente](https://www.tidymodels.org/find/parsnip/#model-args) zu erkunden. Die Vielfalt kann auf den ersten Blick ziemlich überwältigend sein. Zum Beispiel umfassen die folgenden Methoden alle Klassifikationstechniken:\n",
"\n",
"- C5.0 Regelbasierte Klassifikationsmodelle\n",
"\n",
"- Flexible Diskriminanzmodelle\n",
"\n",
"- Lineare Diskriminanzmodelle\n",
"\n",
"- Regularisierte Diskriminanzmodelle\n",
"\n",
"- Logistische Regressionsmodelle\n",
"\n",
"- Multinomiale Regressionsmodelle\n",
"\n",
"- Naive Bayes Modelle\n",
"\n",
"- Support Vector Machines\n",
"\n",
"- Nächste Nachbarn\n",
"\n",
"- Entscheidungsbäume\n",
"\n",
"- Ensemble-Methoden\n",
"\n",
"- Neuronale Netze\n",
"\n",
"Die Liste geht weiter!\n",
"\n",
"### **Welchen Klassifikator wählen?**\n",
"\n",
"Welchen Klassifikator sollten Sie also wählen? Oft ist es eine gute Methode, mehrere auszuprobieren und nach einem guten Ergebnis zu suchen.\n",
"\n",
"> AutoML löst dieses Problem elegant, indem es diese Vergleiche in der Cloud durchführt und Ihnen ermöglicht, den besten Algorithmus für Ihre Daten auszuwählen. Probieren Sie es [hier](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-77952-leestott) aus.\n",
"\n",
"Die Wahl des Klassifikators hängt auch von unserem Problem ab. Wenn das Ergebnis beispielsweise in `mehr als zwei Klassen` kategorisiert werden kann, wie in unserem Fall, müssen Sie einen `Multiklassen-Klassifikationsalgorithmus` anstelle eines `binären Klassifikationsalgorithmus` verwenden.\n",
"\n",
"### **Ein besserer Ansatz**\n",
"\n",
"Ein besserer Ansatz als wildes Raten ist jedoch, die Ideen auf diesem herunterladbaren [ML Cheat Sheet](https://docs.microsoft.com/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-77952-leestott) zu verfolgen. Hier entdecken wir, dass wir für unser Multiklassenproblem einige Optionen haben:\n",
"\n",
"<p >\n",
" <img src=\"../../images/cheatsheet.png\"\n",
" width=\"500\"/>\n",
" <figcaption>Ein Abschnitt des Algorithmus-Cheat Sheets von Microsoft, der Multiklassen-Klassifikationsoptionen beschreibt</figcaption>\n"
],
"metadata": {
"id": "a6DLAZ3vJZ14"
}
},
{
"cell_type": "markdown",
"source": [
"### **Überlegungen**\n",
"\n",
"Schauen wir uns verschiedene Ansätze an, die wir unter den gegebenen Einschränkungen verfolgen können:\n",
"\n",
"- **Tiefe neuronale Netze sind zu aufwendig.** Angesichts unseres sauberen, aber minimalen Datensatzes und der Tatsache, dass wir das Training lokal über Notebooks durchführen, sind tiefe neuronale Netze für diese Aufgabe zu schwergewichtig.\n",
"\n",
"- **Kein Zwei-Klassen-Klassifikator.** Wir verwenden keinen Zwei-Klassen-Klassifikator, daher scheidet ein One-vs-All-Ansatz aus.\n",
"\n",
"- **Entscheidungsbaum oder logistische Regression könnten funktionieren.** Ein Entscheidungsbaum könnte geeignet sein, oder auch eine multinomiale Regression/multiklassige logistische Regression für Daten mit mehreren Klassen.\n",
"\n",
"- **Multiklassige Boosted Decision Trees lösen ein anderes Problem.** Der multiklassige Boosted Decision Tree eignet sich am besten für nichtparametrische Aufgaben, z. B. Aufgaben zur Erstellung von Rankings, und ist daher für uns nicht nützlich.\n",
"\n",
"Generell ist es eine gute Idee, vor der Anwendung komplexerer Machine-Learning-Modelle wie Ensemble-Methoden zunächst das einfachste Modell zu erstellen, um ein grundlegendes Verständnis für die Daten zu bekommen. Daher beginnen wir in dieser Lektion mit einem `multinomial regression`-Modell.\n",
"\n",
"> Logistische Regression ist eine Technik, die verwendet wird, wenn die Zielvariable kategorisch (oder nominal) ist. Bei der binären logistischen Regression gibt es zwei Zielvariablen, während es bei der multinomialen logistischen Regression mehr als zwei Zielvariablen gibt. Weitere Informationen finden Sie unter [Advanced Regression Methods](https://bookdown.org/chua/ber642_advanced_regression/multinomial-logistic-regression.html).\n",
"\n",
"## 4. Ein multinomiales logistische Regressionsmodell trainieren und evaluieren\n",
"\n",
"In Tidymodels definiert `parsnip::multinom_reg()` ein Modell, das lineare Prädiktoren verwendet, um Multiklass-Daten mithilfe der multinomialen Verteilung vorherzusagen. Siehe `?multinom_reg()` für die verschiedenen Möglichkeiten/Engines, mit denen Sie dieses Modell anpassen können.\n",
"\n",
"In diesem Beispiel passen wir ein multinomiales Regressionsmodell über die Standard-Engine [nnet](https://cran.r-project.org/web/packages/nnet/nnet.pdf) an.\n",
"\n",
"> Ich habe den Wert für `penalty` mehr oder weniger zufällig gewählt. Es gibt bessere Methoden, diesen Wert zu bestimmen, z. B. durch `Resampling` und `Tuning` des Modells, was wir später besprechen werden.\n",
">\n",
"> Siehe [Tidymodels: Get Started](https://www.tidymodels.org/start/tuning/), falls Sie mehr darüber erfahren möchten, wie man Modell-Hyperparameter optimiert.\n"
],
"metadata": {
"id": "gWMsVcbBJemu"
}
},
{
"cell_type": "code",
"execution_count": 6,
"source": [
"# Create a multinomial regression model specification\r\n",
"mr_spec <- multinom_reg(penalty = 1) %>% \r\n",
" set_engine(\"nnet\", MaxNWts = 2086) %>% \r\n",
" set_mode(\"classification\")\r\n",
"\r\n",
"# Print model specification\r\n",
"mr_spec"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Multinomial Regression Model Specification (classification)\n",
"\n",
"Main Arguments:\n",
" penalty = 1\n",
"\n",
"Engine-Specific Arguments:\n",
" MaxNWts = 2086\n",
"\n",
"Computational engine: nnet \n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 166
},
"id": "Wq_fcyQiJvfG",
"outputId": "c30449c7-3864-4be7-f810-72a003743e2d"
}
},
{
"cell_type": "markdown",
"source": [
"Großartige Arbeit 🥳! Jetzt, da wir ein Rezept und eine Modellspezifikation haben, müssen wir eine Möglichkeit finden, diese zusammen in ein Objekt zu bündeln, das zuerst die Daten vorverarbeitet, dann das Modell auf den vorverarbeiteten Daten anpasst und auch potenzielle Nachbearbeitungsaktivitäten ermöglicht. In Tidymodels wird dieses praktische Objekt [`workflow`](https://workflows.tidymodels.org/) genannt und hält bequem deine Modellierungskomponenten! Das ist das, was wir in *Python* als *Pipelines* bezeichnen würden.\n",
"\n",
"Also, lass uns alles in einen Workflow bündeln!📦\n"
],
"metadata": {
"id": "NlSbzDfgJ0zh"
}
},
{
"cell_type": "code",
"execution_count": 7,
"source": [
"# Bundle recipe and model specification\r\n",
"mr_wf <- workflow() %>% \r\n",
" add_recipe(cuisines_recipe) %>% \r\n",
" add_model(mr_spec)\r\n",
"\r\n",
"# Print out workflow\r\n",
"mr_wf"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"══ Workflow ════════════════════════════════════════════════════════════════════\n",
"\u001b[3mPreprocessor:\u001b[23m Recipe\n",
"\u001b[3mModel:\u001b[23m multinom_reg()\n",
"\n",
"── Preprocessor ────────────────────────────────────────────────────────────────\n",
"1 Recipe Step\n",
"\n",
"• step_smote()\n",
"\n",
"── Model ───────────────────────────────────────────────────────────────────────\n",
"Multinomial Regression Model Specification (classification)\n",
"\n",
"Main Arguments:\n",
" penalty = 1\n",
"\n",
"Engine-Specific Arguments:\n",
" MaxNWts = 2086\n",
"\n",
"Computational engine: nnet \n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 333
},
"id": "Sc1TfPA4Ke3_",
"outputId": "82c70013-e431-4e7e-cef6-9fcf8aad4a6c"
}
},
{
"cell_type": "markdown",
"source": [
"Workflows 👌👌! Ein **`workflow()`** kann auf ähnliche Weise angepasst werden wie ein Modell. Also, Zeit, ein Modell zu trainieren!\n"
],
"metadata": {
"id": "TNQ8i85aKf9L"
}
},
{
"cell_type": "code",
"execution_count": 8,
"source": [
"# Train a multinomial regression model\n",
"mr_fit <- fit(object = mr_wf, data = cuisines_train)\n",
"\n",
"mr_fit"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"══ Workflow [trained] ══════════════════════════════════════════════════════════\n",
"\u001b[3mPreprocessor:\u001b[23m Recipe\n",
"\u001b[3mModel:\u001b[23m multinom_reg()\n",
"\n",
"── Preprocessor ────────────────────────────────────────────────────────────────\n",
"1 Recipe Step\n",
"\n",
"• step_smote()\n",
"\n",
"── Model ───────────────────────────────────────────────────────────────────────\n",
"Call:\n",
"nnet::multinom(formula = ..y ~ ., data = data, decay = ~1, MaxNWts = ~2086, \n",
" trace = FALSE)\n",
"\n",
"Coefficients:\n",
" (Intercept) almond angelica anise anise_seed apple\n",
"indian 0.19723325 0.2409661 0 -5.004955e-05 -0.1657635 -0.05769734\n",
"japanese 0.13961959 -0.6262400 0 -1.169155e-04 -0.4893596 -0.08585717\n",
"korean 0.22377347 -0.1833485 0 -5.560395e-05 -0.2489401 -0.15657804\n",
"thai -0.04336577 -0.6106258 0 4.903828e-04 -0.5782866 0.63451105\n",
" apple_brandy apricot armagnac artemisia artichoke asparagus\n",
"indian 0 0.37042636 0 -0.09122797 0 -0.27181970\n",
"japanese 0 0.28895643 0 -0.12651100 0 0.14054037\n",
"korean 0 -0.07981259 0 0.55756709 0 -0.66979948\n",
"thai 0 -0.33160904 0 -0.10725182 0 -0.02602152\n",
" avocado bacon baked_potato balm banana barley\n",
"indian -0.46624197 0.16008055 0 0 -0.2838796 0.2230625\n",
"japanese 0.90341344 0.02932727 0 0 -0.4142787 2.0953906\n",
"korean -0.06925382 -0.35804134 0 0 -0.2686963 -0.7233404\n",
"thai -0.21473955 -0.75594439 0 0 0.6784880 -0.4363320\n",
" bartlett_pear basil bay bean beech\n",
"indian 0 -0.7128756 0.1011587 -0.8777275 -0.0004380795\n",
"japanese 0 0.1288697 0.9425626 -0.2380748 0.3373437611\n",
"korean 0 -0.2445193 -0.4744318 -0.8957870 -0.0048784496\n",
"thai 0 1.5365848 0.1333256 0.2196970 -0.0113078024\n",
" beef beef_broth beef_liver beer beet\n",
"indian -0.7985278 0.2430186 -0.035598065 -0.002173738 0.01005813\n",
"japanese 0.2241875 -0.3653020 -0.139551027 0.128905553 0.04923911\n",
"korean 0.5366515 -0.6153237 0.213455197 -0.010828645 0.27325423\n",
"thai 0.1570012 -0.9364154 -0.008032213 -0.035063746 -0.28279823\n",
" bell_pepper bergamot berry bitter_orange black_bean\n",
"indian 0.49074330 0 0.58947607 0.191256164 -0.1945233\n",
"japanese 0.09074167 0 -0.25917977 -0.118915977 -0.3442400\n",
"korean -0.57876763 0 -0.07874180 -0.007729435 -0.5220672\n",
"thai 0.92554006 0 -0.07210196 -0.002983296 -0.4614426\n",
" black_currant black_mustard_seed_oil black_pepper black_raspberry\n",
"indian 0 0.38935801 -0.4453495 0\n",
"japanese 0 -0.05452887 -0.5440869 0\n",
"korean 0 -0.03929970 0.8025454 0\n",
"thai 0 -0.21498372 -0.9854806 0\n",
" black_sesame_seed black_tea blackberry blackberry_brandy\n",
"indian -0.2759246 0.3079977 0.191256164 0\n",
"japanese -0.6101687 -0.1671913 -0.118915977 0\n",
"korean 1.5197674 -0.3036261 -0.007729435 0\n",
"thai -0.1755656 -0.1487033 -0.002983296 0\n",
" blue_cheese blueberry bone_oil bourbon_whiskey brandy\n",
"indian 0 0.216164294 -0.2276744 0 0.22427587\n",
"japanese 0 -0.119186087 0.3913019 0 -0.15595599\n",
"korean 0 -0.007821986 0.2854487 0 -0.02562342\n",
"thai 0 -0.004947048 -0.0253658 0 -0.05715244\n",
"\n",
"...\n",
"and 308 more lines."
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "GMbdfVmTKkJI",
"outputId": "adf9ebdf-d69d-4a64-e9fd-e06e5322292e"
}
},
{
"cell_type": "markdown",
"source": [
"Die Ausgabe zeigt die Koeffizienten, die das Modell während des Trainings gelernt hat.\n",
"\n",
"### Das trainierte Modell bewerten\n",
"\n",
"Es ist an der Zeit, die Leistung des Modells 📏 zu überprüfen, indem wir es mit einem Testdatensatz evaluieren! Beginnen wir damit, Vorhersagen für den Testdatensatz zu erstellen.\n"
],
"metadata": {
"id": "tt2BfOxrKmcJ"
}
},
{
"cell_type": "code",
"execution_count": 9,
"source": [
"# Make predictions on the test set\n",
"results <- cuisines_test %>% select(cuisine) %>% \n",
" bind_cols(mr_fit %>% predict(new_data = cuisines_test))\n",
"\n",
"# Print out results\n",
"results %>% \n",
" slice_head(n = 5)"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine .pred_class\n",
"1 indian thai \n",
"2 indian indian \n",
"3 indian indian \n",
"4 indian indian \n",
"5 indian indian "
],
"text/markdown": [
"\n",
"A tibble: 5 × 2\n",
"\n",
"| cuisine &lt;fct&gt; | .pred_class &lt;fct&gt; |\n",
"|---|---|\n",
"| indian | thai |\n",
"| indian | indian |\n",
"| indian | indian |\n",
"| indian | indian |\n",
"| indian | indian |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 2\n",
"\\begin{tabular}{ll}\n",
" cuisine & .pred\\_class\\\\\n",
" <fct> & <fct>\\\\\n",
"\\hline\n",
"\t indian & thai \\\\\n",
"\t indian & indian\\\\\n",
"\t indian & indian\\\\\n",
"\t indian & indian\\\\\n",
"\t indian & indian\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 2</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>.pred_class</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;fct&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>indian</td><td>thai </td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 248
},
"id": "CqtckvtsKqax",
"outputId": "e57fe557-6a68-4217-fe82-173328c5436d"
}
},
{
"cell_type": "markdown",
"source": [
"Großartige Arbeit! In Tidymodels kann die Bewertung der Modellleistung mit [yardstick](https://yardstick.tidymodels.org/) durchgeführt werden einem Paket, das verwendet wird, um die Effektivität von Modellen anhand von Leistungsmetriken zu messen. Wie wir es in unserer Lektion zur logistischen Regression gemacht haben, beginnen wir mit der Berechnung einer Konfusionsmatrix.\n"
],
"metadata": {
"id": "8w5N6XsBKss7"
}
},
{
"cell_type": "code",
"execution_count": 10,
"source": [
"# Confusion matrix for categorical data\n",
"conf_mat(data = results, truth = cuisine, estimate = .pred_class)\n"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" Truth\n",
"Prediction chinese indian japanese korean thai\n",
" chinese 83 1 8 15 10\n",
" indian 4 163 1 2 6\n",
" japanese 21 5 73 25 1\n",
" korean 15 0 11 191 0\n",
" thai 10 11 3 7 70"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 133
},
"id": "YvODvsLkK0iG",
"outputId": "bb69da84-1266-47ad-b174-d43b88ca2988"
}
},
{
"cell_type": "markdown",
"source": [
"Wenn man mit mehreren Klassen arbeitet, ist es in der Regel intuitiver, dies als Heatmap zu visualisieren, wie hier:\n"
],
"metadata": {
"id": "c0HfPL16Lr6U"
}
},
{
"cell_type": "code",
"execution_count": 11,
"source": [
"update_geom_defaults(geom = \"tile\", new = list(color = \"black\", alpha = 0.7))\n",
"# Visualize confusion matrix\n",
"results %>% \n",
" conf_mat(cuisine, .pred_class) %>% \n",
" autoplot(type = \"heatmap\")"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"plot without title"
],
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tMTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OUlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////isF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3deWBU9b3//0+ibApWrbYuvYorXaxoaatWvVqpqG2HsCmLBAqoVXBDjCKbKMqOQUDFFVxKqyhVFLUqWKJsxg3Lz2IFGilLiEqptMX0hpzvnJkMCbx5/W5vz5k5Z+D5/OOc85nEz3w8Mw9mMjmo84gocC7qBRDtCQGJKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQijHkLb+NUZVRb2Ahn26OeoVNCxWpyZWi/mbeGbnGNJl42PUmbNiVM+7nohRlz4ao/o/HqOuFc/sHEMasShGdfwsRt3+1qYYNXJ9jLq9MkZNE89sIMUkIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSbI+EVLQktatJvB8FpNGtv9LoqMteTx7d/f2vND6h5M3oIS07xT0XxjyhQJpx+sGNj79pbfCJgkJ6o7V72t/f4FKdFSmkRa3dnNTBgnYHNPneY7GCVPvB1ggg3ezaTZraq+C8RYvGF7a64cbW7rLIIU1sdmR8IE1ynX4957qC9pFDGtvsiDSkywsn+D0eJaRxycWkIC1pcdzYSecUzIwTpP+0YJBOONJ/CTqncP6iI49YsGjRwqMOjhrSS03GTY0PpJNaVia3P92nImJIc5vcWZqG1LVFoInCgPRCkzF3pyF1bLa8snLdd1pGC+nTOy8uvvdLr+iVEZ2KF/hv7WoTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZgHS8cf624sKF5RdO84/+plbEDGk8oWfxQjSt7/pb7vu80ngmYJBWvTa+jpIPz0ickhLFlSmIa1vVuSPR7lXI4V0w9jN6wdM94qu+fCfj3XZ5v+MVDRwi/dKl23eoPFfVD/es3pj+/e3b7xudmaYDUjD3C+fmz+6aee64Zsnfz3QdOF82BAjSFPckOV/nrFfv+AzBf6woQ7SWa3Wr18dLaRkaUiL3FB/MMdNjhLS6sTG5KbcK3ra8zYmKlKQ5nrepsQnqxKbkz8zdStblVjtedu9zDD5zyxpn+wPIUJadFsz5wp7pz5i+P1vH2i372gg7dT9+yfPz/WVwScKC1Lrlh0PdAddvyYOkJ5zd/mDN9ywKCG92b42tS9anHwvl/g4BSl9WJZINbv2ng4ls9Z7mWHye9/4cbL3QoR0T/MzRt91SWHqI4bJzh0+KdBsex6kZw/4yYzfXL7PzfGB1LKw20P3F7mL4gDpSTfVHyxzN0YJaVH77WlIS+ohpQ+XJjJv4zbNG9mhrH64uwJBeuPwE/0Xo66FTya3L44f2ragF5AatPGo7/ovRlcULo0NpLff87dd3ZwYQHrOTfIHZW54lJDWJCo876MXdgNpbWJl8usbvZotyd30wZlhFiA97VJwJrjMLL9wDwGpvrfddf7uCXdPbCCle8IFmi8kSEvcLf7gqfQLU1SQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpgVSD383Wg3+PkbHkyTugVI9ZW7/v7uEXdXbCCtXOlv73fjYgBpQ4uf+4MhrixSSFvu6NJz2rbdQdo8ruslJSu82ll9Ova6+++ZYRYgvdH8mDeSuw7usRcLT/WPLnGTgVTfxq+02pjc9Xa/jwukdwsv9AfnFbwRA0iVlzZ5p7Jy7bHfDjTXHnGt3UB32qgJFxe2XbSo2H332pLzC77zRsSQ5pWW9nADSkvfiQOkTXe6Hz/wxOWFRcFnCgbp2QkTurorJ0xYvL6Paztu1OmuX6DpgkGaO2lSN9d/0qRlle8dfPTQO37QaA6QFo06qWmjlleWLVr0Zkmrps2O7fFqoNlCgNQ7fS2ZezAWkDY9+P39Gp8wZH3wiYJB6ll3Vu5dv3ZM6xZNT5kYaLaAkHrVLWZ6ZeWiC1s0Pe2ZQLPtIZDCjau/ZVz9rQKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEiqmEDqOSJGnXJfjOpw+z0xqtOUGNU76rPRsMvEMzvHkEa+GqMuHBajznv8tRg1IOoFNGzgKzFqiHhm5xjSvZ/GqF/MjFHd3on6zWXD7ox6AQ27oypG3Sue2UCKSUCSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyhQTpraYHhzDLL/7zp/3YY9zg1MFDHQ7Z92sXz0ge3fTt5o2O6j0jWkjLTnHPhTFPGJDmt23e/OTSquAThQFpXcl/NW45bFPwiUKEVJN4f6fxpkTFrjdlGVLVmS5aSL0bH1wH6QeFF155puswc+b1BUf37HWi6xQppInNjowNpJcbtbx90jnuluAzhQEpsc9V0y9xJcEnChFS7Qdbd4W0601ZhjSp8bmRQhrWqPiyNKQS1y25/f43Z8z82qEPzJz58GEHRAnppSbjpsYG0o8O+ONnn1V9Z7+NgWcKAdJsd1ty+/Mzg78kZfGtXRLSv/29oUD6wwEll0YKadyomXWQftT0ofRND/e4zt+d7R6IEFL5ws/iA2nydH/bx/0p8EwhQOrSfF3wSVKF+9auNrFwRP++8z1v9aAuVy9Mv7WrGN6964gN3o4vZQ/SRSeujxZSsjpIh540c2aDH4tmnPDV/3TCkD5siA+kdOceGnyOECAdfW5VVWXwaapC/xmpaOAW75Uu22r7lW6rGpKGdGXptn+MKfEyX8oepIcK5n0aE0gzCs7t8/WC/S9KvQw9dNfw0/e5BkgNe9jdHnyS4JA2FfaadEzBQf0/iR+kuf5buk/+mNjoeUvSkLZ+6XmLO9RmvpT8xgVtkr0dNqQ/HdL307hAut8deuxVN15Y0Ma/qcS5Q274jyfcIyH9utlFsfjUrsId9b0Hnrqq8Gfxg7TY8zYnPi5rv93zPklDWj6kuLhboibzpeQ3lvdM9mHYkLoeviY2kB50zacndz9xtya3U6+/7LSCBJDqG7dPpw0hTBMc0l/cQWuSu8vcK7GDtCSlZX77Ws9bk4K0odPsam+pD2lJBtJuCg7pqYKHKyoquh1csS4GkGY2+6a/HeT61N3cPkUKSKmudIM+DWOeEH5GanGmv/2NuyvwTNmBtDxR6XllKUhlRTWe92j2IfVzdZ0fB0itDvO317orphQP948Gur5AqmtgQWkIs3wWCqQzjve3j7l7As+UHUjVPUq3rrs5BWllYsW/Fg5OVGUb0tsv+LU74IU34wCplytJbs8oHD+14Jv+p3ftUmMgJXvahfWJRQiQxrunk9su+7wVeKbsQPI+ur7z1e8k/uzfNKN7jylbB3bblGVI6aL9GWlonz5nu4v69Jkw86GWTdr3+6E7f+bMn7nju/c+veC4//QaoTAgzSst7eEGlJYGnyq4gcrjDipN9V7gqUKAtK71foPuLnKXB59pz7rWLmJIP657d3nVzJn3tv3KPocVJ/XM6H1046bfuGj6fzpnGJB6163rwcAzBYf0UeYt+GOBpwrjEqGP+3yt0XFj43WtXZC4+lvF1d8yrv62AUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAcl28S9jVKt2Mer4blGfjob9KOoFNOxnV8Soi8QzO8eQ7loVo7r/JUYNf/KNGDXkTzFq1PoYNU08s3MMafLaGNUz6rcJDbvtmWUxanhFjIrV+8z7xDMbSDEJSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIs/yBt6n1EoUsFpFwHJFn+Qbp437a9+6UCUq4Dkiz/IH312WwBAtL/FpBk+QdpvyogRRWQZPkH6ezXgRRVQJLlH6S3f7gYSBEFJFn+QTrzv9x+R6cCUq4Dkiz/IJ3dNhOQch2QZPkHKfsBSQUkWT5C+uyFBx56+Qsg5T4gyfIP0vZBjfzLGvYfD6ScByRZ/kEa7zo+/OIL91/gHgVSrgOSLP8gfeuG9P6K7wEp1wFJln+QmsxP7+c1A1KuA5Is/yDt/3x6/2xzIOU6IMnyD9JZP672d9vanQukXAckWf5Bmldw1JWjbr/8iMJXgZTrgCTLP0jeb7/pf/z93XnZcgQkGZBkeQjJ89a/VV6ZNUZA0gFJlpeQshyQVECS5RmkVqO9VjsCUq4DkizPIJ1W6p22IyDlOiDJ8gxSTgKSCkiy/IPU5sP0/ulvASnXAUmWf5BceWr3P7c1BlKuA5Is3yC5+rhoNecBSZZvkN6/2xWl/uuQl434C5ByHZBk+QbJ8y74U7YAAel/C0iy/IPkbZyS3FTdtglIOQ9IsvyDtPIw/1OGCnfYaiDlOiDJ8g9Sh+Pf8ncfHt8JSLkOSLL8g3ToI+n9/S2AlOuAJMs/SM2eSO9/tV9MIc07t3nzk8ZW+Ie/P9k9GT2kkvSvC86OGtJrmV9cjF+2bNoPvtL4xMFLo4T0/Dn773/SmDUVFdenV3Vm9JCWneKeC2OefwvSjy6o8Xdf/ODMzC01ifdjBOnZfY8eNuYsd2PycHSzI+IA6ZeFd/n9OmpIbw5JdX7Br5ZNKmx1402nuCsihPTbfY8eOvosN6iiom/hWL+ZkUOa2OzIHEJ6ueDYASNH9Dm08OXMLbUfbI0RpNNbvLt2bcW391uz9rdNRk2KA6TuBwSfI10Yb+1eP7TDsmXfOLJs2bJFRx8cIaTTWrxdUbHmW/utqri4RaCJQoP0UpNxU3MIyXuljf9CfHJc/4bs+Lv9bbFbvrbsd2tjAelnRwafI10YkC458NVliwdO9A9/7sqigzRusr/t6d6ruPDweEAqX/hZTiF53mcf/H8N/4vF/lu7iuHdu47Y4FUnXh7cr+9SLzOuTSwc0b/vfM/bPL5Xl8GrPO+1qzoX31u9Y5gFSOnOPiS1iwWks1tVVa0NPk1VKJCeLCzJHC5tfVigqcL4sOHsQyoqzjyxomJlDCAlyzGkXfIhXVm67R9jSpKH1/3Ve7XDlszYKxq4xXulyzZv0Pgvqh/vWb2x/fvbN143OzNM/sP/XJfsy7Ah3eeGxQfSKcd0PsgdNOgvsYB0/qFvpPZvzH34gn3HRg3pHje0ouLklkUHuoOu/WhvgrTbvyHrQ9qatLC4Q21N4jnP2971lczYK5rreZsSn6xKbE7+LNWtbFVidfLrXmaY/IcXtEn2dsiQZjZrVxEfSMcU9pj5UEf30+AzBYf0ZOGg9MFU5w4vDTZXcEiPNDt/TUVFy8JL7r8n4S7YmyDt9m/I+pCWDyku7paoqUksS95w1azM2CtanHxbl/i4LJFqdu09HUpmrfcyw+T3rrg52c4XSQSGNGqfotVr4wPp/RX+trubG3im4JC6Nn49ffC7icPPL/hFtJBu36f9x8ndknJ/cLF7ai+CtNuSkDZ0ml3tLfUh+f9fzCt+nRl7RUtSkJYmquu+edO8kR3K6oe7Kyikfu7aT9bGCFK637hRgecIDGnp13/UYNTXzYgSUl93zZ/rR4+6EUB6v6yoxvMe9SE97XnVnV/LjDOQ1iZWJr9xo1ezJbmbPjgzzAqkqwvG7jiOBaTVq/3tQ25i4JkCQ3rE3eLvXip5xN/d5YZGCGlAwZj0wYoV/vYeN3ovgrR/g3b8DdkkpJWJFf9aODhRVZMYUFE9q+PfMuMMJG9oSVXNi10+f7XPx7Wbh0zJDLMB6VduZP0gDpA+KLzI37UtWBI9pKvdr/zd7wq/tyS56+qmRgfp8cwr0LLCdv7u3ILX9yJIXZO1anRG5w6nFLS5ugEkb0b3HlO2Duy2IfHiTZ37lXuZ8aYMpM3jul5SssKrndWnY6+7/54ZZgHSmmMPHDvOb8naOePGXeJ+OW7cm9FCqurnzp8w+gx3efCZAkP6uft9at/bnXz9ze0KTloSGaRVxxw4JnVBw6KK3u680SN/6PoEmS4MSPNKS3u4AaWl7+QAUrLZJ23wdyu/ObchpB2H74g5/g8FgvR+5oKyB9deWnc0LWJIG8efckDTU0uDTxQc0tmF6f3Swa2aNjuu5+uBJgsE6d3M4/RAxeo7Tm7RtPW4QI7CgNQ788zJDaSTnkrv72tdd8P2jxI7PnWLHlLYcfW3jKu/Vf8WpMavpfezm9TdsLDDqFog5SQgyfIP0hGXpna1XQ8PTgZI/7eAJMs/SLe67147atSAb7nBQMp1QJLlH6TacYf7P5EdMrwGSLkOSLL8g5Sk9Mmypau3Z4sRkHRAkuUjpG1vzfnU+x8g5T4gyfIQ0sQWzi3xhvwia5SApAKSLP8gPeDaT09CenTf8UDKdUCS5R+kk6/0tiUhebecCKRcByRZ/kFq+moa0u8aASnXAUmWf5C+9nwa0lMHACnXAUmWf5B+cs4/fUifn9QOSLkOSLL8g/T6Psdf5/r2PqDRm0DKdUCS5R8k77VT/Ssbfvj7bDkCkgxIsjyE5Hmb3ntvc9YYAUkHJFn+QToje/+JVSD9LwFJln+QvjEJSFEFJFn+QXruW7/9F5CiCUiy/IN09ndd4yOO9gNSrgOSLP8gnXle27qAlOuAJMs/SNkPSCogyfIO0rZlb24BUkQBSZZvkCa3cK5R/y/FNwIpuwFJlmeQnnEtbxh2lrtafCOQshuQZHkG6eyW/v8utm+jvwEpioAkyzNIzYf727dc1i5YBdL/X0CS5Rkkd7+/3eBeFt8JpKwGJFm+QXrQ3250LwEpioAkAxKQ/v2AJMs3SLcsSTbPlfo7IOU6IMnyDVLDgJTrgCTLM0i3NgxIuQ5IsjyDlJOApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZOvQM0Yd/4sYdWq7TjHqB5fGqJ/2iVEXimd2jiFNWR+jij+PUaOWboxRiWkxanTUj03DYvKKBCQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiRbUEhvtHZP+/sbXKqzIodUduEBTdr8KoSJAkNa1No9s+tRFJBGHOWuSx1cf3zjxifcsPNtkUEK7XEKBVJN4p1oIY1tdkQa0uWFE/wejxrS2y2OmzD53IIngs8UFNK45Kl5ZpejKCB1a3xQGs2V7shLLj1s35sa3hYZpPAepz0C0twmd5amIXVtEWii0CB1bvbh559vOumY4DMFhPR8k9GT03zqj6KANKjRJT3TaA498K5p0ya0aNXwtsgghfc47RGQFr22vg7ST4+IBaSqZh393Wj3euCpAkJaPH9jHZ/6oygg3XrLtDSaMe4sf9y2YHz9bZFBCvFxCg1SzbCRNX8d36tzyYfe9sTv+k32No/v1WXwKs+rGN6964gNXm1i4Yj+fednBVKyOkhntVq/fnX0kJa54f5urpsaeKrgHzbU84kQUrI0mjvcef6gixtYf1tkkEJ8nEKDVFrypTfo1i1fPtz1b17RwFX/9AaN/6L68Z7V3pWl2/4xpsRL3rjFe6XLtuS3f74s2d+yAql1y44HuoOuXxMxpBfc3f5uiRsReKo9DdLU/Y7yB23cZTGAFOLjFBakJ/p/4a1OrPW86osXeEVPet6qxGbPq+1W5m390vMWd6j1iuZ63qbEJ8lvX9Am2dtZgdSysNtD9xe5iyKG9Iy7z9+9424KPNWeBmlawv33yNsuaOH6xgBSiI9TSJDGJv7geW+2r00O+v/GKyrzvLJEqtne8iHFxd0SNV7RYs/bnPcB240AABDoSURBVPg4+R2rpyRblxVIb7/nb7u6OdFCmucm+7vF7tbAU+1xkO4+r8C5b13qrowBpBAfp5Ag9RsxsKYO0lVPeEVLPG9pojr1tQ2dZlcnBzWpG9OQdlNYkNI94UZGC+ltN8zfzUn/gReoPQ7StGljS+5M/ow0LAaQQnycQoJUvrXPI94a/43bts7zU2bWJlYmv7LRKyuq8bxHcwVp5Up/e78bFy2kT1sk/N1wtzjwVHsgJL/v7jclBpBCfJxC+7BhRYd3vZKRX2y7r+c/Uma8oSVVNS92+XxlYsW/Fg5OVOUE0ruFF/qD8wreiBbS58VNln/++YZjvxN8pj0O0umHTp42bXDhORZX7iGF+DiF93ukx4u3VN3R89Lbkj/8pCBtHtf1kpIVnjeje48pWwd225RFSM9OmNDVXTlhwuL1fVzbcaNOd/0CTRcCpD98teXwMT9sNDf4TAEhzZ04sZu7auLEpQ2OooB0Q48ep7u2PXqMnHZFwQnFHZp/dWzD2yKDFN7jtEdca9czfYWdu3f92jGtWzQ9ZWKg2UK51m7ZRS2anvFcCBMFhFRcd2rua3AUBaSz6u69z7Rpfb7RqPlpd+58W1SQwnuc9ghIIcfV3zKu/lYByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSRUTSB2LY9SJfWJUm469Y1TLs2NUIurHpmEXimd2jiFNq4xRvaP+c79ht77zWYwatSlGXV8eo24Xz2wgxSQgyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSLagkBa1dnNSBwvaHdDke49FCym5mGd2PYoW0pz/PrjJSZM+DT5RcEh/cnXNjBbSgsw6JpSXzzq7eeOT7oo1pKIlOw1rEu9nBdK4ZkekIS1pcdzYSecUzIwSkr+YZ3Y5ihbSrMKTx0443Q2OA6R1d6UqKng9WkiLh6Y6v2BW+Zz9j7p56GkFE+MKafnHBlLtB1uzAemFJmPuTkPq2Gx5ZeW677SMENLzTUZPTvOpP4oYUsuj13322cbjD40DpHSrDy8OPkkIb+0WHtqxvPyCpi+Vly894RtxhXTbiwaSLBikJQsq05DWNyvyx6Pcq9FBWjx/Yx2f+qNoIVXe8YS/6+HWxQbSZQd/FHySECB1PXB++bKm5/uHg9wT8YQ0pH2n672iV0Z0Kl7geRXDu3cdsSFrb+0q6yAtckP9wRw3OTpIyer5xAJSuk9P+0bwSUKC9Gbh2BBmCQ5pduHN5eVPuwH+8XQ3Ip6QvH7+K9I1H/7zsS7bvCtLt/1jTEkdpPXPJKvKBqTn3F3+4A03DEg7tWH5y50bzQw+T0iQOhz+lxBmCQ6p3aGLyssfcMP846fc1XGG9LTnbUxUeFu/9LzFHWrTkBa0SfZ2NiA96ab6g2XuRiDt1DPOHfWbEOYJB9KbhXeGMU1gSLMLb0xup7nb/MGz7vI4Q1rseZsTH3vLhxQXd0vUZP8VaZI/KHPDgbRTH/1qaseCgcHnCQfS5Y1XhzFNYEjdGi9Mbh90Q/3BU+6aOENakoK0odPsam9pBtJuCgnSEneLP3gq/cIEpJ0a5F4NPEcokCqP+EkY0wSG9NbXz/R3c1x/f3dP+oUp3pDKimo879HsQ9rQ4uf+YIgrA1J9fxz3O3/3azc5HpBecpPCmCYwpBnpl6Jl+5/n7wa4p2IKqf/Df89AWplY8a+FgxNV2YZUeWmTdyor1x777UBz7WmQPio8syq5+6V7Jh6Qhrvgv4z1CwrpGjcrte/Q+Pny8kX/dUKgybIIaW7nPhlI3ozuPaZsHdhtQ3YgzZ00qZvrP2nSssr3Dj566B0/aDQnQkhzJ07s5q6aOHFpg6NoIX12nfvhqImdCr5fFQ9I3d2aMKYJDCnhFqb28w48csCgk/edHldI/5eCQepVd9nU9MrKRRe2aHraM4FmCwipuG4x9zU4ihjSp5NObrb/t66uCD5TKJAuKAxjluCQ/ruw7uDpc/Zvcup9wSbbIyCFHFd/y7j6WwUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkVUwgzXg8Rg2MegENGzY16hU07NqoF9CwWD1O08QzO8eQiPbMgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQghIRCGUl5CmTY56BQ2ad+emqJdQ34d3Lo16CfV9eeesqJfQoBl3ZnX6vISUaBf1Chp0R5uPo15Cfa+2eTzqJdS3tc3VUS+hQb2/n9XpgRQ0IKmAFPeApAKSDEg2IKmAJAMSUfwDElEIAYkohPIDUtGS1K4m8X7ECzFL2JSoyPmqYnAaTDWJd6Jegq7u6ZMpK+cvryDVfrA14oWYJSQh5XxVMTgNpthCWv6xgZSV85dXkGJYElLUS4hFsYV024u5efrEHNKnd15cfO+XXtErIzoVL/Bfk2sTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZruGS1g9qMvVC9Nv7SqGd+86YoO340vZXkPmDqsTLw/u13epZxaQ69PjQ6oZNrLmr+N7dS750Nue+F2/yTvuNadnZ+eGtO90febpk1nH3vjW7oaxm9cPmO4VXfPhPx/rss0/A0UDt3ivdNnmDRr/RfXjPas3tn9/+8brZmeGWV9QgyXU9ivdVjUkDenK0m3/GFPi7Vhd1tdQd4c1iev+6r3aYYtZQK5Pjw+ptORLb9CtW758uOvfkutY9c8d95rTs7NL/fxXpPTTp/6k7XWQVic2JjflXtHTnrcx/ZQtmuu/n/pkVWJz8s1ut7JVidWet93LDLO+ogZL+KO/uCXpVW390vMWd6jNfCn7a6i7w5rEc8l//a6v7LqAnJ+eJKQn+n+RfMDWel71xQu8oie9+nvN6dnZpRSk9NOn/qTtdZDebF+b2hctTr5ZSXycehanD8sSqWbX3tOhZNZ6LzPM+ooaLqH9ds/7JA1p+ZDi4m6JmsyXsr+GujusSSxL3nDVrF0XkPPTU5MYm/hD5gHr/xuvKIl2x73m9OzsUgpS3f3uOGl7HaRF/nPVS/+0mIGUPlyayLxP2TRvZIey+mGWa7CE+f6TZk0K0oZOs6u9pf5TZUluIGXusCaRfI54V/x61wXk/PTUJPqNGFhTB+mqJ1LryNxrbs/OLvV7ccfTp/6k7XWQ1vifiX30wm4grU2sTH59o1ezJbmbPjgzzHoNlrA8Uen/qetDKiuq8bxHcwgpc4c1ieS7lurOr+26gJyfnppE+dY+jyQfsOQbt22d56fWkbnX3J6dXWoAqf6k7XWQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpj1BTVYQnWP0q3rbk5BWplY8a+FgxNVOYOUucOaxICK6lkd/2YWkOvT43/YsKLDu17JyC+23dfzH+lPnOvuNbdnZ5f6P/z3zP3Wn7S9D9KWO7r0nLZtd5A2j+t6SckKr3ZWn4697v57Zpj1Gi7ho+s7X/1O4s/+TTO695iydWC3TbmClLnDDYkXb+rcr9wzC8j16Un9Hunx4i1Vd/S89LZ1db+6ydxrTs/OLs3t3GfHA7bjpO19kGg3NfgTNba/B93rAlLetf0j/zPtdECKS0DKuxZ2GFWbOQZSXAISUQgBiSiEgEQUQkAiCiEgEYUQkPK3X7pMp+32622Pzu169uqAlL+9PnXq1Gtd5+TWXNb9nv+4AimHASm/e92V7u7mKUDKcUDK7+ognXn28984w2vd2j8u+qp3QfLtXhuv7XFrLmze/JLsX8lLQMr36iCdd/I373mhHtKfilz5h17blq1HP3tjwS+iXeFeEpDyuzpIbd2c5HYHJK+f23Hjj74W4fL2noCU32UgNf6XZyE19a/J61UY4fL2noCU32UgHeFvd4V0tD/sx0OcizjL+V0G0tH+FkjRxVnO73aCdOpJ/vY0IEUQZzm/2wnSeYckfyja1CwJ6TL3P0DKaZzl/G4nSJPdmMp3f/ydJKQR7rangZTLOMv53U6Qqm84sknr5we08Ly/nNqoFZByGWeZKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYXQ/wMhANIDIZLX1QAAAABJRU5ErkJggg=="
},
"metadata": {
"image/png": {
"width": 420,
"height": 420
}
}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 436
},
"id": "HsAtwukyLsvt",
"outputId": "3032a224-a2c8-4270-b4f2-7bb620317400"
}
},
{
"cell_type": "markdown",
"source": [
"Die dunkleren Felder in der Konfusionsmatrix zeigen eine hohe Anzahl von Fällen an, und idealerweise sehen Sie eine diagonale Linie aus dunkleren Feldern, die Fälle markieren, bei denen die vorhergesagte und die tatsächliche Kategorie übereinstimmen.\n",
"\n",
"Lassen Sie uns nun zusammenfassende Statistiken für die Konfusionsmatrix berechnen.\n"
],
"metadata": {
"id": "oOJC87dkLwPr"
}
},
{
"cell_type": "code",
"execution_count": 12,
"source": [
"# Summary stats for confusion matrix\n",
"conf_mat(data = results, truth = cuisine, estimate = .pred_class) %>% \n",
"summary()"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" .metric .estimator .estimate\n",
"1 accuracy multiclass 0.7880435\n",
"2 kap multiclass 0.7276583\n",
"3 sens macro 0.7780927\n",
"4 spec macro 0.9477598\n",
"5 ppv macro 0.7585583\n",
"6 npv macro 0.9460080\n",
"7 mcc multiclass 0.7292724\n",
"8 j_index macro 0.7258524\n",
"9 bal_accuracy macro 0.8629262\n",
"10 detection_prevalence macro 0.2000000\n",
"11 precision macro 0.7585583\n",
"12 recall macro 0.7780927\n",
"13 f_meas macro 0.7641862"
],
"text/markdown": [
"\n",
"A tibble: 13 × 3\n",
"\n",
"| .metric &lt;chr&gt; | .estimator &lt;chr&gt; | .estimate &lt;dbl&gt; |\n",
"|---|---|---|\n",
"| accuracy | multiclass | 0.7880435 |\n",
"| kap | multiclass | 0.7276583 |\n",
"| sens | macro | 0.7780927 |\n",
"| spec | macro | 0.9477598 |\n",
"| ppv | macro | 0.7585583 |\n",
"| npv | macro | 0.9460080 |\n",
"| mcc | multiclass | 0.7292724 |\n",
"| j_index | macro | 0.7258524 |\n",
"| bal_accuracy | macro | 0.8629262 |\n",
"| detection_prevalence | macro | 0.2000000 |\n",
"| precision | macro | 0.7585583 |\n",
"| recall | macro | 0.7780927 |\n",
"| f_meas | macro | 0.7641862 |\n",
"\n"
],
"text/latex": [
"A tibble: 13 × 3\n",
"\\begin{tabular}{lll}\n",
" .metric & .estimator & .estimate\\\\\n",
" <chr> & <chr> & <dbl>\\\\\n",
"\\hline\n",
"\t accuracy & multiclass & 0.7880435\\\\\n",
"\t kap & multiclass & 0.7276583\\\\\n",
"\t sens & macro & 0.7780927\\\\\n",
"\t spec & macro & 0.9477598\\\\\n",
"\t ppv & macro & 0.7585583\\\\\n",
"\t npv & macro & 0.9460080\\\\\n",
"\t mcc & multiclass & 0.7292724\\\\\n",
"\t j\\_index & macro & 0.7258524\\\\\n",
"\t bal\\_accuracy & macro & 0.8629262\\\\\n",
"\t detection\\_prevalence & macro & 0.2000000\\\\\n",
"\t precision & macro & 0.7585583\\\\\n",
"\t recall & macro & 0.7780927\\\\\n",
"\t f\\_meas & macro & 0.7641862\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 13 × 3</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>.metric</th><th scope=col>.estimator</th><th scope=col>.estimate</th></tr>\n",
"\t<tr><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>accuracy </td><td>multiclass</td><td>0.7880435</td></tr>\n",
"\t<tr><td>kap </td><td>multiclass</td><td>0.7276583</td></tr>\n",
"\t<tr><td>sens </td><td>macro </td><td>0.7780927</td></tr>\n",
"\t<tr><td>spec </td><td>macro </td><td>0.9477598</td></tr>\n",
"\t<tr><td>ppv </td><td>macro </td><td>0.7585583</td></tr>\n",
"\t<tr><td>npv </td><td>macro </td><td>0.9460080</td></tr>\n",
"\t<tr><td>mcc </td><td>multiclass</td><td>0.7292724</td></tr>\n",
"\t<tr><td>j_index </td><td>macro </td><td>0.7258524</td></tr>\n",
"\t<tr><td>bal_accuracy </td><td>macro </td><td>0.8629262</td></tr>\n",
"\t<tr><td>detection_prevalence</td><td>macro </td><td>0.2000000</td></tr>\n",
"\t<tr><td>precision </td><td>macro </td><td>0.7585583</td></tr>\n",
"\t<tr><td>recall </td><td>macro </td><td>0.7780927</td></tr>\n",
"\t<tr><td>f_meas </td><td>macro </td><td>0.7641862</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 494
},
"id": "OYqetUyzL5Wz",
"outputId": "6a84d65e-113d-4281-dfc1-16e8b70f37e6"
}
},
{
"cell_type": "markdown",
"source": [
"Wenn wir uns auf einige Metriken wie Genauigkeit, Sensitivität und PPV konzentrieren, ist das für den Anfang gar nicht so schlecht 🥳!\n",
"\n",
"## 4. Tiefer eintauchen\n",
"\n",
"Stellen wir uns eine subtile Frage: Nach welchen Kriterien wird eine bestimmte Art von Küche als vorhergesagtes Ergebnis ausgewählt?\n",
"\n",
"Nun, statistische Machine-Learning-Algorithmen, wie die logistische Regression, basieren auf `Wahrscheinlichkeit`. Das bedeutet, dass ein Klassifikator tatsächlich eine Wahrscheinlichkeitsverteilung über eine Menge möglicher Ergebnisse vorhersagt. Die Klasse mit der höchsten Wahrscheinlichkeit wird dann als das wahrscheinlichste Ergebnis für die gegebenen Beobachtungen ausgewählt.\n",
"\n",
"Schauen wir uns das in der Praxis an, indem wir sowohl harte Klassenentscheidungen als auch Wahrscheinlichkeiten betrachten.\n"
],
"metadata": {
"id": "43t7vz8vMJtW"
}
},
{
"cell_type": "code",
"execution_count": 13,
"source": [
"# Make hard class prediction and probabilities\n",
"results_prob <- cuisines_test %>%\n",
" select(cuisine) %>% \n",
" bind_cols(mr_fit %>% predict(new_data = cuisines_test)) %>% \n",
" bind_cols(mr_fit %>% predict(new_data = cuisines_test, type = \"prob\"))\n",
"\n",
"# Print out results\n",
"results_prob %>% \n",
" slice_head(n = 5)"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine .pred_class .pred_chinese .pred_indian .pred_japanese .pred_korean\n",
"1 indian thai 1.551259e-03 0.4587877 5.988039e-04 2.428503e-04\n",
"2 indian indian 2.637133e-05 0.9999488 6.648651e-07 2.259993e-05\n",
"3 indian indian 1.049433e-03 0.9909982 1.060937e-03 1.644947e-05\n",
"4 indian indian 6.237482e-02 0.4763035 9.136702e-02 3.660913e-01\n",
"5 indian indian 1.431745e-02 0.9418551 2.945239e-02 8.721782e-03\n",
" .pred_thai \n",
"1 5.388194e-01\n",
"2 1.577948e-06\n",
"3 6.874989e-03\n",
"4 3.863391e-03\n",
"5 5.653283e-03"
],
"text/markdown": [
"\n",
"A tibble: 5 × 7\n",
"\n",
"| cuisine &lt;fct&gt; | .pred_class &lt;fct&gt; | .pred_chinese &lt;dbl&gt; | .pred_indian &lt;dbl&gt; | .pred_japanese &lt;dbl&gt; | .pred_korean &lt;dbl&gt; | .pred_thai &lt;dbl&gt; |\n",
"|---|---|---|---|---|---|---|\n",
"| indian | thai | 1.551259e-03 | 0.4587877 | 5.988039e-04 | 2.428503e-04 | 5.388194e-01 |\n",
"| indian | indian | 2.637133e-05 | 0.9999488 | 6.648651e-07 | 2.259993e-05 | 1.577948e-06 |\n",
"| indian | indian | 1.049433e-03 | 0.9909982 | 1.060937e-03 | 1.644947e-05 | 6.874989e-03 |\n",
"| indian | indian | 6.237482e-02 | 0.4763035 | 9.136702e-02 | 3.660913e-01 | 3.863391e-03 |\n",
"| indian | indian | 1.431745e-02 | 0.9418551 | 2.945239e-02 | 8.721782e-03 | 5.653283e-03 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 7\n",
"\\begin{tabular}{lllllll}\n",
" cuisine & .pred\\_class & .pred\\_chinese & .pred\\_indian & .pred\\_japanese & .pred\\_korean & .pred\\_thai\\\\\n",
" <fct> & <fct> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t indian & thai & 1.551259e-03 & 0.4587877 & 5.988039e-04 & 2.428503e-04 & 5.388194e-01\\\\\n",
"\t indian & indian & 2.637133e-05 & 0.9999488 & 6.648651e-07 & 2.259993e-05 & 1.577948e-06\\\\\n",
"\t indian & indian & 1.049433e-03 & 0.9909982 & 1.060937e-03 & 1.644947e-05 & 6.874989e-03\\\\\n",
"\t indian & indian & 6.237482e-02 & 0.4763035 & 9.136702e-02 & 3.660913e-01 & 3.863391e-03\\\\\n",
"\t indian & indian & 1.431745e-02 & 0.9418551 & 2.945239e-02 & 8.721782e-03 & 5.653283e-03\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 7</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>.pred_class</th><th scope=col>.pred_chinese</th><th scope=col>.pred_indian</th><th scope=col>.pred_japanese</th><th scope=col>.pred_korean</th><th scope=col>.pred_thai</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>indian</td><td>thai </td><td>1.551259e-03</td><td>0.4587877</td><td>5.988039e-04</td><td>2.428503e-04</td><td>5.388194e-01</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>2.637133e-05</td><td>0.9999488</td><td>6.648651e-07</td><td>2.259993e-05</td><td>1.577948e-06</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>1.049433e-03</td><td>0.9909982</td><td>1.060937e-03</td><td>1.644947e-05</td><td>6.874989e-03</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>6.237482e-02</td><td>0.4763035</td><td>9.136702e-02</td><td>3.660913e-01</td><td>3.863391e-03</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>1.431745e-02</td><td>0.9418551</td><td>2.945239e-02</td><td>8.721782e-03</td><td>5.653283e-03</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 248
},
"id": "xdKNs-ZPMTJL",
"outputId": "68f6ac5a-725a-4eff-9ea6-481fef00e008"
}
},
{
"cell_type": "markdown",
"source": [
"✅ Können Sie erklären, warum das Modell ziemlich sicher ist, dass die erste Beobachtung thailändisch ist?\n",
"\n",
"## **🚀Herausforderung**\n",
"\n",
"In dieser Lektion haben Sie Ihre bereinigten Daten verwendet, um ein Machine-Learning-Modell zu erstellen, das anhand einer Reihe von Zutaten eine nationale Küche vorhersagen kann. Nehmen Sie sich etwas Zeit, um die [vielen Optionen](https://www.tidymodels.org/find/parsnip/#models) zu erkunden, die Tidymodels zur Klassifizierung von Daten bietet, sowie [andere Möglichkeiten](https://parsnip.tidymodels.org/articles/articles/Examples.html#multinom_reg-models), um eine multinomiale Regression anzupassen.\n",
"\n",
"#### EIN GROSSES DANKESCHÖN AN:\n",
"\n",
"[`Allison Horst`](https://twitter.com/allison_horst/) für die Erstellung der großartigen Illustrationen, die R einladender und ansprechender machen. Weitere Illustrationen finden Sie in ihrer [Galerie](https://www.google.com/url?q=https://github.com/allisonhorst/stats-illustrations&sa=D&source=editors&ust=1626380772530000&usg=AOvVaw3zcfyCizFQZpkSLzxiiQEM).\n",
"\n",
"[Cassie Breviu](https://www.twitter.com/cassieview) und [Jen Looper](https://www.twitter.com/jenlooper) für die Erstellung der ursprünglichen Python-Version dieses Moduls ♥️\n",
"\n",
"<br>\n",
"Hätte gerne ein paar Witze eingebaut, aber ich verstehe keine Food-Wortspiele 😅.\n",
"\n",
"<br>\n",
"\n",
"Viel Spaß beim Lernen,\n",
"\n",
"[Eric](https://twitter.com/ericntay), Gold Microsoft Learn Student Ambassador.\n"
],
"metadata": {
"id": "2tWVHMeLMYdM"
}
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Haftungsausschluss**: \nDieses Dokument wurde mithilfe des KI-Übersetzungsdienstes [Co-op Translator](https://github.com/Azure/co-op-translator) übersetzt. Obwohl wir uns um Genauigkeit bemühen, weisen wir darauf hin, dass automatisierte Übersetzungen Fehler oder Ungenauigkeiten enthalten können. Das Originaldokument in seiner ursprünglichen Sprache sollte als maßgebliche Quelle betrachtet werden. Für kritische Informationen wird eine professionelle menschliche Übersetzung empfohlen. Wir übernehmen keine Haftung für Missverständnisse oder Fehlinterpretationen, die aus der Nutzung dieser Übersetzung entstehen.\n"
]
}
]
}