You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/translations/lt/4-Classification/2-Classifiers-1/solution/R/lesson_11-R.ipynb

1300 lines
77 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"nbformat": 4,
"nbformat_minor": 2,
"metadata": {
"colab": {
"name": "lesson_11-R.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true
},
"kernelspec": {
"name": "ir",
"display_name": "R"
},
"language_info": {
"name": "R"
},
"coopTranslator": {
"original_hash": "6ea6a5171b1b99b7b5a55f7469c048d2",
"translation_date": "2025-09-03T20:21:38+00:00",
"source_file": "4-Classification/2-Classifiers-1/solution/R/lesson_11-R.ipynb",
"language_code": "lt"
}
},
"cells": [
{
"cell_type": "markdown",
"source": [
"# Sukurkite klasifikavimo modelį: Skanūs Azijos ir Indijos patiekalai\n"
],
"metadata": {
"id": "zs2woWv_HoE8"
}
},
{
"cell_type": "markdown",
"source": [
"## Virtuvės klasifikatoriai 1\n",
"\n",
"Šioje pamokoje nagrinėsime įvairius klasifikatorius, kad *prognozuotume tam tikrą nacionalinę virtuvę pagal ingredientų grupę.* Tuo pačiu sužinosime daugiau apie tai, kaip algoritmai gali būti naudojami klasifikavimo užduotims.\n",
"\n",
"### [**Prieš paskaitos testas**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/21/)\n",
"\n",
"### **Pasiruošimas**\n",
"\n",
"Ši pamoka remiasi mūsų [ankstesne pamoka](https://github.com/microsoft/ML-For-Beginners/blob/main/4-Classification/1-Introduction/solution/lesson_10-R.ipynb), kurioje:\n",
"\n",
"- Švelniai supažindinome su klasifikacijomis, naudodami duomenų rinkinį apie visas nuostabias Azijos ir Indijos virtuves 😋.\n",
"\n",
"- Išnagrinėjome keletą [dplyr veiksmažodžių](https://dplyr.tidyverse.org/), kad paruoštume ir išvalytume duomenis.\n",
"\n",
"- Sukūrėme gražias vizualizacijas naudodami ggplot2.\n",
"\n",
"- Parodėme, kaip spręsti nesubalansuotų duomenų problemą, juos iš anksto apdorojant naudojant [recipes](https://recipes.tidymodels.org/articles/Simple_Example.html).\n",
"\n",
"- Parodėme, kaip `prep` ir `bake` mūsų receptą, kad įsitikintume, jog jis veikia taip, kaip numatyta.\n",
"\n",
"#### **Būtinos sąlygos**\n",
"\n",
"Šiai pamokai mums reikės šių paketų, kad galėtume išvalyti, paruošti ir vizualizuoti duomenis:\n",
"\n",
"- `tidyverse`: [tidyverse](https://www.tidyverse.org/) yra [R paketų kolekcija](https://www.tidyverse.org/packages), sukurta tam, kad duomenų mokslas būtų greitesnis, lengvesnis ir smagesnis!\n",
"\n",
"- `tidymodels`: [tidymodels](https://www.tidymodels.org/) sistema yra [paketų kolekcija](https://www.tidymodels.org/packages/) modeliavimui ir mašininio mokymosi užduotims.\n",
"\n",
"- `themis`: [themis paketas](https://themis.tidymodels.org/) suteikia papildomus receptų žingsnius nesubalansuotų duomenų tvarkymui.\n",
"\n",
"- `nnet`: [nnet paketas](https://cran.r-project.org/web/packages/nnet/nnet.pdf) siūlo funkcijas, skirtas vieno paslėpto sluoksnio tiesioginio perdavimo neuroniniams tinklams ir daugianario logistinės regresijos modeliams įvertinti.\n",
"\n",
"Galite juos įdiegti taip:\n"
],
"metadata": {
"id": "iDFOb3ebHwQC"
}
},
{
"cell_type": "markdown",
"source": [
"`install.packages(c(\"tidyverse\", \"tidymodels\", \"DataExplorer\", \"here\"))`\n",
"\n",
"Arba, žemiau pateiktas scenarijus patikrina, ar turite moduliui užbaigti reikalingus paketus, ir, jei jų trūksta, įdiegia juos už jus.\n"
],
"metadata": {
"id": "4V85BGCjII7F"
}
},
{
"cell_type": "code",
"execution_count": 2,
"source": [
"suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\r\n",
"\r\n",
"pacman::p_load(tidyverse, tidymodels, themis, here)"
],
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Loading required package: pacman\n",
"\n"
]
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "an5NPyyKIKNR",
"outputId": "834d5e74-f4b8-49f9-8ab5-4c52ff2d7bc8"
}
},
{
"cell_type": "markdown",
"source": [
"## 1. Padalinkite duomenis į mokymo ir testavimo rinkinius.\n",
"\n",
"Pradėsime pasirinkdami kelis žingsnius iš ankstesnės pamokos.\n",
"\n",
"### Pašalinkite dažniausiai pasitaikančius ingredientus, kurie sukelia painiavą tarp skirtingų virtuvių, naudodami `dplyr::select()`.\n",
"\n",
"Visi mėgsta ryžius, česnaką ir imbierą!\n"
],
"metadata": {
"id": "0ax9GQLBINVv"
}
},
{
"cell_type": "code",
"execution_count": 3,
"source": [
"# Load the original cuisines data\r\n",
"df <- read_csv(file = \"https://raw.githubusercontent.com/microsoft/ML-For-Beginners/main/4-Classification/data/cuisines.csv\")\r\n",
"\r\n",
"# Drop id column, rice, garlic and ginger from our original data set\r\n",
"df_select <- df %>% \r\n",
" select(-c(1, rice, garlic, ginger)) %>%\r\n",
" # Encode cuisine column as categorical\r\n",
" mutate(cuisine = factor(cuisine))\r\n",
"\r\n",
"# Display new data set\r\n",
"df_select %>% \r\n",
" slice_head(n = 5)\r\n",
"\r\n",
"# Display distribution of cuisines\r\n",
"df_select %>% \r\n",
" count(cuisine) %>% \r\n",
" arrange(desc(n))"
],
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"New names:\n",
"* `` -> ...1\n",
"\n",
"\u001b[1m\u001b[1mRows: \u001b[1m\u001b[22m\u001b[34m\u001b[34m2448\u001b[34m\u001b[39m \u001b[1m\u001b[1mColumns: \u001b[1m\u001b[22m\u001b[34m\u001b[34m385\u001b[34m\u001b[39m\n",
"\n",
"\u001b[36m──\u001b[39m \u001b[1m\u001b[1mColumn specification\u001b[1m\u001b[22m \u001b[36m────────────────────────────────────────────────────────\u001b[39m\n",
"\u001b[1mDelimiter:\u001b[22m \",\"\n",
"\u001b[31mchr\u001b[39m (1): cuisine\n",
"\u001b[32mdbl\u001b[39m (384): ...1, almond, angelica, anise, anise_seed, apple, apple_brandy, a...\n",
"\n",
"\n",
"\u001b[36m\u001b[39m Use \u001b[30m\u001b[47m\u001b[30m\u001b[47m`spec()`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to retrieve the full column specification for this data.\n",
"\u001b[36m\u001b[39m Specify the column types or set \u001b[30m\u001b[47m\u001b[30m\u001b[47m`show_col_types = FALSE`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to quiet this message.\n",
"\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n",
"1 indian 0 0 0 0 0 0 0 0 \n",
"2 indian 1 0 0 0 0 0 0 0 \n",
"3 indian 0 0 0 0 0 0 0 0 \n",
"4 indian 0 0 0 0 0 0 0 0 \n",
"5 indian 0 0 0 0 0 0 0 0 \n",
" artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n",
"1 0 ⋯ 0 0 0 0 0 0 \n",
"2 0 ⋯ 0 0 0 0 0 0 \n",
"3 0 ⋯ 0 0 0 0 0 0 \n",
"4 0 ⋯ 0 0 0 0 0 0 \n",
"5 0 ⋯ 0 0 0 0 0 0 \n",
" yam yeast yogurt zucchini\n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"5 0 0 1 0 "
],
"text/markdown": [
"\n",
"A tibble: 5 × 381\n",
"\n",
"| cuisine &lt;fct&gt; | almond &lt;dbl&gt; | angelica &lt;dbl&gt; | anise &lt;dbl&gt; | anise_seed &lt;dbl&gt; | apple &lt;dbl&gt; | apple_brandy &lt;dbl&gt; | apricot &lt;dbl&gt; | armagnac &lt;dbl&gt; | artemisia &lt;dbl&gt; | ⋯ ⋯ | whiskey &lt;dbl&gt; | white_bread &lt;dbl&gt; | white_wine &lt;dbl&gt; | whole_grain_wheat_flour &lt;dbl&gt; | wine &lt;dbl&gt; | wood &lt;dbl&gt; | yam &lt;dbl&gt; | yeast &lt;dbl&gt; | yogurt &lt;dbl&gt; | zucchini &lt;dbl&gt; |\n",
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 381\n",
"\\begin{tabular}{lllllllllllllllllllll}\n",
" cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n",
" <fct> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & ⋯ & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 381</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>almond</th><th scope=col>angelica</th><th scope=col>anise</th><th scope=col>anise_seed</th><th scope=col>apple</th><th scope=col>apple_brandy</th><th scope=col>apricot</th><th scope=col>armagnac</th><th scope=col>artemisia</th><th scope=col>⋯</th><th scope=col>whiskey</th><th scope=col>white_bread</th><th scope=col>white_wine</th><th scope=col>whole_grain_wheat_flour</th><th scope=col>wine</th><th scope=col>wood</th><th scope=col>yam</th><th scope=col>yeast</th><th scope=col>yogurt</th><th scope=col>zucchini</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>⋯</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>indian</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine n \n",
"1 korean 799\n",
"2 indian 598\n",
"3 chinese 442\n",
"4 japanese 320\n",
"5 thai 289"
],
"text/markdown": [
"\n",
"A tibble: 5 × 2\n",
"\n",
"| cuisine &lt;fct&gt; | n &lt;int&gt; |\n",
"|---|---|\n",
"| korean | 799 |\n",
"| indian | 598 |\n",
"| chinese | 442 |\n",
"| japanese | 320 |\n",
"| thai | 289 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 2\n",
"\\begin{tabular}{ll}\n",
" cuisine & n\\\\\n",
" <fct> & <int>\\\\\n",
"\\hline\n",
"\t korean & 799\\\\\n",
"\t indian & 598\\\\\n",
"\t chinese & 442\\\\\n",
"\t japanese & 320\\\\\n",
"\t thai & 289\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 2</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>n</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;int&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>korean </td><td>799</td></tr>\n",
"\t<tr><td>indian </td><td>598</td></tr>\n",
"\t<tr><td>chinese </td><td>442</td></tr>\n",
"\t<tr><td>japanese</td><td>320</td></tr>\n",
"\t<tr><td>thai </td><td>289</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 735
},
"id": "jhCrrH22IWVR",
"outputId": "d444a85c-1d8b-485f-bc4f-8be2e8f8217c"
}
},
{
"cell_type": "markdown",
"source": [
"Puiku! Dabar laikas padalyti duomenis taip, kad 70% duomenų būtų skirta mokymui, o 30% testavimui. Taip pat pritaikysime `stratifikacijos` techniką, kad `išlaikytume kiekvienos virtuvės proporcijas` mokymo ir validacijos rinkiniuose.\n",
"\n",
"[rsample](https://rsample.tidymodels.org/), Tidymodels paketas, suteikia infrastruktūrą efektyviam duomenų skirstymui ir resamplingui:\n"
],
"metadata": {
"id": "AYTjVyajIdny"
}
},
{
"cell_type": "code",
"execution_count": 4,
"source": [
"# Load the core Tidymodels packages into R session\r\n",
"library(tidymodels)\r\n",
"\r\n",
"# Create split specification\r\n",
"set.seed(2056)\r\n",
"cuisines_split <- initial_split(data = df_select,\r\n",
" strata = cuisine,\r\n",
" prop = 0.7)\r\n",
"\r\n",
"# Extract the data in each split\r\n",
"cuisines_train <- training(cuisines_split)\r\n",
"cuisines_test <- testing(cuisines_split)\r\n",
"\r\n",
"# Print the number of cases in each split\r\n",
"cat(\"Training cases: \", nrow(cuisines_train), \"\\n\",\r\n",
" \"Test cases: \", nrow(cuisines_test), sep = \"\")\r\n",
"\r\n",
"# Display the first few rows of the training set\r\n",
"cuisines_train %>% \r\n",
" slice_head(n = 5)\r\n",
"\r\n",
"\r\n",
"# Display distribution of cuisines in the training set\r\n",
"cuisines_train %>% \r\n",
" count(cuisine) %>% \r\n",
" arrange(desc(n))"
],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Training cases: 1712\n",
"Test cases: 736"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n",
"1 chinese 0 0 0 0 0 0 0 0 \n",
"2 chinese 0 0 0 0 0 0 0 0 \n",
"3 chinese 0 0 0 0 0 0 0 0 \n",
"4 chinese 0 0 0 0 0 0 0 0 \n",
"5 chinese 0 0 0 0 0 0 0 0 \n",
" artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n",
"1 0 ⋯ 0 0 0 0 1 0 \n",
"2 0 ⋯ 0 0 0 0 1 0 \n",
"3 0 ⋯ 0 0 0 0 0 0 \n",
"4 0 ⋯ 0 0 0 0 0 0 \n",
"5 0 ⋯ 0 0 0 0 0 0 \n",
" yam yeast yogurt zucchini\n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"5 0 0 0 0 "
],
"text/markdown": [
"\n",
"A tibble: 5 × 381\n",
"\n",
"| cuisine &lt;fct&gt; | almond &lt;dbl&gt; | angelica &lt;dbl&gt; | anise &lt;dbl&gt; | anise_seed &lt;dbl&gt; | apple &lt;dbl&gt; | apple_brandy &lt;dbl&gt; | apricot &lt;dbl&gt; | armagnac &lt;dbl&gt; | artemisia &lt;dbl&gt; | ⋯ ⋯ | whiskey &lt;dbl&gt; | white_bread &lt;dbl&gt; | white_wine &lt;dbl&gt; | whole_grain_wheat_flour &lt;dbl&gt; | wine &lt;dbl&gt; | wood &lt;dbl&gt; | yam &lt;dbl&gt; | yeast &lt;dbl&gt; | yogurt &lt;dbl&gt; | zucchini &lt;dbl&gt; |\n",
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 381\n",
"\\begin{tabular}{lllllllllllllllllllll}\n",
" cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n",
" <fct> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & ⋯ & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 381</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>almond</th><th scope=col>angelica</th><th scope=col>anise</th><th scope=col>anise_seed</th><th scope=col>apple</th><th scope=col>apple_brandy</th><th scope=col>apricot</th><th scope=col>armagnac</th><th scope=col>artemisia</th><th scope=col>⋯</th><th scope=col>whiskey</th><th scope=col>white_bread</th><th scope=col>white_wine</th><th scope=col>whole_grain_wheat_flour</th><th scope=col>wine</th><th scope=col>wood</th><th scope=col>yam</th><th scope=col>yeast</th><th scope=col>yogurt</th><th scope=col>zucchini</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>⋯</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"\t<tr><td>chinese</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>⋯</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine n \n",
"1 korean 559\n",
"2 indian 418\n",
"3 chinese 309\n",
"4 japanese 224\n",
"5 thai 202"
],
"text/markdown": [
"\n",
"A tibble: 5 × 2\n",
"\n",
"| cuisine &lt;fct&gt; | n &lt;int&gt; |\n",
"|---|---|\n",
"| korean | 559 |\n",
"| indian | 418 |\n",
"| chinese | 309 |\n",
"| japanese | 224 |\n",
"| thai | 202 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 2\n",
"\\begin{tabular}{ll}\n",
" cuisine & n\\\\\n",
" <fct> & <int>\\\\\n",
"\\hline\n",
"\t korean & 559\\\\\n",
"\t indian & 418\\\\\n",
"\t chinese & 309\\\\\n",
"\t japanese & 224\\\\\n",
"\t thai & 202\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 2</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>n</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;int&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>korean </td><td>559</td></tr>\n",
"\t<tr><td>indian </td><td>418</td></tr>\n",
"\t<tr><td>chinese </td><td>309</td></tr>\n",
"\t<tr><td>japanese</td><td>224</td></tr>\n",
"\t<tr><td>thai </td><td>202</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 535
},
"id": "w5FWIkEiIjdN",
"outputId": "2e195fd9-1a8f-4b91-9573-cce5582242df"
}
},
{
"cell_type": "markdown",
"source": [
"## 2. Spręsti duomenų disbalansą\n",
"\n",
"Kaip galėjote pastebėti pradiniame duomenų rinkinyje, taip pat ir mūsų mokymo rinkinyje, yra gana nelygiavertis virtuvių skaičiaus pasiskirstymas. Korėjiečių virtuvės yra *beveik* 3 kartus daugiau nei tailandiečių virtuvės. Duomenų disbalansas dažnai neigiamai veikia modelio našumą. Daugelis modelių geriausiai veikia, kai stebėjimų skaičius yra vienodas, todėl jiems dažnai sunku dirbti su nesubalansuotais duomenimis.\n",
"\n",
"Yra du pagrindiniai būdai spręsti nesubalansuotų duomenų rinkinius:\n",
"\n",
"- pridėti stebėjimų prie mažumos klasės: `Per-sampling` (pvz., naudojant SMOTE algoritmą, kuris sintetiškai generuoja naujus mažumos klasės pavyzdžius, remdamasis artimiausiais kaimynais).\n",
"\n",
"- pašalinti stebėjimus iš daugumos klasės: `Under-sampling`\n",
"\n",
"Ankstesnėje pamokoje parodėme, kaip spręsti nesubalansuotų duomenų rinkinius naudojant `recipe`. Receptą galima laikyti planu, kuris aprašo, kokius veiksmus reikia taikyti duomenų rinkiniui, kad jis būtų paruoštas duomenų analizei. Mūsų atveju, norime užtikrinti vienodą virtuvių pasiskirstymą mūsų `mokymo rinkinyje`. Pradėkime!\n"
],
"metadata": {
"id": "daBi9qJNIwqW"
}
},
{
"cell_type": "code",
"execution_count": 5,
"source": [
"# Load themis package for dealing with imbalanced data\r\n",
"library(themis)\r\n",
"\r\n",
"# Create a recipe for preprocessing training data\r\n",
"cuisines_recipe <- recipe(cuisine ~ ., data = cuisines_train) %>% \r\n",
" step_smote(cuisine)\r\n",
"\r\n",
"# Print recipe\r\n",
"cuisines_recipe"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Data Recipe\n",
"\n",
"Inputs:\n",
"\n",
" role #variables\n",
" outcome 1\n",
" predictor 380\n",
"\n",
"Operations:\n",
"\n",
"SMOTE based on cuisine"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 200
},
"id": "Az6LFBGxI1X0",
"outputId": "29d71d85-64b0-4e62-871e-bcd5398573b6"
}
},
{
"cell_type": "markdown",
"source": [
"Galite, žinoma, patvirtinti (naudodami prep+bake), kad receptas veikia taip, kaip tikitės - visi virtuvės etiketės turi „559“ stebėjimus.\n",
"\n",
"Kadangi šį receptą naudosime kaip išankstinį apdorojimą modeliavimui, `workflow()` atliks visą paruošimą ir kepimą už mus, todėl nereikės rankiniu būdu įvertinti recepto.\n",
"\n",
"Dabar esame pasiruošę treniruoti modelį 👩‍💻👨‍💻!\n",
"\n",
"## 3. Klasifikatoriaus pasirinkimas\n",
"\n",
"<p >\n",
" <img src=\"../../images/parsnip.jpg\"\n",
" width=\"600\"/>\n",
" <figcaption>Piešinys @allison_horst</figcaption>\n"
],
"metadata": {
"id": "NBL3PqIWJBBB"
}
},
{
"cell_type": "markdown",
"source": [
"Dabar turime nuspręsti, kurį algoritmą naudoti šiam darbui 🤔.\n",
"\n",
"Tidymodels pakete [`parsnip package`](https://parsnip.tidymodels.org/index.html) pateikiama nuosekli sąsaja darbui su modeliais, naudojant skirtingus variklius (paketus). Peržiūrėkite parsnip dokumentaciją, kad susipažintumėte su [modelių tipais ir varikliais](https://www.tidymodels.org/find/parsnip/#models) bei jų atitinkamais [modelių argumentais](https://www.tidymodels.org/find/parsnip/#model-args). Įvairovė iš pradžių gali atrodyti stulbinanti. Pavyzdžiui, šie metodai apima klasifikavimo technikas:\n",
"\n",
"- C5.0 taisyklėmis pagrįsti klasifikavimo modeliai\n",
"\n",
"- Lankstūs diskriminantų modeliai\n",
"\n",
"- Linijiniai diskriminantų modeliai\n",
"\n",
"- Reguliarizuoti diskriminantų modeliai\n",
"\n",
"- Logistinės regresijos modeliai\n",
"\n",
"- Daugianarės regresijos modeliai\n",
"\n",
"- Naivūs Bayes modeliai\n",
"\n",
"- Atraminių vektorių mašinos\n",
"\n",
"- Artimiausių kaimynų metodas\n",
"\n",
"- Sprendimų medžiai\n",
"\n",
"- Ansamblio metodai\n",
"\n",
"- Neuroniniai tinklai\n",
"\n",
"Sąrašas tęsiasi!\n",
"\n",
"### **Kokį klasifikatorių pasirinkti?**\n",
"\n",
"Taigi, kurį klasifikatorių reikėtų rinktis? Dažnai geriausias būdas yra išbandyti kelis ir ieškoti gero rezultato.\n",
"\n",
"> AutoML puikiai išsprendžia šią problemą, atlikdamas palyginimus debesyje ir leisdamas pasirinkti geriausią algoritmą jūsų duomenims. Išbandykite [čia](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-77952-leestott)\n",
"\n",
"Be to, klasifikatoriaus pasirinkimas priklauso nuo mūsų problemos. Pavyzdžiui, kai rezultatas gali būti suskirstytas į `daugiau nei dvi klases`, kaip mūsų atveju, turite naudoti `daugianarės klasifikacijos algoritmą`, o ne `dvejetainę klasifikaciją.`\n",
"\n",
"### **Geresnis požiūris**\n",
"\n",
"Tačiau geresnis būdas nei spėliojimas yra vadovautis idėjomis iš šio atsisiunčiamo [ML Cheat sheet](https://docs.microsoft.com/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-77952-leestott). Čia sužinome, kad mūsų daugianarės klasifikacijos problemai turime keletą pasirinkimų:\n",
"\n",
"<p >\n",
" <img src=\"../../images/cheatsheet.png\"\n",
" width=\"500\"/>\n",
" <figcaption>Microsoft algoritmų apgaulės lapo dalis, apibūdinanti daugianarės klasifikacijos galimybes</figcaption>\n"
],
"metadata": {
"id": "a6DLAZ3vJZ14"
}
},
{
"cell_type": "markdown",
"source": [
"### **Samprotavimas**\n",
"\n",
"Pažiūrėkime, ar galime logiškai apsvarstyti skirtingus požiūrius, atsižvelgdami į turimus apribojimus:\n",
"\n",
"- **Gilūs neuroniniai tinklai yra per sudėtingi**. Atsižvelgiant į mūsų švarų, bet minimalų duomenų rinkinį ir tai, kad mokymą vykdome lokaliai per užrašų knygeles, gilūs neuroniniai tinklai yra per sudėtingi šiai užduočiai.\n",
"\n",
"- **Dviejų klasių klasifikatorius netinka**. Mes nenaudojame dviejų klasių klasifikatoriaus, todėl vienos prieš visas metodas netinka.\n",
"\n",
"- **Sprendimų medis arba logistinė regresija galėtų veikti**. Sprendimų medis galėtų būti tinkamas, arba daugialypė regresija/daugialypė logistinė regresija, skirta daugiaklasiams duomenims.\n",
"\n",
"- **Daugiaklasiai sustiprinti sprendimų medžiai sprendžia kitokią problemą**. Daugiaklasiai sustiprinti sprendimų medžiai labiausiai tinka neparametrinėms užduotims, pvz., užduotims, skirtoms reitingų sudarymui, todėl jie mums nėra naudingi.\n",
"\n",
"Be to, paprastai prieš pradedant naudoti sudėtingesnius mašininio mokymosi modelius, pvz., ansamblio metodus, verta sukurti kuo paprastesnį modelį, kad suprastume, kas vyksta. Todėl šioje pamokoje pradėsime nuo `daugialypės regresijos` modelio.\n",
"\n",
"> Logistinė regresija yra technika, naudojama, kai rezultato kintamasis yra kategorinis (arba nominalus). Dvejetainėje logistinėje regresijoje rezultatų kintamųjų skaičius yra du, o daugialypėje logistinėje regresijoje rezultatų kintamųjų skaičius yra daugiau nei du. Daugiau informacijos rasite [Pažangūs regresijos metodai](https://bookdown.org/chua/ber642_advanced_regression/multinomial-logistic-regression.html).\n",
"\n",
"## 4. Mokykite ir įvertinkite daugialypės logistinės regresijos modelį.\n",
"\n",
"Naudojant Tidymodels, `parsnip::multinom_reg()` apibrėžia modelį, kuris naudoja linijinius prognozatorius, kad prognozuotų daugiaklasius duomenis, remdamasis daugialype skirstinio funkcija. Žr. `?multinom_reg()` dėl skirtingų būdų/variklių, kuriuos galite naudoti šiam modeliui pritaikyti.\n",
"\n",
"Šiame pavyzdyje pritaikysime daugialypės regresijos modelį naudodami numatytąjį [nnet](https://cran.r-project.org/web/packages/nnet/nnet.pdf) variklį.\n",
"\n",
"> Atsitiktinai pasirinkau `penalty` reikšmę. Yra geresnių būdų pasirinkti šią reikšmę, pavyzdžiui, naudojant `resampling` ir `modelio derinimą`, apie kuriuos kalbėsime vėliau.\n",
">\n",
"> Žr. [Tidymodels: Pradėkite](https://www.tidymodels.org/start/tuning/), jei norite sužinoti daugiau apie modelio hiperparametrų derinimą.\n"
],
"metadata": {
"id": "gWMsVcbBJemu"
}
},
{
"cell_type": "code",
"execution_count": 6,
"source": [
"# Create a multinomial regression model specification\r\n",
"mr_spec <- multinom_reg(penalty = 1) %>% \r\n",
" set_engine(\"nnet\", MaxNWts = 2086) %>% \r\n",
" set_mode(\"classification\")\r\n",
"\r\n",
"# Print model specification\r\n",
"mr_spec"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Multinomial Regression Model Specification (classification)\n",
"\n",
"Main Arguments:\n",
" penalty = 1\n",
"\n",
"Engine-Specific Arguments:\n",
" MaxNWts = 2086\n",
"\n",
"Computational engine: nnet \n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 166
},
"id": "Wq_fcyQiJvfG",
"outputId": "c30449c7-3864-4be7-f810-72a003743e2d"
}
},
{
"cell_type": "markdown",
"source": [
"Puikus darbas 🥳! Dabar, kai turime receptą ir modelio specifikaciją, turime rasti būdą, kaip juos sujungti į objektą, kuris pirmiausia apdoros duomenis, tada pritaikys modelį apdorotiems duomenims ir taip pat leis atlikti galimas poapdorojimo veiklas. Tidymodels bibliotekoje šis patogus objektas vadinamas [`workflow`](https://workflows.tidymodels.org/) ir patogiai talpina jūsų modeliavimo komponentus! Tai, ką Python'e vadintume *pipelines*.\n",
"\n",
"Taigi, sujunkime viską į workflow!📦\n"
],
"metadata": {
"id": "NlSbzDfgJ0zh"
}
},
{
"cell_type": "code",
"execution_count": 7,
"source": [
"# Bundle recipe and model specification\r\n",
"mr_wf <- workflow() %>% \r\n",
" add_recipe(cuisines_recipe) %>% \r\n",
" add_model(mr_spec)\r\n",
"\r\n",
"# Print out workflow\r\n",
"mr_wf"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"══ Workflow ════════════════════════════════════════════════════════════════════\n",
"\u001b[3mPreprocessor:\u001b[23m Recipe\n",
"\u001b[3mModel:\u001b[23m multinom_reg()\n",
"\n",
"── Preprocessor ────────────────────────────────────────────────────────────────\n",
"1 Recipe Step\n",
"\n",
"• step_smote()\n",
"\n",
"── Model ───────────────────────────────────────────────────────────────────────\n",
"Multinomial Regression Model Specification (classification)\n",
"\n",
"Main Arguments:\n",
" penalty = 1\n",
"\n",
"Engine-Specific Arguments:\n",
" MaxNWts = 2086\n",
"\n",
"Computational engine: nnet \n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 333
},
"id": "Sc1TfPA4Ke3_",
"outputId": "82c70013-e431-4e7e-cef6-9fcf8aad4a6c"
}
},
{
"cell_type": "markdown",
"source": [
"Darbo eigos 👌👌! **`workflow()`** gali būti pritaikyta panašiai kaip ir modelis. Taigi, laikas treniruoti modelį!\n"
],
"metadata": {
"id": "TNQ8i85aKf9L"
}
},
{
"cell_type": "code",
"execution_count": 8,
"source": [
"# Train a multinomial regression model\n",
"mr_fit <- fit(object = mr_wf, data = cuisines_train)\n",
"\n",
"mr_fit"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"══ Workflow [trained] ══════════════════════════════════════════════════════════\n",
"\u001b[3mPreprocessor:\u001b[23m Recipe\n",
"\u001b[3mModel:\u001b[23m multinom_reg()\n",
"\n",
"── Preprocessor ────────────────────────────────────────────────────────────────\n",
"1 Recipe Step\n",
"\n",
"• step_smote()\n",
"\n",
"── Model ───────────────────────────────────────────────────────────────────────\n",
"Call:\n",
"nnet::multinom(formula = ..y ~ ., data = data, decay = ~1, MaxNWts = ~2086, \n",
" trace = FALSE)\n",
"\n",
"Coefficients:\n",
" (Intercept) almond angelica anise anise_seed apple\n",
"indian 0.19723325 0.2409661 0 -5.004955e-05 -0.1657635 -0.05769734\n",
"japanese 0.13961959 -0.6262400 0 -1.169155e-04 -0.4893596 -0.08585717\n",
"korean 0.22377347 -0.1833485 0 -5.560395e-05 -0.2489401 -0.15657804\n",
"thai -0.04336577 -0.6106258 0 4.903828e-04 -0.5782866 0.63451105\n",
" apple_brandy apricot armagnac artemisia artichoke asparagus\n",
"indian 0 0.37042636 0 -0.09122797 0 -0.27181970\n",
"japanese 0 0.28895643 0 -0.12651100 0 0.14054037\n",
"korean 0 -0.07981259 0 0.55756709 0 -0.66979948\n",
"thai 0 -0.33160904 0 -0.10725182 0 -0.02602152\n",
" avocado bacon baked_potato balm banana barley\n",
"indian -0.46624197 0.16008055 0 0 -0.2838796 0.2230625\n",
"japanese 0.90341344 0.02932727 0 0 -0.4142787 2.0953906\n",
"korean -0.06925382 -0.35804134 0 0 -0.2686963 -0.7233404\n",
"thai -0.21473955 -0.75594439 0 0 0.6784880 -0.4363320\n",
" bartlett_pear basil bay bean beech\n",
"indian 0 -0.7128756 0.1011587 -0.8777275 -0.0004380795\n",
"japanese 0 0.1288697 0.9425626 -0.2380748 0.3373437611\n",
"korean 0 -0.2445193 -0.4744318 -0.8957870 -0.0048784496\n",
"thai 0 1.5365848 0.1333256 0.2196970 -0.0113078024\n",
" beef beef_broth beef_liver beer beet\n",
"indian -0.7985278 0.2430186 -0.035598065 -0.002173738 0.01005813\n",
"japanese 0.2241875 -0.3653020 -0.139551027 0.128905553 0.04923911\n",
"korean 0.5366515 -0.6153237 0.213455197 -0.010828645 0.27325423\n",
"thai 0.1570012 -0.9364154 -0.008032213 -0.035063746 -0.28279823\n",
" bell_pepper bergamot berry bitter_orange black_bean\n",
"indian 0.49074330 0 0.58947607 0.191256164 -0.1945233\n",
"japanese 0.09074167 0 -0.25917977 -0.118915977 -0.3442400\n",
"korean -0.57876763 0 -0.07874180 -0.007729435 -0.5220672\n",
"thai 0.92554006 0 -0.07210196 -0.002983296 -0.4614426\n",
" black_currant black_mustard_seed_oil black_pepper black_raspberry\n",
"indian 0 0.38935801 -0.4453495 0\n",
"japanese 0 -0.05452887 -0.5440869 0\n",
"korean 0 -0.03929970 0.8025454 0\n",
"thai 0 -0.21498372 -0.9854806 0\n",
" black_sesame_seed black_tea blackberry blackberry_brandy\n",
"indian -0.2759246 0.3079977 0.191256164 0\n",
"japanese -0.6101687 -0.1671913 -0.118915977 0\n",
"korean 1.5197674 -0.3036261 -0.007729435 0\n",
"thai -0.1755656 -0.1487033 -0.002983296 0\n",
" blue_cheese blueberry bone_oil bourbon_whiskey brandy\n",
"indian 0 0.216164294 -0.2276744 0 0.22427587\n",
"japanese 0 -0.119186087 0.3913019 0 -0.15595599\n",
"korean 0 -0.007821986 0.2854487 0 -0.02562342\n",
"thai 0 -0.004947048 -0.0253658 0 -0.05715244\n",
"\n",
"...\n",
"and 308 more lines."
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "GMbdfVmTKkJI",
"outputId": "adf9ebdf-d69d-4a64-e9fd-e06e5322292e"
}
},
{
"cell_type": "markdown",
"source": [
"Modelio mokymo metu gauti koeficientai.\n",
"\n",
"### Įvertinkite apmokytą modelį\n",
"\n",
"Atėjo metas patikrinti, kaip modelis pasirodė 📏, įvertinant jį su testiniu rinkiniu! Pradėkime nuo prognozių sudarymo testiniame rinkinyje.\n"
],
"metadata": {
"id": "tt2BfOxrKmcJ"
}
},
{
"cell_type": "code",
"execution_count": 9,
"source": [
"# Make predictions on the test set\n",
"results <- cuisines_test %>% select(cuisine) %>% \n",
" bind_cols(mr_fit %>% predict(new_data = cuisines_test))\n",
"\n",
"# Print out results\n",
"results %>% \n",
" slice_head(n = 5)"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine .pred_class\n",
"1 indian thai \n",
"2 indian indian \n",
"3 indian indian \n",
"4 indian indian \n",
"5 indian indian "
],
"text/markdown": [
"\n",
"A tibble: 5 × 2\n",
"\n",
"| cuisine &lt;fct&gt; | .pred_class &lt;fct&gt; |\n",
"|---|---|\n",
"| indian | thai |\n",
"| indian | indian |\n",
"| indian | indian |\n",
"| indian | indian |\n",
"| indian | indian |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 2\n",
"\\begin{tabular}{ll}\n",
" cuisine & .pred\\_class\\\\\n",
" <fct> & <fct>\\\\\n",
"\\hline\n",
"\t indian & thai \\\\\n",
"\t indian & indian\\\\\n",
"\t indian & indian\\\\\n",
"\t indian & indian\\\\\n",
"\t indian & indian\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 2</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>.pred_class</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;fct&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>indian</td><td>thai </td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 248
},
"id": "CqtckvtsKqax",
"outputId": "e57fe557-6a68-4217-fe82-173328c5436d"
}
},
{
"cell_type": "markdown",
"source": [
"Puikus darbas! Tidymodels aplinkoje modelio veikimas gali būti vertinamas naudojant [yardstick](https://yardstick.tidymodels.org/) - paketą, skirtą modelių efektyvumui matuoti naudojant veikimo metrikas. Kaip darėme logistinės regresijos pamokoje, pradėkime nuo painiavos matricos skaičiavimo.\n"
],
"metadata": {
"id": "8w5N6XsBKss7"
}
},
{
"cell_type": "code",
"execution_count": 10,
"source": [
"# Confusion matrix for categorical data\n",
"conf_mat(data = results, truth = cuisine, estimate = .pred_class)\n"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" Truth\n",
"Prediction chinese indian japanese korean thai\n",
" chinese 83 1 8 15 10\n",
" indian 4 163 1 2 6\n",
" japanese 21 5 73 25 1\n",
" korean 15 0 11 191 0\n",
" thai 10 11 3 7 70"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 133
},
"id": "YvODvsLkK0iG",
"outputId": "bb69da84-1266-47ad-b174-d43b88ca2988"
}
},
{
"cell_type": "markdown",
"source": [
"Kai dirbama su keliomis klasėmis, paprastai yra intuityviau tai vizualizuoti kaip šilumos žemėlapį, štai taip:\n"
],
"metadata": {
"id": "c0HfPL16Lr6U"
}
},
{
"cell_type": "code",
"execution_count": 11,
"source": [
"update_geom_defaults(geom = \"tile\", new = list(color = \"black\", alpha = 0.7))\n",
"# Visualize confusion matrix\n",
"results %>% \n",
" conf_mat(cuisine, .pred_class) %>% \n",
" autoplot(type = \"heatmap\")"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"plot without title"
],
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tMTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OUlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////isF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3deWBU9b3//0+ibApWrbYuvYorXaxoaatWvVqpqG2HsCmLBAqoVXBDjCKbKMqOQUDFFVxKqyhVFLUqWKJsxg3Lz2IFGilLiEqptMX0hpzvnJkMCbx5/W5vz5k5Z+D5/OOc85nEz3w8Mw9mMjmo84gocC7qBRDtCQGJKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQijHkLb+NUZVRb2Ahn26OeoVNCxWpyZWi/mbeGbnGNJl42PUmbNiVM+7nohRlz4ao/o/HqOuFc/sHEMasShGdfwsRt3+1qYYNXJ9jLq9MkZNE89sIMUkIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSbI+EVLQktatJvB8FpNGtv9LoqMteTx7d/f2vND6h5M3oIS07xT0XxjyhQJpx+sGNj79pbfCJgkJ6o7V72t/f4FKdFSmkRa3dnNTBgnYHNPneY7GCVPvB1ggg3ezaTZraq+C8RYvGF7a64cbW7rLIIU1sdmR8IE1ynX4957qC9pFDGtvsiDSkywsn+D0eJaRxycWkIC1pcdzYSecUzIwTpP+0YJBOONJ/CTqncP6iI49YsGjRwqMOjhrSS03GTY0PpJNaVia3P92nImJIc5vcWZqG1LVFoInCgPRCkzF3pyF1bLa8snLdd1pGC+nTOy8uvvdLr+iVEZ2KF/hv7WoTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZgHS8cf624sKF5RdO84/+plbEDGk8oWfxQjSt7/pb7vu80ngmYJBWvTa+jpIPz0ickhLFlSmIa1vVuSPR7lXI4V0w9jN6wdM94qu+fCfj3XZ5v+MVDRwi/dKl23eoPFfVD/es3pj+/e3b7xudmaYDUjD3C+fmz+6aee64Zsnfz3QdOF82BAjSFPckOV/nrFfv+AzBf6woQ7SWa3Wr18dLaRkaUiL3FB/MMdNjhLS6sTG5KbcK3ra8zYmKlKQ5nrepsQnqxKbkz8zdStblVjtedu9zDD5zyxpn+wPIUJadFsz5wp7pz5i+P1vH2i372gg7dT9+yfPz/WVwScKC1Lrlh0PdAddvyYOkJ5zd/mDN9ywKCG92b42tS9anHwvl/g4BSl9WJZINbv2ng4ls9Z7mWHye9/4cbL3QoR0T/MzRt91SWHqI4bJzh0+KdBsex6kZw/4yYzfXL7PzfGB1LKw20P3F7mL4gDpSTfVHyxzN0YJaVH77WlIS+ohpQ+XJjJv4zbNG9mhrH64uwJBeuPwE/0Xo66FTya3L44f2ragF5AatPGo7/ovRlcULo0NpLff87dd3ZwYQHrOTfIHZW54lJDWJCo876MXdgNpbWJl8usbvZotyd30wZlhFiA97VJwJrjMLL9wDwGpvrfddf7uCXdPbCCle8IFmi8kSEvcLf7gqfQLU1SQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpgVSD383Wg3+PkbHkyTugVI9ZW7/v7uEXdXbCCtXOlv73fjYgBpQ4uf+4MhrixSSFvu6NJz2rbdQdo8ruslJSu82ll9Ova6+++ZYRYgvdH8mDeSuw7usRcLT/WPLnGTgVTfxq+02pjc9Xa/jwukdwsv9AfnFbwRA0iVlzZ5p7Jy7bHfDjTXHnGt3UB32qgJFxe2XbSo2H332pLzC77zRsSQ5pWW9nADSkvfiQOkTXe6Hz/wxOWFRcFnCgbp2QkTurorJ0xYvL6Paztu1OmuX6DpgkGaO2lSN9d/0qRlle8dfPTQO37QaA6QFo06qWmjlleWLVr0Zkmrps2O7fFqoNlCgNQ7fS2ZezAWkDY9+P39Gp8wZH3wiYJB6ll3Vu5dv3ZM6xZNT5kYaLaAkHrVLWZ6ZeWiC1s0Pe2ZQLPtIZDCjau/ZVz9rQKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEiqmEDqOSJGnXJfjOpw+z0xqtOUGNU76rPRsMvEMzvHkEa+GqMuHBajznv8tRg1IOoFNGzgKzFqiHhm5xjSvZ/GqF/MjFHd3on6zWXD7ox6AQ27oypG3Sue2UCKSUCSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyhQTpraYHhzDLL/7zp/3YY9zg1MFDHQ7Z92sXz0ge3fTt5o2O6j0jWkjLTnHPhTFPGJDmt23e/OTSquAThQFpXcl/NW45bFPwiUKEVJN4f6fxpkTFrjdlGVLVmS5aSL0bH1wH6QeFF155puswc+b1BUf37HWi6xQppInNjowNpJcbtbx90jnuluAzhQEpsc9V0y9xJcEnChFS7Qdbd4W0601ZhjSp8bmRQhrWqPiyNKQS1y25/f43Z8z82qEPzJz58GEHRAnppSbjpsYG0o8O+ONnn1V9Z7+NgWcKAdJsd1ty+/Mzg78kZfGtXRLSv/29oUD6wwEll0YKadyomXWQftT0ofRND/e4zt+d7R6IEFL5ws/iA2nydH/bx/0p8EwhQOrSfF3wSVKF+9auNrFwRP++8z1v9aAuVy9Mv7WrGN6964gN3o4vZQ/SRSeujxZSsjpIh540c2aDH4tmnPDV/3TCkD5siA+kdOceGnyOECAdfW5VVWXwaapC/xmpaOAW75Uu22r7lW6rGpKGdGXptn+MKfEyX8oepIcK5n0aE0gzCs7t8/WC/S9KvQw9dNfw0/e5BkgNe9jdHnyS4JA2FfaadEzBQf0/iR+kuf5buk/+mNjoeUvSkLZ+6XmLO9RmvpT8xgVtkr0dNqQ/HdL307hAut8deuxVN15Y0Ma/qcS5Q274jyfcIyH9utlFsfjUrsId9b0Hnrqq8Gfxg7TY8zYnPi5rv93zPklDWj6kuLhboibzpeQ3lvdM9mHYkLoeviY2kB50zacndz9xtya3U6+/7LSCBJDqG7dPpw0hTBMc0l/cQWuSu8vcK7GDtCSlZX77Ws9bk4K0odPsam+pD2lJBtJuCg7pqYKHKyoquh1csS4GkGY2+6a/HeT61N3cPkUKSKmudIM+DWOeEH5GanGmv/2NuyvwTNmBtDxR6XllKUhlRTWe92j2IfVzdZ0fB0itDvO317orphQP948Gur5AqmtgQWkIs3wWCqQzjve3j7l7As+UHUjVPUq3rrs5BWllYsW/Fg5OVGUb0tsv+LU74IU34wCplytJbs8oHD+14Jv+p3ftUmMgJXvahfWJRQiQxrunk9su+7wVeKbsQPI+ur7z1e8k/uzfNKN7jylbB3bblGVI6aL9GWlonz5nu4v69Jkw86GWTdr3+6E7f+bMn7nju/c+veC4//QaoTAgzSst7eEGlJYGnyq4gcrjDipN9V7gqUKAtK71foPuLnKXB59pz7rWLmJIP657d3nVzJn3tv3KPocVJ/XM6H1046bfuGj6fzpnGJB6163rwcAzBYf0UeYt+GOBpwrjEqGP+3yt0XFj43WtXZC4+lvF1d8yrv62AUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAcl28S9jVKt2Mer4blGfjob9KOoFNOxnV8Soi8QzO8eQ7loVo7r/JUYNf/KNGDXkTzFq1PoYNU08s3MMafLaGNUz6rcJDbvtmWUxanhFjIrV+8z7xDMbSDEJSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIs/yBt6n1EoUsFpFwHJFn+Qbp437a9+6UCUq4Dkiz/IH312WwBAtL/FpBk+QdpvyogRRWQZPkH6ezXgRRVQJLlH6S3f7gYSBEFJFn+QTrzv9x+R6cCUq4Dkiz/IJ3dNhOQch2QZPkHKfsBSQUkWT5C+uyFBx56+Qsg5T4gyfIP0vZBjfzLGvYfD6ScByRZ/kEa7zo+/OIL91/gHgVSrgOSLP8gfeuG9P6K7wEp1wFJln+QmsxP7+c1A1KuA5Is/yDt/3x6/2xzIOU6IMnyD9JZP672d9vanQukXAckWf5Bmldw1JWjbr/8iMJXgZTrgCTLP0jeb7/pf/z93XnZcgQkGZBkeQjJ89a/VV6ZNUZA0gFJlpeQshyQVECS5RmkVqO9VjsCUq4DkizPIJ1W6p22IyDlOiDJ8gxSTgKSCkiy/IPU5sP0/ulvASnXAUmWf5BceWr3P7c1BlKuA5Is3yC5+rhoNecBSZZvkN6/2xWl/uuQl434C5ByHZBk+QbJ8y74U7YAAel/C0iy/IPkbZyS3FTdtglIOQ9IsvyDtPIw/1OGCnfYaiDlOiDJ8g9Sh+Pf8ncfHt8JSLkOSLL8g3ToI+n9/S2AlOuAJMs/SM2eSO9/tV9MIc07t3nzk8ZW+Ie/P9k9GT2kkvSvC86OGtJrmV9cjF+2bNoPvtL4xMFLo4T0/Dn773/SmDUVFdenV3Vm9JCWneKeC2OefwvSjy6o8Xdf/ODMzC01ifdjBOnZfY8eNuYsd2PycHSzI+IA6ZeFd/n9OmpIbw5JdX7Br5ZNKmx1402nuCsihPTbfY8eOvosN6iiom/hWL+ZkUOa2OzIHEJ6ueDYASNH9Dm08OXMLbUfbI0RpNNbvLt2bcW391uz9rdNRk2KA6TuBwSfI10Yb+1eP7TDsmXfOLJs2bJFRx8cIaTTWrxdUbHmW/utqri4RaCJQoP0UpNxU3MIyXuljf9CfHJc/4bs+Lv9bbFbvrbsd2tjAelnRwafI10YkC458NVliwdO9A9/7sqigzRusr/t6d6ruPDweEAqX/hZTiF53mcf/H8N/4vF/lu7iuHdu47Y4FUnXh7cr+9SLzOuTSwc0b/vfM/bPL5Xl8GrPO+1qzoX31u9Y5gFSOnOPiS1iwWks1tVVa0NPk1VKJCeLCzJHC5tfVigqcL4sOHsQyoqzjyxomJlDCAlyzGkXfIhXVm67R9jSpKH1/3Ve7XDlszYKxq4xXulyzZv0Pgvqh/vWb2x/fvbN143OzNM/sP/XJfsy7Ah3eeGxQfSKcd0PsgdNOgvsYB0/qFvpPZvzH34gn3HRg3pHje0ouLklkUHuoOu/WhvgrTbvyHrQ9qatLC4Q21N4jnP2971lczYK5rreZsSn6xKbE7+LNWtbFVidfLrXmaY/IcXtEn2dsiQZjZrVxEfSMcU9pj5UEf30+AzBYf0ZOGg9MFU5w4vDTZXcEiPNDt/TUVFy8JL7r8n4S7YmyDt9m/I+pCWDyku7paoqUksS95w1azM2CtanHxbl/i4LJFqdu09HUpmrfcyw+T3rrg52c4XSQSGNGqfotVr4wPp/RX+trubG3im4JC6Nn49ffC7icPPL/hFtJBu36f9x8ndknJ/cLF7ai+CtNuSkDZ0ml3tLfUh+f9fzCt+nRl7RUtSkJYmquu+edO8kR3K6oe7Kyikfu7aT9bGCFK637hRgecIDGnp13/UYNTXzYgSUl93zZ/rR4+6EUB6v6yoxvMe9SE97XnVnV/LjDOQ1iZWJr9xo1ezJbmbPjgzzAqkqwvG7jiOBaTVq/3tQ25i4JkCQ3rE3eLvXip5xN/d5YZGCGlAwZj0wYoV/vYeN3ovgrR/g3b8DdkkpJWJFf9aODhRVZMYUFE9q+PfMuMMJG9oSVXNi10+f7XPx7Wbh0zJDLMB6VduZP0gDpA+KLzI37UtWBI9pKvdr/zd7wq/tyS56+qmRgfp8cwr0LLCdv7u3ILX9yJIXZO1anRG5w6nFLS5ugEkb0b3HlO2Duy2IfHiTZ37lXuZ8aYMpM3jul5SssKrndWnY6+7/54ZZgHSmmMPHDvOb8naOePGXeJ+OW7cm9FCqurnzp8w+gx3efCZAkP6uft9at/bnXz9ze0KTloSGaRVxxw4JnVBw6KK3u680SN/6PoEmS4MSPNKS3u4AaWl7+QAUrLZJ23wdyu/ObchpB2H74g5/g8FgvR+5oKyB9deWnc0LWJIG8efckDTU0uDTxQc0tmF6f3Swa2aNjuu5+uBJgsE6d3M4/RAxeo7Tm7RtPW4QI7CgNQ788zJDaSTnkrv72tdd8P2jxI7PnWLHlLYcfW3jKu/Vf8WpMavpfezm9TdsLDDqFog5SQgyfIP0hGXpna1XQ8PTgZI/7eAJMs/SLe67147atSAb7nBQMp1QJLlH6TacYf7P5EdMrwGSLkOSLL8g5Sk9Mmypau3Z4sRkHRAkuUjpG1vzfnU+x8g5T4gyfIQ0sQWzi3xhvwia5SApAKSLP8gPeDaT09CenTf8UDKdUCS5R+kk6/0tiUhebecCKRcByRZ/kFq+moa0u8aASnXAUmWf5C+9nwa0lMHACnXAUmWf5B+cs4/fUifn9QOSLkOSLL8g/T6Psdf5/r2PqDRm0DKdUCS5R8k77VT/Ssbfvj7bDkCkgxIsjyE5Hmb3ntvc9YYAUkHJFn+QToje/+JVSD9LwFJln+QvjEJSFEFJFn+QXruW7/9F5CiCUiy/IN09ndd4yOO9gNSrgOSLP8gnXle27qAlOuAJMs/SNkPSCogyfIO0rZlb24BUkQBSZZvkCa3cK5R/y/FNwIpuwFJlmeQnnEtbxh2lrtafCOQshuQZHkG6eyW/v8utm+jvwEpioAkyzNIzYf727dc1i5YBdL/X0CS5Rkkd7+/3eBeFt8JpKwGJFm+QXrQ3250LwEpioAkAxKQ/v2AJMs3SLcsSTbPlfo7IOU6IMnyDVLDgJTrgCTLM0i3NgxIuQ5IsjyDlJOApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZOvQM0Yd/4sYdWq7TjHqB5fGqJ/2iVEXimd2jiFNWR+jij+PUaOWboxRiWkxanTUj03DYvKKBCQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiRbUEhvtHZP+/sbXKqzIodUduEBTdr8KoSJAkNa1No9s+tRFJBGHOWuSx1cf3zjxifcsPNtkUEK7XEKBVJN4p1oIY1tdkQa0uWFE/wejxrS2y2OmzD53IIngs8UFNK45Kl5ZpejKCB1a3xQGs2V7shLLj1s35sa3hYZpPAepz0C0twmd5amIXVtEWii0CB1bvbh559vOumY4DMFhPR8k9GT03zqj6KANKjRJT3TaA498K5p0ya0aNXwtsgghfc47RGQFr22vg7ST4+IBaSqZh393Wj3euCpAkJaPH9jHZ/6oygg3XrLtDSaMe4sf9y2YHz9bZFBCvFxCg1SzbCRNX8d36tzyYfe9sTv+k32No/v1WXwKs+rGN6964gNXm1i4Yj+fednBVKyOkhntVq/fnX0kJa54f5urpsaeKrgHzbU84kQUrI0mjvcef6gixtYf1tkkEJ8nEKDVFrypTfo1i1fPtz1b17RwFX/9AaN/6L68Z7V3pWl2/4xpsRL3rjFe6XLtuS3f74s2d+yAql1y44HuoOuXxMxpBfc3f5uiRsReKo9DdLU/Y7yB23cZTGAFOLjFBakJ/p/4a1OrPW86osXeEVPet6qxGbPq+1W5m390vMWd6j1iuZ63qbEJ8lvX9Am2dtZgdSysNtD9xe5iyKG9Iy7z9+9424KPNWeBmlawv33yNsuaOH6xgBSiI9TSJDGJv7geW+2r00O+v/GKyrzvLJEqtne8iHFxd0SNV7RYs/bnPcB240AABDoSURBVPg4+R2rpyRblxVIb7/nb7u6OdFCmucm+7vF7tbAU+1xkO4+r8C5b13qrowBpBAfp5Ag9RsxsKYO0lVPeEVLPG9pojr1tQ2dZlcnBzWpG9OQdlNYkNI94UZGC+ltN8zfzUn/gReoPQ7StGljS+5M/ow0LAaQQnycQoJUvrXPI94a/43bts7zU2bWJlYmv7LRKyuq8bxHcwVp5Up/e78bFy2kT1sk/N1wtzjwVHsgJL/v7jclBpBCfJxC+7BhRYd3vZKRX2y7r+c/Uma8oSVVNS92+XxlYsW/Fg5OVOUE0ruFF/qD8wreiBbS58VNln/++YZjvxN8pj0O0umHTp42bXDhORZX7iGF+DiF93ukx4u3VN3R89Lbkj/8pCBtHtf1kpIVnjeje48pWwd225RFSM9OmNDVXTlhwuL1fVzbcaNOd/0CTRcCpD98teXwMT9sNDf4TAEhzZ04sZu7auLEpQ2OooB0Q48ep7u2PXqMnHZFwQnFHZp/dWzD2yKDFN7jtEdca9czfYWdu3f92jGtWzQ9ZWKg2UK51m7ZRS2anvFcCBMFhFRcd2rua3AUBaSz6u69z7Rpfb7RqPlpd+58W1SQwnuc9ghIIcfV3zKu/lYByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSRUTSB2LY9SJfWJUm469Y1TLs2NUIurHpmEXimd2jiFNq4xRvaP+c79ht77zWYwatSlGXV8eo24Xz2wgxSQgyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSLagkBa1dnNSBwvaHdDke49FCym5mGd2PYoW0pz/PrjJSZM+DT5RcEh/cnXNjBbSgsw6JpSXzzq7eeOT7oo1pKIlOw1rEu9nBdK4ZkekIS1pcdzYSecUzIwSkr+YZ3Y5ihbSrMKTx0443Q2OA6R1d6UqKng9WkiLh6Y6v2BW+Zz9j7p56GkFE+MKafnHBlLtB1uzAemFJmPuTkPq2Gx5ZeW677SMENLzTUZPTvOpP4oYUsuj13322cbjD40DpHSrDy8OPkkIb+0WHtqxvPyCpi+Vly894RtxhXTbiwaSLBikJQsq05DWNyvyx6Pcq9FBWjx/Yx2f+qNoIVXe8YS/6+HWxQbSZQd/FHySECB1PXB++bKm5/uHg9wT8YQ0pH2n672iV0Z0Kl7geRXDu3cdsSFrb+0q6yAtckP9wRw3OTpIyer5xAJSuk9P+0bwSUKC9Gbh2BBmCQ5pduHN5eVPuwH+8XQ3Ip6QvH7+K9I1H/7zsS7bvCtLt/1jTEkdpPXPJKvKBqTn3F3+4A03DEg7tWH5y50bzQw+T0iQOhz+lxBmCQ6p3aGLyssfcMP846fc1XGG9LTnbUxUeFu/9LzFHWrTkBa0SfZ2NiA96ab6g2XuRiDt1DPOHfWbEOYJB9KbhXeGMU1gSLMLb0xup7nb/MGz7vI4Q1rseZsTH3vLhxQXd0vUZP8VaZI/KHPDgbRTH/1qaseCgcHnCQfS5Y1XhzFNYEjdGi9Mbh90Q/3BU+6aOENakoK0odPsam9pBtJuCgnSEneLP3gq/cIEpJ0a5F4NPEcokCqP+EkY0wSG9NbXz/R3c1x/f3dP+oUp3pDKimo879HsQ9rQ4uf+YIgrA1J9fxz3O3/3azc5HpBecpPCmCYwpBnpl6Jl+5/n7wa4p2IKqf/Df89AWplY8a+FgxNV2YZUeWmTdyor1x777UBz7WmQPio8syq5+6V7Jh6Qhrvgv4z1CwrpGjcrte/Q+Pny8kX/dUKgybIIaW7nPhlI3ozuPaZsHdhtQ3YgzZ00qZvrP2nSssr3Dj566B0/aDQnQkhzJ07s5q6aOHFpg6NoIX12nfvhqImdCr5fFQ9I3d2aMKYJDCnhFqb28w48csCgk/edHldI/5eCQepVd9nU9MrKRRe2aHraM4FmCwipuG4x9zU4ihjSp5NObrb/t66uCD5TKJAuKAxjluCQ/ruw7uDpc/Zvcup9wSbbIyCFHFd/y7j6WwUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkVUwgzXg8Rg2MegENGzY16hU07NqoF9CwWD1O08QzO8eQiPbMgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQghIRCGUl5CmTY56BQ2ad+emqJdQ34d3Lo16CfV9eeesqJfQoBl3ZnX6vISUaBf1Chp0R5uPo15Cfa+2eTzqJdS3tc3VUS+hQb2/n9XpgRQ0IKmAFPeApAKSDEg2IKmAJAMSUfwDElEIAYkohPIDUtGS1K4m8X7ECzFL2JSoyPmqYnAaTDWJd6Jegq7u6ZMpK+cvryDVfrA14oWYJSQh5XxVMTgNpthCWv6xgZSV85dXkGJYElLUS4hFsYV024u5efrEHNKnd15cfO+XXtErIzoVL/Bfk2sTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZruGS1g9qMvVC9Nv7SqGd+86YoO340vZXkPmDqsTLw/u13epZxaQ69PjQ6oZNrLmr+N7dS750Nue+F2/yTvuNadnZ+eGtO90febpk1nH3vjW7oaxm9cPmO4VXfPhPx/rss0/A0UDt3ivdNnmDRr/RfXjPas3tn9/+8brZmeGWV9QgyXU9ivdVjUkDenK0m3/GFPi7Vhd1tdQd4c1iev+6r3aYYtZQK5Pjw+ptORLb9CtW758uOvfkutY9c8d95rTs7NL/fxXpPTTp/6k7XWQVic2JjflXtHTnrcx/ZQtmuu/n/pkVWJz8s1ut7JVidWet93LDLO+ogZL+KO/uCXpVW390vMWd6jNfCn7a6i7w5rEc8l//a6v7LqAnJ+eJKQn+n+RfMDWel71xQu8oie9+nvN6dnZpRSk9NOn/qTtdZDebF+b2hctTr5ZSXycehanD8sSqWbX3tOhZNZ6LzPM+ooaLqH9ds/7JA1p+ZDi4m6JmsyXsr+GujusSSxL3nDVrF0XkPPTU5MYm/hD5gHr/xuvKIl2x73m9OzsUgpS3f3uOGl7HaRF/nPVS/+0mIGUPlyayLxP2TRvZIey+mGWa7CE+f6TZk0K0oZOs6u9pf5TZUluIGXusCaRfI54V/x61wXk/PTUJPqNGFhTB+mqJ1LryNxrbs/OLvV7ccfTp/6k7XWQ1vifiX30wm4grU2sTH59o1ezJbmbPjgzzHoNlrA8Uen/qetDKiuq8bxHcwgpc4c1ieS7lurOr+26gJyfnppE+dY+jyQfsOQbt22d56fWkbnX3J6dXWoAqf6k7XWQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpj1BTVYQnWP0q3rbk5BWplY8a+FgxNVOYOUucOaxICK6lkd/2YWkOvT43/YsKLDu17JyC+23dfzH+lPnOvuNbdnZ5f6P/z3zP3Wn7S9D9KWO7r0nLZtd5A2j+t6SckKr3ZWn4697v57Zpj1Gi7ho+s7X/1O4s/+TTO695iydWC3TbmClLnDDYkXb+rcr9wzC8j16Un9Hunx4i1Vd/S89LZ1db+6ydxrTs/OLs3t3GfHA7bjpO19kGg3NfgTNba/B93rAlLetf0j/zPtdECKS0DKuxZ2GFWbOQZSXAISUQgBiSiEgEQUQkAiCiEgEYUQkPK3X7pMp+32622Pzu169uqAlL+9PnXq1Gtd5+TWXNb9nv+4AimHASm/e92V7u7mKUDKcUDK7+ognXn28984w2vd2j8u+qp3QfLtXhuv7XFrLmze/JLsX8lLQMr36iCdd/I373mhHtKfilz5h17blq1HP3tjwS+iXeFeEpDyuzpIbd2c5HYHJK+f23Hjj74W4fL2noCU32UgNf6XZyE19a/J61UY4fL2noCU32UgHeFvd4V0tD/sx0OcizjL+V0G0tH+FkjRxVnO73aCdOpJ/vY0IEUQZzm/2wnSeYckfyja1CwJ6TL3P0DKaZzl/G4nSJPdmMp3f/ydJKQR7rangZTLOMv53U6Qqm84sknr5we08Ly/nNqoFZByGWeZKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYXQ/wMhANIDIZLX1QAAAABJRU5ErkJggg=="
},
"metadata": {
"image/png": {
"width": 420,
"height": 420
}
}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 436
},
"id": "HsAtwukyLsvt",
"outputId": "3032a224-a2c8-4270-b4f2-7bb620317400"
}
},
{
"cell_type": "markdown",
"source": [
"Tamsesni kvadratai painiavos matricos grafike rodo didelį atvejų skaičių, ir tikimės, kad matote įstrižą liniją iš tamsesnių kvadratų, kurie nurodo atvejus, kai prognozuota ir faktinė etiketė sutampa.\n",
"\n",
"Dabar apskaičiuokime painiavos matricos apibendrinančią statistiką.\n"
],
"metadata": {
"id": "oOJC87dkLwPr"
}
},
{
"cell_type": "code",
"execution_count": 12,
"source": [
"# Summary stats for confusion matrix\n",
"conf_mat(data = results, truth = cuisine, estimate = .pred_class) %>% \n",
"summary()"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" .metric .estimator .estimate\n",
"1 accuracy multiclass 0.7880435\n",
"2 kap multiclass 0.7276583\n",
"3 sens macro 0.7780927\n",
"4 spec macro 0.9477598\n",
"5 ppv macro 0.7585583\n",
"6 npv macro 0.9460080\n",
"7 mcc multiclass 0.7292724\n",
"8 j_index macro 0.7258524\n",
"9 bal_accuracy macro 0.8629262\n",
"10 detection_prevalence macro 0.2000000\n",
"11 precision macro 0.7585583\n",
"12 recall macro 0.7780927\n",
"13 f_meas macro 0.7641862"
],
"text/markdown": [
"\n",
"A tibble: 13 × 3\n",
"\n",
"| .metric &lt;chr&gt; | .estimator &lt;chr&gt; | .estimate &lt;dbl&gt; |\n",
"|---|---|---|\n",
"| accuracy | multiclass | 0.7880435 |\n",
"| kap | multiclass | 0.7276583 |\n",
"| sens | macro | 0.7780927 |\n",
"| spec | macro | 0.9477598 |\n",
"| ppv | macro | 0.7585583 |\n",
"| npv | macro | 0.9460080 |\n",
"| mcc | multiclass | 0.7292724 |\n",
"| j_index | macro | 0.7258524 |\n",
"| bal_accuracy | macro | 0.8629262 |\n",
"| detection_prevalence | macro | 0.2000000 |\n",
"| precision | macro | 0.7585583 |\n",
"| recall | macro | 0.7780927 |\n",
"| f_meas | macro | 0.7641862 |\n",
"\n"
],
"text/latex": [
"A tibble: 13 × 3\n",
"\\begin{tabular}{lll}\n",
" .metric & .estimator & .estimate\\\\\n",
" <chr> & <chr> & <dbl>\\\\\n",
"\\hline\n",
"\t accuracy & multiclass & 0.7880435\\\\\n",
"\t kap & multiclass & 0.7276583\\\\\n",
"\t sens & macro & 0.7780927\\\\\n",
"\t spec & macro & 0.9477598\\\\\n",
"\t ppv & macro & 0.7585583\\\\\n",
"\t npv & macro & 0.9460080\\\\\n",
"\t mcc & multiclass & 0.7292724\\\\\n",
"\t j\\_index & macro & 0.7258524\\\\\n",
"\t bal\\_accuracy & macro & 0.8629262\\\\\n",
"\t detection\\_prevalence & macro & 0.2000000\\\\\n",
"\t precision & macro & 0.7585583\\\\\n",
"\t recall & macro & 0.7780927\\\\\n",
"\t f\\_meas & macro & 0.7641862\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 13 × 3</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>.metric</th><th scope=col>.estimator</th><th scope=col>.estimate</th></tr>\n",
"\t<tr><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;chr&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>accuracy </td><td>multiclass</td><td>0.7880435</td></tr>\n",
"\t<tr><td>kap </td><td>multiclass</td><td>0.7276583</td></tr>\n",
"\t<tr><td>sens </td><td>macro </td><td>0.7780927</td></tr>\n",
"\t<tr><td>spec </td><td>macro </td><td>0.9477598</td></tr>\n",
"\t<tr><td>ppv </td><td>macro </td><td>0.7585583</td></tr>\n",
"\t<tr><td>npv </td><td>macro </td><td>0.9460080</td></tr>\n",
"\t<tr><td>mcc </td><td>multiclass</td><td>0.7292724</td></tr>\n",
"\t<tr><td>j_index </td><td>macro </td><td>0.7258524</td></tr>\n",
"\t<tr><td>bal_accuracy </td><td>macro </td><td>0.8629262</td></tr>\n",
"\t<tr><td>detection_prevalence</td><td>macro </td><td>0.2000000</td></tr>\n",
"\t<tr><td>precision </td><td>macro </td><td>0.7585583</td></tr>\n",
"\t<tr><td>recall </td><td>macro </td><td>0.7780927</td></tr>\n",
"\t<tr><td>f_meas </td><td>macro </td><td>0.7641862</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 494
},
"id": "OYqetUyzL5Wz",
"outputId": "6a84d65e-113d-4281-dfc1-16e8b70f37e6"
}
},
{
"cell_type": "markdown",
"source": [
"Jei apsiribosime tam tikrais rodikliais, tokiais kaip tikslumas, jautrumas, ppv, pradžiai visai neblogai 🥳!\n",
"\n",
"## 4. Gilinamės\n",
"\n",
"Paklauskime subtilaus klausimo: kokiais kriterijais remiantis pasirenkama tam tikra virtuvės rūšis kaip numatytas rezultatas?\n",
"\n",
"Na, statistiniai mašininio mokymosi algoritmai, tokie kaip logistinė regresija, yra pagrįsti `tikimybe`; taigi, ką iš tikrųjų prognozuoja klasifikatorius, yra tikimybių pasiskirstymas tarp galimų rezultatų rinkinio. Klasė, turinti didžiausią tikimybę, tada pasirenkama kaip labiausiai tikėtinas rezultatas pagal pateiktus stebėjimus.\n",
"\n",
"Pažiūrėkime, kaip tai veikia praktiškai, atlikdami tiek griežtas klasių prognozes, tiek tikimybių skaičiavimus.\n"
],
"metadata": {
"id": "43t7vz8vMJtW"
}
},
{
"cell_type": "code",
"execution_count": 13,
"source": [
"# Make hard class prediction and probabilities\n",
"results_prob <- cuisines_test %>%\n",
" select(cuisine) %>% \n",
" bind_cols(mr_fit %>% predict(new_data = cuisines_test)) %>% \n",
" bind_cols(mr_fit %>% predict(new_data = cuisines_test, type = \"prob\"))\n",
"\n",
"# Print out results\n",
"results_prob %>% \n",
" slice_head(n = 5)"
],
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
" cuisine .pred_class .pred_chinese .pred_indian .pred_japanese .pred_korean\n",
"1 indian thai 1.551259e-03 0.4587877 5.988039e-04 2.428503e-04\n",
"2 indian indian 2.637133e-05 0.9999488 6.648651e-07 2.259993e-05\n",
"3 indian indian 1.049433e-03 0.9909982 1.060937e-03 1.644947e-05\n",
"4 indian indian 6.237482e-02 0.4763035 9.136702e-02 3.660913e-01\n",
"5 indian indian 1.431745e-02 0.9418551 2.945239e-02 8.721782e-03\n",
" .pred_thai \n",
"1 5.388194e-01\n",
"2 1.577948e-06\n",
"3 6.874989e-03\n",
"4 3.863391e-03\n",
"5 5.653283e-03"
],
"text/markdown": [
"\n",
"A tibble: 5 × 7\n",
"\n",
"| cuisine &lt;fct&gt; | .pred_class &lt;fct&gt; | .pred_chinese &lt;dbl&gt; | .pred_indian &lt;dbl&gt; | .pred_japanese &lt;dbl&gt; | .pred_korean &lt;dbl&gt; | .pred_thai &lt;dbl&gt; |\n",
"|---|---|---|---|---|---|---|\n",
"| indian | thai | 1.551259e-03 | 0.4587877 | 5.988039e-04 | 2.428503e-04 | 5.388194e-01 |\n",
"| indian | indian | 2.637133e-05 | 0.9999488 | 6.648651e-07 | 2.259993e-05 | 1.577948e-06 |\n",
"| indian | indian | 1.049433e-03 | 0.9909982 | 1.060937e-03 | 1.644947e-05 | 6.874989e-03 |\n",
"| indian | indian | 6.237482e-02 | 0.4763035 | 9.136702e-02 | 3.660913e-01 | 3.863391e-03 |\n",
"| indian | indian | 1.431745e-02 | 0.9418551 | 2.945239e-02 | 8.721782e-03 | 5.653283e-03 |\n",
"\n"
],
"text/latex": [
"A tibble: 5 × 7\n",
"\\begin{tabular}{lllllll}\n",
" cuisine & .pred\\_class & .pred\\_chinese & .pred\\_indian & .pred\\_japanese & .pred\\_korean & .pred\\_thai\\\\\n",
" <fct> & <fct> & <dbl> & <dbl> & <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t indian & thai & 1.551259e-03 & 0.4587877 & 5.988039e-04 & 2.428503e-04 & 5.388194e-01\\\\\n",
"\t indian & indian & 2.637133e-05 & 0.9999488 & 6.648651e-07 & 2.259993e-05 & 1.577948e-06\\\\\n",
"\t indian & indian & 1.049433e-03 & 0.9909982 & 1.060937e-03 & 1.644947e-05 & 6.874989e-03\\\\\n",
"\t indian & indian & 6.237482e-02 & 0.4763035 & 9.136702e-02 & 3.660913e-01 & 3.863391e-03\\\\\n",
"\t indian & indian & 1.431745e-02 & 0.9418551 & 2.945239e-02 & 8.721782e-03 & 5.653283e-03\\\\\n",
"\\end{tabular}\n"
],
"text/html": [
"<table class=\"dataframe\">\n",
"<caption>A tibble: 5 × 7</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cuisine</th><th scope=col>.pred_class</th><th scope=col>.pred_chinese</th><th scope=col>.pred_indian</th><th scope=col>.pred_japanese</th><th scope=col>.pred_korean</th><th scope=col>.pred_thai</th></tr>\n",
"\t<tr><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;fct&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>indian</td><td>thai </td><td>1.551259e-03</td><td>0.4587877</td><td>5.988039e-04</td><td>2.428503e-04</td><td>5.388194e-01</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>2.637133e-05</td><td>0.9999488</td><td>6.648651e-07</td><td>2.259993e-05</td><td>1.577948e-06</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>1.049433e-03</td><td>0.9909982</td><td>1.060937e-03</td><td>1.644947e-05</td><td>6.874989e-03</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>6.237482e-02</td><td>0.4763035</td><td>9.136702e-02</td><td>3.660913e-01</td><td>3.863391e-03</td></tr>\n",
"\t<tr><td>indian</td><td>indian</td><td>1.431745e-02</td><td>0.9418551</td><td>2.945239e-02</td><td>8.721782e-03</td><td>5.653283e-03</td></tr>\n",
"</tbody>\n",
"</table>\n"
]
},
"metadata": {}
}
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 248
},
"id": "xdKNs-ZPMTJL",
"outputId": "68f6ac5a-725a-4eff-9ea6-481fef00e008"
}
},
{
"cell_type": "markdown",
"source": [
"Daug geriau!\n",
"\n",
"✅ Ar galite paaiškinti, kodėl modelis yra gana tikras, kad pirmasis stebėjimas yra tailandietiškas?\n",
"\n",
"## **🚀Iššūkis**\n",
"\n",
"Šioje pamokoje jūs naudojote išvalytus duomenis, kad sukurtumėte mašininio mokymosi modelį, galintį prognozuoti nacionalinę virtuvę pagal ingredientų sąrašą. Skirkite šiek tiek laiko peržiūrėti [daugybę parinkčių](https://www.tidymodels.org/find/parsnip/#models), kurias Tidymodels siūlo duomenų klasifikavimui, ir [kitus būdus](https://parsnip.tidymodels.org/articles/articles/Examples.html#multinom_reg-models), kaip pritaikyti daugianarę regresiją.\n",
"\n",
"#### AČIŪ:\n",
"\n",
"[`Allison Horst`](https://twitter.com/allison_horst/) už nuostabias iliustracijas, kurios padaro R labiau prieinamą ir įdomų. Daugiau iliustracijų rasite jos [galerijoje](https://www.google.com/url?q=https://github.com/allisonhorst/stats-illustrations&sa=D&source=editors&ust=1626380772530000&usg=AOvVaw3zcfyCizFQZpkSLzxiiQEM).\n",
"\n",
"[Cassie Breviu](https://www.twitter.com/cassieview) ir [Jen Looper](https://www.twitter.com/jenlooper) už originalios šio modulio Python versijos sukūrimą ♥️\n",
"\n",
"<br>\n",
"Būčiau pridėjęs keletą juokelių, bet nesuprantu maisto kalambūrų 😅.\n",
"\n",
"<br>\n",
"\n",
"Sėkmingo mokymosi,\n",
"\n",
"[Eric](https://twitter.com/ericntay), Gold Microsoft Learn Student Ambasadorius.\n"
],
"metadata": {
"id": "2tWVHMeLMYdM"
}
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n---\n\n**Atsakomybės apribojimas**: \nŠis dokumentas buvo išverstas naudojant AI vertimo paslaugą [Co-op Translator](https://github.com/Azure/co-op-translator). Nors siekiame tikslumo, prašome atkreipti dėmesį, kad automatiniai vertimai gali turėti klaidų ar netikslumų. Originalus dokumentas jo gimtąja kalba turėtų būti laikomas autoritetingu šaltiniu. Kritinei informacijai rekomenduojama profesionali žmogaus vertimo paslauga. Mes neprisiimame atsakomybės už nesusipratimus ar klaidingus interpretavimus, atsiradusius naudojant šį vertimą.\n"
]
}
]
}