{ "nbformat": 4, "nbformat_minor": 2, "metadata": { "colab": { "name": "lesson_11-R.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "kernelspec": { "name": "ir", "display_name": "R" }, "language_info": { "name": "R" }, "coopTranslator": { "original_hash": "6ea6a5171b1b99b7b5a55f7469c048d2", "translation_date": "2025-09-04T02:25:43+00:00", "source_file": "4-Classification/2-Classifiers-1/solution/R/lesson_11-R.ipynb", "language_code": "ja" } }, "cells": [ { "cell_type": "markdown", "source": [ "# 分類モデルを構築する: 美味しいアジア料理とインド料理\n" ], "metadata": { "id": "zs2woWv_HoE8" } }, { "cell_type": "markdown", "source": [ "## 料理分類器 1\n", "\n", "このレッスンでは、*材料のグループに基づいて特定の国の料理を予測する*ためのさまざまな分類器を探ります。その過程で、分類タスクにアルゴリズムを活用する方法についてさらに学びます。\n", "\n", "### [**事前クイズ**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/21/)\n", "\n", "### **準備**\n", "\n", "このレッスンは、[前回のレッスン](https://github.com/microsoft/ML-For-Beginners/blob/main/4-Classification/1-Introduction/solution/lesson_10-R.ipynb)を基に進めます。前回のレッスンでは以下を行いました:\n", "\n", "- アジアとインドの素晴らしい料理に関するデータセットを使って、分類の基本的な紹介を行いました 😋。\n", "\n", "- データを準備し、クリーンアップするためにいくつかの[dplyrの動詞](https://dplyr.tidyverse.org/)を探りました。\n", "\n", "- ggplot2を使って美しい可視化を作成しました。\n", "\n", "- [recipes](https://recipes.tidymodels.org/articles/Simple_Example.html)を使用してデータを前処理し、不均衡なデータに対処する方法を示しました。\n", "\n", "- レシピを`prep`して`bake`することで、期待通りに動作することを確認する方法を示しました。\n", "\n", "#### **前提条件**\n", "\n", "このレッスンでは、データをクリーンアップし、準備し、可視化するために以下のパッケージが必要です:\n", "\n", "- `tidyverse`: [tidyverse](https://www.tidyverse.org/)は、データサイエンスをより速く、簡単に、そして楽しくするために設計された[Rパッケージのコレクション](https://www.tidyverse.org/packages)です!\n", "\n", "- `tidymodels`: [tidymodels](https://www.tidymodels.org/)フレームワークは、モデリングと機械学習のための[パッケージのコレクション](https://www.tidymodels.org/packages/)です。\n", "\n", "- `themis`: [themisパッケージ](https://themis.tidymodels.org/)は、不均衡なデータに対処するための追加のレシピステップを提供します。\n", "\n", "- `nnet`: [nnetパッケージ](https://cran.r-project.org/web/packages/nnet/nnet.pdf)は、単一の隠れ層を持つフィードフォワードニューラルネットワークや多項ロジスティック回帰モデルを推定するための関数を提供します。\n", "\n", "以下のようにインストールできます:\n" ], "metadata": { "id": "iDFOb3ebHwQC" } }, { "cell_type": "markdown", "source": [ "`install.packages(c(\"tidyverse\", \"tidymodels\", \"DataExplorer\", \"here\"))`\n", "\n", "または、以下のスクリプトを使用すると、このモジュールを完了するために必要なパッケージがインストールされているか確認し、足りない場合は自動的にインストールします。\n" ], "metadata": { "id": "4V85BGCjII7F" } }, { "cell_type": "code", "execution_count": 2, "source": [ "suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\r\n", "\r\n", "pacman::p_load(tidyverse, tidymodels, themis, here)" ], "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "Loading required package: pacman\n", "\n" ] } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "an5NPyyKIKNR", "outputId": "834d5e74-f4b8-49f9-8ab5-4c52ff2d7bc8" } }, { "cell_type": "markdown", "source": [ "## 1. データをトレーニングセットとテストセットに分割する\n", "\n", "前回のレッスンからいくつかのステップを選んで始めましょう。\n", "\n", "### 異なる料理間で混乱を招きやすい、最も一般的な食材を `dplyr::select()` を使って除外する\n", "\n", "みんなが大好きな米、ニンニク、そしてショウガ!\n" ], "metadata": { "id": "0ax9GQLBINVv" } }, { "cell_type": "code", "execution_count": 3, "source": [ "# Load the original cuisines data\r\n", "df <- read_csv(file = \"https://raw.githubusercontent.com/microsoft/ML-For-Beginners/main/4-Classification/data/cuisines.csv\")\r\n", "\r\n", "# Drop id column, rice, garlic and ginger from our original data set\r\n", "df_select <- df %>% \r\n", " select(-c(1, rice, garlic, ginger)) %>%\r\n", " # Encode cuisine column as categorical\r\n", " mutate(cuisine = factor(cuisine))\r\n", "\r\n", "# Display new data set\r\n", "df_select %>% \r\n", " slice_head(n = 5)\r\n", "\r\n", "# Display distribution of cuisines\r\n", "df_select %>% \r\n", " count(cuisine) %>% \r\n", " arrange(desc(n))" ], "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "New names:\n", "* `` -> ...1\n", "\n", "\u001b[1m\u001b[1mRows: \u001b[1m\u001b[22m\u001b[34m\u001b[34m2448\u001b[34m\u001b[39m \u001b[1m\u001b[1mColumns: \u001b[1m\u001b[22m\u001b[34m\u001b[34m385\u001b[34m\u001b[39m\n", "\n", "\u001b[36m──\u001b[39m \u001b[1m\u001b[1mColumn specification\u001b[1m\u001b[22m \u001b[36m────────────────────────────────────────────────────────\u001b[39m\n", "\u001b[1mDelimiter:\u001b[22m \",\"\n", "\u001b[31mchr\u001b[39m (1): cuisine\n", "\u001b[32mdbl\u001b[39m (384): ...1, almond, angelica, anise, anise_seed, apple, apple_brandy, a...\n", "\n", "\n", "\u001b[36mℹ\u001b[39m Use \u001b[30m\u001b[47m\u001b[30m\u001b[47m`spec()`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to retrieve the full column specification for this data.\n", "\u001b[36mℹ\u001b[39m Specify the column types or set \u001b[30m\u001b[47m\u001b[30m\u001b[47m`show_col_types = FALSE`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to quiet this message.\n", "\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n", "1 indian 0 0 0 0 0 0 0 0 \n", "2 indian 1 0 0 0 0 0 0 0 \n", "3 indian 0 0 0 0 0 0 0 0 \n", "4 indian 0 0 0 0 0 0 0 0 \n", "5 indian 0 0 0 0 0 0 0 0 \n", " artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n", "1 0 ⋯ 0 0 0 0 0 0 \n", "2 0 ⋯ 0 0 0 0 0 0 \n", "3 0 ⋯ 0 0 0 0 0 0 \n", "4 0 ⋯ 0 0 0 0 0 0 \n", "5 0 ⋯ 0 0 0 0 0 0 \n", " yam yeast yogurt zucchini\n", "1 0 0 0 0 \n", "2 0 0 0 0 \n", "3 0 0 0 0 \n", "4 0 0 0 0 \n", "5 0 0 1 0 " ], "text/markdown": [ "\n", "A tibble: 5 × 381\n", "\n", "| cuisine <fct> | almond <dbl> | angelica <dbl> | anise <dbl> | anise_seed <dbl> | apple <dbl> | apple_brandy <dbl> | apricot <dbl> | armagnac <dbl> | artemisia <dbl> | ⋯ ⋯ | whiskey <dbl> | white_bread <dbl> | white_wine <dbl> | whole_grain_wheat_flour <dbl> | wine <dbl> | wood <dbl> | yam <dbl> | yeast <dbl> | yogurt <dbl> | zucchini <dbl> |\n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 381\n", "\\begin{tabular}{lllllllllllllllllllll}\n", " cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n", " & & & & & & & & & & ⋯ & & & & & & & & & & \\\\\n", "\\hline\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 381
cuisinealmondangelicaaniseanise_seedappleapple_brandyapricotarmagnacartemisiawhiskeywhite_breadwhite_winewhole_grain_wheat_flourwinewoodyamyeastyogurtzucchini
<fct><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
indian0000000000000000000
indian1000000000000000000
indian0000000000000000000
indian0000000000000000000
indian0000000000000000010
\n" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine n \n", "1 korean 799\n", "2 indian 598\n", "3 chinese 442\n", "4 japanese 320\n", "5 thai 289" ], "text/markdown": [ "\n", "A tibble: 5 × 2\n", "\n", "| cuisine <fct> | n <int> |\n", "|---|---|\n", "| korean | 799 |\n", "| indian | 598 |\n", "| chinese | 442 |\n", "| japanese | 320 |\n", "| thai | 289 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 2\n", "\\begin{tabular}{ll}\n", " cuisine & n\\\\\n", " & \\\\\n", "\\hline\n", "\t korean & 799\\\\\n", "\t indian & 598\\\\\n", "\t chinese & 442\\\\\n", "\t japanese & 320\\\\\n", "\t thai & 289\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 2
cuisinen
<fct><int>
korean 799
indian 598
chinese 442
japanese320
thai 289
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 735 }, "id": "jhCrrH22IWVR", "outputId": "d444a85c-1d8b-485f-bc4f-8be2e8f8217c" } }, { "cell_type": "markdown", "source": [ "完璧です!では、データを分割して、70%をトレーニング用データ、30%をテスト用データに振り分けましょう。また、分割時には「ストラティフィケーション」技法を適用して、トレーニングデータセットと検証データセット内で各料理の割合を維持するようにします。\n", "\n", "[rsample](https://rsample.tidymodels.org/) は、Tidymodels のパッケージで、効率的なデータ分割やリサンプリングのためのインフラを提供します。\n" ], "metadata": { "id": "AYTjVyajIdny" } }, { "cell_type": "code", "execution_count": 4, "source": [ "# Load the core Tidymodels packages into R session\r\n", "library(tidymodels)\r\n", "\r\n", "# Create split specification\r\n", "set.seed(2056)\r\n", "cuisines_split <- initial_split(data = df_select,\r\n", " strata = cuisine,\r\n", " prop = 0.7)\r\n", "\r\n", "# Extract the data in each split\r\n", "cuisines_train <- training(cuisines_split)\r\n", "cuisines_test <- testing(cuisines_split)\r\n", "\r\n", "# Print the number of cases in each split\r\n", "cat(\"Training cases: \", nrow(cuisines_train), \"\\n\",\r\n", " \"Test cases: \", nrow(cuisines_test), sep = \"\")\r\n", "\r\n", "# Display the first few rows of the training set\r\n", "cuisines_train %>% \r\n", " slice_head(n = 5)\r\n", "\r\n", "\r\n", "# Display distribution of cuisines in the training set\r\n", "cuisines_train %>% \r\n", " count(cuisine) %>% \r\n", " arrange(desc(n))" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Training cases: 1712\n", "Test cases: 736" ] }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n", "1 chinese 0 0 0 0 0 0 0 0 \n", "2 chinese 0 0 0 0 0 0 0 0 \n", "3 chinese 0 0 0 0 0 0 0 0 \n", "4 chinese 0 0 0 0 0 0 0 0 \n", "5 chinese 0 0 0 0 0 0 0 0 \n", " artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n", "1 0 ⋯ 0 0 0 0 1 0 \n", "2 0 ⋯ 0 0 0 0 1 0 \n", "3 0 ⋯ 0 0 0 0 0 0 \n", "4 0 ⋯ 0 0 0 0 0 0 \n", "5 0 ⋯ 0 0 0 0 0 0 \n", " yam yeast yogurt zucchini\n", "1 0 0 0 0 \n", "2 0 0 0 0 \n", "3 0 0 0 0 \n", "4 0 0 0 0 \n", "5 0 0 0 0 " ], "text/markdown": [ "\n", "A tibble: 5 × 381\n", "\n", "| cuisine <fct> | almond <dbl> | angelica <dbl> | anise <dbl> | anise_seed <dbl> | apple <dbl> | apple_brandy <dbl> | apricot <dbl> | armagnac <dbl> | artemisia <dbl> | ⋯ ⋯ | whiskey <dbl> | white_bread <dbl> | white_wine <dbl> | whole_grain_wheat_flour <dbl> | wine <dbl> | wood <dbl> | yam <dbl> | yeast <dbl> | yogurt <dbl> | zucchini <dbl> |\n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 381\n", "\\begin{tabular}{lllllllllllllllllllll}\n", " cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n", " & & & & & & & & & & ⋯ & & & & & & & & & & \\\\\n", "\\hline\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 381
cuisinealmondangelicaaniseanise_seedappleapple_brandyapricotarmagnacartemisiawhiskeywhite_breadwhite_winewhole_grain_wheat_flourwinewoodyamyeastyogurtzucchini
<fct><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
chinese0000000000000100000
chinese0000000000000100000
chinese0000000000000000000
chinese0000000000000000000
chinese0000000000000000000
\n" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine n \n", "1 korean 559\n", "2 indian 418\n", "3 chinese 309\n", "4 japanese 224\n", "5 thai 202" ], "text/markdown": [ "\n", "A tibble: 5 × 2\n", "\n", "| cuisine <fct> | n <int> |\n", "|---|---|\n", "| korean | 559 |\n", "| indian | 418 |\n", "| chinese | 309 |\n", "| japanese | 224 |\n", "| thai | 202 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 2\n", "\\begin{tabular}{ll}\n", " cuisine & n\\\\\n", " & \\\\\n", "\\hline\n", "\t korean & 559\\\\\n", "\t indian & 418\\\\\n", "\t chinese & 309\\\\\n", "\t japanese & 224\\\\\n", "\t thai & 202\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 2
cuisinen
<fct><int>
korean 559
indian 418
chinese 309
japanese224
thai 202
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 535 }, "id": "w5FWIkEiIjdN", "outputId": "2e195fd9-1a8f-4b91-9573-cce5582242df" } }, { "cell_type": "markdown", "source": [ "## 2. 不均衡データへの対処\n", "\n", "元のデータセットやトレーニングセットを見て気づいたかもしれませんが、料理の数にはかなりの偏りがあります。韓国料理はタイ料理の*ほぼ*3倍の数です。不均衡なデータは、モデルのパフォーマンスに悪影響を及ぼすことがよくあります。多くのモデルは観測数が均等であるときに最も良いパフォーマンスを発揮するため、不均衡なデータには苦戦しがちです。\n", "\n", "不均衡データセットに対処する方法は主に2つあります:\n", "\n", "- 少数派クラスに観測値を追加する:`オーバーサンプリング` 例えば、SMOTEアルゴリズムを使用して、少数派クラスの近傍のデータをもとに新しい例を合成的に生成する方法。\n", "\n", "- 多数派クラスから観測値を削除する:`アンダーサンプリング`\n", "\n", "前回のレッスンでは、`recipe`を使って不均衡データセットに対処する方法を説明しました。`recipe`とは、データ分析の準備をするためにデータセットにどのような手順を適用すべきかを記述した設計図のようなものと考えることができます。今回の場合、`training set`における料理の数を均等に分布させたいと考えています。それでは、さっそく始めましょう。\n" ], "metadata": { "id": "daBi9qJNIwqW" } }, { "cell_type": "code", "execution_count": 5, "source": [ "# Load themis package for dealing with imbalanced data\r\n", "library(themis)\r\n", "\r\n", "# Create a recipe for preprocessing training data\r\n", "cuisines_recipe <- recipe(cuisine ~ ., data = cuisines_train) %>% \r\n", " step_smote(cuisine)\r\n", "\r\n", "# Print recipe\r\n", "cuisines_recipe" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "Data Recipe\n", "\n", "Inputs:\n", "\n", " role #variables\n", " outcome 1\n", " predictor 380\n", "\n", "Operations:\n", "\n", "SMOTE based on cuisine" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 200 }, "id": "Az6LFBGxI1X0", "outputId": "29d71d85-64b0-4e62-871e-bcd5398573b6" } }, { "cell_type": "markdown", "source": [ "レシピが期待通りに動作することを確認するために(prep + bakeを使用して)、もちろん試してみることができます。すべての料理ラベルに「559」の観測値があることを確認してください。\n", "\n", "このレシピをモデリングの前処理として使用する予定なので、`workflow()`を使用すればprepとbakeをすべて自動で行ってくれるため、レシピを手動で推定する必要はありません。\n", "\n", "では、モデルのトレーニングを始めましょう 👩‍💻👨‍💻!\n", "\n", "## 3. 分類器の選択\n", "\n", "

\n", " \n", "

@allison_horstによるアートワーク
\n" ], "metadata": { "id": "NBL3PqIWJBBB" } }, { "cell_type": "markdown", "source": [ "どのアルゴリズムを使用するか決める必要がありますね 🤔。\n", "\n", "Tidymodelsでは、[`parsnip パッケージ`](https://parsnip.tidymodels.org/index.html)が、異なるエンジン(パッケージ)間でモデルを扱うための一貫したインターフェースを提供しています。[モデルの種類とエンジン](https://www.tidymodels.org/find/parsnip/#models)や、それに対応する[モデル引数](https://www.tidymodels.org/find/parsnip/#model-args)については、parsnip のドキュメントをご覧ください。最初はその多様性に圧倒されるかもしれません。例えば、以下のような手法が分類技術に含まれます:\n", "\n", "- C5.0 ルールベース分類モデル \n", "- 柔軟な判別モデル \n", "- 線形判別モデル \n", "- 正則化判別モデル \n", "- ロジスティック回帰モデル \n", "- 多項回帰モデル \n", "- ナイーブベイズモデル \n", "- サポートベクターマシン \n", "- 最近傍法 \n", "- 決定木 \n", "- アンサンブル手法 \n", "- ニューラルネットワーク \n", "\n", "このリストはまだまだ続きます!\n", "\n", "### **どの分類器を選ぶべきか?**\n", "\n", "では、どの分類器を選べばよいのでしょうか?多くの場合、いくつか試してみて良い結果を探るのが一つの方法です。\n", "\n", "> AutoML はこの問題をうまく解決してくれます。クラウド上でこれらの比較を実行し、データに最適なアルゴリズムを選択できるようにしてくれます。[こちら](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-77952-leestott)で試してみてください。\n", "\n", "また、分類器の選択は問題に依存します。例えば、結果が「2つ以上のクラス」に分類される場合(今回のケースのように)、`多クラス分類アルゴリズム`を使用する必要があります。これは、`二値分類`とは異なります。\n", "\n", "### **より良いアプローチ**\n", "\n", "しかし、単に当てずっぽうで選ぶよりも、ダウンロード可能な[ML チートシート](https://docs.microsoft.com/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-77952-leestott)のアイデアに従う方が良い方法です。ここでは、今回の多クラス問題に対していくつかの選択肢があることがわかります:\n", "\n", "

\n", " \n", "

Microsoft のアルゴリズムチートシートの一部、多クラス分類オプションを詳細に説明
\n" ], "metadata": { "id": "a6DLAZ3vJZ14" } }, { "cell_type": "markdown", "source": [ "### **理由**\n", "\n", "制約を考慮しながら、異なるアプローチを検討してみましょう:\n", "\n", "- **ディープニューラルネットワークは重すぎる**。データセットがクリーンであるものの最小限であり、ノートブックを使ってローカルでトレーニングを行うことを考えると、ディープニューラルネットワークはこのタスクには重すぎます。\n", "\n", "- **2クラス分類器は使用しない**。2クラス分類器は使用しないため、one-vs-allのアプローチは除外されます。\n", "\n", "- **決定木やロジスティック回帰は有効かもしれない**。決定木や多項式回帰、多クラスロジスティック回帰は多クラスデータに対して有効かもしれません。\n", "\n", "- **多クラスブースト決定木は異なる問題を解決する**。多クラスブースト決定木は非パラメトリックなタスク、例えばランキングを構築するタスクに最適であり、今回の目的には適していません。\n", "\n", "また、通常はアンサンブル法などの複雑な機械学習モデルに取り組む前に、最もシンプルなモデルを構築してデータの状況を把握するのが良い方法です。したがって、このレッスンではまず`多項式回帰`モデルから始めます。\n", "\n", "> ロジスティック回帰は、結果変数がカテゴリカル(または名義尺度)の場合に使用される手法です。バイナリロジスティック回帰では結果変数が2つですが、多項式ロジスティック回帰では結果変数が2つ以上になります。詳細については[高度な回帰手法](https://bookdown.org/chua/ber642_advanced_regression/multinomial-logistic-regression.html)をご覧ください。\n", "\n", "## 4. 多項式ロジスティック回帰モデルのトレーニングと評価\n", "\n", "Tidymodelsでは、`parsnip::multinom_reg()`を使用して、多項式分布を用いて多クラスデータを予測する線形予測モデルを定義します。`?multinom_reg()`を参照すると、このモデルをフィットするために使用できるさまざまな方法やエンジンについて確認できます。\n", "\n", "この例では、デフォルトの[nnet](https://cran.r-project.org/web/packages/nnet/nnet.pdf)エンジンを使用して多項式回帰モデルをフィットします。\n", "\n", "> `penalty`の値はほぼランダムに選びました。この値を選ぶより良い方法としては、`リサンプリング`を使用し、モデルを`チューニング`する方法がありますが、これについては後ほど説明します。\n", ">\n", "> モデルのハイパーパラメータをチューニングする方法について学びたい場合は、[Tidymodels: Get Started](https://www.tidymodels.org/start/tuning/)をご覧ください。\n" ], "metadata": { "id": "gWMsVcbBJemu" } }, { "cell_type": "code", "execution_count": 6, "source": [ "# Create a multinomial regression model specification\r\n", "mr_spec <- multinom_reg(penalty = 1) %>% \r\n", " set_engine(\"nnet\", MaxNWts = 2086) %>% \r\n", " set_mode(\"classification\")\r\n", "\r\n", "# Print model specification\r\n", "mr_spec" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "Multinomial Regression Model Specification (classification)\n", "\n", "Main Arguments:\n", " penalty = 1\n", "\n", "Engine-Specific Arguments:\n", " MaxNWts = 2086\n", "\n", "Computational engine: nnet \n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 166 }, "id": "Wq_fcyQiJvfG", "outputId": "c30449c7-3864-4be7-f810-72a003743e2d" } }, { "cell_type": "markdown", "source": [ "素晴らしい仕事です 🥳!これでレシピとモデル仕様が揃ったので、それらを一つのオブジェクトにまとめる方法を見つける必要があります。このオブジェクトは、まずデータを前処理し、その後前処理済みデータにモデルを適合させ、さらに必要に応じて後処理を行うことも可能にします。Tidymodelsでは、この便利なオブジェクトを[`workflow`](https://workflows.tidymodels.org/)と呼び、モデリングの構成要素を便利に保持します!これは、*Python*で言うところの*パイプライン*に相当します。\n", "\n", "それでは、すべてをワークフローにまとめましょう!📦\n" ], "metadata": { "id": "NlSbzDfgJ0zh" } }, { "cell_type": "code", "execution_count": 7, "source": [ "# Bundle recipe and model specification\r\n", "mr_wf <- workflow() %>% \r\n", " add_recipe(cuisines_recipe) %>% \r\n", " add_model(mr_spec)\r\n", "\r\n", "# Print out workflow\r\n", "mr_wf" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "══ Workflow ════════════════════════════════════════════════════════════════════\n", "\u001b[3mPreprocessor:\u001b[23m Recipe\n", "\u001b[3mModel:\u001b[23m multinom_reg()\n", "\n", "── Preprocessor ────────────────────────────────────────────────────────────────\n", "1 Recipe Step\n", "\n", "• step_smote()\n", "\n", "── Model ───────────────────────────────────────────────────────────────────────\n", "Multinomial Regression Model Specification (classification)\n", "\n", "Main Arguments:\n", " penalty = 1\n", "\n", "Engine-Specific Arguments:\n", " MaxNWts = 2086\n", "\n", "Computational engine: nnet \n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 333 }, "id": "Sc1TfPA4Ke3_", "outputId": "82c70013-e431-4e7e-cef6-9fcf8aad4a6c" } }, { "cell_type": "markdown", "source": [ "ワークフロー 👌👌! **`workflow()`** はモデルとほぼ同じ方法で適合させることができます。では、モデルをトレーニングする時間です!\n" ], "metadata": { "id": "TNQ8i85aKf9L" } }, { "cell_type": "code", "execution_count": 8, "source": [ "# Train a multinomial regression model\n", "mr_fit <- fit(object = mr_wf, data = cuisines_train)\n", "\n", "mr_fit" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "══ Workflow [trained] ══════════════════════════════════════════════════════════\n", "\u001b[3mPreprocessor:\u001b[23m Recipe\n", "\u001b[3mModel:\u001b[23m multinom_reg()\n", "\n", "── Preprocessor ────────────────────────────────────────────────────────────────\n", "1 Recipe Step\n", "\n", "• step_smote()\n", "\n", "── Model ───────────────────────────────────────────────────────────────────────\n", "Call:\n", "nnet::multinom(formula = ..y ~ ., data = data, decay = ~1, MaxNWts = ~2086, \n", " trace = FALSE)\n", "\n", "Coefficients:\n", " (Intercept) almond angelica anise anise_seed apple\n", "indian 0.19723325 0.2409661 0 -5.004955e-05 -0.1657635 -0.05769734\n", "japanese 0.13961959 -0.6262400 0 -1.169155e-04 -0.4893596 -0.08585717\n", "korean 0.22377347 -0.1833485 0 -5.560395e-05 -0.2489401 -0.15657804\n", "thai -0.04336577 -0.6106258 0 4.903828e-04 -0.5782866 0.63451105\n", " apple_brandy apricot armagnac artemisia artichoke asparagus\n", "indian 0 0.37042636 0 -0.09122797 0 -0.27181970\n", "japanese 0 0.28895643 0 -0.12651100 0 0.14054037\n", "korean 0 -0.07981259 0 0.55756709 0 -0.66979948\n", "thai 0 -0.33160904 0 -0.10725182 0 -0.02602152\n", " avocado bacon baked_potato balm banana barley\n", "indian -0.46624197 0.16008055 0 0 -0.2838796 0.2230625\n", "japanese 0.90341344 0.02932727 0 0 -0.4142787 2.0953906\n", "korean -0.06925382 -0.35804134 0 0 -0.2686963 -0.7233404\n", "thai -0.21473955 -0.75594439 0 0 0.6784880 -0.4363320\n", " bartlett_pear basil bay bean beech\n", "indian 0 -0.7128756 0.1011587 -0.8777275 -0.0004380795\n", "japanese 0 0.1288697 0.9425626 -0.2380748 0.3373437611\n", "korean 0 -0.2445193 -0.4744318 -0.8957870 -0.0048784496\n", "thai 0 1.5365848 0.1333256 0.2196970 -0.0113078024\n", " beef beef_broth beef_liver beer beet\n", "indian -0.7985278 0.2430186 -0.035598065 -0.002173738 0.01005813\n", "japanese 0.2241875 -0.3653020 -0.139551027 0.128905553 0.04923911\n", "korean 0.5366515 -0.6153237 0.213455197 -0.010828645 0.27325423\n", "thai 0.1570012 -0.9364154 -0.008032213 -0.035063746 -0.28279823\n", " bell_pepper bergamot berry bitter_orange black_bean\n", "indian 0.49074330 0 0.58947607 0.191256164 -0.1945233\n", "japanese 0.09074167 0 -0.25917977 -0.118915977 -0.3442400\n", "korean -0.57876763 0 -0.07874180 -0.007729435 -0.5220672\n", "thai 0.92554006 0 -0.07210196 -0.002983296 -0.4614426\n", " black_currant black_mustard_seed_oil black_pepper black_raspberry\n", "indian 0 0.38935801 -0.4453495 0\n", "japanese 0 -0.05452887 -0.5440869 0\n", "korean 0 -0.03929970 0.8025454 0\n", "thai 0 -0.21498372 -0.9854806 0\n", " black_sesame_seed black_tea blackberry blackberry_brandy\n", "indian -0.2759246 0.3079977 0.191256164 0\n", "japanese -0.6101687 -0.1671913 -0.118915977 0\n", "korean 1.5197674 -0.3036261 -0.007729435 0\n", "thai -0.1755656 -0.1487033 -0.002983296 0\n", " blue_cheese blueberry bone_oil bourbon_whiskey brandy\n", "indian 0 0.216164294 -0.2276744 0 0.22427587\n", "japanese 0 -0.119186087 0.3913019 0 -0.15595599\n", "korean 0 -0.007821986 0.2854487 0 -0.02562342\n", "thai 0 -0.004947048 -0.0253658 0 -0.05715244\n", "\n", "...\n", "and 308 more lines." ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "GMbdfVmTKkJI", "outputId": "adf9ebdf-d69d-4a64-e9fd-e06e5322292e" } }, { "cell_type": "markdown", "source": [ "モデルが学習中に得た係数が出力されます。\n", "\n", "### 学習済みモデルの評価\n", "\n", "テストセットでモデルの性能を評価する時が来ました 📏!まずはテストセットに対して予測を行いましょう。\n" ], "metadata": { "id": "tt2BfOxrKmcJ" } }, { "cell_type": "code", "execution_count": 9, "source": [ "# Make predictions on the test set\n", "results <- cuisines_test %>% select(cuisine) %>% \n", " bind_cols(mr_fit %>% predict(new_data = cuisines_test))\n", "\n", "# Print out results\n", "results %>% \n", " slice_head(n = 5)" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " cuisine .pred_class\n", "1 indian thai \n", "2 indian indian \n", "3 indian indian \n", "4 indian indian \n", "5 indian indian " ], "text/markdown": [ "\n", "A tibble: 5 × 2\n", "\n", "| cuisine <fct> | .pred_class <fct> |\n", "|---|---|\n", "| indian | thai |\n", "| indian | indian |\n", "| indian | indian |\n", "| indian | indian |\n", "| indian | indian |\n", "\n" ], "text/latex": [ "A tibble: 5 × 2\n", "\\begin{tabular}{ll}\n", " cuisine & .pred\\_class\\\\\n", " & \\\\\n", "\\hline\n", "\t indian & thai \\\\\n", "\t indian & indian\\\\\n", "\t indian & indian\\\\\n", "\t indian & indian\\\\\n", "\t indian & indian\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 2
cuisine.pred_class
<fct><fct>
indianthai
indianindian
indianindian
indianindian
indianindian
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 248 }, "id": "CqtckvtsKqax", "outputId": "e57fe557-6a68-4217-fe82-173328c5436d" } }, { "cell_type": "markdown", "source": [ "素晴らしい仕事です!Tidymodelsでは、モデルの性能評価は[yardstick](https://yardstick.tidymodels.org/)を使用して行うことができます。これは、性能指標を使用してモデルの有効性を測定するためのパッケージです。ロジスティック回帰のレッスンで行ったように、まず混同行列を計算することから始めましょう。\n" ], "metadata": { "id": "8w5N6XsBKss7" } }, { "cell_type": "code", "execution_count": 10, "source": [ "# Confusion matrix for categorical data\n", "conf_mat(data = results, truth = cuisine, estimate = .pred_class)\n" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " Truth\n", "Prediction chinese indian japanese korean thai\n", " chinese 83 1 8 15 10\n", " indian 4 163 1 2 6\n", " japanese 21 5 73 25 1\n", " korean 15 0 11 191 0\n", " thai 10 11 3 7 70" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 133 }, "id": "YvODvsLkK0iG", "outputId": "bb69da84-1266-47ad-b174-d43b88ca2988" } }, { "cell_type": "markdown", "source": [ "複数のクラスを扱う場合、これをヒートマップのように視覚化する方が一般的に直感的です。このように:\n" ], "metadata": { "id": "c0HfPL16Lr6U" } }, { "cell_type": "code", "execution_count": 11, "source": [ "update_geom_defaults(geom = \"tile\", new = list(color = \"black\", alpha = 0.7))\n", "# Visualize confusion matrix\n", "results %>% \n", " conf_mat(cuisine, .pred_class) %>% \n", " autoplot(type = \"heatmap\")" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "plot without title" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tMTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OUlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////isF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3deWBU9b3//0+ibApWrbYuvYorXaxoaatWvVqpqG2HsCmLBAqoVXBDjCKbKMqOQUDFFVxKqyhVFLUqWKJsxg3Lz2IFGilLiEqptMX0hpzvnJkMCbx5/W5vz5k5Z+D5/OOc85nEz3w8Mw9mMjmo84gocC7qBRDtCQGJKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQijHkLb+NUZVRb2Ahn26OeoVNCxWpyZWi/mbeGbnGNJl42PUmbNiVM+7nohRlz4ao/o/HqOuFc/sHEMasShGdfwsRt3+1qYYNXJ9jLq9MkZNE89sIMUkIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSbI+EVLQktatJvB8FpNGtv9LoqMteTx7d/f2vND6h5M3oIS07xT0XxjyhQJpx+sGNj79pbfCJgkJ6o7V72t/f4FKdFSmkRa3dnNTBgnYHNPneY7GCVPvB1ggg3ezaTZraq+C8RYvGF7a64cbW7rLIIU1sdmR8IE1ynX4957qC9pFDGtvsiDSkywsn+D0eJaRxycWkIC1pcdzYSecUzIwTpP+0YJBOONJ/CTqncP6iI49YsGjRwqMOjhrSS03GTY0PpJNaVia3P92nImJIc5vcWZqG1LVFoInCgPRCkzF3pyF1bLa8snLdd1pGC+nTOy8uvvdLr+iVEZ2KF/hv7WoTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZgHS8cf624sKF5RdO84/+plbEDGk8oWfxQjSt7/pb7vu80ngmYJBWvTa+jpIPz0ickhLFlSmIa1vVuSPR7lXI4V0w9jN6wdM94qu+fCfj3XZ5v+MVDRwi/dKl23eoPFfVD/es3pj+/e3b7xudmaYDUjD3C+fmz+6aee64Zsnfz3QdOF82BAjSFPckOV/nrFfv+AzBf6woQ7SWa3Wr18dLaRkaUiL3FB/MMdNjhLS6sTG5KbcK3ra8zYmKlKQ5nrepsQnqxKbkz8zdStblVjtedu9zDD5zyxpn+wPIUJadFsz5wp7pz5i+P1vH2i372gg7dT9+yfPz/WVwScKC1Lrlh0PdAddvyYOkJ5zd/mDN9ywKCG92b42tS9anHwvl/g4BSl9WJZINbv2ng4ls9Z7mWHye9/4cbL3QoR0T/MzRt91SWHqI4bJzh0+KdBsex6kZw/4yYzfXL7PzfGB1LKw20P3F7mL4gDpSTfVHyxzN0YJaVH77WlIS+ohpQ+XJjJv4zbNG9mhrH64uwJBeuPwE/0Xo66FTya3L44f2ragF5AatPGo7/ovRlcULo0NpLff87dd3ZwYQHrOTfIHZW54lJDWJCo876MXdgNpbWJl8usbvZotyd30wZlhFiA97VJwJrjMLL9wDwGpvrfddf7uCXdPbCCle8IFmi8kSEvcLf7gqfQLU1SQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpgVSD383Wg3+PkbHkyTugVI9ZW7/v7uEXdXbCCtXOlv73fjYgBpQ4uf+4MhrixSSFvu6NJz2rbdQdo8ruslJSu82ll9Ova6+++ZYRYgvdH8mDeSuw7usRcLT/WPLnGTgVTfxq+02pjc9Xa/jwukdwsv9AfnFbwRA0iVlzZ5p7Jy7bHfDjTXHnGt3UB32qgJFxe2XbSo2H332pLzC77zRsSQ5pWW9nADSkvfiQOkTXe6Hz/wxOWFRcFnCgbp2QkTurorJ0xYvL6Paztu1OmuX6DpgkGaO2lSN9d/0qRlle8dfPTQO37QaA6QFo06qWmjlleWLVr0Zkmrps2O7fFqoNlCgNQ7fS2ZezAWkDY9+P39Gp8wZH3wiYJB6ll3Vu5dv3ZM6xZNT5kYaLaAkHrVLWZ6ZeWiC1s0Pe2ZQLPtIZDCjau/ZVz9rQKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEiqmEDqOSJGnXJfjOpw+z0xqtOUGNU76rPRsMvEMzvHkEa+GqMuHBajznv8tRg1IOoFNGzgKzFqiHhm5xjSvZ/GqF/MjFHd3on6zWXD7ox6AQ27oypG3Sue2UCKSUCSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyhQTpraYHhzDLL/7zp/3YY9zg1MFDHQ7Z92sXz0ge3fTt5o2O6j0jWkjLTnHPhTFPGJDmt23e/OTSquAThQFpXcl/NW45bFPwiUKEVJN4f6fxpkTFrjdlGVLVmS5aSL0bH1wH6QeFF155puswc+b1BUf37HWi6xQppInNjowNpJcbtbx90jnuluAzhQEpsc9V0y9xJcEnChFS7Qdbd4W0601ZhjSp8bmRQhrWqPiyNKQS1y25/f43Z8z82qEPzJz58GEHRAnppSbjpsYG0o8O+ONnn1V9Z7+NgWcKAdJsd1ty+/Mzg78kZfGtXRLSv/29oUD6wwEll0YKadyomXWQftT0ofRND/e4zt+d7R6IEFL5ws/iA2nydH/bx/0p8EwhQOrSfF3wSVKF+9auNrFwRP++8z1v9aAuVy9Mv7WrGN6964gN3o4vZQ/SRSeujxZSsjpIh540c2aDH4tmnPDV/3TCkD5siA+kdOceGnyOECAdfW5VVWXwaapC/xmpaOAW75Uu22r7lW6rGpKGdGXptn+MKfEyX8oepIcK5n0aE0gzCs7t8/WC/S9KvQw9dNfw0/e5BkgNe9jdHnyS4JA2FfaadEzBQf0/iR+kuf5buk/+mNjoeUvSkLZ+6XmLO9RmvpT8xgVtkr0dNqQ/HdL307hAut8deuxVN15Y0Ma/qcS5Q274jyfcIyH9utlFsfjUrsId9b0Hnrqq8Gfxg7TY8zYnPi5rv93zPklDWj6kuLhboibzpeQ3lvdM9mHYkLoeviY2kB50zacndz9xtya3U6+/7LSCBJDqG7dPpw0hTBMc0l/cQWuSu8vcK7GDtCSlZX77Ws9bk4K0odPsam+pD2lJBtJuCg7pqYKHKyoquh1csS4GkGY2+6a/HeT61N3cPkUKSKmudIM+DWOeEH5GanGmv/2NuyvwTNmBtDxR6XllKUhlRTWe92j2IfVzdZ0fB0itDvO317orphQP948Gur5AqmtgQWkIs3wWCqQzjve3j7l7As+UHUjVPUq3rrs5BWllYsW/Fg5OVGUb0tsv+LU74IU34wCplytJbs8oHD+14Jv+p3ftUmMgJXvahfWJRQiQxrunk9su+7wVeKbsQPI+ur7z1e8k/uzfNKN7jylbB3bblGVI6aL9GWlonz5nu4v69Jkw86GWTdr3+6E7f+bMn7nju/c+veC4//QaoTAgzSst7eEGlJYGnyq4gcrjDipN9V7gqUKAtK71foPuLnKXB59pz7rWLmJIP657d3nVzJn3tv3KPocVJ/XM6H1046bfuGj6fzpnGJB6163rwcAzBYf0UeYt+GOBpwrjEqGP+3yt0XFj43WtXZC4+lvF1d8yrv62AUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAcl28S9jVKt2Mer4blGfjob9KOoFNOxnV8Soi8QzO8eQ7loVo7r/JUYNf/KNGDXkTzFq1PoYNU08s3MMafLaGNUz6rcJDbvtmWUxanhFjIrV+8z7xDMbSDEJSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIs/yBt6n1EoUsFpFwHJFn+Qbp437a9+6UCUq4Dkiz/IH312WwBAtL/FpBk+QdpvyogRRWQZPkH6ezXgRRVQJLlH6S3f7gYSBEFJFn+QTrzv9x+R6cCUq4Dkiz/IJ3dNhOQch2QZPkHKfsBSQUkWT5C+uyFBx56+Qsg5T4gyfIP0vZBjfzLGvYfD6ScByRZ/kEa7zo+/OIL91/gHgVSrgOSLP8gfeuG9P6K7wEp1wFJln+QmsxP7+c1A1KuA5Is/yDt/3x6/2xzIOU6IMnyD9JZP672d9vanQukXAckWf5Bmldw1JWjbr/8iMJXgZTrgCTLP0jeb7/pf/z93XnZcgQkGZBkeQjJ89a/VV6ZNUZA0gFJlpeQshyQVECS5RmkVqO9VjsCUq4DkizPIJ1W6p22IyDlOiDJ8gxSTgKSCkiy/IPU5sP0/ulvASnXAUmWf5BceWr3P7c1BlKuA5Is3yC5+rhoNecBSZZvkN6/2xWl/uuQl434C5ByHZBk+QbJ8y74U7YAAel/C0iy/IPkbZyS3FTdtglIOQ9IsvyDtPIw/1OGCnfYaiDlOiDJ8g9Sh+Pf8ncfHt8JSLkOSLL8g3ToI+n9/S2AlOuAJMs/SM2eSO9/tV9MIc07t3nzk8ZW+Ie/P9k9GT2kkvSvC86OGtJrmV9cjF+2bNoPvtL4xMFLo4T0/Dn773/SmDUVFdenV3Vm9JCWneKeC2OefwvSjy6o8Xdf/ODMzC01ifdjBOnZfY8eNuYsd2PycHSzI+IA6ZeFd/n9OmpIbw5JdX7Br5ZNKmx1402nuCsihPTbfY8eOvosN6iiom/hWL+ZkUOa2OzIHEJ6ueDYASNH9Dm08OXMLbUfbI0RpNNbvLt2bcW391uz9rdNRk2KA6TuBwSfI10Yb+1eP7TDsmXfOLJs2bJFRx8cIaTTWrxdUbHmW/utqri4RaCJQoP0UpNxU3MIyXuljf9CfHJc/4bs+Lv9bbFbvrbsd2tjAelnRwafI10YkC458NVliwdO9A9/7sqigzRusr/t6d6ruPDweEAqX/hZTiF53mcf/H8N/4vF/lu7iuHdu47Y4FUnXh7cr+9SLzOuTSwc0b/vfM/bPL5Xl8GrPO+1qzoX31u9Y5gFSOnOPiS1iwWks1tVVa0NPk1VKJCeLCzJHC5tfVigqcL4sOHsQyoqzjyxomJlDCAlyzGkXfIhXVm67R9jSpKH1/3Ve7XDlszYKxq4xXulyzZv0Pgvqh/vWb2x/fvbN143OzNM/sP/XJfsy7Ah3eeGxQfSKcd0PsgdNOgvsYB0/qFvpPZvzH34gn3HRg3pHje0ouLklkUHuoOu/WhvgrTbvyHrQ9qatLC4Q21N4jnP2971lczYK5rreZsSn6xKbE7+LNWtbFVidfLrXmaY/IcXtEn2dsiQZjZrVxEfSMcU9pj5UEf30+AzBYf0ZOGg9MFU5w4vDTZXcEiPNDt/TUVFy8JL7r8n4S7YmyDt9m/I+pCWDyku7paoqUksS95w1azM2CtanHxbl/i4LJFqdu09HUpmrfcyw+T3rrg52c4XSQSGNGqfotVr4wPp/RX+trubG3im4JC6Nn49ffC7icPPL/hFtJBu36f9x8ndknJ/cLF7ai+CtNuSkDZ0ml3tLfUh+f9fzCt+nRl7RUtSkJYmquu+edO8kR3K6oe7Kyikfu7aT9bGCFK637hRgecIDGnp13/UYNTXzYgSUl93zZ/rR4+6EUB6v6yoxvMe9SE97XnVnV/LjDOQ1iZWJr9xo1ezJbmbPjgzzAqkqwvG7jiOBaTVq/3tQ25i4JkCQ3rE3eLvXip5xN/d5YZGCGlAwZj0wYoV/vYeN3ovgrR/g3b8DdkkpJWJFf9aODhRVZMYUFE9q+PfMuMMJG9oSVXNi10+f7XPx7Wbh0zJDLMB6VduZP0gDpA+KLzI37UtWBI9pKvdr/zd7wq/tyS56+qmRgfp8cwr0LLCdv7u3ILX9yJIXZO1anRG5w6nFLS5ugEkb0b3HlO2Duy2IfHiTZ37lXuZ8aYMpM3jul5SssKrndWnY6+7/54ZZgHSmmMPHDvOb8naOePGXeJ+OW7cm9FCqurnzp8w+gx3efCZAkP6uft9at/bnXz9ze0KTloSGaRVxxw4JnVBw6KK3u680SN/6PoEmS4MSPNKS3u4AaWl7+QAUrLZJ23wdyu/ObchpB2H74g5/g8FgvR+5oKyB9deWnc0LWJIG8efckDTU0uDTxQc0tmF6f3Swa2aNjuu5+uBJgsE6d3M4/RAxeo7Tm7RtPW4QI7CgNQ788zJDaSTnkrv72tdd8P2jxI7PnWLHlLYcfW3jKu/Vf8WpMavpfezm9TdsLDDqFog5SQgyfIP0hGXpna1XQ8PTgZI/7eAJMs/SLe67147atSAb7nBQMp1QJLlH6TacYf7P5EdMrwGSLkOSLL8g5Sk9Mmypau3Z4sRkHRAkuUjpG1vzfnU+x8g5T4gyfIQ0sQWzi3xhvwia5SApAKSLP8gPeDaT09CenTf8UDKdUCS5R+kk6/0tiUhebecCKRcByRZ/kFq+moa0u8aASnXAUmWf5C+9nwa0lMHACnXAUmWf5B+cs4/fUifn9QOSLkOSLL8g/T6Psdf5/r2PqDRm0DKdUCS5R8k77VT/Ssbfvj7bDkCkgxIsjyE5Hmb3ntvc9YYAUkHJFn+QToje/+JVSD9LwFJln+QvjEJSFEFJFn+QXruW7/9F5CiCUiy/IN09ndd4yOO9gNSrgOSLP8gnXle27qAlOuAJMs/SNkPSCogyfIO0rZlb24BUkQBSZZvkCa3cK5R/y/FNwIpuwFJlmeQnnEtbxh2lrtafCOQshuQZHkG6eyW/v8utm+jvwEpioAkyzNIzYf727dc1i5YBdL/X0CS5Rkkd7+/3eBeFt8JpKwGJFm+QXrQ3250LwEpioAkAxKQ/v2AJMs3SLcsSTbPlfo7IOU6IMnyDVLDgJTrgCTLM0i3NgxIuQ5IsjyDlJOApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZOvQM0Yd/4sYdWq7TjHqB5fGqJ/2iVEXimd2jiFNWR+jij+PUaOWboxRiWkxanTUj03DYvKKBCQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiRbUEhvtHZP+/sbXKqzIodUduEBTdr8KoSJAkNa1No9s+tRFJBGHOWuSx1cf3zjxifcsPNtkUEK7XEKBVJN4p1oIY1tdkQa0uWFE/wejxrS2y2OmzD53IIngs8UFNK45Kl5ZpejKCB1a3xQGs2V7shLLj1s35sa3hYZpPAepz0C0twmd5amIXVtEWii0CB1bvbh559vOumY4DMFhPR8k9GT03zqj6KANKjRJT3TaA498K5p0ya0aNXwtsgghfc47RGQFr22vg7ST4+IBaSqZh393Wj3euCpAkJaPH9jHZ/6oygg3XrLtDSaMe4sf9y2YHz9bZFBCvFxCg1SzbCRNX8d36tzyYfe9sTv+k32No/v1WXwKs+rGN6964gNXm1i4Yj+fednBVKyOkhntVq/fnX0kJa54f5urpsaeKrgHzbU84kQUrI0mjvcef6gixtYf1tkkEJ8nEKDVFrypTfo1i1fPtz1b17RwFX/9AaN/6L68Z7V3pWl2/4xpsRL3rjFe6XLtuS3f74s2d+yAql1y44HuoOuXxMxpBfc3f5uiRsReKo9DdLU/Y7yB23cZTGAFOLjFBakJ/p/4a1OrPW86osXeEVPet6qxGbPq+1W5m390vMWd6j1iuZ63qbEJ8lvX9Am2dtZgdSysNtD9xe5iyKG9Iy7z9+9424KPNWeBmlawv33yNsuaOH6xgBSiI9TSJDGJv7geW+2r00O+v/GKyrzvLJEqtne8iHFxd0SNV7RYs/bnPcB240AABDoSURBVPg4+R2rpyRblxVIb7/nb7u6OdFCmucm+7vF7tbAU+1xkO4+r8C5b13qrowBpBAfp5Ag9RsxsKYO0lVPeEVLPG9pojr1tQ2dZlcnBzWpG9OQdlNYkNI94UZGC+ltN8zfzUn/gReoPQ7StGljS+5M/ow0LAaQQnycQoJUvrXPI94a/43bts7zU2bWJlYmv7LRKyuq8bxHcwVp5Up/e78bFy2kT1sk/N1wtzjwVHsgJL/v7jclBpBCfJxC+7BhRYd3vZKRX2y7r+c/Uma8oSVVNS92+XxlYsW/Fg5OVOUE0ruFF/qD8wreiBbS58VNln/++YZjvxN8pj0O0umHTp42bXDhORZX7iGF+DiF93ukx4u3VN3R89Lbkj/8pCBtHtf1kpIVnjeje48pWwd225RFSM9OmNDVXTlhwuL1fVzbcaNOd/0CTRcCpD98teXwMT9sNDf4TAEhzZ04sZu7auLEpQ2OooB0Q48ep7u2PXqMnHZFwQnFHZp/dWzD2yKDFN7jtEdca9czfYWdu3f92jGtWzQ9ZWKg2UK51m7ZRS2anvFcCBMFhFRcd2rua3AUBaSz6u69z7Rpfb7RqPlpd+58W1SQwnuc9ghIIcfV3zKu/lYByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSRUTSB2LY9SJfWJUm469Y1TLs2NUIurHpmEXimd2jiFNq4xRvaP+c79ht77zWYwatSlGXV8eo24Xz2wgxSQgyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSLagkBa1dnNSBwvaHdDke49FCym5mGd2PYoW0pz/PrjJSZM+DT5RcEh/cnXNjBbSgsw6JpSXzzq7eeOT7oo1pKIlOw1rEu9nBdK4ZkekIS1pcdzYSecUzIwSkr+YZ3Y5ihbSrMKTx0443Q2OA6R1d6UqKng9WkiLh6Y6v2BW+Zz9j7p56GkFE+MKafnHBlLtB1uzAemFJmPuTkPq2Gx5ZeW677SMENLzTUZPTvOpP4oYUsuj13322cbjD40DpHSrDy8OPkkIb+0WHtqxvPyCpi+Vly894RtxhXTbiwaSLBikJQsq05DWNyvyx6Pcq9FBWjx/Yx2f+qNoIVXe8YS/6+HWxQbSZQd/FHySECB1PXB++bKm5/uHg9wT8YQ0pH2n672iV0Z0Kl7geRXDu3cdsSFrb+0q6yAtckP9wRw3OTpIyer5xAJSuk9P+0bwSUKC9Gbh2BBmCQ5pduHN5eVPuwH+8XQ3Ip6QvH7+K9I1H/7zsS7bvCtLt/1jTEkdpPXPJKvKBqTn3F3+4A03DEg7tWH5y50bzQw+T0iQOhz+lxBmCQ6p3aGLyssfcMP846fc1XGG9LTnbUxUeFu/9LzFHWrTkBa0SfZ2NiA96ab6g2XuRiDt1DPOHfWbEOYJB9KbhXeGMU1gSLMLb0xup7nb/MGz7vI4Q1rseZsTH3vLhxQXd0vUZP8VaZI/KHPDgbRTH/1qaseCgcHnCQfS5Y1XhzFNYEjdGi9Mbh90Q/3BU+6aOENakoK0odPsam9pBtJuCgnSEneLP3gq/cIEpJ0a5F4NPEcokCqP+EkY0wSG9NbXz/R3c1x/f3dP+oUp3pDKimo879HsQ9rQ4uf+YIgrA1J9fxz3O3/3azc5HpBecpPCmCYwpBnpl6Jl+5/n7wa4p2IKqf/Df89AWplY8a+FgxNV2YZUeWmTdyor1x777UBz7WmQPio8syq5+6V7Jh6Qhrvgv4z1CwrpGjcrte/Q+Pny8kX/dUKgybIIaW7nPhlI3ozuPaZsHdhtQ3YgzZ00qZvrP2nSssr3Dj566B0/aDQnQkhzJ07s5q6aOHFpg6NoIX12nfvhqImdCr5fFQ9I3d2aMKYJDCnhFqb28w48csCgk/edHldI/5eCQepVd9nU9MrKRRe2aHraM4FmCwipuG4x9zU4ihjSp5NObrb/t66uCD5TKJAuKAxjluCQ/ruw7uDpc/Zvcup9wSbbIyCFHFd/y7j6WwUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkVUwgzXg8Rg2MegENGzY16hU07NqoF9CwWD1O08QzO8eQiPbMgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQghIRCGUl5CmTY56BQ2ad+emqJdQ34d3Lo16CfV9eeesqJfQoBl3ZnX6vISUaBf1Chp0R5uPo15Cfa+2eTzqJdS3tc3VUS+hQb2/n9XpgRQ0IKmAFPeApAKSDEg2IKmAJAMSUfwDElEIAYkohPIDUtGS1K4m8X7ECzFL2JSoyPmqYnAaTDWJd6Jegq7u6ZMpK+cvryDVfrA14oWYJSQh5XxVMTgNpthCWv6xgZSV85dXkGJYElLUS4hFsYV024u5efrEHNKnd15cfO+XXtErIzoVL/Bfk2sTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZruGS1g9qMvVC9Nv7SqGd+86YoO340vZXkPmDqsTLw/u13epZxaQ69PjQ6oZNrLmr+N7dS750Nue+F2/yTvuNadnZ+eGtO90febpk1nH3vjW7oaxm9cPmO4VXfPhPx/rss0/A0UDt3ivdNnmDRr/RfXjPas3tn9/+8brZmeGWV9QgyXU9ivdVjUkDenK0m3/GFPi7Vhd1tdQd4c1iev+6r3aYYtZQK5Pjw+ptORLb9CtW758uOvfkutY9c8d95rTs7NL/fxXpPTTp/6k7XWQVic2JjflXtHTnrcx/ZQtmuu/n/pkVWJz8s1ut7JVidWet93LDLO+ogZL+KO/uCXpVW390vMWd6jNfCn7a6i7w5rEc8l//a6v7LqAnJ+eJKQn+n+RfMDWel71xQu8oie9+nvN6dnZpRSk9NOn/qTtdZDebF+b2hctTr5ZSXycehanD8sSqWbX3tOhZNZ6LzPM+ooaLqH9ds/7JA1p+ZDi4m6JmsyXsr+GujusSSxL3nDVrF0XkPPTU5MYm/hD5gHr/xuvKIl2x73m9OzsUgpS3f3uOGl7HaRF/nPVS/+0mIGUPlyayLxP2TRvZIey+mGWa7CE+f6TZk0K0oZOs6u9pf5TZUluIGXusCaRfI54V/x61wXk/PTUJPqNGFhTB+mqJ1LryNxrbs/OLvV7ccfTp/6k7XWQ1vifiX30wm4grU2sTH59o1ezJbmbPjgzzHoNlrA8Uen/qetDKiuq8bxHcwgpc4c1ieS7lurOr+26gJyfnppE+dY+jyQfsOQbt22d56fWkbnX3J6dXWoAqf6k7XWQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpj1BTVYQnWP0q3rbk5BWplY8a+FgxNVOYOUucOaxICK6lkd/2YWkOvT43/YsKLDu17JyC+23dfzH+lPnOvuNbdnZ5f6P/z3zP3Wn7S9D9KWO7r0nLZtd5A2j+t6SckKr3ZWn4697v57Zpj1Gi7ho+s7X/1O4s/+TTO695iydWC3TbmClLnDDYkXb+rcr9wzC8j16Un9Hunx4i1Vd/S89LZ1db+6ydxrTs/OLs3t3GfHA7bjpO19kGg3NfgTNba/B93rAlLetf0j/zPtdECKS0DKuxZ2GFWbOQZSXAISUQgBiSiEgEQUQkAiCiEgEYUQkPK3X7pMp+32622Pzu169uqAlL+9PnXq1Gtd5+TWXNb9nv+4AimHASm/e92V7u7mKUDKcUDK7+ognXn28984w2vd2j8u+qp3QfLtXhuv7XFrLmze/JLsX8lLQMr36iCdd/I373mhHtKfilz5h17blq1HP3tjwS+iXeFeEpDyuzpIbd2c5HYHJK+f23Hjj74W4fL2noCU32UgNf6XZyE19a/J61UY4fL2noCU32UgHeFvd4V0tD/sx0OcizjL+V0G0tH+FkjRxVnO73aCdOpJ/vY0IEUQZzm/2wnSeYckfyja1CwJ6TL3P0DKaZzl/G4nSJPdmMp3f/ydJKQR7rangZTLOMv53U6Qqm84sknr5we08Ly/nNqoFZByGWeZKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYXQ/wMhANIDIZLX1QAAAABJRU5ErkJggg==" }, "metadata": { "image/png": { "width": 420, "height": 420 } } } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 436 }, "id": "HsAtwukyLsvt", "outputId": "3032a224-a2c8-4270-b4f2-7bb620317400" } }, { "cell_type": "markdown", "source": [ "混同行列プロットの中で、濃い色の四角は多くのケースを示しており、予測ラベルと実際のラベルが一致しているケースを示す濃い色の四角が対角線上にあることが確認できるはずです。\n", "\n", "次に、混同行列の要約統計量を計算してみましょう。\n" ], "metadata": { "id": "oOJC87dkLwPr" } }, { "cell_type": "code", "execution_count": 12, "source": [ "# Summary stats for confusion matrix\n", "conf_mat(data = results, truth = cuisine, estimate = .pred_class) %>% \n", "summary()" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " .metric .estimator .estimate\n", "1 accuracy multiclass 0.7880435\n", "2 kap multiclass 0.7276583\n", "3 sens macro 0.7780927\n", "4 spec macro 0.9477598\n", "5 ppv macro 0.7585583\n", "6 npv macro 0.9460080\n", "7 mcc multiclass 0.7292724\n", "8 j_index macro 0.7258524\n", "9 bal_accuracy macro 0.8629262\n", "10 detection_prevalence macro 0.2000000\n", "11 precision macro 0.7585583\n", "12 recall macro 0.7780927\n", "13 f_meas macro 0.7641862" ], "text/markdown": [ "\n", "A tibble: 13 × 3\n", "\n", "| .metric <chr> | .estimator <chr> | .estimate <dbl> |\n", "|---|---|---|\n", "| accuracy | multiclass | 0.7880435 |\n", "| kap | multiclass | 0.7276583 |\n", "| sens | macro | 0.7780927 |\n", "| spec | macro | 0.9477598 |\n", "| ppv | macro | 0.7585583 |\n", "| npv | macro | 0.9460080 |\n", "| mcc | multiclass | 0.7292724 |\n", "| j_index | macro | 0.7258524 |\n", "| bal_accuracy | macro | 0.8629262 |\n", "| detection_prevalence | macro | 0.2000000 |\n", "| precision | macro | 0.7585583 |\n", "| recall | macro | 0.7780927 |\n", "| f_meas | macro | 0.7641862 |\n", "\n" ], "text/latex": [ "A tibble: 13 × 3\n", "\\begin{tabular}{lll}\n", " .metric & .estimator & .estimate\\\\\n", " & & \\\\\n", "\\hline\n", "\t accuracy & multiclass & 0.7880435\\\\\n", "\t kap & multiclass & 0.7276583\\\\\n", "\t sens & macro & 0.7780927\\\\\n", "\t spec & macro & 0.9477598\\\\\n", "\t ppv & macro & 0.7585583\\\\\n", "\t npv & macro & 0.9460080\\\\\n", "\t mcc & multiclass & 0.7292724\\\\\n", "\t j\\_index & macro & 0.7258524\\\\\n", "\t bal\\_accuracy & macro & 0.8629262\\\\\n", "\t detection\\_prevalence & macro & 0.2000000\\\\\n", "\t precision & macro & 0.7585583\\\\\n", "\t recall & macro & 0.7780927\\\\\n", "\t f\\_meas & macro & 0.7641862\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 13 × 3
.metric.estimator.estimate
<chr><chr><dbl>
accuracy multiclass0.7880435
kap multiclass0.7276583
sens macro 0.7780927
spec macro 0.9477598
ppv macro 0.7585583
npv macro 0.9460080
mcc multiclass0.7292724
j_index macro 0.7258524
bal_accuracy macro 0.8629262
detection_prevalencemacro 0.2000000
precision macro 0.7585583
recall macro 0.7780927
f_meas macro 0.7641862
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 494 }, "id": "OYqetUyzL5Wz", "outputId": "6a84d65e-113d-4281-dfc1-16e8b70f37e6" } }, { "cell_type": "markdown", "source": [ "もし、精度、感度、PPVといったいくつかの指標に絞って考えるなら、スタートとしては悪くないですね 🥳!\n", "\n", "## 4. さらに深掘り\n", "\n", "ここで一つ微妙な質問をしてみましょう:あるタイプの料理を予測結果として選ぶ基準は何でしょうか?\n", "\n", "実は、ロジスティック回帰のような統計的機械学習アルゴリズムは、`確率`に基づいています。つまり、分類器が実際に予測しているのは、可能性のある結果の集合に対する確率分布です。そして、最も高い確率を持つクラスが、与えられた観測データに対して最も可能性の高い結果として選ばれます。\n", "\n", "これを実際に確認するために、ハードなクラス予測と確率の両方を試してみましょう。\n" ], "metadata": { "id": "43t7vz8vMJtW" } }, { "cell_type": "code", "execution_count": 13, "source": [ "# Make hard class prediction and probabilities\n", "results_prob <- cuisines_test %>%\n", " select(cuisine) %>% \n", " bind_cols(mr_fit %>% predict(new_data = cuisines_test)) %>% \n", " bind_cols(mr_fit %>% predict(new_data = cuisines_test, type = \"prob\"))\n", "\n", "# Print out results\n", "results_prob %>% \n", " slice_head(n = 5)" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " cuisine .pred_class .pred_chinese .pred_indian .pred_japanese .pred_korean\n", "1 indian thai 1.551259e-03 0.4587877 5.988039e-04 2.428503e-04\n", "2 indian indian 2.637133e-05 0.9999488 6.648651e-07 2.259993e-05\n", "3 indian indian 1.049433e-03 0.9909982 1.060937e-03 1.644947e-05\n", "4 indian indian 6.237482e-02 0.4763035 9.136702e-02 3.660913e-01\n", "5 indian indian 1.431745e-02 0.9418551 2.945239e-02 8.721782e-03\n", " .pred_thai \n", "1 5.388194e-01\n", "2 1.577948e-06\n", "3 6.874989e-03\n", "4 3.863391e-03\n", "5 5.653283e-03" ], "text/markdown": [ "\n", "A tibble: 5 × 7\n", "\n", "| cuisine <fct> | .pred_class <fct> | .pred_chinese <dbl> | .pred_indian <dbl> | .pred_japanese <dbl> | .pred_korean <dbl> | .pred_thai <dbl> |\n", "|---|---|---|---|---|---|---|\n", "| indian | thai | 1.551259e-03 | 0.4587877 | 5.988039e-04 | 2.428503e-04 | 5.388194e-01 |\n", "| indian | indian | 2.637133e-05 | 0.9999488 | 6.648651e-07 | 2.259993e-05 | 1.577948e-06 |\n", "| indian | indian | 1.049433e-03 | 0.9909982 | 1.060937e-03 | 1.644947e-05 | 6.874989e-03 |\n", "| indian | indian | 6.237482e-02 | 0.4763035 | 9.136702e-02 | 3.660913e-01 | 3.863391e-03 |\n", "| indian | indian | 1.431745e-02 | 0.9418551 | 2.945239e-02 | 8.721782e-03 | 5.653283e-03 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 7\n", "\\begin{tabular}{lllllll}\n", " cuisine & .pred\\_class & .pred\\_chinese & .pred\\_indian & .pred\\_japanese & .pred\\_korean & .pred\\_thai\\\\\n", " & & & & & & \\\\\n", "\\hline\n", "\t indian & thai & 1.551259e-03 & 0.4587877 & 5.988039e-04 & 2.428503e-04 & 5.388194e-01\\\\\n", "\t indian & indian & 2.637133e-05 & 0.9999488 & 6.648651e-07 & 2.259993e-05 & 1.577948e-06\\\\\n", "\t indian & indian & 1.049433e-03 & 0.9909982 & 1.060937e-03 & 1.644947e-05 & 6.874989e-03\\\\\n", "\t indian & indian & 6.237482e-02 & 0.4763035 & 9.136702e-02 & 3.660913e-01 & 3.863391e-03\\\\\n", "\t indian & indian & 1.431745e-02 & 0.9418551 & 2.945239e-02 & 8.721782e-03 & 5.653283e-03\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 7
cuisine.pred_class.pred_chinese.pred_indian.pred_japanese.pred_korean.pred_thai
<fct><fct><dbl><dbl><dbl><dbl><dbl>
indianthai 1.551259e-030.45878775.988039e-042.428503e-045.388194e-01
indianindian2.637133e-050.99994886.648651e-072.259993e-051.577948e-06
indianindian1.049433e-030.99099821.060937e-031.644947e-056.874989e-03
indianindian6.237482e-020.47630359.136702e-023.660913e-013.863391e-03
indianindian1.431745e-020.94185512.945239e-028.721782e-035.653283e-03
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 248 }, "id": "xdKNs-ZPMTJL", "outputId": "68f6ac5a-725a-4eff-9ea6-481fef00e008" } }, { "cell_type": "markdown", "source": [ "もっと良くなりました!\n", "\n", "✅ なぜモデルが最初の観察がタイ料理だと確信しているのか説明できますか?\n", "\n", "## **🚀チャレンジ**\n", "\n", "このレッスンでは、クリーンアップしたデータを使用して、材料の組み合わせに基づいて国の料理を予測する機械学習モデルを構築しました。Tidymodelsが提供する[多くのオプション](https://www.tidymodels.org/find/parsnip/#models)を読んでデータを分類する方法や、[他の方法](https://parsnip.tidymodels.org/articles/articles/Examples.html#multinom_reg-models)で多項ロジスティック回帰を適合させる方法を確認してください。\n", "\n", "#### 感謝の言葉:\n", "\n", "[`Allison Horst`](https://twitter.com/allison_horst/) さん、Rをより親しみやすく魅力的にする素晴らしいイラストを作成してくださりありがとうございます。彼女の[ギャラリー](https://www.google.com/url?q=https://github.com/allisonhorst/stats-illustrations&sa=D&source=editors&ust=1626380772530000&usg=AOvVaw3zcfyCizFQZpkSLzxiiQEM)でさらに多くのイラストをご覧ください。\n", "\n", "[Cassie Breviu](https://www.twitter.com/cassieview) さんと [Jen Looper](https://www.twitter.com/jenlooper) さん、モジュールの元となるPython版を作成してくださりありがとうございます ♥️\n", "\n", "
\n", "食べ物のダジャレは理解できないので、ジョークを入れるのはやめておきます 😅。\n", "\n", "
\n", "\n", "楽しい学びを!\n", "\n", "[Eric](https://twitter.com/ericntay)、Gold Microsoft Learn Student Ambassador\n" ], "metadata": { "id": "2tWVHMeLMYdM" } }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n---\n\n**免責事項**: \nこの文書は、AI翻訳サービス [Co-op Translator](https://github.com/Azure/co-op-translator) を使用して翻訳されています。正確性を追求しておりますが、自動翻訳には誤りや不正確な部分が含まれる可能性があります。元の言語で記載された原文が正式な情報源と見なされるべきです。重要な情報については、専門の人間による翻訳を推奨します。本翻訳の利用に起因する誤解や誤認について、当社は一切の責任を負いません。\n" ] } ] }