{ "nbformat": 4, "nbformat_minor": 2, "metadata": { "colab": { "name": "lesson_11-R.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "kernelspec": { "name": "ir", "display_name": "R" }, "language_info": { "name": "R" }, "coopTranslator": { "original_hash": "6ea6a5171b1b99b7b5a55f7469c048d2", "translation_date": "2025-09-04T02:24:35+00:00", "source_file": "4-Classification/2-Classifiers-1/solution/R/lesson_11-R.ipynb", "language_code": "fr" } }, "cells": [ { "cell_type": "markdown", "source": [ "# Construire un modèle de classification : Délicieuses cuisines asiatiques et indiennes\n" ], "metadata": { "id": "zs2woWv_HoE8" } }, { "cell_type": "markdown", "source": [ "## Classificateurs de cuisine 1\n", "\n", "Dans cette leçon, nous allons explorer une variété de classificateurs pour *prédire une cuisine nationale donnée en fonction d'un groupe d'ingrédients.* Ce faisant, nous découvrirons certaines des façons dont les algorithmes peuvent être utilisés pour des tâches de classification.\n", "\n", "### [**Quiz avant le cours**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/21/)\n", "\n", "### **Préparation**\n", "\n", "Cette leçon s'appuie sur notre [leçon précédente](https://github.com/microsoft/ML-For-Beginners/blob/main/4-Classification/1-Introduction/solution/lesson_10-R.ipynb) où nous avons :\n", "\n", "- Fait une introduction douce aux classifications en utilisant un jeu de données sur toutes les brillantes cuisines d'Asie et d'Inde 😋.\n", "\n", "- Exploré quelques [verbes dplyr](https://dplyr.tidyverse.org/) pour préparer et nettoyer nos données.\n", "\n", "- Réalisé de belles visualisations avec ggplot2.\n", "\n", "- Montré comment gérer des données déséquilibrées en les prétraitant avec [recipes](https://recipes.tidymodels.org/articles/Simple_Example.html).\n", "\n", "- Démontré comment `prep` et `bake` notre recette pour confirmer qu'elle fonctionne comme prévu.\n", "\n", "#### **Prérequis**\n", "\n", "Pour cette leçon, nous aurons besoin des packages suivants pour nettoyer, préparer et visualiser nos données :\n", "\n", "- `tidyverse` : Le [tidyverse](https://www.tidyverse.org/) est une [collection de packages R](https://www.tidyverse.org/packages) conçue pour rendre la science des données plus rapide, plus facile et plus amusante !\n", "\n", "- `tidymodels` : Le framework [tidymodels](https://www.tidymodels.org/) est une [collection de packages](https://www.tidymodels.org/packages/) pour la modélisation et l'apprentissage automatique.\n", "\n", "- `themis` : Le package [themis](https://themis.tidymodels.org/) fournit des étapes supplémentaires pour les recettes afin de gérer les données déséquilibrées.\n", "\n", "- `nnet` : Le package [nnet](https://cran.r-project.org/web/packages/nnet/nnet.pdf) propose des fonctions pour estimer des réseaux de neurones à propagation avant avec une seule couche cachée, ainsi que des modèles de régression logistique multinomiale.\n", "\n", "Vous pouvez les installer comme suit :\n" ], "metadata": { "id": "iDFOb3ebHwQC" } }, { "cell_type": "markdown", "source": [ "`install.packages(c(\"tidyverse\", \"tidymodels\", \"DataExplorer\", \"here\"))`\n", "\n", "Alternativement, le script ci-dessous vérifie si vous avez les packages nécessaires pour compléter ce module et les installe pour vous s'ils sont manquants.\n" ], "metadata": { "id": "4V85BGCjII7F" } }, { "cell_type": "code", "execution_count": 2, "source": [ "suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\r\n", "\r\n", "pacman::p_load(tidyverse, tidymodels, themis, here)" ], "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "Loading required package: pacman\n", "\n" ] } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "an5NPyyKIKNR", "outputId": "834d5e74-f4b8-49f9-8ab5-4c52ff2d7bc8" } }, { "cell_type": "markdown", "source": [ "## 1. Divisez les données en ensembles d'entraînement et de test.\n", "\n", "Commençons par reprendre quelques étapes de notre leçon précédente.\n", "\n", "### Supprimez les ingrédients les plus courants qui créent de la confusion entre les cuisines distinctes, en utilisant `dplyr::select()`.\n", "\n", "Tout le monde adore le riz, l'ail et le gingembre !\n" ], "metadata": { "id": "0ax9GQLBINVv" } }, { "cell_type": "code", "execution_count": 3, "source": [ "# Load the original cuisines data\r\n", "df <- read_csv(file = \"https://raw.githubusercontent.com/microsoft/ML-For-Beginners/main/4-Classification/data/cuisines.csv\")\r\n", "\r\n", "# Drop id column, rice, garlic and ginger from our original data set\r\n", "df_select <- df %>% \r\n", " select(-c(1, rice, garlic, ginger)) %>%\r\n", " # Encode cuisine column as categorical\r\n", " mutate(cuisine = factor(cuisine))\r\n", "\r\n", "# Display new data set\r\n", "df_select %>% \r\n", " slice_head(n = 5)\r\n", "\r\n", "# Display distribution of cuisines\r\n", "df_select %>% \r\n", " count(cuisine) %>% \r\n", " arrange(desc(n))" ], "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "New names:\n", "* `` -> ...1\n", "\n", "\u001b[1m\u001b[1mRows: \u001b[1m\u001b[22m\u001b[34m\u001b[34m2448\u001b[34m\u001b[39m \u001b[1m\u001b[1mColumns: \u001b[1m\u001b[22m\u001b[34m\u001b[34m385\u001b[34m\u001b[39m\n", "\n", "\u001b[36m──\u001b[39m \u001b[1m\u001b[1mColumn specification\u001b[1m\u001b[22m \u001b[36m────────────────────────────────────────────────────────\u001b[39m\n", "\u001b[1mDelimiter:\u001b[22m \",\"\n", "\u001b[31mchr\u001b[39m (1): cuisine\n", "\u001b[32mdbl\u001b[39m (384): ...1, almond, angelica, anise, anise_seed, apple, apple_brandy, a...\n", "\n", "\n", "\u001b[36mℹ\u001b[39m Use \u001b[30m\u001b[47m\u001b[30m\u001b[47m`spec()`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to retrieve the full column specification for this data.\n", "\u001b[36mℹ\u001b[39m Specify the column types or set \u001b[30m\u001b[47m\u001b[30m\u001b[47m`show_col_types = FALSE`\u001b[47m\u001b[30m\u001b[49m\u001b[39m to quiet this message.\n", "\n" ] }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n", "1 indian 0 0 0 0 0 0 0 0 \n", "2 indian 1 0 0 0 0 0 0 0 \n", "3 indian 0 0 0 0 0 0 0 0 \n", "4 indian 0 0 0 0 0 0 0 0 \n", "5 indian 0 0 0 0 0 0 0 0 \n", " artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n", "1 0 ⋯ 0 0 0 0 0 0 \n", "2 0 ⋯ 0 0 0 0 0 0 \n", "3 0 ⋯ 0 0 0 0 0 0 \n", "4 0 ⋯ 0 0 0 0 0 0 \n", "5 0 ⋯ 0 0 0 0 0 0 \n", " yam yeast yogurt zucchini\n", "1 0 0 0 0 \n", "2 0 0 0 0 \n", "3 0 0 0 0 \n", "4 0 0 0 0 \n", "5 0 0 1 0 " ], "text/markdown": [ "\n", "A tibble: 5 × 381\n", "\n", "| cuisine <fct> | almond <dbl> | angelica <dbl> | anise <dbl> | anise_seed <dbl> | apple <dbl> | apple_brandy <dbl> | apricot <dbl> | armagnac <dbl> | artemisia <dbl> | ⋯ ⋯ | whiskey <dbl> | white_bread <dbl> | white_wine <dbl> | whole_grain_wheat_flour <dbl> | wine <dbl> | wood <dbl> | yam <dbl> | yeast <dbl> | yogurt <dbl> | zucchini <dbl> |\n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 381\n", "\\begin{tabular}{lllllllllllllllllllll}\n", " cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n", " & & & & & & & & & & ⋯ & & & & & & & & & & \\\\\n", "\\hline\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t indian & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 381
cuisinealmondangelicaaniseanise_seedappleapple_brandyapricotarmagnacartemisiawhiskeywhite_breadwhite_winewhole_grain_wheat_flourwinewoodyamyeastyogurtzucchini
<fct><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
indian0000000000000000000
indian1000000000000000000
indian0000000000000000000
indian0000000000000000000
indian0000000000000000010
\n" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine n \n", "1 korean 799\n", "2 indian 598\n", "3 chinese 442\n", "4 japanese 320\n", "5 thai 289" ], "text/markdown": [ "\n", "A tibble: 5 × 2\n", "\n", "| cuisine <fct> | n <int> |\n", "|---|---|\n", "| korean | 799 |\n", "| indian | 598 |\n", "| chinese | 442 |\n", "| japanese | 320 |\n", "| thai | 289 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 2\n", "\\begin{tabular}{ll}\n", " cuisine & n\\\\\n", " & \\\\\n", "\\hline\n", "\t korean & 799\\\\\n", "\t indian & 598\\\\\n", "\t chinese & 442\\\\\n", "\t japanese & 320\\\\\n", "\t thai & 289\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 2
cuisinen
<fct><int>
korean 799
indian 598
chinese 442
japanese320
thai 289
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 735 }, "id": "jhCrrH22IWVR", "outputId": "d444a85c-1d8b-485f-bc4f-8be2e8f8217c" } }, { "cell_type": "markdown", "source": [ "Parfait ! Maintenant, il est temps de diviser les données de manière à ce que 70 % des données soient utilisées pour l'entraînement et 30 % pour les tests. Nous appliquerons également une technique de `stratification` lors de la division des données afin de `maintenir la proportion de chaque cuisine` dans les ensembles de données d'entraînement et de validation.\n", "\n", "[rsample](https://rsample.tidymodels.org/), un package de Tidymodels, offre une infrastructure pour une division et un échantillonnage efficaces des données :\n" ], "metadata": { "id": "AYTjVyajIdny" } }, { "cell_type": "code", "execution_count": 4, "source": [ "# Load the core Tidymodels packages into R session\r\n", "library(tidymodels)\r\n", "\r\n", "# Create split specification\r\n", "set.seed(2056)\r\n", "cuisines_split <- initial_split(data = df_select,\r\n", " strata = cuisine,\r\n", " prop = 0.7)\r\n", "\r\n", "# Extract the data in each split\r\n", "cuisines_train <- training(cuisines_split)\r\n", "cuisines_test <- testing(cuisines_split)\r\n", "\r\n", "# Print the number of cases in each split\r\n", "cat(\"Training cases: \", nrow(cuisines_train), \"\\n\",\r\n", " \"Test cases: \", nrow(cuisines_test), sep = \"\")\r\n", "\r\n", "# Display the first few rows of the training set\r\n", "cuisines_train %>% \r\n", " slice_head(n = 5)\r\n", "\r\n", "\r\n", "# Display distribution of cuisines in the training set\r\n", "cuisines_train %>% \r\n", " count(cuisine) %>% \r\n", " arrange(desc(n))" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Training cases: 1712\n", "Test cases: 736" ] }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine almond angelica anise anise_seed apple apple_brandy apricot armagnac\n", "1 chinese 0 0 0 0 0 0 0 0 \n", "2 chinese 0 0 0 0 0 0 0 0 \n", "3 chinese 0 0 0 0 0 0 0 0 \n", "4 chinese 0 0 0 0 0 0 0 0 \n", "5 chinese 0 0 0 0 0 0 0 0 \n", " artemisia ⋯ whiskey white_bread white_wine whole_grain_wheat_flour wine wood\n", "1 0 ⋯ 0 0 0 0 1 0 \n", "2 0 ⋯ 0 0 0 0 1 0 \n", "3 0 ⋯ 0 0 0 0 0 0 \n", "4 0 ⋯ 0 0 0 0 0 0 \n", "5 0 ⋯ 0 0 0 0 0 0 \n", " yam yeast yogurt zucchini\n", "1 0 0 0 0 \n", "2 0 0 0 0 \n", "3 0 0 0 0 \n", "4 0 0 0 0 \n", "5 0 0 0 0 " ], "text/markdown": [ "\n", "A tibble: 5 × 381\n", "\n", "| cuisine <fct> | almond <dbl> | angelica <dbl> | anise <dbl> | anise_seed <dbl> | apple <dbl> | apple_brandy <dbl> | apricot <dbl> | armagnac <dbl> | artemisia <dbl> | ⋯ ⋯ | whiskey <dbl> | white_bread <dbl> | white_wine <dbl> | whole_grain_wheat_flour <dbl> | wine <dbl> | wood <dbl> | yam <dbl> | yeast <dbl> | yogurt <dbl> | zucchini <dbl> |\n", "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "| chinese | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 381\n", "\\begin{tabular}{lllllllllllllllllllll}\n", " cuisine & almond & angelica & anise & anise\\_seed & apple & apple\\_brandy & apricot & armagnac & artemisia & ⋯ & whiskey & white\\_bread & white\\_wine & whole\\_grain\\_wheat\\_flour & wine & wood & yam & yeast & yogurt & zucchini\\\\\n", " & & & & & & & & & & ⋯ & & & & & & & & & & \\\\\n", "\\hline\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\t chinese & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & ⋯ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 381
cuisinealmondangelicaaniseanise_seedappleapple_brandyapricotarmagnacartemisiawhiskeywhite_breadwhite_winewhole_grain_wheat_flourwinewoodyamyeastyogurtzucchini
<fct><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
chinese0000000000000100000
chinese0000000000000100000
chinese0000000000000000000
chinese0000000000000000000
chinese0000000000000000000
\n" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ " cuisine n \n", "1 korean 559\n", "2 indian 418\n", "3 chinese 309\n", "4 japanese 224\n", "5 thai 202" ], "text/markdown": [ "\n", "A tibble: 5 × 2\n", "\n", "| cuisine <fct> | n <int> |\n", "|---|---|\n", "| korean | 559 |\n", "| indian | 418 |\n", "| chinese | 309 |\n", "| japanese | 224 |\n", "| thai | 202 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 2\n", "\\begin{tabular}{ll}\n", " cuisine & n\\\\\n", " & \\\\\n", "\\hline\n", "\t korean & 559\\\\\n", "\t indian & 418\\\\\n", "\t chinese & 309\\\\\n", "\t japanese & 224\\\\\n", "\t thai & 202\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 2
cuisinen
<fct><int>
korean 559
indian 418
chinese 309
japanese224
thai 202
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 535 }, "id": "w5FWIkEiIjdN", "outputId": "2e195fd9-1a8f-4b91-9573-cce5582242df" } }, { "cell_type": "markdown", "source": [ "## 2. Gérer les données déséquilibrées\n", "\n", "Comme vous avez peut-être remarqué dans le jeu de données original ainsi que dans notre ensemble d'entraînement, il y a une distribution assez inégale dans le nombre de cuisines. Les cuisines coréennes sont *presque* 3 fois plus nombreuses que les cuisines thaïlandaises. Les données déséquilibrées ont souvent des effets négatifs sur les performances du modèle. De nombreux modèles fonctionnent mieux lorsque le nombre d'observations est équilibré et, par conséquent, ont tendance à rencontrer des difficultés avec des données déséquilibrées.\n", "\n", "Il existe principalement deux façons de gérer des ensembles de données déséquilibrés :\n", "\n", "- ajouter des observations à la classe minoritaire : `Sur-échantillonnage`, par exemple en utilisant un algorithme SMOTE qui génère de manière synthétique de nouveaux exemples de la classe minoritaire en utilisant les plus proches voisins de ces cas.\n", "\n", "- supprimer des observations de la classe majoritaire : `Sous-échantillonnage`\n", "\n", "Dans notre leçon précédente, nous avons démontré comment gérer des ensembles de données déséquilibrés en utilisant une `recette`. Une recette peut être considérée comme un plan décrivant les étapes à appliquer à un ensemble de données pour le préparer à l'analyse. Dans notre cas, nous souhaitons obtenir une distribution égale du nombre de nos cuisines pour notre `ensemble d'entraînement`. Passons directement à la pratique.\n" ], "metadata": { "id": "daBi9qJNIwqW" } }, { "cell_type": "code", "execution_count": 5, "source": [ "# Load themis package for dealing with imbalanced data\r\n", "library(themis)\r\n", "\r\n", "# Create a recipe for preprocessing training data\r\n", "cuisines_recipe <- recipe(cuisine ~ ., data = cuisines_train) %>% \r\n", " step_smote(cuisine)\r\n", "\r\n", "# Print recipe\r\n", "cuisines_recipe" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "Data Recipe\n", "\n", "Inputs:\n", "\n", " role #variables\n", " outcome 1\n", " predictor 380\n", "\n", "Operations:\n", "\n", "SMOTE based on cuisine" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 200 }, "id": "Az6LFBGxI1X0", "outputId": "29d71d85-64b0-4e62-871e-bcd5398573b6" } }, { "cell_type": "markdown", "source": [ "Vous pouvez bien sûr confirmer (en utilisant prep+bake) que la recette fonctionnera comme prévu - toutes les étiquettes de cuisine ayant `559` observations.\n", "\n", "Étant donné que nous utiliserons cette recette comme préprocesseur pour la modélisation, un `workflow()` effectuera tout le travail de préparation et de cuisson pour nous, donc nous n'aurons pas à estimer manuellement la recette.\n", "\n", "Nous sommes maintenant prêts à entraîner un modèle 👩‍💻👨‍💻 !\n", "\n", "## 3. Choisir votre classificateur\n", "\n", "

\n", " \n", "

Illustration par @allison_horst
\n" ], "metadata": { "id": "NBL3PqIWJBBB" } }, { "cell_type": "markdown", "source": [ "Maintenant, nous devons décider quel algorithme utiliser pour cette tâche 🤔.\n", "\n", "Dans Tidymodels, le [`package parsnip`](https://parsnip.tidymodels.org/index.html) offre une interface cohérente pour travailler avec des modèles à travers différents moteurs (packages). Veuillez consulter la documentation de parsnip pour explorer les [types de modèles et moteurs](https://www.tidymodels.org/find/parsnip/#models) ainsi que leurs [arguments de modèles correspondants](https://www.tidymodels.org/find/parsnip/#model-args). La variété peut sembler déroutante au premier abord. Par exemple, les méthodes suivantes incluent toutes des techniques de classification :\n", "\n", "- Modèles de classification basés sur des règles C5.0\n", "\n", "- Modèles discriminants flexibles\n", "\n", "- Modèles discriminants linéaires\n", "\n", "- Modèles discriminants régularisés\n", "\n", "- Modèles de régression logistique\n", "\n", "- Modèles de régression multinomiale\n", "\n", "- Modèles de Bayes naïfs\n", "\n", "- Machines à vecteurs de support\n", "\n", "- Plus proches voisins\n", "\n", "- Arbres de décision\n", "\n", "- Méthodes d'ensemble\n", "\n", "- Réseaux de neurones\n", "\n", "Et la liste continue !\n", "\n", "### **Quel classifieur choisir ?**\n", "\n", "Alors, quel classifieur devriez-vous choisir ? Souvent, tester plusieurs modèles et rechercher un bon résultat est une méthode pour évaluer.\n", "\n", "> AutoML résout ce problème de manière élégante en effectuant ces comparaisons dans le cloud, vous permettant de choisir le meilleur algorithme pour vos données. Essayez-le [ici](https://docs.microsoft.com/learn/modules/automate-model-selection-with-azure-automl/?WT.mc_id=academic-77952-leestott)\n", "\n", "Le choix du classifieur dépend également de notre problème. Par exemple, lorsque le résultat peut être catégorisé en `plus de deux classes`, comme dans notre cas, vous devez utiliser un `algorithme de classification multiclasses` plutôt qu'une `classification binaire.`\n", "\n", "### **Une meilleure approche**\n", "\n", "Une meilleure approche que de deviner au hasard est de suivre les idées présentées dans cette [fiche pratique ML téléchargeable](https://docs.microsoft.com/azure/machine-learning/algorithm-cheat-sheet?WT.mc_id=academic-77952-leestott). Ici, nous découvrons que, pour notre problème multiclasses, nous avons plusieurs options :\n", "\n", "

\n", " \n", "

Une section de la fiche pratique des algorithmes de Microsoft, détaillant les options de classification multiclasses
\n" ], "metadata": { "id": "a6DLAZ3vJZ14" } }, { "cell_type": "markdown", "source": [ "### **Raisonnement**\n", "\n", "Voyons si nous pouvons réfléchir à différentes approches en fonction des contraintes que nous avons :\n", "\n", "- **Les réseaux neuronaux profonds sont trop lourds**. Étant donné notre ensemble de données propre mais minimal, et le fait que nous exécutons l'entraînement localement via des notebooks, les réseaux neuronaux profonds sont trop lourds pour cette tâche.\n", "\n", "- **Pas de classificateur à deux classes**. Nous n'utilisons pas de classificateur à deux classes, ce qui exclut l'approche un-contre-tous.\n", "\n", "- **Un arbre de décision ou une régression logistique pourraient fonctionner**. Un arbre de décision pourrait convenir, ou une régression multinomiale/régression logistique multiclasses pour des données multiclasses.\n", "\n", "- **Les arbres de décision multiclasses boostés résolvent un problème différent**. L'arbre de décision multiclasses boosté est le plus adapté aux tâches non paramétriques, par exemple les tâches conçues pour établir des classements, donc il ne nous est pas utile.\n", "\n", "En général, avant de se lancer dans des modèles d'apprentissage automatique plus complexes, comme les méthodes d'ensemble, il est judicieux de construire le modèle le plus simple possible pour avoir une idée de ce qui se passe. Donc, pour cette leçon, nous commencerons par un modèle de `régression multinomiale`.\n", "\n", "> La régression logistique est une technique utilisée lorsque la variable de résultat est catégorielle (ou nominale). Pour la régression logistique binaire, le nombre de variables de résultat est de deux, tandis que pour la régression logistique multinomiale, le nombre de variables de résultat est supérieur à deux. Voir [Méthodes de régression avancées](https://bookdown.org/chua/ber642_advanced_regression/multinomial-logistic-regression.html) pour en savoir plus.\n", "\n", "## 4. Entraîner et évaluer un modèle de régression logistique multinomiale.\n", "\n", "Dans Tidymodels, `parsnip::multinom_reg()` définit un modèle qui utilise des prédicteurs linéaires pour prédire des données multiclasses en utilisant la distribution multinomiale. Consultez `?multinom_reg()` pour découvrir les différentes façons/moteurs que vous pouvez utiliser pour ajuster ce modèle.\n", "\n", "Pour cet exemple, nous ajusterons un modèle de régression multinomiale via le moteur par défaut [nnet](https://cran.r-project.org/web/packages/nnet/nnet.pdf).\n", "\n", "> J'ai choisi une valeur pour `penalty` un peu au hasard. Il existe de meilleures façons de choisir cette valeur, notamment en utilisant la `validation croisée` et en `ajustant` le modèle, ce que nous aborderons plus tard.\n", ">\n", "> Consultez [Tidymodels : Commencer](https://www.tidymodels.org/start/tuning/) si vous souhaitez en savoir plus sur l'ajustement des hyperparamètres du modèle.\n" ], "metadata": { "id": "gWMsVcbBJemu" } }, { "cell_type": "code", "execution_count": 6, "source": [ "# Create a multinomial regression model specification\r\n", "mr_spec <- multinom_reg(penalty = 1) %>% \r\n", " set_engine(\"nnet\", MaxNWts = 2086) %>% \r\n", " set_mode(\"classification\")\r\n", "\r\n", "# Print model specification\r\n", "mr_spec" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "Multinomial Regression Model Specification (classification)\n", "\n", "Main Arguments:\n", " penalty = 1\n", "\n", "Engine-Specific Arguments:\n", " MaxNWts = 2086\n", "\n", "Computational engine: nnet \n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 166 }, "id": "Wq_fcyQiJvfG", "outputId": "c30449c7-3864-4be7-f810-72a003743e2d" } }, { "cell_type": "markdown", "source": [ "Excellent travail 🥳 ! Maintenant que nous avons une recette et une spécification de modèle, nous devons trouver un moyen de les regrouper dans un objet qui prétraitera d'abord les données, puis ajustera le modèle sur les données prétraitées, tout en permettant également des activités de post-traitement potentielles. Dans Tidymodels, cet objet pratique s'appelle un [`workflow`](https://workflows.tidymodels.org/) et contient commodément vos composants de modélisation ! C'est ce que nous appellerions des *pipelines* en *Python*.\n", "\n", "Alors, regroupons tout dans un workflow !📦\n" ], "metadata": { "id": "NlSbzDfgJ0zh" } }, { "cell_type": "code", "execution_count": 7, "source": [ "# Bundle recipe and model specification\r\n", "mr_wf <- workflow() %>% \r\n", " add_recipe(cuisines_recipe) %>% \r\n", " add_model(mr_spec)\r\n", "\r\n", "# Print out workflow\r\n", "mr_wf" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "══ Workflow ════════════════════════════════════════════════════════════════════\n", "\u001b[3mPreprocessor:\u001b[23m Recipe\n", "\u001b[3mModel:\u001b[23m multinom_reg()\n", "\n", "── Preprocessor ────────────────────────────────────────────────────────────────\n", "1 Recipe Step\n", "\n", "• step_smote()\n", "\n", "── Model ───────────────────────────────────────────────────────────────────────\n", "Multinomial Regression Model Specification (classification)\n", "\n", "Main Arguments:\n", " penalty = 1\n", "\n", "Engine-Specific Arguments:\n", " MaxNWts = 2086\n", "\n", "Computational engine: nnet \n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 333 }, "id": "Sc1TfPA4Ke3_", "outputId": "82c70013-e431-4e7e-cef6-9fcf8aad4a6c" } }, { "cell_type": "markdown", "source": [ "Les flux de travail 👌👌 ! Un **`workflow()`** peut être ajusté de la même manière qu'un modèle. Alors, il est temps d'entraîner un modèle !\n" ], "metadata": { "id": "TNQ8i85aKf9L" } }, { "cell_type": "code", "execution_count": 8, "source": [ "# Train a multinomial regression model\n", "mr_fit <- fit(object = mr_wf, data = cuisines_train)\n", "\n", "mr_fit" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "══ Workflow [trained] ══════════════════════════════════════════════════════════\n", "\u001b[3mPreprocessor:\u001b[23m Recipe\n", "\u001b[3mModel:\u001b[23m multinom_reg()\n", "\n", "── Preprocessor ────────────────────────────────────────────────────────────────\n", "1 Recipe Step\n", "\n", "• step_smote()\n", "\n", "── Model ───────────────────────────────────────────────────────────────────────\n", "Call:\n", "nnet::multinom(formula = ..y ~ ., data = data, decay = ~1, MaxNWts = ~2086, \n", " trace = FALSE)\n", "\n", "Coefficients:\n", " (Intercept) almond angelica anise anise_seed apple\n", "indian 0.19723325 0.2409661 0 -5.004955e-05 -0.1657635 -0.05769734\n", "japanese 0.13961959 -0.6262400 0 -1.169155e-04 -0.4893596 -0.08585717\n", "korean 0.22377347 -0.1833485 0 -5.560395e-05 -0.2489401 -0.15657804\n", "thai -0.04336577 -0.6106258 0 4.903828e-04 -0.5782866 0.63451105\n", " apple_brandy apricot armagnac artemisia artichoke asparagus\n", "indian 0 0.37042636 0 -0.09122797 0 -0.27181970\n", "japanese 0 0.28895643 0 -0.12651100 0 0.14054037\n", "korean 0 -0.07981259 0 0.55756709 0 -0.66979948\n", "thai 0 -0.33160904 0 -0.10725182 0 -0.02602152\n", " avocado bacon baked_potato balm banana barley\n", "indian -0.46624197 0.16008055 0 0 -0.2838796 0.2230625\n", "japanese 0.90341344 0.02932727 0 0 -0.4142787 2.0953906\n", "korean -0.06925382 -0.35804134 0 0 -0.2686963 -0.7233404\n", "thai -0.21473955 -0.75594439 0 0 0.6784880 -0.4363320\n", " bartlett_pear basil bay bean beech\n", "indian 0 -0.7128756 0.1011587 -0.8777275 -0.0004380795\n", "japanese 0 0.1288697 0.9425626 -0.2380748 0.3373437611\n", "korean 0 -0.2445193 -0.4744318 -0.8957870 -0.0048784496\n", "thai 0 1.5365848 0.1333256 0.2196970 -0.0113078024\n", " beef beef_broth beef_liver beer beet\n", "indian -0.7985278 0.2430186 -0.035598065 -0.002173738 0.01005813\n", "japanese 0.2241875 -0.3653020 -0.139551027 0.128905553 0.04923911\n", "korean 0.5366515 -0.6153237 0.213455197 -0.010828645 0.27325423\n", "thai 0.1570012 -0.9364154 -0.008032213 -0.035063746 -0.28279823\n", " bell_pepper bergamot berry bitter_orange black_bean\n", "indian 0.49074330 0 0.58947607 0.191256164 -0.1945233\n", "japanese 0.09074167 0 -0.25917977 -0.118915977 -0.3442400\n", "korean -0.57876763 0 -0.07874180 -0.007729435 -0.5220672\n", "thai 0.92554006 0 -0.07210196 -0.002983296 -0.4614426\n", " black_currant black_mustard_seed_oil black_pepper black_raspberry\n", "indian 0 0.38935801 -0.4453495 0\n", "japanese 0 -0.05452887 -0.5440869 0\n", "korean 0 -0.03929970 0.8025454 0\n", "thai 0 -0.21498372 -0.9854806 0\n", " black_sesame_seed black_tea blackberry blackberry_brandy\n", "indian -0.2759246 0.3079977 0.191256164 0\n", "japanese -0.6101687 -0.1671913 -0.118915977 0\n", "korean 1.5197674 -0.3036261 -0.007729435 0\n", "thai -0.1755656 -0.1487033 -0.002983296 0\n", " blue_cheese blueberry bone_oil bourbon_whiskey brandy\n", "indian 0 0.216164294 -0.2276744 0 0.22427587\n", "japanese 0 -0.119186087 0.3913019 0 -0.15595599\n", "korean 0 -0.007821986 0.2854487 0 -0.02562342\n", "thai 0 -0.004947048 -0.0253658 0 -0.05715244\n", "\n", "...\n", "and 308 more lines." ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "GMbdfVmTKkJI", "outputId": "adf9ebdf-d69d-4a64-e9fd-e06e5322292e" } }, { "cell_type": "markdown", "source": [ "Les résultats affichent les coefficients que le modèle a appris pendant l'entraînement.\n", "\n", "### Évaluer le modèle entraîné\n", "\n", "Il est temps de voir comment le modèle s'est comporté 📏 en l'évaluant sur un jeu de test ! Commençons par faire des prédictions sur le jeu de test.\n" ], "metadata": { "id": "tt2BfOxrKmcJ" } }, { "cell_type": "code", "execution_count": 9, "source": [ "# Make predictions on the test set\n", "results <- cuisines_test %>% select(cuisine) %>% \n", " bind_cols(mr_fit %>% predict(new_data = cuisines_test))\n", "\n", "# Print out results\n", "results %>% \n", " slice_head(n = 5)" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " cuisine .pred_class\n", "1 indian thai \n", "2 indian indian \n", "3 indian indian \n", "4 indian indian \n", "5 indian indian " ], "text/markdown": [ "\n", "A tibble: 5 × 2\n", "\n", "| cuisine <fct> | .pred_class <fct> |\n", "|---|---|\n", "| indian | thai |\n", "| indian | indian |\n", "| indian | indian |\n", "| indian | indian |\n", "| indian | indian |\n", "\n" ], "text/latex": [ "A tibble: 5 × 2\n", "\\begin{tabular}{ll}\n", " cuisine & .pred\\_class\\\\\n", " & \\\\\n", "\\hline\n", "\t indian & thai \\\\\n", "\t indian & indian\\\\\n", "\t indian & indian\\\\\n", "\t indian & indian\\\\\n", "\t indian & indian\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 2
cuisine.pred_class
<fct><fct>
indianthai
indianindian
indianindian
indianindian
indianindian
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 248 }, "id": "CqtckvtsKqax", "outputId": "e57fe557-6a68-4217-fe82-173328c5436d" } }, { "cell_type": "markdown", "source": [ "Beau travail ! Dans Tidymodels, l'évaluation des performances des modèles peut être effectuée à l'aide de [yardstick](https://yardstick.tidymodels.org/) - un package utilisé pour mesurer l'efficacité des modèles à l'aide de métriques de performance. Comme nous l'avons fait dans notre leçon sur la régression logistique, commençons par calculer une matrice de confusion.\n" ], "metadata": { "id": "8w5N6XsBKss7" } }, { "cell_type": "code", "execution_count": 10, "source": [ "# Confusion matrix for categorical data\n", "conf_mat(data = results, truth = cuisine, estimate = .pred_class)\n" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " Truth\n", "Prediction chinese indian japanese korean thai\n", " chinese 83 1 8 15 10\n", " indian 4 163 1 2 6\n", " japanese 21 5 73 25 1\n", " korean 15 0 11 191 0\n", " thai 10 11 3 7 70" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 133 }, "id": "YvODvsLkK0iG", "outputId": "bb69da84-1266-47ad-b174-d43b88ca2988" } }, { "cell_type": "markdown", "source": [ "Lorsqu'on traite plusieurs classes, il est généralement plus intuitif de visualiser cela comme une carte thermique, comme ceci :\n" ], "metadata": { "id": "c0HfPL16Lr6U" } }, { "cell_type": "code", "execution_count": 11, "source": [ "update_geom_defaults(geom = \"tile\", new = list(color = \"black\", alpha = 0.7))\n", "# Visualize confusion matrix\n", "results %>% \n", " conf_mat(cuisine, .pred_class) %>% \n", " autoplot(type = \"heatmap\")" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "plot without title" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tMTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OUlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////isF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3deWBU9b3//0+ibApWrbYuvYorXaxoaatWvVqpqG2HsCmLBAqoVXBDjCKbKMqOQUDFFVxKqyhVFLUqWKJsxg3Lz2IFGilLiEqptMX0hpzvnJkMCbx5/W5vz5k5Z+D5/OOc85nEz3w8Mw9mMjmo84gocC7qBRDtCQGJKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQijHkLb+NUZVRb2Ahn26OeoVNCxWpyZWi/mbeGbnGNJl42PUmbNiVM+7nohRlz4ao/o/HqOuFc/sHEMasShGdfwsRt3+1qYYNXJ9jLq9MkZNE89sIMUkIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSbI+EVLQktatJvB8FpNGtv9LoqMteTx7d/f2vND6h5M3oIS07xT0XxjyhQJpx+sGNj79pbfCJgkJ6o7V72t/f4FKdFSmkRa3dnNTBgnYHNPneY7GCVPvB1ggg3ezaTZraq+C8RYvGF7a64cbW7rLIIU1sdmR8IE1ynX4957qC9pFDGtvsiDSkywsn+D0eJaRxycWkIC1pcdzYSecUzIwTpP+0YJBOONJ/CTqncP6iI49YsGjRwqMOjhrSS03GTY0PpJNaVia3P92nImJIc5vcWZqG1LVFoInCgPRCkzF3pyF1bLa8snLdd1pGC+nTOy8uvvdLr+iVEZ2KF/hv7WoTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZgHS8cf624sKF5RdO84/+plbEDGk8oWfxQjSt7/pb7vu80ngmYJBWvTa+jpIPz0ickhLFlSmIa1vVuSPR7lXI4V0w9jN6wdM94qu+fCfj3XZ5v+MVDRwi/dKl23eoPFfVD/es3pj+/e3b7xudmaYDUjD3C+fmz+6aee64Zsnfz3QdOF82BAjSFPckOV/nrFfv+AzBf6woQ7SWa3Wr18dLaRkaUiL3FB/MMdNjhLS6sTG5KbcK3ra8zYmKlKQ5nrepsQnqxKbkz8zdStblVjtedu9zDD5zyxpn+wPIUJadFsz5wp7pz5i+P1vH2i372gg7dT9+yfPz/WVwScKC1Lrlh0PdAddvyYOkJ5zd/mDN9ywKCG92b42tS9anHwvl/g4BSl9WJZINbv2ng4ls9Z7mWHye9/4cbL3QoR0T/MzRt91SWHqI4bJzh0+KdBsex6kZw/4yYzfXL7PzfGB1LKw20P3F7mL4gDpSTfVHyxzN0YJaVH77WlIS+ohpQ+XJjJv4zbNG9mhrH64uwJBeuPwE/0Xo66FTya3L44f2ragF5AatPGo7/ovRlcULo0NpLff87dd3ZwYQHrOTfIHZW54lJDWJCo876MXdgNpbWJl8usbvZotyd30wZlhFiA97VJwJrjMLL9wDwGpvrfddf7uCXdPbCCle8IFmi8kSEvcLf7gqfQLU1SQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpgVSD383Wg3+PkbHkyTugVI9ZW7/v7uEXdXbCCtXOlv73fjYgBpQ4uf+4MhrixSSFvu6NJz2rbdQdo8ruslJSu82ll9Ova6+++ZYRYgvdH8mDeSuw7usRcLT/WPLnGTgVTfxq+02pjc9Xa/jwukdwsv9AfnFbwRA0iVlzZ5p7Jy7bHfDjTXHnGt3UB32qgJFxe2XbSo2H332pLzC77zRsSQ5pWW9nADSkvfiQOkTXe6Hz/wxOWFRcFnCgbp2QkTurorJ0xYvL6Paztu1OmuX6DpgkGaO2lSN9d/0qRlle8dfPTQO37QaA6QFo06qWmjlleWLVr0Zkmrps2O7fFqoNlCgNQ7fS2ZezAWkDY9+P39Gp8wZH3wiYJB6ll3Vu5dv3ZM6xZNT5kYaLaAkHrVLWZ6ZeWiC1s0Pe2ZQLPtIZDCjau/ZVz9rQKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEgqINmAJAOSCkg2IMmApAKSDUgyIKmAZAOSDEiqmEDqOSJGnXJfjOpw+z0xqtOUGNU76rPRsMvEMzvHkEa+GqMuHBajznv8tRg1IOoFNGzgKzFqiHhm5xjSvZ/GqF/MjFHd3on6zWXD7ox6AQ27oypG3Sue2UCKSUCSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyhQTpraYHhzDLL/7zp/3YY9zg1MFDHQ7Z92sXz0ge3fTt5o2O6j0jWkjLTnHPhTFPGJDmt23e/OTSquAThQFpXcl/NW45bFPwiUKEVJN4f6fxpkTFrjdlGVLVmS5aSL0bH1wH6QeFF155puswc+b1BUf37HWi6xQppInNjowNpJcbtbx90jnuluAzhQEpsc9V0y9xJcEnChFS7Qdbd4W0601ZhjSp8bmRQhrWqPiyNKQS1y25/f43Z8z82qEPzJz58GEHRAnppSbjpsYG0o8O+ONnn1V9Z7+NgWcKAdJsd1ty+/Mzg78kZfGtXRLSv/29oUD6wwEll0YKadyomXWQftT0ofRND/e4zt+d7R6IEFL5ws/iA2nydH/bx/0p8EwhQOrSfF3wSVKF+9auNrFwRP++8z1v9aAuVy9Mv7WrGN6964gN3o4vZQ/SRSeujxZSsjpIh540c2aDH4tmnPDV/3TCkD5siA+kdOceGnyOECAdfW5VVWXwaapC/xmpaOAW75Uu22r7lW6rGpKGdGXptn+MKfEyX8oepIcK5n0aE0gzCs7t8/WC/S9KvQw9dNfw0/e5BkgNe9jdHnyS4JA2FfaadEzBQf0/iR+kuf5buk/+mNjoeUvSkLZ+6XmLO9RmvpT8xgVtkr0dNqQ/HdL307hAut8deuxVN15Y0Ma/qcS5Q274jyfcIyH9utlFsfjUrsId9b0Hnrqq8Gfxg7TY8zYnPi5rv93zPklDWj6kuLhboibzpeQ3lvdM9mHYkLoeviY2kB50zacndz9xtya3U6+/7LSCBJDqG7dPpw0hTBMc0l/cQWuSu8vcK7GDtCSlZX77Ws9bk4K0odPsam+pD2lJBtJuCg7pqYKHKyoquh1csS4GkGY2+6a/HeT61N3cPkUKSKmudIM+DWOeEH5GanGmv/2NuyvwTNmBtDxR6XllKUhlRTWe92j2IfVzdZ0fB0itDvO317orphQP948Gur5AqmtgQWkIs3wWCqQzjve3j7l7As+UHUjVPUq3rrs5BWllYsW/Fg5OVGUb0tsv+LU74IU34wCplytJbs8oHD+14Jv+p3ftUmMgJXvahfWJRQiQxrunk9su+7wVeKbsQPI+ur7z1e8k/uzfNKN7jylbB3bblGVI6aL9GWlonz5nu4v69Jkw86GWTdr3+6E7f+bMn7nju/c+veC4//QaoTAgzSst7eEGlJYGnyq4gcrjDipN9V7gqUKAtK71foPuLnKXB59pz7rWLmJIP657d3nVzJn3tv3KPocVJ/XM6H1046bfuGj6fzpnGJB6163rwcAzBYf0UeYt+GOBpwrjEqGP+3yt0XFj43WtXZC4+lvF1d8yrv62AUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAckGJBWQZECyAUkFJBmQbEBSAUkGJBuQVECSAcl28S9jVKt2Mer4blGfjob9KOoFNOxnV8Soi8QzO8eQ7loVo7r/JUYNf/KNGDXkTzFq1PoYNU08s3MMafLaGNUz6rcJDbvtmWUxanhFjIrV+8z7xDMbSDEJSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIs/yBt6n1EoUsFpFwHJFn+Qbp437a9+6UCUq4Dkiz/IH312WwBAtL/FpBk+QdpvyogRRWQZPkH6ezXgRRVQJLlH6S3f7gYSBEFJFn+QTrzv9x+R6cCUq4Dkiz/IJ3dNhOQch2QZPkHKfsBSQUkWT5C+uyFBx56+Qsg5T4gyfIP0vZBjfzLGvYfD6ScByRZ/kEa7zo+/OIL91/gHgVSrgOSLP8gfeuG9P6K7wEp1wFJln+QmsxP7+c1A1KuA5Is/yDt/3x6/2xzIOU6IMnyD9JZP672d9vanQukXAckWf5Bmldw1JWjbr/8iMJXgZTrgCTLP0jeb7/pf/z93XnZcgQkGZBkeQjJ89a/VV6ZNUZA0gFJlpeQshyQVECS5RmkVqO9VjsCUq4DkizPIJ1W6p22IyDlOiDJ8gxSTgKSCkiy/IPU5sP0/ulvASnXAUmWf5BceWr3P7c1BlKuA5Is3yC5+rhoNecBSZZvkN6/2xWl/uuQl434C5ByHZBk+QbJ8y74U7YAAel/C0iy/IPkbZyS3FTdtglIOQ9IsvyDtPIw/1OGCnfYaiDlOiDJ8g9Sh+Pf8ncfHt8JSLkOSLL8g3ToI+n9/S2AlOuAJMs/SM2eSO9/tV9MIc07t3nzk8ZW+Ie/P9k9GT2kkvSvC86OGtJrmV9cjF+2bNoPvtL4xMFLo4T0/Dn773/SmDUVFdenV3Vm9JCWneKeC2OefwvSjy6o8Xdf/ODMzC01ifdjBOnZfY8eNuYsd2PycHSzI+IA6ZeFd/n9OmpIbw5JdX7Br5ZNKmx1402nuCsihPTbfY8eOvosN6iiom/hWL+ZkUOa2OzIHEJ6ueDYASNH9Dm08OXMLbUfbI0RpNNbvLt2bcW391uz9rdNRk2KA6TuBwSfI10Yb+1eP7TDsmXfOLJs2bJFRx8cIaTTWrxdUbHmW/utqri4RaCJQoP0UpNxU3MIyXuljf9CfHJc/4bs+Lv9bbFbvrbsd2tjAelnRwafI10YkC458NVliwdO9A9/7sqigzRusr/t6d6ruPDweEAqX/hZTiF53mcf/H8N/4vF/lu7iuHdu47Y4FUnXh7cr+9SLzOuTSwc0b/vfM/bPL5Xl8GrPO+1qzoX31u9Y5gFSOnOPiS1iwWks1tVVa0NPk1VKJCeLCzJHC5tfVigqcL4sOHsQyoqzjyxomJlDCAlyzGkXfIhXVm67R9jSpKH1/3Ve7XDlszYKxq4xXulyzZv0Pgvqh/vWb2x/fvbN143OzNM/sP/XJfsy7Ah3eeGxQfSKcd0PsgdNOgvsYB0/qFvpPZvzH34gn3HRg3pHje0ouLklkUHuoOu/WhvgrTbvyHrQ9qatLC4Q21N4jnP2971lczYK5rreZsSn6xKbE7+LNWtbFVidfLrXmaY/IcXtEn2dsiQZjZrVxEfSMcU9pj5UEf30+AzBYf0ZOGg9MFU5w4vDTZXcEiPNDt/TUVFy8JL7r8n4S7YmyDt9m/I+pCWDyku7paoqUksS95w1azM2CtanHxbl/i4LJFqdu09HUpmrfcyw+T3rrg52c4XSQSGNGqfotVr4wPp/RX+trubG3im4JC6Nn49ffC7icPPL/hFtJBu36f9x8ndknJ/cLF7ai+CtNuSkDZ0ml3tLfUh+f9fzCt+nRl7RUtSkJYmquu+edO8kR3K6oe7Kyikfu7aT9bGCFK637hRgecIDGnp13/UYNTXzYgSUl93zZ/rR4+6EUB6v6yoxvMe9SE97XnVnV/LjDOQ1iZWJr9xo1ezJbmbPjgzzAqkqwvG7jiOBaTVq/3tQ25i4JkCQ3rE3eLvXip5xN/d5YZGCGlAwZj0wYoV/vYeN3ovgrR/g3b8DdkkpJWJFf9aODhRVZMYUFE9q+PfMuMMJG9oSVXNi10+f7XPx7Wbh0zJDLMB6VduZP0gDpA+KLzI37UtWBI9pKvdr/zd7wq/tyS56+qmRgfp8cwr0LLCdv7u3ILX9yJIXZO1anRG5w6nFLS5ugEkb0b3HlO2Duy2IfHiTZ37lXuZ8aYMpM3jul5SssKrndWnY6+7/54ZZgHSmmMPHDvOb8naOePGXeJ+OW7cm9FCqurnzp8w+gx3efCZAkP6uft9at/bnXz9ze0KTloSGaRVxxw4JnVBw6KK3u680SN/6PoEmS4MSPNKS3u4AaWl7+QAUrLZJ23wdyu/ObchpB2H74g5/g8FgvR+5oKyB9deWnc0LWJIG8efckDTU0uDTxQc0tmF6f3Swa2aNjuu5+uBJgsE6d3M4/RAxeo7Tm7RtPW4QI7CgNQ788zJDaSTnkrv72tdd8P2jxI7PnWLHlLYcfW3jKu/Vf8WpMavpfezm9TdsLDDqFog5SQgyfIP0hGXpna1XQ8PTgZI/7eAJMs/SLe67147atSAb7nBQMp1QJLlH6TacYf7P5EdMrwGSLkOSLL8g5Sk9Mmypau3Z4sRkHRAkuUjpG1vzfnU+x8g5T4gyfIQ0sQWzi3xhvwia5SApAKSLP8gPeDaT09CenTf8UDKdUCS5R+kk6/0tiUhebecCKRcByRZ/kFq+moa0u8aASnXAUmWf5C+9nwa0lMHACnXAUmWf5B+cs4/fUifn9QOSLkOSLL8g/T6Psdf5/r2PqDRm0DKdUCS5R8k77VT/Ssbfvj7bDkCkgxIsjyE5Hmb3ntvc9YYAUkHJFn+QToje/+JVSD9LwFJln+QvjEJSFEFJFn+QXruW7/9F5CiCUiy/IN09ndd4yOO9gNSrgOSLP8gnXle27qAlOuAJMs/SNkPSCogyfIO0rZlb24BUkQBSZZvkCa3cK5R/y/FNwIpuwFJlmeQnnEtbxh2lrtafCOQshuQZHkG6eyW/v8utm+jvwEpioAkyzNIzYf727dc1i5YBdL/X0CS5Rkkd7+/3eBeFt8JpKwGJFm+QXrQ3250LwEpioAkAxKQ/v2AJMs3SLcsSTbPlfo7IOU6IMnyDVLDgJTrgCTLM0i3NgxIuQ5IsjyDlJOApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZAOSCkgyINmApAKSDEg2IKmAJAOSDUgqIMmAZOvQM0Yd/4sYdWq7TjHqB5fGqJ/2iVEXimd2jiFNWR+jij+PUaOWboxRiWkxanTUj03DYvKKBCQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiQbkFRAkgHJBiQVkGRAsgFJBSQZkGxAUgFJBiRbUEhvtHZP+/sbXKqzIodUduEBTdr8KoSJAkNa1No9s+tRFJBGHOWuSx1cf3zjxifcsPNtkUEK7XEKBVJN4p1oIY1tdkQa0uWFE/wejxrS2y2OmzD53IIngs8UFNK45Kl5ZpejKCB1a3xQGs2V7shLLj1s35sa3hYZpPAepz0C0twmd5amIXVtEWii0CB1bvbh559vOumY4DMFhPR8k9GT03zqj6KANKjRJT3TaA498K5p0ya0aNXwtsgghfc47RGQFr22vg7ST4+IBaSqZh393Wj3euCpAkJaPH9jHZ/6oygg3XrLtDSaMe4sf9y2YHz9bZFBCvFxCg1SzbCRNX8d36tzyYfe9sTv+k32No/v1WXwKs+rGN6964gNXm1i4Yj+fednBVKyOkhntVq/fnX0kJa54f5urpsaeKrgHzbU84kQUrI0mjvcef6gixtYf1tkkEJ8nEKDVFrypTfo1i1fPtz1b17RwFX/9AaN/6L68Z7V3pWl2/4xpsRL3rjFe6XLtuS3f74s2d+yAql1y44HuoOuXxMxpBfc3f5uiRsReKo9DdLU/Y7yB23cZTGAFOLjFBakJ/p/4a1OrPW86osXeEVPet6qxGbPq+1W5m390vMWd6j1iuZ63qbEJ8lvX9Am2dtZgdSysNtD9xe5iyKG9Iy7z9+9424KPNWeBmlawv33yNsuaOH6xgBSiI9TSJDGJv7geW+2r00O+v/GKyrzvLJEqtne8iHFxd0SNV7RYs/bnPcB240AABDoSURBVPg4+R2rpyRblxVIb7/nb7u6OdFCmucm+7vF7tbAU+1xkO4+r8C5b13qrowBpBAfp5Ag9RsxsKYO0lVPeEVLPG9pojr1tQ2dZlcnBzWpG9OQdlNYkNI94UZGC+ltN8zfzUn/gReoPQ7StGljS+5M/ow0LAaQQnycQoJUvrXPI94a/43bts7zU2bWJlYmv7LRKyuq8bxHcwVp5Up/e78bFy2kT1sk/N1wtzjwVHsgJL/v7jclBpBCfJxC+7BhRYd3vZKRX2y7r+c/Uma8oSVVNS92+XxlYsW/Fg5OVOUE0ruFF/qD8wreiBbS58VNln/++YZjvxN8pj0O0umHTp42bXDhORZX7iGF+DiF93ukx4u3VN3R89Lbkj/8pCBtHtf1kpIVnjeje48pWwd225RFSM9OmNDVXTlhwuL1fVzbcaNOd/0CTRcCpD98teXwMT9sNDf4TAEhzZ04sZu7auLEpQ2OooB0Q48ep7u2PXqMnHZFwQnFHZp/dWzD2yKDFN7jtEdca9czfYWdu3f92jGtWzQ9ZWKg2UK51m7ZRS2anvFcCBMFhFRcd2rua3AUBaSz6u69z7Rpfb7RqPlpd+58W1SQwnuc9ghIIcfV3zKu/lYByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSRUTSB2LY9SJfWJUm469Y1TLs2NUIurHpmEXimd2jiFNq4xRvaP+c79ht77zWYwatSlGXV8eo24Xz2wgxSQgyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSDYgqYAkA5INSCogyYBkA5IKSDIg2YCkApIMSLagkBa1dnNSBwvaHdDke49FCym5mGd2PYoW0pz/PrjJSZM+DT5RcEh/cnXNjBbSgsw6JpSXzzq7eeOT7oo1pKIlOw1rEu9nBdK4ZkekIS1pcdzYSecUzIwSkr+YZ3Y5ihbSrMKTx0443Q2OA6R1d6UqKng9WkiLh6Y6v2BW+Zz9j7p56GkFE+MKafnHBlLtB1uzAemFJmPuTkPq2Gx5ZeW677SMENLzTUZPTvOpP4oYUsuj13322cbjD40DpHSrDy8OPkkIb+0WHtqxvPyCpi+Vly894RtxhXTbiwaSLBikJQsq05DWNyvyx6Pcq9FBWjx/Yx2f+qNoIVXe8YS/6+HWxQbSZQd/FHySECB1PXB++bKm5/uHg9wT8YQ0pH2n672iV0Z0Kl7geRXDu3cdsSFrb+0q6yAtckP9wRw3OTpIyer5xAJSuk9P+0bwSUKC9Gbh2BBmCQ5pduHN5eVPuwH+8XQ3Ip6QvH7+K9I1H/7zsS7bvCtLt/1jTEkdpPXPJKvKBqTn3F3+4A03DEg7tWH5y50bzQw+T0iQOhz+lxBmCQ6p3aGLyssfcMP846fc1XGG9LTnbUxUeFu/9LzFHWrTkBa0SfZ2NiA96ab6g2XuRiDt1DPOHfWbEOYJB9KbhXeGMU1gSLMLb0xup7nb/MGz7vI4Q1rseZsTH3vLhxQXd0vUZP8VaZI/KHPDgbRTH/1qaseCgcHnCQfS5Y1XhzFNYEjdGi9Mbh90Q/3BU+6aOENakoK0odPsam9pBtJuCgnSEneLP3gq/cIEpJ0a5F4NPEcokCqP+EkY0wSG9NbXz/R3c1x/f3dP+oUp3pDKimo879HsQ9rQ4uf+YIgrA1J9fxz3O3/3azc5HpBecpPCmCYwpBnpl6Jl+5/n7wa4p2IKqf/Df89AWplY8a+FgxNV2YZUeWmTdyor1x777UBz7WmQPio8syq5+6V7Jh6Qhrvgv4z1CwrpGjcrte/Q+Pny8kX/dUKgybIIaW7nPhlI3ozuPaZsHdhtQ3YgzZ00qZvrP2nSssr3Dj566B0/aDQnQkhzJ07s5q6aOHFpg6NoIX12nfvhqImdCr5fFQ9I3d2aMKYJDCnhFqb28w48csCgk/edHldI/5eCQepVd9nU9MrKRRe2aHraM4FmCwipuG4x9zU4ihjSp5NObrb/t66uCD5TKJAuKAxjluCQ/ruw7uDpc/Zvcup9wSbbIyCFHFd/y7j6WwUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkFZBsQJIBSQUkG5BkQFIByQYkGZBUQLIBSQYkVUwgzXg8Rg2MegENGzY16hU07NqoF9CwWD1O08QzO8eQiPbMgEQUQkAiCiEgEYUQkIhCCEhEIQQkohACElEIAYkohIBEFEJAIgohIBGFEJCIQghIRCGUl5CmTY56BQ2ad+emqJdQ34d3Lo16CfV9eeesqJfQoBl3ZnX6vISUaBf1Chp0R5uPo15Cfa+2eTzqJdS3tc3VUS+hQb2/n9XpgRQ0IKmAFPeApAKSDEg2IKmAJAMSUfwDElEIAYkohPIDUtGS1K4m8X7ECzFL2JSoyPmqYnAaTDWJd6Jegq7u6ZMpK+cvryDVfrA14oWYJSQh5XxVMTgNpthCWv6xgZSV85dXkGJYElLUS4hFsYV024u5efrEHNKnd15cfO+XXtErIzoVL/Bfk2sTC0f07zvf8zaP79Vl8CrPe+2qzsX3Vu8YZruGS1g9qMvVC9Nv7SqGd+86YoO340vZXkPmDqsTLw/u13epZxaQ69PjQ6oZNrLmr+N7dS750Nue+F2/yTvuNadnZ+eGtO90febpk1nH3vjW7oaxm9cPmO4VXfPhPx/rss0/A0UDt3ivdNnmDRr/RfXjPas3tn9/+8brZmeGWV9QgyXU9ivdVjUkDenK0m3/GFPi7Vhd1tdQd4c1iev+6r3aYYtZQK5Pjw+ptORLb9CtW758uOvfkutY9c8d95rTs7NL/fxXpPTTp/6k7XWQVic2JjflXtHTnrcx/ZQtmuu/n/pkVWJz8s1ut7JVidWet93LDLO+ogZL+KO/uCXpVW390vMWd6jNfCn7a6i7w5rEc8l//a6v7LqAnJ+eJKQn+n+RfMDWel71xQu8oie9+nvN6dnZpRSk9NOn/qTtdZDebF+b2hctTr5ZSXycehanD8sSqWbX3tOhZNZ6LzPM+ooaLqH9ds/7JA1p+ZDi4m6JmsyXsr+GujusSSxL3nDVrF0XkPPTU5MYm/hD5gHr/xuvKIl2x73m9OzsUgpS3f3uOGl7HaRF/nPVS/+0mIGUPlyayLxP2TRvZIey+mGWa7CE+f6TZk0K0oZOs6u9pf5TZUluIGXusCaRfI54V/x61wXk/PTUJPqNGFhTB+mqJ1LryNxrbs/OLvV7ccfTp/6k7XWQ1vifiX30wm4grU2sTH59o1ezJbmbPjgzzHoNlrA8Uen/qetDKiuq8bxHcwgpc4c1ieS7lurOr+26gJyfnppE+dY+jyQfsOQbt22d56fWkbnX3J6dXWoAqf6k7XWQvEEjKtddd+9uIHlDS6pqXuzy+at9Pq7dPGRKZpj1BTVYQnWP0q3rbk5BWplY8a+FgxNVOYOUucOaxICK6lkd/2YWkOvT43/YsKLDu17JyC+23dfzH+lPnOvuNbdnZ5f6P/z3zP3Wn7S9D9KWO7r0nLZtd5A2j+t6SckKr3ZWn4697v57Zpj1Gi7ho+s7X/1O4s/+TTO695iydWC3TbmClLnDDYkXb+rcr9wzC8j16Un9Hunx4i1Vd/S89LZ1db+6ydxrTs/OLs3t3GfHA7bjpO19kGg3NfgTNba/B93rAlLetf0j/zPtdECKS0DKuxZ2GFWbOQZSXAISUQgBiSiEgEQUQkAiCiEgEYUQkPK3X7pMp+32622Pzu169uqAlL+9PnXq1Gtd5+TWXNb9nv+4AimHASm/e92V7u7mKUDKcUDK7+ognXn28984w2vd2j8u+qp3QfLtXhuv7XFrLmze/JLsX8lLQMr36iCdd/I373mhHtKfilz5h17blq1HP3tjwS+iXeFeEpDyuzpIbd2c5HYHJK+f23Hjj74W4fL2noCU32UgNf6XZyE19a/J61UY4fL2noCU32UgHeFvd4V0tD/sx0OcizjL+V0G0tH+FkjRxVnO73aCdOpJ/vY0IEUQZzm/2wnSeYckfyja1CwJ6TL3P0DKaZzl/G4nSJPdmMp3f/ydJKQR7rangZTLOMv53U6Qqm84sknr5we08Ly/nNqoFZByGWeZKISARBRCQCIKISARhRCQiEIISEQhBCSiEAISUQgBiSiEgEQUQkAiCiEgEYXQ/wMhANIDIZLX1QAAAABJRU5ErkJggg==" }, "metadata": { "image/png": { "width": 420, "height": 420 } } } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 436 }, "id": "HsAtwukyLsvt", "outputId": "3032a224-a2c8-4270-b4f2-7bb620317400" } }, { "cell_type": "markdown", "source": [ "Les cases plus foncées dans le graphique de la matrice de confusion indiquent un grand nombre de cas, et vous devriez normalement voir une ligne diagonale de cases plus foncées représentant les cas où l'étiquette prédite et l'étiquette réelle sont identiques.\n", "\n", "Calculons maintenant les statistiques récapitulatives pour la matrice de confusion.\n" ], "metadata": { "id": "oOJC87dkLwPr" } }, { "cell_type": "code", "execution_count": 12, "source": [ "# Summary stats for confusion matrix\n", "conf_mat(data = results, truth = cuisine, estimate = .pred_class) %>% \n", "summary()" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " .metric .estimator .estimate\n", "1 accuracy multiclass 0.7880435\n", "2 kap multiclass 0.7276583\n", "3 sens macro 0.7780927\n", "4 spec macro 0.9477598\n", "5 ppv macro 0.7585583\n", "6 npv macro 0.9460080\n", "7 mcc multiclass 0.7292724\n", "8 j_index macro 0.7258524\n", "9 bal_accuracy macro 0.8629262\n", "10 detection_prevalence macro 0.2000000\n", "11 precision macro 0.7585583\n", "12 recall macro 0.7780927\n", "13 f_meas macro 0.7641862" ], "text/markdown": [ "\n", "A tibble: 13 × 3\n", "\n", "| .metric <chr> | .estimator <chr> | .estimate <dbl> |\n", "|---|---|---|\n", "| accuracy | multiclass | 0.7880435 |\n", "| kap | multiclass | 0.7276583 |\n", "| sens | macro | 0.7780927 |\n", "| spec | macro | 0.9477598 |\n", "| ppv | macro | 0.7585583 |\n", "| npv | macro | 0.9460080 |\n", "| mcc | multiclass | 0.7292724 |\n", "| j_index | macro | 0.7258524 |\n", "| bal_accuracy | macro | 0.8629262 |\n", "| detection_prevalence | macro | 0.2000000 |\n", "| precision | macro | 0.7585583 |\n", "| recall | macro | 0.7780927 |\n", "| f_meas | macro | 0.7641862 |\n", "\n" ], "text/latex": [ "A tibble: 13 × 3\n", "\\begin{tabular}{lll}\n", " .metric & .estimator & .estimate\\\\\n", " & & \\\\\n", "\\hline\n", "\t accuracy & multiclass & 0.7880435\\\\\n", "\t kap & multiclass & 0.7276583\\\\\n", "\t sens & macro & 0.7780927\\\\\n", "\t spec & macro & 0.9477598\\\\\n", "\t ppv & macro & 0.7585583\\\\\n", "\t npv & macro & 0.9460080\\\\\n", "\t mcc & multiclass & 0.7292724\\\\\n", "\t j\\_index & macro & 0.7258524\\\\\n", "\t bal\\_accuracy & macro & 0.8629262\\\\\n", "\t detection\\_prevalence & macro & 0.2000000\\\\\n", "\t precision & macro & 0.7585583\\\\\n", "\t recall & macro & 0.7780927\\\\\n", "\t f\\_meas & macro & 0.7641862\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 13 × 3
.metric.estimator.estimate
<chr><chr><dbl>
accuracy multiclass0.7880435
kap multiclass0.7276583
sens macro 0.7780927
spec macro 0.9477598
ppv macro 0.7585583
npv macro 0.9460080
mcc multiclass0.7292724
j_index macro 0.7258524
bal_accuracy macro 0.8629262
detection_prevalencemacro 0.2000000
precision macro 0.7585583
recall macro 0.7780927
f_meas macro 0.7641862
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 494 }, "id": "OYqetUyzL5Wz", "outputId": "6a84d65e-113d-4281-dfc1-16e8b70f37e6" } }, { "cell_type": "markdown", "source": [ "Si nous nous concentrons sur certains indicateurs comme la précision, la sensibilité, le ppv, nous ne sommes pas mal partis pour un début 🥳 !\n", "\n", "## 4. Approfondir\n", "\n", "Posons une question subtile : Quels critères sont utilisés pour choisir un type de cuisine donné comme résultat prédit ?\n", "\n", "Eh bien, les algorithmes d'apprentissage automatique statistiques, comme la régression logistique, sont basés sur la `probabilité` ; ce qui est réellement prédit par un classificateur, c'est une distribution de probabilité sur un ensemble de résultats possibles. La classe avec la probabilité la plus élevée est alors choisie comme le résultat le plus probable pour les observations données.\n", "\n", "Voyons cela en action en réalisant à la fois des prédictions de classes strictes et des probabilités.\n" ], "metadata": { "id": "43t7vz8vMJtW" } }, { "cell_type": "code", "execution_count": 13, "source": [ "# Make hard class prediction and probabilities\n", "results_prob <- cuisines_test %>%\n", " select(cuisine) %>% \n", " bind_cols(mr_fit %>% predict(new_data = cuisines_test)) %>% \n", " bind_cols(mr_fit %>% predict(new_data = cuisines_test, type = \"prob\"))\n", "\n", "# Print out results\n", "results_prob %>% \n", " slice_head(n = 5)" ], "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ " cuisine .pred_class .pred_chinese .pred_indian .pred_japanese .pred_korean\n", "1 indian thai 1.551259e-03 0.4587877 5.988039e-04 2.428503e-04\n", "2 indian indian 2.637133e-05 0.9999488 6.648651e-07 2.259993e-05\n", "3 indian indian 1.049433e-03 0.9909982 1.060937e-03 1.644947e-05\n", "4 indian indian 6.237482e-02 0.4763035 9.136702e-02 3.660913e-01\n", "5 indian indian 1.431745e-02 0.9418551 2.945239e-02 8.721782e-03\n", " .pred_thai \n", "1 5.388194e-01\n", "2 1.577948e-06\n", "3 6.874989e-03\n", "4 3.863391e-03\n", "5 5.653283e-03" ], "text/markdown": [ "\n", "A tibble: 5 × 7\n", "\n", "| cuisine <fct> | .pred_class <fct> | .pred_chinese <dbl> | .pred_indian <dbl> | .pred_japanese <dbl> | .pred_korean <dbl> | .pred_thai <dbl> |\n", "|---|---|---|---|---|---|---|\n", "| indian | thai | 1.551259e-03 | 0.4587877 | 5.988039e-04 | 2.428503e-04 | 5.388194e-01 |\n", "| indian | indian | 2.637133e-05 | 0.9999488 | 6.648651e-07 | 2.259993e-05 | 1.577948e-06 |\n", "| indian | indian | 1.049433e-03 | 0.9909982 | 1.060937e-03 | 1.644947e-05 | 6.874989e-03 |\n", "| indian | indian | 6.237482e-02 | 0.4763035 | 9.136702e-02 | 3.660913e-01 | 3.863391e-03 |\n", "| indian | indian | 1.431745e-02 | 0.9418551 | 2.945239e-02 | 8.721782e-03 | 5.653283e-03 |\n", "\n" ], "text/latex": [ "A tibble: 5 × 7\n", "\\begin{tabular}{lllllll}\n", " cuisine & .pred\\_class & .pred\\_chinese & .pred\\_indian & .pred\\_japanese & .pred\\_korean & .pred\\_thai\\\\\n", " & & & & & & \\\\\n", "\\hline\n", "\t indian & thai & 1.551259e-03 & 0.4587877 & 5.988039e-04 & 2.428503e-04 & 5.388194e-01\\\\\n", "\t indian & indian & 2.637133e-05 & 0.9999488 & 6.648651e-07 & 2.259993e-05 & 1.577948e-06\\\\\n", "\t indian & indian & 1.049433e-03 & 0.9909982 & 1.060937e-03 & 1.644947e-05 & 6.874989e-03\\\\\n", "\t indian & indian & 6.237482e-02 & 0.4763035 & 9.136702e-02 & 3.660913e-01 & 3.863391e-03\\\\\n", "\t indian & indian & 1.431745e-02 & 0.9418551 & 2.945239e-02 & 8.721782e-03 & 5.653283e-03\\\\\n", "\\end{tabular}\n" ], "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A tibble: 5 × 7
cuisine.pred_class.pred_chinese.pred_indian.pred_japanese.pred_korean.pred_thai
<fct><fct><dbl><dbl><dbl><dbl><dbl>
indianthai 1.551259e-030.45878775.988039e-042.428503e-045.388194e-01
indianindian2.637133e-050.99994886.648651e-072.259993e-051.577948e-06
indianindian1.049433e-030.99099821.060937e-031.644947e-056.874989e-03
indianindian6.237482e-020.47630359.136702e-023.660913e-013.863391e-03
indianindian1.431745e-020.94185512.945239e-028.721782e-035.653283e-03
\n" ] }, "metadata": {} } ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 248 }, "id": "xdKNs-ZPMTJL", "outputId": "68f6ac5a-725a-4eff-9ea6-481fef00e008" } }, { "cell_type": "markdown", "source": [ "✅ Pouvez-vous expliquer pourquoi le modèle est assez sûr que la première observation est en thaï ?\n", "\n", "## **🚀Défi**\n", "\n", "Dans cette leçon, vous avez utilisé vos données nettoyées pour construire un modèle d'apprentissage automatique capable de prédire une cuisine nationale en fonction d'une série d'ingrédients. Prenez le temps de parcourir les [nombreuses options](https://www.tidymodels.org/find/parsnip/#models) que Tidymodels propose pour classifier les données et [d'autres méthodes](https://parsnip.tidymodels.org/articles/articles/Examples.html#multinom_reg-models) pour ajuster une régression multinomiale.\n", "\n", "#### MERCI À :\n", "\n", "[`Allison Horst`](https://twitter.com/allison_horst/) pour avoir créé les illustrations incroyables qui rendent R plus accueillant et engageant. Retrouvez plus d'illustrations dans sa [galerie](https://www.google.com/url?q=https://github.com/allisonhorst/stats-illustrations&sa=D&source=editors&ust=1626380772530000&usg=AOvVaw3zcfyCizFQZpkSLzxiiQEM).\n", "\n", "[Cassie Breviu](https://www.twitter.com/cassieview) et [Jen Looper](https://www.twitter.com/jenlooper) pour avoir créé la version originale en Python de ce module ♥️\n", "\n", "
\n", "J'aurais bien ajouté quelques blagues, mais je ne comprends pas les jeux de mots culinaires 😅.\n", "\n", "
\n", "\n", "Bon apprentissage,\n", "\n", "[Eric](https://twitter.com/ericntay), Gold Microsoft Learn Student Ambassador.\n" ], "metadata": { "id": "2tWVHMeLMYdM" } }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n---\n\n**Avertissement** : \nCe document a été traduit à l'aide du service de traduction automatique [Co-op Translator](https://github.com/Azure/co-op-translator). Bien que nous nous efforcions d'assurer l'exactitude, veuillez noter que les traductions automatisées peuvent contenir des erreurs ou des inexactitudes. Le document original dans sa langue d'origine doit être considéré comme la source faisant autorité. Pour des informations critiques, il est recommandé de faire appel à une traduction humaine professionnelle. Nous déclinons toute responsabilité en cas de malentendus ou d'interprétations erronées résultant de l'utilisation de cette traduction.\n" ] } ] }