{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "lesson_12-R.ipynb", "provenance": [], "collapsed_sections": [] }, "kernelspec": { "name": "ir", "display_name": "R" }, "language_info": { "name": "R" }, "coopTranslator": { "original_hash": "fab50046ca413a38939d579f8432274f", "translation_date": "2025-09-06T15:38:31+00:00", "source_file": "4-Classification/3-Classifiers-2/solution/R/lesson_12-R.ipynb", "language_code": "en" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "jsFutf_ygqSx" }, "source": [ "# Build a classification model: Delicious Asian and Indian Cuisines\n" ] }, { "cell_type": "markdown", "metadata": { "id": "HD54bEefgtNO" }, "source": [ "## Cuisine classifiers 2\n", "\n", "In this second classification lesson, we will explore `additional methods` for classifying categorical data. We will also discuss the implications of choosing one classifier over another.\n", "\n", "### [**Pre-lecture quiz**](https://gray-sand-07a10f403.1.azurestaticapps.net/quiz/23/)\n", "\n", "### **Prerequisite**\n", "\n", "We assume that you have completed the previous lessons since we will be building on concepts introduced earlier.\n", "\n", "For this lesson, the following packages will be required:\n", "\n", "- `tidyverse`: The [tidyverse](https://www.tidyverse.org/) is a [set of R packages](https://www.tidyverse.org/packages) designed to make data science faster, easier, and more enjoyable!\n", "\n", "- `tidymodels`: The [tidymodels](https://www.tidymodels.org/) framework is a [collection of packages](https://www.tidymodels.org/packages/) for modeling and machine learning.\n", "\n", "- `themis`: The [themis package](https://themis.tidymodels.org/) provides additional recipe steps for handling imbalanced data.\n", "\n", "You can install them using the following command:\n", "\n", "`install.packages(c(\"tidyverse\", \"tidymodels\", \"kernlab\", \"themis\", \"ranger\", \"xgboost\", \"kknn\"))`\n", "\n", "Alternatively, the script below checks whether the required packages for this module are installed and installs any missing ones for you.\n" ] }, { "cell_type": "code", "metadata": { "id": "vZ57IuUxgyQt" }, "source": [ "suppressWarnings(if (!require(\"pacman\"))install.packages(\"pacman\"))\n", "\n", "pacman::p_load(tidyverse, tidymodels, themis, kernlab, ranger, xgboost, kknn)" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "z22M-pj4g07x" }, "source": [ "## **1. A classification map**\n", "\n", "In our [previous lesson](https://github.com/microsoft/ML-For-Beginners/tree/main/4-Classification/2-Classifiers-1), we explored the question: how do we decide between different models? To a large extent, the choice depends on the characteristics of the data and the type of problem we aim to solve (e.g., classification or regression).\n", "\n", "Earlier, we learned about the various options available for classifying data using Microsoft's cheat sheet. Python's Machine Learning framework, Scikit-learn, provides a similar but more detailed cheat sheet that can help further refine your choice of estimators (another term for classifiers):\n", "\n", "
\n",
" \n",
"
\n",
" \n",
"
\n",
" \n",
"