diff --git a/2-Regression/2-Data/solution/R/lesson_2.html b/2-Regression/2-Data/solution/R/lesson_2.html new file mode 100644 index 00000000..97af866c --- /dev/null +++ b/2-Regression/2-Data/solution/R/lesson_2.html @@ -0,0 +1,3534 @@ + + + + + + + + + + + + + +Build a regression model: prepare and visualize data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + +
+

Linear Regression for Pumpkins - Lesson 2

+
+

Introduction

+

Now that you are set up with the tools you need to start tackling +machine learning model building with Tidymodels and the Tidyverse, you +are ready to start asking questions of your data. As you work with data +and apply ML solutions, it’s very important to understand how to ask the +right question to properly unlock the potentials of your dataset.

+

In this lesson, you will learn:

+
    +
  • How to prepare your data for model-building.

  • +
  • How to use ggplot2 for data visualization.

  • +
+

The question you need answered will determine what type of ML +algorithms you will leverage. And the quality of the answer you get back +will be heavily dependent on the nature of your data.

+

Let’s see this by working through a practical exercise.

+
+ +

Artwork by @allison_horst

+
+
+
+
+

1. Importing pumpkins data and summoning the Tidyverse

+

We’ll require the following packages to slice and dice this +lesson:

+ +

You can have them installed as:

+

install.packages(c("tidyverse"))

+

The script below checks whether you have the packages required to +complete this module and installs them for you in case they are +missing.

+
if (!require("pacman")) install.packages("pacman")
+pacman::p_load(tidyverse)
+

Now, let’s fire up some packages and load the data +provided for this lesson!

+
# Load the core Tidyverse packages
+library(tidyverse)
+
+# Import the pumpkins data
+pumpkins <- read_csv(file = "https://raw.githubusercontent.com/microsoft/ML-For-Beginners/main/2-Regression/data/US-pumpkins.csv")
+
+
+# Get a glimpse and dimensions of the data
+glimpse(pumpkins)
+
## Rows: 1,757
+## Columns: 26
+## $ `City Name`       <chr> "BALTIMORE", "BALTIMORE", "BALTIMORE", "BALTIMORE", ~
+## $ Type              <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Package           <chr> "24 inch bins", "24 inch bins", "24 inch bins", "24 ~
+## $ Variety           <chr> NA, NA, "HOWDEN TYPE", "HOWDEN TYPE", "HOWDEN TYPE",~
+## $ `Sub Variety`     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Grade             <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Date              <chr> "4/29/17", "5/6/17", "9/24/16", "9/24/16", "11/5/16"~
+## $ `Low Price`       <dbl> 270, 270, 160, 160, 90, 90, 160, 160, 160, 160, 160,~
+## $ `High Price`      <dbl> 280, 280, 160, 160, 100, 100, 170, 160, 170, 160, 17~
+## $ `Mostly Low`      <dbl> 270, 270, 160, 160, 90, 90, 160, 160, 160, 160, 160,~
+## $ `Mostly High`     <dbl> 280, 280, 160, 160, 100, 100, 170, 160, 170, 160, 17~
+## $ Origin            <chr> "MARYLAND", "MARYLAND", "DELAWARE", "VIRGINIA", "MAR~
+## $ `Origin District` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ `Item Size`       <chr> "lge", "lge", "med", "med", "lge", "lge", "med", "lg~
+## $ Color             <chr> NA, NA, "ORANGE", "ORANGE", "ORANGE", "ORANGE", "ORA~
+## $ Environment       <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ `Unit of Sale`    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Quality           <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Condition         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Appearance        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Storage           <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Crop              <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ Repack            <chr> "E", "E", "N", "N", "N", "N", "N", "N", "N", "N", "N~
+## $ `Trans Mode`      <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ ...25             <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+## $ ...26             <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
+
# Print the first 50 rows of the data set
+pumpkins %>% 
+  slice_head(n =50)
+
+ +
+

A quick glimpse() immediately shows that there are +blanks and a mix of strings (chr) and numeric data +(dbl). The Date is of type character and +there’s also a strange column called Package where the data +is a mix between sacks, bins and other values. +The data, in fact, is a bit of a mess 😀.

+

In fact, it is not very common to be gifted a dataset that is +completely ready to use to create a ML model out of the box. But worry +not, in this lesson, you will learn how to prepare a raw dataset using +standard R libraries πŸ§‘β€πŸ”§. You will also learn various techniques to +visualize the data.πŸ“ˆπŸ“Š

+
+

A refresher: The pipe operator (%>%) performs +operations in logical sequence by passing an object forward into a +function or call expression. You can think of the pipe operator as +saying β€œand then” in your code.

+
+
+
+

2. Check for missing data

+

One of the most common issues data scientists need to deal with is +incomplete or missing data. R represents missing, or unknown values, +with special sentinel value: NA (Not Available).

+

So how would we know that the data frame contains missing values?

+
    +
  • One straight forward way would be to use the base R function +anyNA which returns the logical objects TRUE +or FALSE
  • +
+
pumpkins %>% 
+  anyNA()
+
## [1] TRUE
+

Great, there seems to be some missing data! That’s a good place to +start.

+
    +
  • Another way would be to use the function is.na() that +indicates which individual column elements are missing with a logical +TRUE.
  • +
+
pumpkins %>% 
+  is.na() %>% 
+  head(n = 7)
+
##      City Name Type Package Variety Sub Variety Grade  Date Low Price
+## [1,]     FALSE TRUE   FALSE    TRUE        TRUE  TRUE FALSE     FALSE
+## [2,]     FALSE TRUE   FALSE    TRUE        TRUE  TRUE FALSE     FALSE
+## [3,]     FALSE TRUE   FALSE   FALSE        TRUE  TRUE FALSE     FALSE
+## [4,]     FALSE TRUE   FALSE   FALSE        TRUE  TRUE FALSE     FALSE
+## [5,]     FALSE TRUE   FALSE   FALSE        TRUE  TRUE FALSE     FALSE
+## [6,]     FALSE TRUE   FALSE   FALSE        TRUE  TRUE FALSE     FALSE
+## [7,]     FALSE TRUE   FALSE   FALSE        TRUE  TRUE FALSE     FALSE
+##      High Price Mostly Low Mostly High Origin Origin District Item Size Color
+## [1,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE  TRUE
+## [2,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE  TRUE
+## [3,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE FALSE
+## [4,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE FALSE
+## [5,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE FALSE
+## [6,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE FALSE
+## [7,]      FALSE      FALSE       FALSE  FALSE            TRUE     FALSE FALSE
+##      Environment Unit of Sale Quality Condition Appearance Storage Crop Repack
+## [1,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+## [2,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+## [3,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+## [4,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+## [5,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+## [6,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+## [7,]        TRUE         TRUE    TRUE      TRUE       TRUE    TRUE TRUE  FALSE
+##      Trans Mode ...25 ...26
+## [1,]       TRUE  TRUE  TRUE
+## [2,]       TRUE  TRUE  TRUE
+## [3,]       TRUE  TRUE  TRUE
+## [4,]       TRUE  TRUE  TRUE
+## [5,]       TRUE  TRUE  TRUE
+## [6,]       TRUE  TRUE  TRUE
+## [7,]       TRUE  TRUE  TRUE
+

Okay, got the job done but with a large data frame such as this, it +would be inefficient and practically impossible to review all of the +rows and columns individually😴.

+
    +
  • A more intuitive way would be to calculate the sum of the missing +values for each column:
  • +
+
pumpkins %>% 
+  is.na() %>% 
+  colSums()
+
##       City Name            Type         Package         Variety     Sub Variety 
+##               0            1712               0               5            1461 
+##           Grade            Date       Low Price      High Price      Mostly Low 
+##            1757               0               0               0             103 
+##     Mostly High          Origin Origin District       Item Size           Color 
+##             103               3            1626             279             616 
+##     Environment    Unit of Sale         Quality       Condition      Appearance 
+##            1757            1595            1757            1757            1757 
+##         Storage            Crop          Repack      Trans Mode           ...25 
+##            1757            1757               0            1757            1757 
+##           ...26 
+##            1654
+

Much better! There is missing data, but maybe it won’t matter for the +task at hand. Let’s see what further analysis brings forth.

+
+

Along with the awesome sets of packages and functions, R has a very +good documentation. For instance, use help(colSums) or +?colSums to find out more about the function.

+
+
+
+

3. Dplyr: A Grammar of Data Manipulation

+
+ +

Artwork by @allison_horst

+
+

dplyr, a +package in the Tidyverse, is a grammar of data manipulation that +provides a consistent set of verbs that help you solve the most common +data manipulation challenges. In this section, we’ll explore some of +dplyr’s verbs!

+
+

dplyr::select()

+

select() is a function in the package dplyr +which helps you pick columns to keep or exclude.

+

To make your data frame easier to work with, drop several of its +columns, using select(), keeping only the columns you +need.

+

For instance, in this exercise, our analysis will involve the columns +Package, Low Price, High Price +and Date. Let’s select these columns.

+
# Select desired columns
+pumpkins <- pumpkins %>% 
+  select(Package, `Low Price`, `High Price`, Date)
+
+
+# Print data set
+pumpkins %>% 
+  slice_head(n = 5)
+
+ +
+
+
+

dplyr::mutate()

+

mutate() is a function in the package dplyr +which helps you create or modify columns, while keeping the existing +columns.

+

The general structure of mutate is:

+

data %>% mutate(new_column_name = what_it_contains)

+

Let’s take mutate out for a spin using the +Date column by doing the following operations:

+
    +
  1. Convert the dates (currently of type character) to a month format +(these are US dates, so the format is MM/DD/YYYY).

  2. +
  3. Extract the month from the dates to a new column.

  4. +
+

In R, the package lubridate makes it easier to +work with Date-time data. So, let’s use dplyr::mutate(), +lubridate::mdy(), lubridate::month() and see +how to achieve the above objectives. We can drop the Date column since +we won’t be needing it again in subsequent operations.

+
# Load lubridate
+library(lubridate)
+
+pumpkins <- pumpkins %>% 
+  # Convert the Date column to a date object
+  mutate(Date = mdy(Date)) %>% 
+  # Extract month from Date
+  mutate(Month = month(Date)) %>% 
+  # Drop Date column
+  select(-Date)
+
+# View the first few rows
+pumpkins %>% 
+  slice_head(n = 7)
+
+ +
+

Woohoo! 🀩

+

Next, let’s create a new column Price, which represents +the average price of a pumpkin. Now, let’s take the average of the +Low Price and High Price columns to populate +the new Price column.

+
# Create a new column Price
+pumpkins <- pumpkins %>% 
+  mutate(Price = (`Low Price` + `High Price`)/2)
+
+# View the first few rows of the data
+pumpkins %>% 
+  slice_head(n = 5)
+
+ +
+

Yeees!πŸ’ͺ

+

β€œBut wait!”, you’ll say after skimming through the whole data set +with View(pumpkins), β€œThere’s something odd here!β€πŸ€”

+

If you look at the Package column, pumpkins are sold in +many different configurations. Some are sold in +1 1/9 bushel measures, and some in 1/2 bushel +measures, some per pumpkin, some per pound, and some in big boxes with +varying widths.

+

Let’s verify this:

+
# Verify the distinct observations in Package column
+pumpkins %>% 
+  distinct(Package)
+
+ +
+

Amazing!πŸ‘

+

Pumpkins seem to be very hard to weigh consistently, so let’s filter +them by selecting only pumpkins with the string bushel in the +Package column and put this in a new data frame +new_pumpkins.

+
+
+

dplyr::filter() and stringr::str_detect()

+

dplyr::filter(): +creates a subset of the data only containing rows that +satisfy your conditions, in this case, pumpkins with the string +bushel in the Package column.

+

stringr::str_detect(): +detects the presence or absence of a pattern in a string.

+

The stringr +package provides simple functions for common string operations.

+
# Retain only pumpkins with "bushel"
+new_pumpkins <- pumpkins %>% 
+       filter(str_detect(Package, "bushel"))
+
+# Get the dimensions of the new data
+dim(new_pumpkins)
+
## [1] 415   5
+
# View a few rows of the new data
+new_pumpkins %>% 
+  slice_head(n = 5)
+
+ +
+

You can see that we have narrowed down to 415 or so rows of data +containing pumpkins by the bushel.🀩

+
+
+

dplyr::case_when()

+

But wait! There’s one more thing to do

+

Did you notice that the bushel amount varies per row? You need to +normalize the pricing so that you show the pricing per bushel, not per 1 +1/9 or 1/2 bushel. Time to do some math to standardize it.

+

We’ll use the function case_when() +to mutate the Price column depending on some conditions. +case_when allows you to vectorise multiple +if_else()statements.

+
# Convert the price if the Package contains fractional bushel values
+new_pumpkins <- new_pumpkins %>% 
+  mutate(Price = case_when(
+    str_detect(Package, "1 1/9") ~ Price/(1 + 1/9),
+    str_detect(Package, "1/2") ~ Price/(1/2),
+    TRUE ~ Price))
+
+# View the first few rows of the data
+new_pumpkins %>% 
+  slice_head(n = 30)
+
+ +
+

Now, we can analyze the pricing per unit based on their bushel +measurement. All this study of bushels of pumpkins, however, goes to +show how very important it is to +understand the nature of your data!

+
+

βœ… According to The +Spruce Eats, a bushel’s weight depends on the type of produce, as +it’s a volume measurement. β€œA bushel of tomatoes, for example, is +supposed to weigh 56 pounds… Leaves and greens take up more space with +less weight, so a bushel of spinach is only 20 pounds.” It’s all pretty +complicated! Let’s not bother with making a bushel-to-pound conversion, +and instead price by the bushel. All this study of bushels of pumpkins, +however, goes to show how very important it is to understand the nature +of your data!

+

βœ… Did you notice that pumpkins sold by the half-bushel are very +expensive? Can you figure out why? Hint: little pumpkins are way pricier +than big ones, probably because there are so many more of them per +bushel, given the unused space taken by one big hollow pie pumpkin.

+
+

Now lastly, for the sheer sake of adventure πŸ’β€β™€οΈ, let’s also move the +Month column to the first position i.e before column +Package.

+

dplyr::relocate() is used to change column +positions.

+
# Create a new data frame new_pumpkins
+new_pumpkins <- new_pumpkins %>% 
+  relocate(Month, .before = Package)
+
+new_pumpkins %>% 
+  slice_head(n = 7)
+
+ +
+

Good job!πŸ‘Œ You now have a clean, tidy dataset on which you can build +your new regression model!

+
+
+
+

4. Data visualization with ggplot2

+
+ +

Infographic by Dasani Madipalli

+
+

There is a wise saying that goes like this:

+
+

β€œThe simple graph has brought more information to the data analyst’s +mind than any other device.” β€” John Tukey

+
+

Part of the data scientist’s role is to demonstrate the quality and +nature of the data they are working with. To do this, they often create +interesting visualizations, or plots, graphs, and charts, showing +different aspects of data. In this way, they are able to visually show +relationships and gaps that are otherwise hard to uncover.

+

Visualizations can also help determine the machine learning technique +most appropriate for the data. A scatterplot that seems to follow a +line, for example, indicates that the data is a good candidate for a +linear regression exercise.

+

R offers a number of several systems for making graphs, but ggplot2 +is one of the most elegant and most versatile. ggplot2 +allows you to compose graphs by combining independent +components.

+

Let’s start with a simple scatter plot for the Price and Month +columns.

+

So in this case, we’ll start with ggplot(), +supply a dataset and aesthetic mapping (with aes()) +then add a layers (like geom_point()) +for scatter plots.

+
# Set a theme for the plots
+theme_set(theme_light())
+
+# Create a scatter plot
+p <- ggplot(data = new_pumpkins, aes(x = Price, y = Month))
+p + geom_point()
+

+

Is this a useful plot 🀷? Does anything about it surprise you?

+

It’s not particularly useful as all it does is display in your data +as a spread of points in a given month.

+
+

How do we make it useful?

+

To get charts to display useful data, you usually need to group the +data somehow. For instance in our case, finding the average price of +pumpkins for each month would provide more insights to the underlying +patterns in our data. This leads us to one more dplyr +flyby:

+
+

dplyr::group_by() %>% summarize()

+

Grouped aggregation in R can be easily computed using

+

dplyr::group_by() %>% summarize()

+
    +
  • dplyr::group_by() changes the unit of analysis from +the complete dataset to individual groups such as per month.

  • +
  • dplyr::summarize() creates a new data frame with one +column for each grouping variable and one column for each of the summary +statistics that you have specified.

  • +
+

For example, we can use the +dplyr::group_by() %>% summarize() to group the pumpkins +into groups based on the Month columns and then find +the mean price for each month.

+
# Find the average price of pumpkins per month
+new_pumpkins %>%
+  group_by(Month) %>% 
+  summarise(mean_price = mean(Price))
+
+ +
+

Succinct!✨

+

Categorical features such as months are better represented using a +bar plot πŸ“Š. The layers responsible for bar charts are +geom_bar() and geom_col(). Consult

+

?geom_bar to find out more.

+

Let’s whip up one!

+
# Find the average price of pumpkins per month then plot a bar chart
+new_pumpkins %>%
+  group_by(Month) %>% 
+  summarise(mean_price = mean(Price)) %>% 
+  ggplot(aes(x = Month, y = mean_price)) +
+  geom_col(fill = "midnightblue", alpha = 0.7) +
+  ylab("Pumpkin Price")
+

+

🀩🀩This is a more useful data visualization! It seems to indicate +that the highest price for pumpkins occurs in September and October. +Does that meet your expectation? Why or why not?

+

Congratulations on finishing the second lesson πŸ‘! You prepared your +data for model building, then uncovered more insights using +visualizations!

+
+
+
+ +
LS0tDQp0aXRsZTogJ0J1aWxkIGEgcmVncmVzc2lvbiBtb2RlbDogcHJlcGFyZSBhbmQgdmlzdWFsaXplIGRhdGEnDQpvdXRwdXQ6DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgZGZfcHJpbnQ6IHBhZ2VkDQogICAgdGhlbWU6IGZsYXRseQ0KICAgIGhpZ2hsaWdodDogYnJlZXplZGFyaw0KICAgIHRvYzogeWVzDQogICAgdG9jX2Zsb2F0OiB5ZXMNCiAgICBjb2RlX2Rvd25sb2FkOiB5ZXMNCi0tLQ0KDQojIyAqKkxpbmVhciBSZWdyZXNzaW9uIGZvciBQdW1wa2lucyAtIExlc3NvbiAyKioNCg0KIyMjIyBJbnRyb2R1Y3Rpb24NCg0KTm93IHRoYXQgeW91IGFyZSBzZXQgdXAgd2l0aCB0aGUgdG9vbHMgeW91IG5lZWQgdG8gc3RhcnQgdGFja2xpbmcgbWFjaGluZSBsZWFybmluZyBtb2RlbCBidWlsZGluZyB3aXRoIFRpZHltb2RlbHMgYW5kIHRoZSBUaWR5dmVyc2UsIHlvdSBhcmUgcmVhZHkgdG8gc3RhcnQgYXNraW5nIHF1ZXN0aW9ucyBvZiB5b3VyIGRhdGEuIEFzIHlvdSB3b3JrIHdpdGggZGF0YSBhbmQgYXBwbHkgTUwgc29sdXRpb25zLCBpdCdzIHZlcnkgaW1wb3J0YW50IHRvIHVuZGVyc3RhbmQgaG93IHRvIGFzayB0aGUgcmlnaHQgcXVlc3Rpb24gdG8gcHJvcGVybHkgdW5sb2NrIHRoZSBwb3RlbnRpYWxzIG9mIHlvdXIgZGF0YXNldC4NCg0KSW4gdGhpcyBsZXNzb24sIHlvdSB3aWxsIGxlYXJuOg0KDQotICAgSG93IHRvIHByZXBhcmUgeW91ciBkYXRhIGZvciBtb2RlbC1idWlsZGluZy4NCg0KLSAgIEhvdyB0byB1c2UgYGdncGxvdDJgIGZvciBkYXRhIHZpc3VhbGl6YXRpb24uDQoNClRoZSBxdWVzdGlvbiB5b3UgbmVlZCBhbnN3ZXJlZCB3aWxsIGRldGVybWluZSB3aGF0IHR5cGUgb2YgTUwgYWxnb3JpdGhtcyB5b3Ugd2lsbCBsZXZlcmFnZS4gQW5kIHRoZSBxdWFsaXR5IG9mIHRoZSBhbnN3ZXIgeW91IGdldCBiYWNrIHdpbGwgYmUgaGVhdmlseSBkZXBlbmRlbnQgb24gdGhlIG5hdHVyZSBvZiB5b3VyIGRhdGEuDQoNCkxldCdzIHNlZSB0aGlzIGJ5IHdvcmtpbmcgdGhyb3VnaCBhIHByYWN0aWNhbCBleGVyY2lzZS4NCg0KIVtBcnR3b3JrIGJ5IFxAYWxsaXNvbl9ob3JzdF0oLi4vLi4vaW1hZ2VzL3VucnVseV9kYXRhLmpwZyl7d2lkdGg9IjcwMCJ9DQoNCiMjIDEuIEltcG9ydGluZyBwdW1wa2lucyBkYXRhIGFuZCBzdW1tb25pbmcgdGhlIFRpZHl2ZXJzZQ0KDQpXZSdsbCByZXF1aXJlIHRoZSBmb2xsb3dpbmcgcGFja2FnZXMgdG8gc2xpY2UgYW5kIGRpY2UgdGhpcyBsZXNzb246DQoNCi0gICBgdGlkeXZlcnNlYDogVGhlIFt0aWR5dmVyc2VdKGh0dHBzOi8vd3d3LnRpZHl2ZXJzZS5vcmcvKSBpcyBhIFtjb2xsZWN0aW9uIG9mIFIgcGFja2FnZXNdKGh0dHBzOi8vd3d3LnRpZHl2ZXJzZS5vcmcvcGFja2FnZXMpIGRlc2lnbmVkIHRvIG1ha2VzIGRhdGEgc2NpZW5jZSBmYXN0ZXIsIGVhc2llciBhbmQgbW9yZSBmdW4hDQoNCllvdSBjYW4gaGF2ZSB0aGVtIGluc3RhbGxlZCBhczoNCg0KYGluc3RhbGwucGFja2FnZXMoYygidGlkeXZlcnNlIikpYA0KDQpUaGUgc2NyaXB0IGJlbG93IGNoZWNrcyB3aGV0aGVyIHlvdSBoYXZlIHRoZSBwYWNrYWdlcyByZXF1aXJlZCB0byBjb21wbGV0ZSB0aGlzIG1vZHVsZSBhbmQgaW5zdGFsbHMgdGhlbSBmb3IgeW91IGluIGNhc2UgdGhleSBhcmUgbWlzc2luZy4NCg0KYGBge3IsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KaWYgKCFyZXF1aXJlKCJwYWNtYW4iKSkgaW5zdGFsbC5wYWNrYWdlcygicGFjbWFuIikNCnBhY21hbjo6cF9sb2FkKHRpZHl2ZXJzZSkNCmBgYA0KDQpOb3csIGxldCdzIGZpcmUgdXAgc29tZSBwYWNrYWdlcyBhbmQgbG9hZCB0aGUgW2RhdGFdKGh0dHBzOi8vZ2l0aHViLmNvbS9taWNyb3NvZnQvTUwtRm9yLUJlZ2lubmVycy9ibG9iL21haW4vMi1SZWdyZXNzaW9uL2RhdGEvVVMtcHVtcGtpbnMuY3N2KSBwcm92aWRlZCBmb3IgdGhpcyBsZXNzb24hDQoNCmBgYHtyIGxvYWRfdGlkeV92ZXJzZV9tb2RlbHMsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBMb2FkIHRoZSBjb3JlIFRpZHl2ZXJzZSBwYWNrYWdlcw0KbGlicmFyeSh0aWR5dmVyc2UpDQoNCiMgSW1wb3J0IHRoZSBwdW1wa2lucyBkYXRhDQpwdW1wa2lucyA8LSByZWFkX2NzdihmaWxlID0gImh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9taWNyb3NvZnQvTUwtRm9yLUJlZ2lubmVycy9tYWluLzItUmVncmVzc2lvbi9kYXRhL1VTLXB1bXBraW5zLmNzdiIpDQoNCg0KIyBHZXQgYSBnbGltcHNlIGFuZCBkaW1lbnNpb25zIG9mIHRoZSBkYXRhDQpnbGltcHNlKHB1bXBraW5zKQ0KDQoNCiMgUHJpbnQgdGhlIGZpcnN0IDUwIHJvd3Mgb2YgdGhlIGRhdGEgc2V0DQpwdW1wa2lucyAlPiUgDQogIHNsaWNlX2hlYWQobiA9NTApDQoNCmBgYA0KDQpBIHF1aWNrIGBnbGltcHNlKClgIGltbWVkaWF0ZWx5IHNob3dzIHRoYXQgdGhlcmUgYXJlIGJsYW5rcyBhbmQgYSBtaXggb2Ygc3RyaW5ncyAoYGNocmApIGFuZCBudW1lcmljIGRhdGEgKGBkYmxgKS4gVGhlIGBEYXRlYCBpcyBvZiB0eXBlIGNoYXJhY3RlciBhbmQgdGhlcmUncyBhbHNvIGEgc3RyYW5nZSBjb2x1bW4gY2FsbGVkIGBQYWNrYWdlYCB3aGVyZSB0aGUgZGF0YSBpcyBhIG1peCBiZXR3ZWVuIGBzYWNrc2AsIGBiaW5zYCBhbmQgb3RoZXIgdmFsdWVzLiBUaGUgZGF0YSwgaW4gZmFjdCwgaXMgYSBiaXQgb2YgYSBtZXNzIPCfmKQuDQoNCkluIGZhY3QsIGl0IGlzIG5vdCB2ZXJ5IGNvbW1vbiB0byBiZSBnaWZ0ZWQgYSBkYXRhc2V0IHRoYXQgaXMgY29tcGxldGVseSByZWFkeSB0byB1c2UgdG8gY3JlYXRlIGEgTUwgbW9kZWwgb3V0IG9mIHRoZSBib3guIEJ1dCB3b3JyeSBub3QsIGluIHRoaXMgbGVzc29uLCB5b3Ugd2lsbCBsZWFybiBob3cgdG8gcHJlcGFyZSBhIHJhdyBkYXRhc2V0IHVzaW5nIHN0YW5kYXJkIFIgbGlicmFyaWVzIPCfp5HigI3wn5SnLiBZb3Ugd2lsbCBhbHNvIGxlYXJuIHZhcmlvdXMgdGVjaG5pcXVlcyB0byB2aXN1YWxpemUgdGhlIGRhdGEu8J+TiPCfk4oNCg0KDQoNCj4gQSByZWZyZXNoZXI6IFRoZSBwaXBlIG9wZXJhdG9yIChgJT4lYCkgcGVyZm9ybXMgb3BlcmF0aW9ucyBpbiBsb2dpY2FsIHNlcXVlbmNlIGJ5IHBhc3NpbmcgYW4gb2JqZWN0IGZvcndhcmQgaW50byBhIGZ1bmN0aW9uIG9yIGNhbGwgZXhwcmVzc2lvbi4gWW91IGNhbiB0aGluayBvZiB0aGUgcGlwZSBvcGVyYXRvciBhcyBzYXlpbmcgImFuZCB0aGVuIiBpbiB5b3VyIGNvZGUuDQoNCg0KIyMgMi4gQ2hlY2sgZm9yIG1pc3NpbmcgZGF0YQ0KDQpPbmUgb2YgdGhlIG1vc3QgY29tbW9uIGlzc3VlcyBkYXRhIHNjaWVudGlzdHMgbmVlZCB0byBkZWFsIHdpdGggaXMgaW5jb21wbGV0ZSBvciBtaXNzaW5nIGRhdGEuIFIgcmVwcmVzZW50cyBtaXNzaW5nLCBvciB1bmtub3duIHZhbHVlcywgd2l0aCBzcGVjaWFsIHNlbnRpbmVsIHZhbHVlOiBgTkFgIChOb3QgQXZhaWxhYmxlKS4NCg0KU28gaG93IHdvdWxkIHdlIGtub3cgdGhhdCB0aGUgZGF0YSBmcmFtZSBjb250YWlucyBtaXNzaW5nIHZhbHVlcz8NCg0KLSAgIE9uZSBzdHJhaWdodCBmb3J3YXJkIHdheSB3b3VsZCBiZSB0byB1c2UgdGhlIGJhc2UgUiBmdW5jdGlvbiBgYW55TkFgIHdoaWNoIHJldHVybnMgdGhlIGxvZ2ljYWwgb2JqZWN0cyBgVFJVRWAgb3IgYEZBTFNFYA0KDQpgYGB7ciBhbnlOQSwgbWVzc2FnZT1GLCB3YXJuaW5nPUZ9DQpwdW1wa2lucyAlPiUgDQogIGFueU5BKCkNCmBgYA0KDQpHcmVhdCwgdGhlcmUgc2VlbXMgdG8gYmUgc29tZSBtaXNzaW5nIGRhdGEhIFRoYXQncyBhIGdvb2QgcGxhY2UgdG8gc3RhcnQuDQoNCi0gICBBbm90aGVyIHdheSB3b3VsZCBiZSB0byB1c2UgdGhlIGZ1bmN0aW9uIGBpcy5uYSgpYCB0aGF0IGluZGljYXRlcyB3aGljaCBpbmRpdmlkdWFsIGNvbHVtbiBlbGVtZW50cyBhcmUgbWlzc2luZyB3aXRoIGEgbG9naWNhbCBgVFJVRWAuDQoNCmBgYHtyIGlzX25hLCBtZXNzYWdlPUYsIHdhcm5pbmc9Rn0NCnB1bXBraW5zICU+JSANCiAgaXMubmEoKSAlPiUgDQogIGhlYWQobiA9IDcpDQpgYGANCg0KT2theSwgZ290IHRoZSBqb2IgZG9uZSBidXQgd2l0aCBhIGxhcmdlIGRhdGEgZnJhbWUgc3VjaCBhcyB0aGlzLCBpdCB3b3VsZCBiZSBpbmVmZmljaWVudCBhbmQgcHJhY3RpY2FsbHkgaW1wb3NzaWJsZSB0byByZXZpZXcgYWxsIG9mIHRoZSByb3dzIGFuZCBjb2x1bW5zIGluZGl2aWR1YWxsefCfmLQuDQoNCi0gICBBIG1vcmUgaW50dWl0aXZlIHdheSB3b3VsZCBiZSB0byBjYWxjdWxhdGUgdGhlIHN1bSBvZiB0aGUgbWlzc2luZyB2YWx1ZXMgZm9yIGVhY2ggY29sdW1uOg0KDQpgYGB7ciBjb2xTdW1fTkEsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KcHVtcGtpbnMgJT4lIA0KICBpcy5uYSgpICU+JSANCiAgY29sU3VtcygpDQpgYGANCg0KTXVjaCBiZXR0ZXIhIFRoZXJlIGlzIG1pc3NpbmcgZGF0YSwgYnV0IG1heWJlIGl0IHdvbid0IG1hdHRlciBmb3IgdGhlIHRhc2sgYXQgaGFuZC4gTGV0J3Mgc2VlIHdoYXQgZnVydGhlciBhbmFseXNpcyBicmluZ3MgZm9ydGguDQoNCj4gQWxvbmcgd2l0aCB0aGUgYXdlc29tZSBzZXRzIG9mIHBhY2thZ2VzIGFuZCBmdW5jdGlvbnMsIFIgaGFzIGEgdmVyeSBnb29kIGRvY3VtZW50YXRpb24uIEZvciBpbnN0YW5jZSwgdXNlIGBoZWxwKGNvbFN1bXMpYCBvciBgP2NvbFN1bXNgIHRvIGZpbmQgb3V0IG1vcmUgYWJvdXQgdGhlIGZ1bmN0aW9uLg0KDQojIyAzLiBEcGx5cjogQSBHcmFtbWFyIG9mIERhdGEgTWFuaXB1bGF0aW9uDQoNCiFbQXJ0d29yayBieSBcQGFsbGlzb25faG9yc3RdKC4uLy4uL2ltYWdlcy9kcGx5cl93cmFuZ2xpbmcucG5nKXt3aWR0aD0iNTY5In0NCg0KW2BkcGx5cmBdKGh0dHBzOi8vZHBseXIudGlkeXZlcnNlLm9yZy8pLCBhIHBhY2thZ2UgaW4gdGhlIFRpZHl2ZXJzZSwgaXMgYSBncmFtbWFyIG9mIGRhdGEgbWFuaXB1bGF0aW9uIHRoYXQgcHJvdmlkZXMgYSBjb25zaXN0ZW50IHNldCBvZiB2ZXJicyB0aGF0IGhlbHAgeW91IHNvbHZlIHRoZSBtb3N0IGNvbW1vbiBkYXRhIG1hbmlwdWxhdGlvbiBjaGFsbGVuZ2VzLiBJbiB0aGlzIHNlY3Rpb24sIHdlJ2xsIGV4cGxvcmUgc29tZSBvZiBkcGx5cidzIHZlcmJzIQ0KDQojIyMjIGRwbHlyOjpzZWxlY3QoKQ0KDQpgc2VsZWN0KClgIGlzIGEgZnVuY3Rpb24gaW4gdGhlIHBhY2thZ2UgYGRwbHlyYCB3aGljaCBoZWxwcyB5b3UgcGljayBjb2x1bW5zIHRvIGtlZXAgb3IgZXhjbHVkZS4NCg0KVG8gbWFrZSB5b3VyIGRhdGEgZnJhbWUgZWFzaWVyIHRvIHdvcmsgd2l0aCwgZHJvcCBzZXZlcmFsIG9mIGl0cyBjb2x1bW5zLCB1c2luZyBgc2VsZWN0KClgLCBrZWVwaW5nIG9ubHkgdGhlIGNvbHVtbnMgeW91IG5lZWQuDQoNCkZvciBpbnN0YW5jZSwgaW4gdGhpcyBleGVyY2lzZSwgb3VyIGFuYWx5c2lzIHdpbGwgaW52b2x2ZSB0aGUgY29sdW1ucyBgUGFja2FnZWAsIGBMb3cgUHJpY2VgLCBgSGlnaCBQcmljZWAgYW5kIGBEYXRlYC4gTGV0J3Mgc2VsZWN0IHRoZXNlIGNvbHVtbnMuDQoNCmBgYHtyIHNlbGVjdCwgbWVzc2FnZT1GLCB3YXJuaW5nPUZ9DQojIFNlbGVjdCBkZXNpcmVkIGNvbHVtbnMNCnB1bXBraW5zIDwtIHB1bXBraW5zICU+JSANCiAgc2VsZWN0KFBhY2thZ2UsIGBMb3cgUHJpY2VgLCBgSGlnaCBQcmljZWAsIERhdGUpDQoNCg0KIyBQcmludCBkYXRhIHNldA0KcHVtcGtpbnMgJT4lIA0KICBzbGljZV9oZWFkKG4gPSA1KQ0KYGBgDQoNCiMjIyMgZHBseXI6Om11dGF0ZSgpDQoNCmBtdXRhdGUoKWAgaXMgYSBmdW5jdGlvbiBpbiB0aGUgcGFja2FnZSBgZHBseXJgIHdoaWNoIGhlbHBzIHlvdSBjcmVhdGUgb3IgbW9kaWZ5IGNvbHVtbnMsIHdoaWxlIGtlZXBpbmcgdGhlIGV4aXN0aW5nIGNvbHVtbnMuDQoNClRoZSBnZW5lcmFsIHN0cnVjdHVyZSBvZiBtdXRhdGUgaXM6DQoNCmBkYXRhICU+JSAgIG11dGF0ZShuZXdfY29sdW1uX25hbWUgPSB3aGF0X2l0X2NvbnRhaW5zKWANCg0KTGV0J3MgdGFrZSBgbXV0YXRlYCBvdXQgZm9yIGEgc3BpbiB1c2luZyB0aGUgYERhdGVgIGNvbHVtbiBieSBkb2luZyB0aGUgZm9sbG93aW5nIG9wZXJhdGlvbnM6DQoNCjEuICBDb252ZXJ0IHRoZSBkYXRlcyAoY3VycmVudGx5IG9mIHR5cGUgY2hhcmFjdGVyKSB0byBhIG1vbnRoIGZvcm1hdCAodGhlc2UgYXJlIFVTIGRhdGVzLCBzbyB0aGUgZm9ybWF0IGlzIGBNTS9ERC9ZWVlZYCkuDQoNCjIuICBFeHRyYWN0IHRoZSBtb250aCBmcm9tIHRoZSBkYXRlcyB0byBhIG5ldyBjb2x1bW4uDQoNCkluIFIsIHRoZSBwYWNrYWdlIFtsdWJyaWRhdGVdKGh0dHBzOi8vbHVicmlkYXRlLnRpZHl2ZXJzZS5vcmcvKSBtYWtlcyBpdCBlYXNpZXIgdG8gd29yayB3aXRoIERhdGUtdGltZSBkYXRhLiBTbywgbGV0J3MgdXNlIGBkcGx5cjo6bXV0YXRlKClgLCBgbHVicmlkYXRlOjptZHkoKWAsIGBsdWJyaWRhdGU6Om1vbnRoKClgIGFuZCBzZWUgaG93IHRvIGFjaGlldmUgdGhlIGFib3ZlIG9iamVjdGl2ZXMuIFdlIGNhbiBkcm9wIHRoZSBEYXRlIGNvbHVtbiBzaW5jZSB3ZSB3b24ndCBiZSBuZWVkaW5nIGl0IGFnYWluIGluIHN1YnNlcXVlbnQgb3BlcmF0aW9ucy4NCg0KYGBge3IgbXV0X2RhdGUsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBMb2FkIGx1YnJpZGF0ZQ0KbGlicmFyeShsdWJyaWRhdGUpDQoNCnB1bXBraW5zIDwtIHB1bXBraW5zICU+JSANCiAgIyBDb252ZXJ0IHRoZSBEYXRlIGNvbHVtbiB0byBhIGRhdGUgb2JqZWN0DQogIG11dGF0ZShEYXRlID0gbWR5KERhdGUpKSAlPiUgDQogICMgRXh0cmFjdCBtb250aCBmcm9tIERhdGUNCiAgbXV0YXRlKE1vbnRoID0gbW9udGgoRGF0ZSkpICU+JSANCiAgIyBEcm9wIERhdGUgY29sdW1uDQogIHNlbGVjdCgtRGF0ZSkNCg0KIyBWaWV3IHRoZSBmaXJzdCBmZXcgcm93cw0KcHVtcGtpbnMgJT4lIA0KICBzbGljZV9oZWFkKG4gPSA3KQ0KYGBgDQoNCldvb2hvbyEg8J+kqQ0KDQpOZXh0LCBsZXQncyBjcmVhdGUgYSBuZXcgY29sdW1uIGBQcmljZWAsIHdoaWNoIHJlcHJlc2VudHMgdGhlIGF2ZXJhZ2UgcHJpY2Ugb2YgYSBwdW1wa2luLiBOb3csIGxldCdzIHRha2UgdGhlIGF2ZXJhZ2Ugb2YgdGhlIGBMb3cgUHJpY2VgIGFuZCBgSGlnaCBQcmljZWAgY29sdW1ucyB0byBwb3B1bGF0ZSB0aGUgbmV3IFByaWNlIGNvbHVtbi4NCg0KYGBge3IgcHJpY2UsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBDcmVhdGUgYSBuZXcgY29sdW1uIFByaWNlDQpwdW1wa2lucyA8LSBwdW1wa2lucyAlPiUgDQogIG11dGF0ZShQcmljZSA9IChgTG93IFByaWNlYCArIGBIaWdoIFByaWNlYCkvMikNCg0KIyBWaWV3IHRoZSBmaXJzdCBmZXcgcm93cyBvZiB0aGUgZGF0YQ0KcHVtcGtpbnMgJT4lIA0KICBzbGljZV9oZWFkKG4gPSA1KQ0KYGBgDQoNClllZWVzIfCfkqoNCg0KIkJ1dCB3YWl0ISIsIHlvdSdsbCBzYXkgYWZ0ZXIgc2tpbW1pbmcgdGhyb3VnaCB0aGUgd2hvbGUgZGF0YSBzZXQgd2l0aCBgVmlldyhwdW1wa2lucylgLCAiVGhlcmUncyBzb21ldGhpbmcgb2RkIGhlcmUhIvCfpJQNCg0KSWYgeW91IGxvb2sgYXQgdGhlIGBQYWNrYWdlYCBjb2x1bW4sIHB1bXBraW5zIGFyZSBzb2xkIGluIG1hbnkgZGlmZmVyZW50IGNvbmZpZ3VyYXRpb25zLiBTb21lIGFyZSBzb2xkIGluIGAxIDEvOSBidXNoZWxgIG1lYXN1cmVzLCBhbmQgc29tZSBpbiBgMS8yIGJ1c2hlbGAgbWVhc3VyZXMsIHNvbWUgcGVyIHB1bXBraW4sIHNvbWUgcGVyIHBvdW5kLCBhbmQgc29tZSBpbiBiaWcgYm94ZXMgd2l0aCB2YXJ5aW5nIHdpZHRocy4NCg0KTGV0J3MgdmVyaWZ5IHRoaXM6DQoNCmBgYHtyIFBhY2thZ2UsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBWZXJpZnkgdGhlIGRpc3RpbmN0IG9ic2VydmF0aW9ucyBpbiBQYWNrYWdlIGNvbHVtbg0KcHVtcGtpbnMgJT4lIA0KICBkaXN0aW5jdChQYWNrYWdlKQ0KDQpgYGANCg0KQW1hemluZyHwn5GPDQoNClB1bXBraW5zIHNlZW0gdG8gYmUgdmVyeSBoYXJkIHRvIHdlaWdoIGNvbnNpc3RlbnRseSwgc28gbGV0J3MgZmlsdGVyIHRoZW0gYnkgc2VsZWN0aW5nIG9ubHkgcHVtcGtpbnMgd2l0aCB0aGUgc3RyaW5nICpidXNoZWwqIGluIHRoZSBgUGFja2FnZWAgY29sdW1uIGFuZCBwdXQgdGhpcyBpbiBhIG5ldyBkYXRhIGZyYW1lIGBuZXdfcHVtcGtpbnNgLg0KDQojIyMjIGRwbHlyOjpmaWx0ZXIoKSBhbmQgc3RyaW5ncjo6c3RyX2RldGVjdCgpDQoNCltgZHBseXI6OmZpbHRlcigpYF0oaHR0cHM6Ly9kcGx5ci50aWR5dmVyc2Uub3JnL3JlZmVyZW5jZS9maWx0ZXIuaHRtbCk6IGNyZWF0ZXMgYSBzdWJzZXQgb2YgdGhlIGRhdGEgb25seSBjb250YWluaW5nICoqcm93cyoqIHRoYXQgc2F0aXNmeSB5b3VyIGNvbmRpdGlvbnMsIGluIHRoaXMgY2FzZSwgcHVtcGtpbnMgd2l0aCB0aGUgc3RyaW5nICpidXNoZWwqIGluIHRoZSBgUGFja2FnZWAgY29sdW1uLg0KDQpbc3RyaW5ncjo6c3RyX2RldGVjdCgpXShodHRwczovL3N0cmluZ3IudGlkeXZlcnNlLm9yZy9yZWZlcmVuY2Uvc3RyX2RldGVjdC5odG1sKTogZGV0ZWN0cyB0aGUgcHJlc2VuY2Ugb3IgYWJzZW5jZSBvZiBhIHBhdHRlcm4gaW4gYSBzdHJpbmcuDQoNClRoZSBbYHN0cmluZ3JgXShodHRwczovL2dpdGh1Yi5jb20vdGlkeXZlcnNlL3N0cmluZ3IpIHBhY2thZ2UgcHJvdmlkZXMgc2ltcGxlIGZ1bmN0aW9ucyBmb3IgY29tbW9uIHN0cmluZyBvcGVyYXRpb25zLg0KDQpgYGB7ciBmaWx0ZXIsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBSZXRhaW4gb25seSBwdW1wa2lucyB3aXRoICJidXNoZWwiDQpuZXdfcHVtcGtpbnMgPC0gcHVtcGtpbnMgJT4lIA0KICAgICAgIGZpbHRlcihzdHJfZGV0ZWN0KFBhY2thZ2UsICJidXNoZWwiKSkNCg0KIyBHZXQgdGhlIGRpbWVuc2lvbnMgb2YgdGhlIG5ldyBkYXRhDQpkaW0obmV3X3B1bXBraW5zKQ0KDQojIFZpZXcgYSBmZXcgcm93cyBvZiB0aGUgbmV3IGRhdGENCm5ld19wdW1wa2lucyAlPiUgDQogIHNsaWNlX2hlYWQobiA9IDUpDQpgYGANCg0KWW91IGNhbiBzZWUgdGhhdCB3ZSBoYXZlIG5hcnJvd2VkIGRvd24gdG8gNDE1IG9yIHNvIHJvd3Mgb2YgZGF0YSBjb250YWluaW5nIHB1bXBraW5zIGJ5IHRoZSBidXNoZWwu8J+kqQ0KDQojIyMjIGRwbHlyOjpjYXNlX3doZW4oKQ0KDQoqKkJ1dCB3YWl0ISBUaGVyZSdzIG9uZSBtb3JlIHRoaW5nIHRvIGRvKioNCg0KRGlkIHlvdSBub3RpY2UgdGhhdCB0aGUgYnVzaGVsIGFtb3VudCB2YXJpZXMgcGVyIHJvdz8gWW91IG5lZWQgdG8gbm9ybWFsaXplIHRoZSBwcmljaW5nIHNvIHRoYXQgeW91IHNob3cgdGhlIHByaWNpbmcgcGVyIGJ1c2hlbCwgbm90IHBlciAxIDEvOSBvciAxLzIgYnVzaGVsLiBUaW1lIHRvIGRvIHNvbWUgbWF0aCB0byBzdGFuZGFyZGl6ZSBpdC4NCg0KV2UnbGwgdXNlIHRoZSBmdW5jdGlvbiBbYGNhc2Vfd2hlbigpYF0oaHR0cHM6Ly9kcGx5ci50aWR5dmVyc2Uub3JnL3JlZmVyZW5jZS9jYXNlX3doZW4uaHRtbCkgdG8gKm11dGF0ZSogdGhlIFByaWNlIGNvbHVtbiBkZXBlbmRpbmcgb24gc29tZSBjb25kaXRpb25zLiBgY2FzZV93aGVuYCBhbGxvd3MgeW91IHRvIHZlY3RvcmlzZSBtdWx0aXBsZSBgaWZfZWxzZSgpYHN0YXRlbWVudHMuDQoNCmBgYHtyIG5vcm1hbGl6ZV9wcmljZSwgbWVzc2FnZT1GLCB3YXJuaW5nPUZ9DQojIENvbnZlcnQgdGhlIHByaWNlIGlmIHRoZSBQYWNrYWdlIGNvbnRhaW5zIGZyYWN0aW9uYWwgYnVzaGVsIHZhbHVlcw0KbmV3X3B1bXBraW5zIDwtIG5ld19wdW1wa2lucyAlPiUgDQogIG11dGF0ZShQcmljZSA9IGNhc2Vfd2hlbigNCiAgICBzdHJfZGV0ZWN0KFBhY2thZ2UsICIxIDEvOSIpIH4gUHJpY2UvKDEgKyAxLzkpLA0KICAgIHN0cl9kZXRlY3QoUGFja2FnZSwgIjEvMiIpIH4gUHJpY2UvKDEvMiksDQogICAgVFJVRSB+IFByaWNlKSkNCg0KIyBWaWV3IHRoZSBmaXJzdCBmZXcgcm93cyBvZiB0aGUgZGF0YQ0KbmV3X3B1bXBraW5zICU+JSANCiAgc2xpY2VfaGVhZChuID0gMzApDQpgYGANCg0KTm93LCB3ZSBjYW4gYW5hbHl6ZSB0aGUgcHJpY2luZyBwZXIgdW5pdCBiYXNlZCBvbiB0aGVpciBidXNoZWwgbWVhc3VyZW1lbnQuIEFsbCB0aGlzIHN0dWR5IG9mIGJ1c2hlbHMgb2YgcHVtcGtpbnMsIGhvd2V2ZXIsIGdvZXMgdG8gc2hvdyBob3cgdmVyeSBgaW1wb3J0YW50YCBpdCBpcyB0byBgdW5kZXJzdGFuZCB0aGUgbmF0dXJlIG9mIHlvdXIgZGF0YWAhDQoNCj4g4pyFIEFjY29yZGluZyB0byBbVGhlIFNwcnVjZSBFYXRzXShodHRwczovL3d3dy50aGVzcHJ1Y2VlYXRzLmNvbS9ob3ctbXVjaC1pcy1hLWJ1c2hlbC0xMzg5MzA4KSwgYSBidXNoZWwncyB3ZWlnaHQgZGVwZW5kcyBvbiB0aGUgdHlwZSBvZiBwcm9kdWNlLCBhcyBpdCdzIGEgdm9sdW1lIG1lYXN1cmVtZW50LiAiQSBidXNoZWwgb2YgdG9tYXRvZXMsIGZvciBleGFtcGxlLCBpcyBzdXBwb3NlZCB0byB3ZWlnaCA1NiBwb3VuZHMuLi4gTGVhdmVzIGFuZCBncmVlbnMgdGFrZSB1cCBtb3JlIHNwYWNlIHdpdGggbGVzcyB3ZWlnaHQsIHNvIGEgYnVzaGVsIG9mIHNwaW5hY2ggaXMgb25seSAyMCBwb3VuZHMuIiBJdCdzIGFsbCBwcmV0dHkgY29tcGxpY2F0ZWQhIExldCdzIG5vdCBib3RoZXIgd2l0aCBtYWtpbmcgYSBidXNoZWwtdG8tcG91bmQgY29udmVyc2lvbiwgYW5kIGluc3RlYWQgcHJpY2UgYnkgdGhlIGJ1c2hlbC4gQWxsIHRoaXMgc3R1ZHkgb2YgYnVzaGVscyBvZiBwdW1wa2lucywgaG93ZXZlciwgZ29lcyB0byBzaG93IGhvdyB2ZXJ5IGltcG9ydGFudCBpdCBpcyB0byB1bmRlcnN0YW5kIHRoZSBuYXR1cmUgb2YgeW91ciBkYXRhIQ0KPg0KPiDinIUgRGlkIHlvdSBub3RpY2UgdGhhdCBwdW1wa2lucyBzb2xkIGJ5IHRoZSBoYWxmLWJ1c2hlbCBhcmUgdmVyeSBleHBlbnNpdmU/IENhbiB5b3UgZmlndXJlIG91dCB3aHk/IEhpbnQ6IGxpdHRsZSBwdW1wa2lucyBhcmUgd2F5IHByaWNpZXIgdGhhbiBiaWcgb25lcywgcHJvYmFibHkgYmVjYXVzZSB0aGVyZSBhcmUgc28gbWFueSBtb3JlIG9mIHRoZW0gcGVyIGJ1c2hlbCwgZ2l2ZW4gdGhlIHVudXNlZCBzcGFjZSB0YWtlbiBieSBvbmUgYmlnIGhvbGxvdyBwaWUgcHVtcGtpbi4NCg0KTm93IGxhc3RseSwgZm9yIHRoZSBzaGVlciBzYWtlIG9mIGFkdmVudHVyZSDwn5KB4oCN4pmA77iPLCBsZXQncyBhbHNvIG1vdmUgdGhlIE1vbnRoIGNvbHVtbiB0byB0aGUgZmlyc3QgcG9zaXRpb24gaS5lIGBiZWZvcmVgIGNvbHVtbiBgUGFja2FnZWAuDQoNCmBkcGx5cjo6cmVsb2NhdGUoKWAgaXMgdXNlZCB0byBjaGFuZ2UgY29sdW1uIHBvc2l0aW9ucy4NCg0KYGBge3IgbmV3X3B1bXBraW5zLCBtZXNzYWdlPUYsIHdhcm5pbmc9Rn0NCiMgQ3JlYXRlIGEgbmV3IGRhdGEgZnJhbWUgbmV3X3B1bXBraW5zDQpuZXdfcHVtcGtpbnMgPC0gbmV3X3B1bXBraW5zICU+JSANCiAgcmVsb2NhdGUoTW9udGgsIC5iZWZvcmUgPSBQYWNrYWdlKQ0KDQpuZXdfcHVtcGtpbnMgJT4lIA0KICBzbGljZV9oZWFkKG4gPSA3KQ0KICANCmBgYA0KDQpHb29kIGpvYiHwn5GMIFlvdSBub3cgaGF2ZSBhIGNsZWFuLCB0aWR5IGRhdGFzZXQgb24gd2hpY2ggeW91IGNhbiBidWlsZCB5b3VyIG5ldyByZWdyZXNzaW9uIG1vZGVsIQ0KDQojIyA0LiBEYXRhIHZpc3VhbGl6YXRpb24gd2l0aCBnZ3Bsb3QyDQoNCiFbSW5mb2dyYXBoaWMgYnkgRGFzYW5pIE1hZGlwYWxsaV0oLi4vLi4vaW1hZ2VzL2RhdGEtdmlzdWFsaXphdGlvbi5wbmcpe3dpZHRoPSI2MDAifQ0KDQpUaGVyZSBpcyBhICp3aXNlKiBzYXlpbmcgdGhhdCBnb2VzIGxpa2UgdGhpczoNCg0KPiAiVGhlIHNpbXBsZSBncmFwaCBoYXMgYnJvdWdodCBtb3JlIGluZm9ybWF0aW9uIHRvIHRoZSBkYXRhIGFuYWx5c3QncyBtaW5kIHRoYW4gYW55IG90aGVyIGRldmljZS4iIC0tLSBKb2huIFR1a2V5DQoNClBhcnQgb2YgdGhlIGRhdGEgc2NpZW50aXN0J3Mgcm9sZSBpcyB0byBkZW1vbnN0cmF0ZSB0aGUgcXVhbGl0eSBhbmQgbmF0dXJlIG9mIHRoZSBkYXRhIHRoZXkgYXJlIHdvcmtpbmcgd2l0aC4gVG8gZG8gdGhpcywgdGhleSBvZnRlbiBjcmVhdGUgaW50ZXJlc3RpbmcgdmlzdWFsaXphdGlvbnMsIG9yIHBsb3RzLCBncmFwaHMsIGFuZCBjaGFydHMsIHNob3dpbmcgZGlmZmVyZW50IGFzcGVjdHMgb2YgZGF0YS4gSW4gdGhpcyB3YXksIHRoZXkgYXJlIGFibGUgdG8gdmlzdWFsbHkgc2hvdyByZWxhdGlvbnNoaXBzIGFuZCBnYXBzIHRoYXQgYXJlIG90aGVyd2lzZSBoYXJkIHRvIHVuY292ZXIuDQoNClZpc3VhbGl6YXRpb25zIGNhbiBhbHNvIGhlbHAgZGV0ZXJtaW5lIHRoZSBtYWNoaW5lIGxlYXJuaW5nIHRlY2huaXF1ZSBtb3N0IGFwcHJvcHJpYXRlIGZvciB0aGUgZGF0YS4gQSBzY2F0dGVycGxvdCB0aGF0IHNlZW1zIHRvIGZvbGxvdyBhIGxpbmUsIGZvciBleGFtcGxlLCBpbmRpY2F0ZXMgdGhhdCB0aGUgZGF0YSBpcyBhIGdvb2QgY2FuZGlkYXRlIGZvciBhIGxpbmVhciByZWdyZXNzaW9uIGV4ZXJjaXNlLg0KDQpSIG9mZmVycyBhIG51bWJlciBvZiBzZXZlcmFsIHN5c3RlbXMgZm9yIG1ha2luZyBncmFwaHMsIGJ1dCBbYGdncGxvdDJgXShodHRwczovL2dncGxvdDIudGlkeXZlcnNlLm9yZy9pbmRleC5odG1sKSBpcyBvbmUgb2YgdGhlIG1vc3QgZWxlZ2FudCBhbmQgbW9zdCB2ZXJzYXRpbGUuIGBnZ3Bsb3QyYCBhbGxvd3MgeW91IHRvIGNvbXBvc2UgZ3JhcGhzIGJ5ICoqY29tYmluaW5nIGluZGVwZW5kZW50IGNvbXBvbmVudHMqKi4NCg0KTGV0J3Mgc3RhcnQgd2l0aCBhIHNpbXBsZSBzY2F0dGVyIHBsb3QgZm9yIHRoZSBQcmljZSBhbmQgTW9udGggY29sdW1ucy4NCg0KU28gaW4gdGhpcyBjYXNlLCB3ZSdsbCBzdGFydCB3aXRoIFtgZ2dwbG90KClgXShodHRwczovL2dncGxvdDIudGlkeXZlcnNlLm9yZy9yZWZlcmVuY2UvZ2dwbG90Lmh0bWwpLCBzdXBwbHkgYSBkYXRhc2V0IGFuZCBhZXN0aGV0aWMgbWFwcGluZyAod2l0aCBbYGFlcygpYF0oaHR0cHM6Ly9nZ3Bsb3QyLnRpZHl2ZXJzZS5vcmcvcmVmZXJlbmNlL2Flcy5odG1sKSkgdGhlbiBhZGQgYSBsYXllcnMgKGxpa2UgW2BnZW9tX3BvaW50KClgXShodHRwczovL2dncGxvdDIudGlkeXZlcnNlLm9yZy9yZWZlcmVuY2UvZ2VvbV9wb2ludC5odG1sKSkgZm9yIHNjYXR0ZXIgcGxvdHMuDQoNCmBgYHtyIHNjYXR0ZXJfcGx0LCBtZXNzYWdlPUYsIHdhcm5pbmc9Rn0NCiMgU2V0IGEgdGhlbWUgZm9yIHRoZSBwbG90cw0KdGhlbWVfc2V0KHRoZW1lX2xpZ2h0KCkpDQoNCiMgQ3JlYXRlIGEgc2NhdHRlciBwbG90DQpwIDwtIGdncGxvdChkYXRhID0gbmV3X3B1bXBraW5zLCBhZXMoeCA9IFByaWNlLCB5ID0gTW9udGgpKQ0KcCArIGdlb21fcG9pbnQoKQ0KYGBgDQoNCklzIHRoaXMgYSB1c2VmdWwgcGxvdCDwn6S3PyBEb2VzIGFueXRoaW5nIGFib3V0IGl0IHN1cnByaXNlIHlvdT8NCg0KSXQncyBub3QgcGFydGljdWxhcmx5IHVzZWZ1bCBhcyBhbGwgaXQgZG9lcyBpcyBkaXNwbGF5IGluIHlvdXIgZGF0YSBhcyBhIHNwcmVhZCBvZiBwb2ludHMgaW4gYSBnaXZlbiBtb250aC4NCg0KIyMjICoqSG93IGRvIHdlIG1ha2UgaXQgdXNlZnVsPyoqDQoNClRvIGdldCBjaGFydHMgdG8gZGlzcGxheSB1c2VmdWwgZGF0YSwgeW91IHVzdWFsbHkgbmVlZCB0byBncm91cCB0aGUgZGF0YSBzb21laG93LiBGb3IgaW5zdGFuY2UgaW4gb3VyIGNhc2UsIGZpbmRpbmcgdGhlIGF2ZXJhZ2UgcHJpY2Ugb2YgcHVtcGtpbnMgZm9yIGVhY2ggbW9udGggd291bGQgcHJvdmlkZSBtb3JlIGluc2lnaHRzIHRvIHRoZSB1bmRlcmx5aW5nIHBhdHRlcm5zIGluIG91ciBkYXRhLiBUaGlzIGxlYWRzIHVzIHRvIG9uZSBtb3JlICoqZHBseXIqKiBmbHlieToNCg0KIyMjIyBgZHBseXI6Omdyb3VwX2J5KCkgJT4lIHN1bW1hcml6ZSgpYA0KDQpHcm91cGVkIGFnZ3JlZ2F0aW9uIGluIFIgY2FuIGJlIGVhc2lseSBjb21wdXRlZCB1c2luZw0KDQpgZHBseXI6Omdyb3VwX2J5KCkgJT4lIHN1bW1hcml6ZSgpYA0KDQotICAgYGRwbHlyOjpncm91cF9ieSgpYCBjaGFuZ2VzIHRoZSB1bml0IG9mIGFuYWx5c2lzIGZyb20gdGhlIGNvbXBsZXRlIGRhdGFzZXQgdG8gaW5kaXZpZHVhbCBncm91cHMgc3VjaCBhcyBwZXIgbW9udGguDQoNCi0gICBgZHBseXI6OnN1bW1hcml6ZSgpYCBjcmVhdGVzIGEgbmV3IGRhdGEgZnJhbWUgd2l0aCBvbmUgY29sdW1uIGZvciBlYWNoIGdyb3VwaW5nIHZhcmlhYmxlIGFuZCBvbmUgY29sdW1uIGZvciBlYWNoIG9mIHRoZSBzdW1tYXJ5IHN0YXRpc3RpY3MgdGhhdCB5b3UgaGF2ZSBzcGVjaWZpZWQuDQoNCkZvciBleGFtcGxlLCB3ZSBjYW4gdXNlIHRoZSBgZHBseXI6Omdyb3VwX2J5KCkgJT4lIHN1bW1hcml6ZSgpYCB0byBncm91cCB0aGUgcHVtcGtpbnMgaW50byBncm91cHMgYmFzZWQgb24gdGhlICoqTW9udGgqKiBjb2x1bW5zIGFuZCB0aGVuIGZpbmQgdGhlICoqbWVhbiBwcmljZSoqIGZvciBlYWNoIG1vbnRoLg0KDQpgYGB7ciBncnBfc3VtcnksIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBGaW5kIHRoZSBhdmVyYWdlIHByaWNlIG9mIHB1bXBraW5zIHBlciBtb250aA0KbmV3X3B1bXBraW5zICU+JQ0KICBncm91cF9ieShNb250aCkgJT4lIA0KICBzdW1tYXJpc2UobWVhbl9wcmljZSA9IG1lYW4oUHJpY2UpKQ0KYGBgDQoNClN1Y2NpbmN0IeKcqA0KDQpDYXRlZ29yaWNhbCBmZWF0dXJlcyBzdWNoIGFzIG1vbnRocyBhcmUgYmV0dGVyIHJlcHJlc2VudGVkIHVzaW5nIGEgYmFyIHBsb3Qg8J+Tii4gVGhlIGxheWVycyByZXNwb25zaWJsZSBmb3IgYmFyIGNoYXJ0cyBhcmUgYGdlb21fYmFyKClgIGFuZCBgZ2VvbV9jb2woKWAuIENvbnN1bHQNCg0KYD9nZW9tX2JhcmAgdG8gZmluZCBvdXQgbW9yZS4NCg0KTGV0J3Mgd2hpcCB1cCBvbmUhDQoNCmBgYHtyIGJhcl9wbHQsIG1lc3NhZ2U9Riwgd2FybmluZz1GfQ0KIyBGaW5kIHRoZSBhdmVyYWdlIHByaWNlIG9mIHB1bXBraW5zIHBlciBtb250aCB0aGVuIHBsb3QgYSBiYXIgY2hhcnQNCm5ld19wdW1wa2lucyAlPiUNCiAgZ3JvdXBfYnkoTW9udGgpICU+JSANCiAgc3VtbWFyaXNlKG1lYW5fcHJpY2UgPSBtZWFuKFByaWNlKSkgJT4lIA0KICBnZ3Bsb3QoYWVzKHggPSBNb250aCwgeSA9IG1lYW5fcHJpY2UpKSArDQogIGdlb21fY29sKGZpbGwgPSAibWlkbmlnaHRibHVlIiwgYWxwaGEgPSAwLjcpICsNCiAgeWxhYigiUHVtcGtpbiBQcmljZSIpDQpgYGANCg0K8J+kqfCfpKlUaGlzIGlzIGEgbW9yZSB1c2VmdWwgZGF0YSB2aXN1YWxpemF0aW9uISBJdCBzZWVtcyB0byBpbmRpY2F0ZSB0aGF0IHRoZSBoaWdoZXN0IHByaWNlIGZvciBwdW1wa2lucyBvY2N1cnMgaW4gU2VwdGVtYmVyIGFuZCBPY3RvYmVyLiBEb2VzIHRoYXQgbWVldCB5b3VyIGV4cGVjdGF0aW9uPyBXaHkgb3Igd2h5IG5vdD8NCg0KQ29uZ3JhdHVsYXRpb25zIG9uIGZpbmlzaGluZyB0aGUgc2Vjb25kIGxlc3NvbiDwn5GPISBZb3UgcHJlcGFyZWQgeW91ciBkYXRhIGZvciBtb2RlbCBidWlsZGluZywgdGhlbiB1bmNvdmVyZWQgbW9yZSBpbnNpZ2h0cyB1c2luZyB2aXN1YWxpemF0aW9ucyENCg==
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/README.md b/README.md index 9e6622ef..8fbc5176 100644 --- a/README.md +++ b/README.md @@ -95,7 +95,7 @@ By ensuring that the content aligns with projects, the process is made more enga | 03 | Fairness and machine learning | [Introduction](1-Introduction/README.md) | What are the important philosophical issues around fairness that students should consider when building and applying ML models? | [Lesson](1-Introduction/3-fairness/README.md) | Tomomi | | 04 | Techniques for machine learning | [Introduction](1-Introduction/README.md) | What techniques do ML researchers use to build ML models? | [Lesson](1-Introduction/4-techniques-of-ML/README.md) | Chris and Jen | | 05 | Introduction to regression | [Regression](2-Regression/README.md) | Get started with Python and Scikit-learn for regression models | | | -| 06 | North American pumpkin prices πŸŽƒ | [Regression](2-Regression/README.md) | Visualize and clean data in preparation for ML | | | +| 06 | North American pumpkin prices πŸŽƒ | [Regression](2-Regression/README.md) | Visualize and clean data in preparation for ML | | | | 07 | North American pumpkin prices πŸŽƒ | [Regression](2-Regression/README.md) | Build linear and polynomial regression models | | | | 08 | North American pumpkin prices πŸŽƒ | [Regression](2-Regression/README.md) | Build a logistic regression model | | | | 09 | A Web App πŸ”Œ | [Web App](3-Web-App/README.md) | Build a web app to use your trained model | [Python](3-Web-App/1-Web-App/README.md) | Jen |