From cda8274a740d0f7903a69113ae9c870a28955f66 Mon Sep 17 00:00:00 2001 From: ArthurSrZ <55806298+ArthurSrz@users.noreply.github.com> Date: Thu, 5 Oct 2023 10:13:46 +0000 Subject: [PATCH] Pending changes exported from your codespace --- .../01-defining-data-science/notebook.ipynb | 336 ++--- .../README.md | 0 .../assignment.md | 0 .../translations/README.es.md | 0 .../translations/README.hi.md | 0 .../translations/README.ko.md | 0 .../translations/README.pt-br.md | 0 .../translations/README.ru.md | 0 .../translations/README.tr.md | 0 .../translations/assignment.hi.md | 0 .../translations/assignment.ko.md | 0 .../translations/assignment.pt-br.md | 0 .../translations/assignment.ru.md | 0 1-Introduction/02-ethics/README.md | 263 ---- 1-Introduction/02-ethics/assignment.md | 21 - .../02-ethics/translations/README.hi.md | 259 ---- .../02-ethics/translations/README.ko.md | 263 ---- .../02-ethics/translations/README.nl.md | 259 ---- .../02-ethics/translations/README.pt-br.md | 262 ---- .../02-ethics/translations/README.ru.md | 273 ---- .../02-ethics/translations/assignment.hi.md | 19 - .../02-ethics/translations/assignment.ko.md | 21 - .../02-ethics/translations/assignment.nl.md | 21 - .../translations/assignment.pt-br.md | 21 - .../02-ethics/translations/assignment.ru.md | 21 - .../README.md | 0 .../assignment.md | 0 .../images/boxplot_byrole.png | Bin .../images/boxplot_explanation.png | Bin .../images/height-boxplot.png | Bin .../images/normal-histogram.png | Bin .../images/probability-density.png | Bin .../images/video-prob-and-stats.png | Bin .../images/weight-boxplot.png | Bin .../images/weight-height-relationship.png | Bin .../images/weight-histogram.png | Bin .../translations/README.hi.md | 0 .../translations/README.ko.md | 0 .../translations/README.pt-br.md | 0 .../translations/README.ru.md | 0 .../translations/assignment.hi.md | 0 .../translations/assignment.ko.md | 0 .../translations/assignment.pt-br.md | 0 .../translations/assignment.ru.md | 0 .../04-stats-and-probability/assignment.ipynb | 252 ---- .../04-stats-and-probability/notebook.ipynb | 1122 ----------------- .../solution/assignment.ipynb | 945 -------------- 47 files changed, 168 insertions(+), 4190 deletions(-) rename 1-Introduction/{03-defining-data => 02-defining-data}/README.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/assignment.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/README.es.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/README.hi.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/README.ko.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/README.pt-br.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/README.ru.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/README.tr.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/assignment.hi.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/assignment.ko.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/assignment.pt-br.md (100%) rename 1-Introduction/{03-defining-data => 02-defining-data}/translations/assignment.ru.md (100%) delete mode 100644 1-Introduction/02-ethics/README.md delete mode 100644 1-Introduction/02-ethics/assignment.md delete mode 100644 1-Introduction/02-ethics/translations/README.hi.md delete mode 100644 1-Introduction/02-ethics/translations/README.ko.md delete mode 100644 1-Introduction/02-ethics/translations/README.nl.md delete mode 100644 1-Introduction/02-ethics/translations/README.pt-br.md delete mode 100644 1-Introduction/02-ethics/translations/README.ru.md delete mode 100644 1-Introduction/02-ethics/translations/assignment.hi.md delete mode 100644 1-Introduction/02-ethics/translations/assignment.ko.md delete mode 100644 1-Introduction/02-ethics/translations/assignment.nl.md delete mode 100644 1-Introduction/02-ethics/translations/assignment.pt-br.md delete mode 100644 1-Introduction/02-ethics/translations/assignment.ru.md rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/README.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/assignment.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/boxplot_byrole.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/boxplot_explanation.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/height-boxplot.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/normal-histogram.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/probability-density.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/video-prob-and-stats.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/weight-boxplot.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/weight-height-relationship.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/images/weight-histogram.png (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/README.hi.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/README.ko.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/README.pt-br.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/README.ru.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/assignment.hi.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/assignment.ko.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/assignment.pt-br.md (100%) rename 1-Introduction/{04-stats-and-probability => 03-stats-and-probability}/translations/assignment.ru.md (100%) delete mode 100644 1-Introduction/04-stats-and-probability/assignment.ipynb delete mode 100644 1-Introduction/04-stats-and-probability/notebook.ipynb delete mode 100644 1-Introduction/04-stats-and-probability/solution/assignment.ipynb diff --git a/1-Introduction/01-defining-data-science/notebook.ipynb b/1-Introduction/01-defining-data-science/notebook.ipynb index cf3988e8..c7740cb8 100644 --- a/1-Introduction/01-defining-data-science/notebook.ipynb +++ b/1-Introduction/01-defining-data-science/notebook.ipynb @@ -2,55 +2,50 @@ "cells": [ { "cell_type": "markdown", + "metadata": {}, "source": [ - "# Challenge: Analyzing Text about Data Science\r\n", - "\r\n", - "In this example, let's do a simple exercise that covers all steps of a traditional data science process. You do not have to write any code, you can just click on the cells below to execute them and observe the result. As a challenge, you are encouraged to try this code out with different data. \r\n", - "\r\n", - "## Goal\r\n", - "\r\n", - "In this lesson, we have been discussing different concepts related to Data Science. Let's try to discover more related concepts by doing some **text mining**. We will start with a text about Data Science, extract keywords from it, and then try to visualize the result.\r\n", - "\r\n", + "# Challenge: Analyzing Text about Data Science\n", + "\n", + "In this example, let's do a simple exercise that covers all steps of a traditional data science process. You do not have to write any code, you can just click on the cells below to execute them and observe the result. As a challenge, you are encouraged to try this code out with different data. \n", + "\n", + "## Goal\n", + "\n", + "In this lesson, we have been discussing different concepts related to Data Science. Let's try to discover more related concepts by doing some **text mining**. We will start with a text about Data Science, extract keywords from it, and then try to visualize the result.\n", + "\n", "As a text, I will use the page on Data Science from Wikipedia:" - ], - "metadata": {} + ] }, { "cell_type": "markdown", - "source": [], - "metadata": {} + "metadata": {}, + "source": [] }, { "cell_type": "code", "execution_count": 62, + "metadata": {}, + "outputs": [], "source": [ "url = 'https://en.wikipedia.org/wiki/Data_science'" - ], - "outputs": [], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "## Step 1: Getting the Data\r\n", - "\r\n", + "## Step 1: Getting the Data\n", + "\n", "First step in every data science process is getting the data. We will use `requests` library to do that:" - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 63, - "source": [ - "import requests\r\n", - "\r\n", - "text = requests.get(url).content.decode('utf-8')\r\n", - "print(text[:1000])" - ], + "metadata": {}, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "\n", "\n", @@ -61,77 +56,79 @@ ] } ], - "metadata": {} + "source": [ + "import requests\n", + "\n", + "text = requests.get(url).content.decode('utf-8')\n", + "print(text[:1000])" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "## Step 2: Transforming the Data\r\n", - "\r\n", - "The next step is to convert the data into the form suitable for processing. In our case, we have downloaded HTML source code from the page, and we need to convert it into plain text.\r\n", - "\r\n", + "## Step 2: Transforming the Data\n", + "\n", + "The next step is to convert the data into the form suitable for processing. In our case, we have downloaded HTML source code from the page, and we need to convert it into plain text.\n", + "\n", "There are many ways this can be done. We will use the simplest built-in [HTMLParser](https://docs.python.org/3/library/html.parser.html) object from Python. We need to subclass the `HTMLParser` class and define the code that will collect all text inside HTML tags, except `