diff --git a/1-Introduction/01-defining-data-science/images/ds_wordcloud.png b/1-Introduction/01-defining-data-science/images/ds_wordcloud.png index b4b9b242..cbdbd623 100644 Binary files a/1-Introduction/01-defining-data-science/images/ds_wordcloud.png and b/1-Introduction/01-defining-data-science/images/ds_wordcloud.png differ diff --git a/1-Introduction/01-defining-data-science/notebook.ipynb b/1-Introduction/01-defining-data-science/notebook.ipynb index cf3988e8..02b9cab5 100644 --- a/1-Introduction/01-defining-data-science/notebook.ipynb +++ b/1-Introduction/01-defining-data-science/notebook.ipynb @@ -2,199 +2,186 @@ "cells": [ { "cell_type": "markdown", + "metadata": {}, "source": [ - "# Challenge: Analyzing Text about Data Science\r\n", - "\r\n", - "In this example, let's do a simple exercise that covers all steps of a traditional data science process. You do not have to write any code, you can just click on the cells below to execute them and observe the result. As a challenge, you are encouraged to try this code out with different data. \r\n", - "\r\n", - "## Goal\r\n", - "\r\n", - "In this lesson, we have been discussing different concepts related to Data Science. Let's try to discover more related concepts by doing some **text mining**. We will start with a text about Data Science, extract keywords from it, and then try to visualize the result.\r\n", - "\r\n", + "# Challenge: Analyzing Text about Data Science\n", + "\n", + "In this example, let's do a simple exercise that covers all steps of a traditional data science process. You do not have to write any code, you can just click on the cells below to execute them and observe the result. As a challenge, you are encouraged to try this code out with different data. \n", + "\n", + "## Goal\n", + "\n", + "In this lesson, we have been discussing different concepts related to Data Science. Let's try to discover more related concepts by doing some **text mining**. We will start with a text about Data Science, extract keywords from it, and then try to visualize the result.\n", + "\n", "As a text, I will use the page on Data Science from Wikipedia:" - ], - "metadata": {} + ] }, { "cell_type": "markdown", - "source": [], - "metadata": {} + "metadata": {}, + "source": [] }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 1, + "metadata": {}, + "outputs": [], "source": [ "url = 'https://en.wikipedia.org/wiki/Data_science'" - ], - "outputs": [], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "## Step 1: Getting the Data\r\n", - "\r\n", + "## Step 1: Getting the Data\n", + "\n", "First step in every data science process is getting the data. We will use `requests` library to do that:" - ], - "metadata": {} + ] }, { "cell_type": "code", - "execution_count": 63, - "source": [ - "import requests\r\n", - "\r\n", - "text = requests.get(url).content.decode('utf-8')\r\n", - "print(text[:1000])" - ], + "execution_count": 2, + "metadata": {}, "outputs": [ { - "output_type": "stream", "name": "stdout", + "output_type": "stream", "text": [ "\n", "\n", "\n", "\n", "Data science - Wikipedia\n", - "