diff --git a/1-Introduction/01-defining-data-science/.ipynb_checkpoints/notebook-checkpoint.ipynb b/1-Introduction/01-defining-data-science/.ipynb_checkpoints/notebook-checkpoint.ipynb new file mode 100644 index 00000000..a7bf29f2 --- /dev/null +++ b/1-Introduction/01-defining-data-science/.ipynb_checkpoints/notebook-checkpoint.ipynb @@ -0,0 +1,419 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Challenge: Analyzing Text about Data Science\n", + "\n", + "In this example, let's do a simple exercise that covers all steps of a traditional data science process. You do not have to write any code, you can just click on the cells below to execute them and observe the result. As a challenge, you are encouraged to try this code out with different data. \n", + "\n", + "## Goal\n", + "\n", + "In this lesson, we have been discussing different concepts related to Data Science. Let's try to discover more related concepts by doing some **text mining**. We will start with a text about Data Science, extract keywords from it, and then try to visualize the result.\n", + "\n", + "As a text, I will use the page on Data Science from Wikipedia:" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": {}, + "outputs": [], + "source": [ + "url = 'https://en.wikipedia.org/wiki/Data_science'" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1: Getting the Data\n", + "\n", + "First step in every data science process is getting the data. We will use `requests` library to do that:" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "
\n", + "\n", + "