From 0634419bceec7ff8285a54e0698afbee098a603c Mon Sep 17 00:00:00 2001 From: Lee Stott Date: Tue, 17 Feb 2026 07:53:54 +0000 Subject: [PATCH] Update 1-Introduction/01-defining-data-science/solution/notebook.ipynb Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- 1-Introduction/01-defining-data-science/solution/notebook.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Introduction/01-defining-data-science/solution/notebook.ipynb b/1-Introduction/01-defining-data-science/solution/notebook.ipynb index 4a13c9ee..a7853ecb 100644 --- a/1-Introduction/01-defining-data-science/solution/notebook.ipynb +++ b/1-Introduction/01-defining-data-science/solution/notebook.ipynb @@ -69,7 +69,7 @@ { "cell_type": "markdown", "source": [ - "## Step 2: Transforming the Data\r\n\r\nThe next step is to convert the data into the form suitable for processing. In our case, we have downloaded HTML source code from the page, and we need to convert it into plain text.\r\n\r\nThere are many ways this can be done. We will use [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/), a popular Python library for parsing HTML. BeautifulSoup allows us to target specific HTML elements, so we can extract only the main article content from Wikipedia, avoiding navigation menus, sidebars, footers, and other irrelevant content." + "## Step 2: Transforming the Data\r\n\r\nThe next step is to convert the data into the form suitable for processing. In our case, we have downloaded HTML source code from the page, and we need to convert it into plain text.\r\n\r\nThere are many ways this can be done. We will use [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/), a popular Python library for parsing HTML. BeautifulSoup allows us to target specific HTML elements, so we can focus on the main article content from Wikipedia and reduce some navigation menus, sidebars, footers, and other irrelevant content (though some boilerplate text may still remain)." ], "metadata": {} },