From 380929ff547e8df6132ef40c56bb5d9e93830cf0 Mon Sep 17 00:00:00 2001 From: "Stephen Howell (MSFT)" <38020233+stephen-howell@users.noreply.github.com> Date: Wed, 23 Jun 2021 15:11:17 +0100 Subject: [PATCH] Update README.md --- 6-NLP/4-Hotel-Reviews-1/README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/6-NLP/4-Hotel-Reviews-1/README.md b/6-NLP/4-Hotel-Reviews-1/README.md index fc7b916f3..e1d4d7f0c 100644 --- a/6-NLP/4-Hotel-Reviews-1/README.md +++ b/6-NLP/4-Hotel-Reviews-1/README.md @@ -292,14 +292,14 @@ Here are the questions on their own, followed by the code and explanations: hotel_freq_df = hotel_freq_df.drop_duplicates(subset = ["Hotel_Name"]) display(hotel_freq_df) - Hotel_Name Total_Number_of_Reviews Total_Reviews_Found - Britannia International Hotel Canary Wharf 9086 4789 - Park Plaza Westminster Bridge London 12158 4169 - Copthorne Tara Hotel London Kensington 7105 3578 + Hotel_Name Total_Number_of_Reviews Total_Reviews_Found + Britannia International Hotel Canary Wharf 9086 4789 + Park Plaza Westminster Bridge London 12158 4169 + Copthorne Tara Hotel London Kensington 7105 3578 ... - Mercure Paris Porte d Orleans 110 10 - Hotel Wagner 135 10 - Hotel Gallitzinberg 173 8 + Mercure Paris Porte d Orleans 110 10 + Hotel Wagner 135 10 + Hotel Gallitzinberg 173 8 ``` You may notice that the *counted in the dataset* results do not match the value in `Total_Number_of_Reviews`. It is unclear if this value in the dataset represented the total number of reviews the hotel had, but not all were scraped, or some other calculation. `Total_Number_of_Reviews` is not used in the model because of this unclarity.