You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/4-Data-Science-Lifecycle/14-Introduction/assignment.md

1.3 KiB

Exploring and Assessing a Dataset

A client has approached your team for help in investigating a taxi customer's seasonal spending habits in New York City.

They want to know: Do yellow taxi passengers in New York City tip drivers more in the winter or summer?

Your team is in the Capturing stage of the Data Science Lifecycle and you are in charge of exploring the dataset. You have been provided a notebook and data from Azure Open Datasets to explore and assess if the data can answer the client's question. You have decided to select a small sample of 1 summer month and 1 winter month in the year 2019.

Instructions

In this directory is a notebook that uses Python to load yellow taxi trip data from the NYC Taxi & Limousine Commission for the months of January and July 2019. These datasets have been joined together in a Pandas dataframe.

Your task is to identify the columns that are the most likely required to answer this question, then reorganize the joined dataset so that these columns are displayed first.

Finally, write 3 questions that you would ask the client for more clarification and better understanding of the problem.

Rubric

Exemplary Adequate Needs Improvement