You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/4-Data-Science-Lifecycle/14-Introduction/assignment.md

24 lines
1.5 KiB

# Assessing a Dataset
A client has approached your team for help in investigating a taxi customer's seasonal spending habits in New York City.
They want to know: **Do yellow taxi passengers in New York City tip drivers more in the winter or summer?**
2 years ago
Your team is in the [Capturing](Readme.md#Capturing) stage of the Data Science Lifecycle and you are in charge of handling the dataset. You have been provided a notebook and [data](../../data/taxi.csv) to explore.
In this directory is a [notebook](notebook.ipynb) that uses Python to load yellow taxi trip data from the [NYC Taxi & Limousine Commission](https://docs.microsoft.com/en-us/azure/open-datasets/dataset-taxi-yellow?tabs=azureml-opendatasets).
You can also open the taxi data file in text editor or spreadsheet software like Excel.
## Instructions
- Assess whether or not the data in this dataset can help answer the question.
- Explore the [NYC Open Data catalog](https://data.cityofnewyork.us/browse?sortBy=most_accessed&utf8=%E2%9C%93). Identify an additional dataset that could potentially be helpful in answering the client's question.
- Write 3 questions that you would ask the client for more clarification and better understanding of the problem.
Refer to the [dataset's dictionary](https://www1.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf) and [user guide](https://www1.nyc.gov/assets/tlc/downloads/pdf/trip_record_user_guide.pdf) for more information about the data.
## Rubric
Exemplary | Adequate | Needs Improvement
--- | --- | -- |