You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
25 lines
1.5 KiB
25 lines
1.5 KiB
# Exploring for answers
|
|
|
|
This is a continuation of the previous lesson's [assignment](..\14-Introduction\assignment.md), where we briefly took a look at the data set. Now we will be taking a deeper look at the data.
|
|
|
|
Again, the question the client wants to know: **Do yellow taxi passengers in New York City tip drivers more in the winter or summer?**
|
|
|
|
Your team is in the [Analyzing](Readme.md) stage of the Data Science Lifecycle, where you are responsible for doing exploratory data analysis on the dataset. You have been provided a notebook and dataset that contains 200 taxi transactions from January and July 2019.
|
|
|
|
## Instructions
|
|
|
|
In this directory is a [notebook](assignment.ipynb) and data from the [Taxi & Limousine Commission](https://docs.microsoft.com/en-us/azure/open-datasets/dataset-taxi-yellow?tabs=azureml-opendatasets). Refer to the [dataset's dictionary](https://www1.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf) and [user guide](https://www1.nyc.gov/assets/tlc/downloads/pdf/trip_record_user_guide.pdf) for more information about the data.
|
|
|
|
|
|
Use some the techniques in this lesson to do your own EDA in the notebook (add cells if you'd like) and answer the following questions:
|
|
|
|
- What other influences in the data could affect the tip amount?
|
|
- What columns will most likely not be needed to answer the client's questions?
|
|
- Based on what has been provided so far, does the data seem to provide any evidence of seasonal tipping behavior?
|
|
|
|
|
|
## Rubric
|
|
|
|
Exemplary | Adequate | Needs Improvement
|
|
--- | --- | -- |
|