42.28. NYC taxi data in winter and summer#

A client has approached your team for help in investigating a taxi customer’s seasonal spending habits in New York City. They want to know: Do yellow taxi passengers in New York City tip drivers more in the winter or summer?

Your team is in the capturing stage of the Data Science Lifecycle and you are in charge of handling the dataset. You have been provided a notebook and data to explore.

We will use Python to load yellow taxi trip data from the NYC Taxi & Limousine Commission. You can also open the taxi data file in a text editor or spreadsheet software like Excel.

42.28.1. Instructions#

  • Assess whether or not the data in this dataset can help answer the question.

  • Explore the NYC Open Data catalog. Identify an additional dataset that could potentially be helpful in answering the client’s question.

  • Write 3 questions that you would ask the client for more clarification and a better understanding of the problem.

Refer to the dataset’s dictionary and user guide for more information about the data.

import pandas as pd

path = '../../assets/data/taxi.csv'

#Load the csv file into a dataframe
df = pd.read_csv(path)

#Print the dataframe
print(df)

42.28.2. Rubric#

Exemplary

Adequate

Needs Improvement

—

—

–

42.28.3. Acknowledgments#

Thanks to Microsoft for creating the open-source course Data Science for Beginners. It inspires the majority of the content in this chapter.