Project 2

Tips in trips in the city of Chicago

Important

Due date: Sunday, March 16

Use the corresponding invite link in this google doc (accessible with your EPFL account) to accept the project, and either join an existing team or create a new one.

You are required to hand in a PDF version of your report report.pdf (generated from the quarto file report.qmd, if applicable) and the quarto file itself. The report.qmd should contain all the code necessary to reproduce your results: you should not show the actual code in the PDF report, unless you want to point out something specific.

Your README.md should contain instructions on reproducing the PDF report from the quarto file. This can be useful if you have issues with the automatic generation of the PDF report right before the deadline.

An aternative to quarto (or Rmarkdown), is to use LaTeX to produce the report. In that case, you will also need to submit the source code where chunks should be well commented and references to the figures in the report should be clear.

Checklist:

Data

Transportation Network Providers (TNPs) such as Uber and Lyft operate in Chicago. As part of its licensing process, Chicago requires the companies to report on their activities monthly

  • The dataset provides detailed trip-level information on reported trips
  • This dataset covers trips during October 2024
  • Source: City of Chicago Open Data Portal
  • Key features:
    • Temporal trends
    • Spatial patterns
    • Fare and cost analysis
    • Shared ride behaviour

Data: Trip information

  • trip_id: unique identifier for each trip
  • trip_start_timestamp: timestamp when the trip started
  • trip_end_timestamp: timestamp when the trip ended
  • trip_seconds: total duration of the trip in seconds
  • trip_miles: Total distance of the trip in miles
  • fare: base fare of the trip
  • tip: amount tipped by the passenger
  • additional_charges: extra fees (e.g., service fees)
  • trip_total: total cost of the trip (sum of fare, tip, and additional charges)
  • shared_trip_authorized: whether the passenger opted for a shared ride
  • shared_trip_match: whether the ride was actually matched with another passenger
  • trips_pooled: number of passengers pooled in a shared trip

Note: times are rounded to the nearest 15 minutes, fares are rounded to the nearest 2.50, and tips are rounded to the nearest 1.00

The Goal

The dataset provides valuable insights into:

  • temporal trends (peak ride-sharing hours and seasonal variations)
  • spatial mobility (pickup/dropoff areas)
  • fare & cost analysis (fare structures, tip patterns, and additional charges)
  • ride-sharing trends (e.g., how many trips were pooled vs. individual rides)

We will focus on one specific aspect: the tipping behaviour

\(\rightarrow\) Understanding tipping patterns can help companies adjust pricing models or researchers study socio-economic factors influencing tips

Data: Trip information

  • trip_id: unique identifier for each trip
  • trip_start_timestamp: timestamp when the trip started
  • trip_end_timestamp: timestamp when the trip ended
  • trip_seconds: total duration of the trip in seconds
  • trip_miles: Total distance of the trip in miles
  • fare: base fare of the trip
  • tip: amount tipped by the passenger
  • additional_charges: extra fees (e.g., service fees)
  • trip_total: total cost of the trip (sum of fare, tip, and additional charges)
  • shared_trip_authorized: whether the passenger opted for a shared ride
  • shared_trip_match: whether the ride was actually matched with another passenger
  • trips_pooled: number of passengers pooled in a shared trip

Note: times are rounded to the nearest 15 minutes, fares are rounded to the nearest 2.50, and tips are rounded to the nearest 1.00

The Goal

The dataset provides valuable insights into:

  • temporal trends (peak ride-sharing hours and seasonal variations)
  • spatial mobility (pickup/dropoff areas)
  • fare & cost analysis (fare structures, tip patterns, and additional charges)
  • ride-sharing trends (e.g., how many trips were pooled vs. individual rides)

We will focus on one specific aspect: the tipping behaviour

\(\rightarrow\) Understanding tipping patterns can help companies adjust pricing models or researchers study socio-economic factors influencing tips