0% found this document useful (0 votes)
38 views

Report_NYC_Taxi_Operations_Starter23

The report focuses on optimizing NYC taxi operations through data preparation, cleaning, and exploratory data analysis (EDA). It includes methodologies for handling missing values, identifying patterns in trip data, and generating insights on revenue trends and passenger behavior. The conclusions provide recommendations for improving routing, dispatching, and pricing strategies based on the analysis of operational inefficiencies and demand patterns.

Uploaded by

zeeshanrnk786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Report_NYC_Taxi_Operations_Starter23

The report focuses on optimizing NYC taxi operations through data preparation, cleaning, and exploratory data analysis (EDA). It includes methodologies for handling missing values, identifying patterns in trip data, and generating insights on revenue trends and passenger behavior. The conclusions provide recommendations for improving routing, dispatching, and pricing strategies based on the analysis of operational inefficiencies and demand patterns.

Uploaded by

zeeshanrnk786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Report: Optimising NYC Taxi

Operations
Include your visualisations, analysis, results, insights, and outcomes. Explain your methodology
and approach to the tasks. Add your conclusions to the sections.

1. Data Preparation
1.1. Loading the dataset

1.1.1. Sample the data and combine the files

2. Data Cleaning
2.1. Fixing Columns
2.1.1. Fix the index

2.1.2. Combine the two airport_fee columns

2.2. Handling Missing Values


2.2.1. Find the proportion of missing values in each column

2.2.2. Handling missing values in passenger_count

2.2.3. Handle missing values in RatecodeID

2.2.4. Impute NaN in congestion_surcharge


2.3. Handling Outliers and Standardising Values
2.3.1. Check outliers in payment type, trip distance and tip amount
columns

3. Exploratory Data Analysis


3.1. General EDA: Finding Patterns and Trends
3.1.1. Classify variables into categorical and numerical

3.1.2. Analyse the distribution of taxi pickups by hours, days of the week,
and months

3.1.3. Filter out the zero/negative values in fares, distance and tips

3.1.4. Analyse the monthly revenue trends

3.1.5. Find the proportion of each quarter’s revenue in the yearly revenue

3.1.6. Analyse and visualise the relationship between distance and fare
amount

3.1.7. Analyse the relationship between fare/tips and trips/passengers

3.1.8. Analyse the distribution of different payment types

3.1.9. Load the taxi zones shapefile and display it


3.1.10. Merge the zone data with trips data

3.1.11. Find the number of trips for each zone/location ID

3.1.12. Add the number of trips for each zone to the zones dataframe

3.1.13. Plot a map of the zones showing number of trips

3.1.14. Conclude with results

3.2. Detailed EDA: Insights and Strategies


3.2.1. Identify slow routes by comparing average speeds on different
routes

3.2.2. Calculate the hourly number of trips and identify the busy hours

3.2.3. Scale up the number of trips from above to find the actual number of
trips

3.2.4. Compare hourly traffic on weekdays and weekends

3.2.5. Identify the top 10 zones with high hourly pickups and drops

3.2.6. Find the ratio of pickups and dropoffs in each zone


3.2.7. Identify the top zones with high traffic during night hours

3.2.8. Find the revenue share for nighttime and daytime hours

3.2.9. For the different passenger counts, find the average fare per mile
per passenger

3.2.10. Find the average fare per mile by hours of the day and by days of
the week

3.2.11. Analyse the average fare per mile for the different vendors

3.2.12. Compare the fare rates of different vendors in a distance-tiered


fashion

3.2.13. Analyse the tip percentages

3.2.14. Analyse the trends in passenger count

3.2.15. Analyse the variation of passenger counts across zones

3.2.16. Analyse the pickup/dropoff zones or times when extra charges are
applied more frequently.
4. Conclusions
4.1. Final Insights and Recommendations
4.1.1. Recommendations to optimize routing and dispatching based on
demand patterns and operational inefficiencies.

4.1.2. Suggestions on strategically positioning cabs across different


zones to make best use of insights uncovered by analysing trip
trends across time, days and months.

4.1.3. Propose data-driven adjustments to the pricing strategy to maximize


revenue while maintaining competitive rates with other vendors.

You might also like