__STYLES__
Overview
The goal of the dashboard was to identify key business drivers and opportunities for improvement, including:
Key Insights
One quick note on how the dataset impacts this analysis: it seems probable that it's a (hopefully representative) sample of the total data during the period covered, because it shows less than 2 passengers per scheduled trip. If it did include all of the ticket sales, the system would be losing tremendous amounts of money. I'll ignore this issue for the rest of this analysis.
The most popular route was Manchester Piccadilly to Liverpool Lime Street, with over 4600 tickets sold. However, because of the low average ticket price (£4) it only generated 2.3% of total revenue.
The peak travel times coincided with typical commute times--6 to 8 a.m. and 4-6 p.m.-- and those peaks were consistent across every day of the week, even weekends.
Standard tickets generated almost 4 times as much revenue as first class tickets. The "advance purchase" ticket type generated about 42% of overall revenue, while "off peak" generated approximately 30% and "anytime" generated the remainder.
The overall on-time performance was 90.7%, with weather as the most common delay/cancellation cause at over 25%. The other factors, in descending order, were staffing, signal failure, technical issue (all a bit more than 20%), with only about 8% of the delays and cancellations due to traffic.
5.3% of all trips were delayed, and 4% were cancelled. Almost 2300 travelers were impacted by delays, and just under 1900 by cancellations. Assuming all travelers who requested refunds received them, refunds cost the train system £38,700 in lost revenue. Travelers with "off-peak" tickets were much more likely to request refunds compared to holders of other ticket types.
Recommendations/Next Steps
The Dataset
The dataset consisted of a single file containing mock train ticket sales for National Rail in the UK, from January to April 2024, including details on the type of ticket, the date & time for each journey, the departure & arrival stations, the ticket price, and more. Each observation was for a single ticket sale.
The only data cleaning required was to correct capitalization and naming conventions in the "Reason for Delay" column, which reduced the number of categories.
Transformations
I created several calculated columns using Power Query and DAX to better analyze the data, including:
I also built a calendar table and linked it with the railway table to create the data model, and created a number of DAX measures to support the visualizations I planned to create. Among the more critical measures were ones that combined the single ticket sale observations into groups of train trips, so on-time rates could be determined.
The Dashboard
The dashboard I built contains 4 pages plus 1 drill-down page:
The Overview page shows big picture details, including KPIs for total revenue, on-time trip percentage, and total passenger (ticket count). Also included on this page are bar chars showing the routes that generated the most revenue and the most sales, and which ticket categories and types generated the most tickets and sales.
The Peak Times and Routes page shows peak times by departure hour and day, as well as providing bar charts to show numbers of passengers by departure and destination locations. A day of week filter is included so users can explore differences by day.
The Performance page shows KPIs for on-time trip performance, the numbers of trips delayed, cancelled and scheduled, and the number of passengers/tickets effected by the delays and cancellations. I've also included a chart to view these details for individual departure stations (or arrival stations), and users can drill-down into details for individual departure stations from there. There's also a donut chart displaying the delay/cancellation causes and a area chart showing on-time trip performance by day.
The Destination Details drilldown page shows KPIs for on-time trip performance, percentage delayed, and percentage cancelled, and compares those values to the overall values for all locations. There's also donut chart showing cancellation/delay causes for the station, and a table that includes duration/delays for routes served by the destination.
The Revenue page allows uses to explore details about revenue and refund information. The revenue route column shows total revenue and net revenue (total-refunds) for the highest revenue routes. A slider allows users to estimate the revenue gains that might result from a ticket price increase.
There are 2 donut charts: one to break down the revenue generated by railcard holders vs. non-card holders, and another that shows the breakdown of revenue by ticket type.
There's also a KPI card showing detailed refund information, including the total amount refunded, and the number of delays and cancellations, refund requests, and percentage of delayed/cancelled ticket holders who requested refunds. Finally, there's a 100% bar chart showing the percentage of refund requests by ticket type and class.