__STYLES__

Tickets, Please - National Rail Data Challenge Project

Tools used in this project
Tickets, Please - National Rail Data Challenge Project

Power BI Dashboard

About this project

PROJECT GOALS:

For the Maven Rail Challenge, you'll play the role of a BI Developer for National Rail, a company that provides business services to passenger train operators in England, Scotland, and Wales.

You've been asked by your manager to create an exploratory dashboard that helps them:

  • Identify the most popular routes
  • Determine peak travel times
  • Analyse revenue from different ticket types & classes
  • Diagnose on-time performance and contributing factors

PROCESS:

In order to meet the brief above I decided to create a dashboard with 4 tabs, as follows:

The first page shows a general overview of key KPIs, and allow for an exploration the routes and trends over time, both able to be viewed by revenue or ticket sales.

The second focuses on the question of ticket types and classes, allowing the user to see the effect on both total and average revenue, sales and refund rates, based on different ticket types and classes. This page also includes some time series and composition data, broken down by ticket types and classes.

The third page displays the most common travel times, which can be viewed by both revenue and sales, as well as being able to be broken down into Weekends and Weekdays and times of day in 6-hour intervals.

The fourth and final page allows for a look into the performance of trains and the reasons for delays and cancellations and the effects that they have had in terms of driving refund requests.

In addition to the information readily available on each page, there is also a slicer panel on each that allows for further filtering and exploration of the data by the end user.

In order to prepare the data a few steps were taken:

Removed Duplicates in transaction ID in Excel: No duplicate values found

Used filter to check all columns for blank values, only blanks found in Actual Arrival and Reason for delay, both of these to be expected

Created a calendar lookup table to refer to in Power BI

Changed format of all time related fields to remove the seconds, created a time lookup table to refer to in Power BI

Created lookups for the station names and the cities they serve

Loaded rail data, calendar data and time data into Power BI and created a star-schema data model

Created additional columns in calendar data to provide the start of year, quarter, month and week and also the day of the week for each lookup value.

Changed the initial format of the time data column to be time only, rather than date/time.

Created additional columns in the time data to provide the start of hour and a calculated column using DAX to track if the given time was in the morning, afternoon, evening or night – based on 6 hour intervals.

Created additional column in the rail data table to repeat the price where a refund was requested and return blank if no refund requested in order to calculate the cost of refunds.

In the original data held in excel, used nested IF statements to group the delay reasons Weather and Weather Conditions into Weather, and Staffing and Staff Shortage into Staffing Issues and then applied the results to the delay column and refreshed the data into PowerBI.

Created Calculated Measures for:

· Refunds

· Average Refunds

· Average Revenue

· Refund Rate

· Sales Quantity

· Total Refunds

· Total Revenue

ANALYSIS:

Using the dashboard allows for the following conclusions to be drawn about the current state of National Rail based on the four factors in the project brief.

Identify the most popular routes

In terms of revenue, the most popular routes are:

London Kings Cross-York

Liverpool Lime Street-London Euston

London Paddington-Reading

In terms of tickets sold, the most popular routes are:

Manchester Piccadilly-Liverpool Lime Street

London Euston-Birmingham New Street

London Kings Cross-York

This shows that the London Stations and Liverpool Lime Street are both significant drivers for the national rail network and efforts should focus on ensuring these stations remain well serviced with enough trains to service the high

Determine peak travel times

In terms of revenue the most popular time to travel is 08:00, regardless of the day of week. In terms of ticket sales, the most popular time to travel is 18:45, regardless of the day of week. Overall, weekdays make up a significant portion of both journeys and revenue, and we should continue to ensure that weekday journeys are as free from issue as possible by maintaining the technology and staffing levels. Additionally we should consider extending the peak travel window to include 18:45 as this is the most common travel time despite not being included in the peak

Analyse revenue from different ticket types & classes

Standard class tickets make up a whopping 80% of our revenue, although the average revenue per ticket is about half that of a first class ticket. Convincing more customers to switch to first class could result in an uptick in revenue as long as we are able to maintain high levels of service for first class customers as they are also more likely to request a refund when there is a delay or cancellation.

Advance tickets make up the majority of our total revenue, however the average revenue on an advance ticket is nearly £6 lower than our overall average. This could indicate that we need to increase pricing on advance tickets, as long as this stays in line with demand. Additionally we make a significant average return on our Anytime tickets. The majority of this comes from trains departing at 08:00 which is during peak time, suggesting that for commuters the extra pricing on these tickets is worth paying. This demonstrates that extending the peak travel time to include 18:45 as suggested above may prove a worthwhile

Diagnose on-time performance and contributing factors

The vast majority of our journeys are considered to be on-time, which is a win that can be shared with our customers and shareholders.

Weather is the biggest factor in causing delays and cancellations, however has the lowest rate of returns of all. This is likely attributed to the fact that Weather is the only of one of the issues seen to be outside of our control.

Cancellations are more likely to result in refund requests overall, although delays tend to result in higher losses both on average and in total.

Traffic is the lowest overall driver of delays and cancellations, but when it does the cost is disproportionally high to the business meaning that traffic related delays need to be avoided at all costs.

Overall the majority of our losses through refunds come through Staffing and Technical issues, these are factors entirely within our control and suggest that we should work harder to ensure adequate investment in these areas to prevent such issues in the future.

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.