__STYLES__
Tools used in this project
National Rail Summary

National Rail Summary - UK

About this project

For the preparation of the dashboard I carried out following key steps

  1. Loading the source file and investigate them with Column profile, quality and distribution options to get am summary idea
    
  2. Added necessary transformations to create query columns

  3. Data Cleaning ( And I noticed when actual arrival and planned time is same, but the Journey status was marked as “Delayed”. I believe it was a data entry error since delayed reasons were recorded for those 18 rows that means there actual arrival time should be a later time than its recorded. which was in 18 rows out of 31k+ rows hence its bit hard to notice).

  4. Upon creation of the UI/ report view I used DAX measures, field parameters, standard Power BI visuals and 2 custom visuals (Sankey Chart and Flow chart) to analyze and visualize the pattern between rides from stations.

And via the analysis I met following findings

♦ While Manchester Piccadilly has highest departures and Birmingham New Street station has highest arrivals mostly used train route is from Manchester Piccadilly to Liverpool Lime Street station

♦ Regarding Peak times Wednesdays and Thursdays seems to be the busiest days while 6am and 5pm-6pm are the peak times for maximum train rides in overall and when it comes to monthly levels Jan and March had the highest rides in overall

♦ Regarding revenue analysis it seems in first 2 weeks revenue increases and then decreases in latter weeks in general. And out of total revenue made about 5% was requested for refunds. And based on ticket type 50% have used Advance ticket type and that amounts to 309k euros which is about 41% of total revenue while 90% of rides were on standard class tickets (which amounted to 80% of the revenue as well)

♦ Highest Ontime arrivals count were recorded in rides departed from London Euston station which is a 90% while Edinburgh Waverly has the lowest performance which 100% delayed/cancelled rides out of 51 total rides which has been assigned as staffing issue in the delay reason and on average Friday’s have highest on time performance which is 88.1%. And accumulated delay time has added up to 67 days and 4.5hrs

Main Measures

Late Time formatted = 
VAR gap = [Late Time Gap in minutes] 
VAR days =  INT( DIVIDE([Late Time Gap in minutes] ,60 * 24) )
VAR bal_hrs =  INT([Late Time Gap in minutes]/60 - days * 24 )   //INT(ROUNDDOWN( DIVIDE([Time Gap in minutes] ,60) , 0))
VAR bal_minutes = int(MOD( gap, 60 ))
RETURN
SWITCH( TRUE(),
SELECTEDVALUE(railway[Journey Status]) = "Cancelled", BLANK(),
 days = 0 && bal_hrs = 0, bal_minutes & " Mins",
 days = 0 && bal_hrs <> 0 , bal_hrs & " hrs " & bal_minutes & " Mins",
 days > 0, days & " days " & bal_hrs & " hrs " & bal_minutes & " Mins"
 )

Highest Revenue Week = 
VAR Tab1 = TOPN(1, VALUES('Dim Date'[Week of Month]), [Total Revenue],DESC)
VAR cR = COUNTROWS( Tab1 )
VAR concatX = CONCATENATEX (
        SELECTCOLUMNS (
             TOPN ( 1, Tab1, [Total Revenue] ),
            "Week of Month", [Week of Month]
        ),
        [Week of Month],
         ", Week " ) 
RETURN
SWITCH( TRUE(),
cR = 1, "Week " & Tab1,
cR > 1, "Week " &  concatX,
BLANK()
)

Highest revenue Monthweek = 
VAR Tab1 = TOPN(1, VALUES('Dim Date'[MonthWeek]), [Total Revenue],DESC)
VAR cR = COUNTROWS( Tab1 )
VAR concatX = CONCATENATEX (
        SELECTCOLUMNS (
            TOPN ( 1, Tab1, [Total Revenue] ),
            "MonthWeek", [MonthWeek]
        ),
        [MonthWeek],
         ", " ) 
RETURN
SWITCH( TRUE(),
cR = 1, Tab1,
cR > 1, concatX,
BLANK()
)

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.