As a BI Analyst for US Department of Energy I am accountable for cleaning, prepping & visualizing the data to find out trends, patterns, causes, residents in states affected due to power outrage. Since the data is imported from an E-filing system, there are certain protocols that needs to be followed to:
Sub CombineSheets()
Dim ws As Worksheet
Dim combinedWs As Worksheet
' Create a new worksheet for combined data
Set combinedWs = Sheets.Add
combinedWs.Name = "CombinedData"
' Loop through all sheets except the new one
For Each ws In Worksheets
If ws.Name <> "CombinedData" Then
' Find the last row in the combined sheet
Dim lastRow As Long
lastRow = combinedWs.Cells(combinedWs.Rows.Count, "A").End(xlUp).Row
' Copy data from the current sheet to the combined sheet
ws.UsedRange.Copy combinedWs.Cells(lastRow + 1, 1)
End If
Next ws
End Sub
Formatting important columns(Event date & time): Which is a tough task for excel hence executing it in python with the help some some libraries using pandas & numpy. Jupyter Notebook --> Link
Table Addition: Creating separate USA_states table and denoting Event_id to each state to have all affected area tagged to each event id in the fact table so that we get the exact number of event count pertaining to a particular state/region.
Data Visualization: The idea is to make a robust dashboard with 2 KPI Cards to show current year to date metrics with year over year change, followed by a bar chart and tree map nested in a parameter which changes upon selection. Filled Map to show the affected states, which is further sliced into 3 horizontally stacked trend chart to show voltage loss, event count and event count:voltage loss ratio
Metrics: Display of metrics or aggregated values will be divided into 2 parts: One showing year-to-date values with % change, second will be an timeline values interactable with slicers & filters.
Dimensions: As the prime focus on power outrage hence the dimensions are State, Causes & Year