__STYLES__
This dataset comes from Maven Analytics’ monthly challenges. It contains data all the way from 2002-2023 on power outages in the US (provided by the Department of Energy). Some of the information included are demand (electricity) loss, region affected, number of people affected, cause of outage, and restoration time. The task was to clean the data, create a dashboard, and then talk about the trends while also finding the weak points in the grid.
Some of the key metrics I wanted to look at were # of outages, demand loss, downtime, and # of people affected. For my analysis, I decided to focus on the year 2022, comparing it to the rolling average of the past 5 years (2018-2022). Nonetheless the dashboard is fully interactive including a “year” filter and being able to filter by clicking on the visuals. Other areas I wanted to compare were the causes of these outages as well as the line graphs for # of people affected and downtime. Total downtime by region is also on the dashboard. The last thing I wanted to analyze was the total people affected, on a state level with the map, and on a monthly level with the line graph.
Looking at the # of outages, the average for the past 5 years has been 285.4. In 2022, there were 335 total power outages which is above average. The demand loss however, was significantly lower at a total of 63,110 megawatts. Comparing that to 2018-2022 which had an average demand loss of 93,362 MW annually, it shows that more outages doesn't necessarily mean more demand loss.
The same thing goes for downtime, despite having more outages, 2022 experienced 3,565 hours of downtime compared to the 2018-2022 average of 5,281.6 hours. This is the main reason I chose to put the stacked bar and line graph on my dashboard. To show that more electricity lost doesn’t correlate strongly with more downtime (more loss doesn’t automatically mean it takes longer to fix). Then there is the # of people affected which in 2022 was 11,285,742 compared to the 2018-2022 average of 14,125,518. Clearly there were less people affected than usual which is a positive thing.
Moving on to the causes, the top causes for outages between 2018-2022 were severe weather at #1, followed by system operations, and vandalism. In the year 2022 it was slightly different. Vandalism was the top cause, then system operations, severe weather, and the last big cause was interruption. There is a pattern here with the top 3 causes, just not in the same order. It experienced more vandalism than it normally has in the past.
Looking at the # of people affected from 2018-2022, there are peaks in February and September with low points in May and November. Comparing that to just 2022, the peak and fall off is similar in September-October but with a sharper decrease. However, there is no peak in February for the year 2022 unlike the past 5 years which suggests something must’ve changed and it worked. On a state level, the 5-year period and 2022 were similar for # of people affected. In both, states like FL, CA, TX, and NV were affected the hardest (logically given their larger population). But what stands out is Michigan, it was one of the most affected areas in that 5 year period as well as overall. This could be due to its location which after further examination seems to be the case because most of the state’s outages were from severe weather.
Lastly I wanted to look at the dashboard and analyze the results for all the years. What we see is that August is definitely the worst month for outages. Severe weather is the #1 cause for power outages (1,313) followed by vandalism (796). The RFC region, which is in the midwest, had the most downtime which is surprising. I would've thought a coastal region like the south (SERC) region would’ve been the most affected but it turns out the midwest is more affected. Maybe the support for outages could be less advanced in the midwest causing restoration time to take longer.
From this analysis, we can glean numerous key takeaways. The first being a need for better infrastructure. Looking at the top 3 causes, measures need to be taken to make the infrastructure more weather proof, vandal proof, and reliable. By making the infrastructure more resistant to weather, it can reduce the number of outages. By adding more surveillance or increasing the security of the infrastructure, it will deter vandalism. Then, to minimize system operation failures, the Department of Energy can invest in newer technology or frequently do maintenance to make the system less prone to failures.
Moreover, given that August is the worst month for outages, it would be beneficial to increase support during that time. Whether it be monitoring power plants, generators, or whatever it may be more closely. Allocating more staff during that month would also be a good idea for fixing outages more efficiently. In regards to the “area” portion of each record, using zip code instead of city/state could be a better way to streamline the process. The last takeaway from this is that the data collection process needs to be improved. There should be just one template used to record all outages because in the original dataset, the sheets varied significantly. And there was a lot of incomplete information so getting as much detail from each power outage occurrence would help improve the integrity of the data.