__STYLES__
When analyzing the power outage data from 2002-2023, I first dealt with issues of data quality and integrity by cleaning up the messy data, using both Microsoft Power Query Editor (in Excel) and the Pandas library in Python.
Once complete, I asked the following questions:
What is the trend of power outages in the different states? Which states are most affected by power outages?
What is the trend of power outages in the different geographic regions? Which geographic regions are most affected by power outages?
What is causing these power outages? What trends should we be concerned about?
First, let's discuss the data cleaning.
1. Chose my questions for analysis & narrowed down the number of columns I would use: I analyzed the columns in the different Excel tabs and decided which columns to focus on for my analysis - Year, Area Affected, Event Type, Demand Loss (MW), and Number of Customers Affected.
2. Consolidated the Excel tabs: I appended the Excel tabs into one big Excel worksheet using Power Query Editor.
3. Created a column for "Consolidated Event Type" for data cleaning: I realized that the number of categories for Event Type was much too unwieldy for any clean data visualizations (for example, there were separate categories for "wind", "heavy wind", "wind storm", "rain and wind.") I added a column in Excel called "Consolidated Event Type", where I consolidated the event types into a few major categories.
For example, I included all the following event types under "Weather": "Cold Weather Event", "Heat Storm", "Heavy Rain and Wind Storm", "Heavy Snow Storm", "High Winds", "Hurricane", etc.
Similarly, I created an overarching category for "Physical Attack/Vandalism" where I put the event types "Physical Attack", "Suspected Physical Attack", and "Vandalism."
I decided that to most readers of this report, they would care more about the state names than the county names.
I used this Python code to do the following:
If the U.S. state appeared in the "Area Affected" column, I set the "State" field for this row in the output data frame.
This way, if a state was listed twice in a particular row (e.g. in the "Area Affected" column, both "Eastern Ohio" and "Western Ohio" were listed), that state would only be listed once for that power outage.
5. Added U.S. Census data to separate the states by geographic region: I decided it would be helpful to not just look at power outages by state, but also by geographic region. I added the U.S. Census geographic regions (e.g. West is California, Oregon, Washington, etc.) to a new column in my Excel spreadsheet called "Geographic Region."
First, I created a line chart to see the number of outages by year, since the data started being collected (2002) to the last year of full data (2022). I added a slicer on the far left side of the screen so that users can select the state they're most interested in.
My change to the data: I excluded the year 2023 for this line chart because it only had data through May. If I charted that data as-is, it would look like there was a decrease in outages in 2023, which would be misleading. To do this, I added a filter on that specific line chart visual for all years except 2023. I kept the 2023 data in the analysis when calculating OVERALL outages by state.
My findings: The number of outages reported per year is increasing overall.
Top 5 states with most outages over time:
Top 5 states with most outages in 2022:
States with the most outages per year: Interestingly, I found that the states with the top # of outages per year were amazingly consistent. California, Texas, and Washington consistently ranked #1 in the number of outages per year.
California had the most outages for the following years: 2002, 2005, 2007-2008, 2010, 2013, 2015, 2017-2020 (2019 tied with Texas), and 2022.
Texas had the most outages for the following years: 2014, 2016, 2019 (tied with California), and 2021.
Washington had the most outages for the following years: 2006, 2011-2012.
There were a few other states that ranked #1 for outages in certain years: Kentucky (2009), Florida (2004), and Michigan (2003). But for the most part, California & Texas ruled the day.
In fact, both California AND Texas were in the top 5 states for outages in the following years: 2004, 2006-2008, 2011-2012, and 2014-2022. This means that for the last 9 years, California and Texas have scored in the top 5 outages. They're also on pace to be in the top 5 most outages for 2023.
To analyze the trends of outages based on geographic regions, I categorized the states according to the geographic regions used by the U.S. Census: South, West, Midwest, and Northeast.
In this pie chart showing # of outages by geographic region, the regions rank in this order (over all time) for # of power outages:
1. South: 1,920 outages total (~40%)
West: 1,225 outages total (~25%)
Midwest: 865 outages total (~18%)
Northeast: 834 outages total (~17%)
This line chart for # of outages by year & geographic region shows that there is a general upward trend in power outages for all geographic regions:
I also included a table with the top 5 states with most outages per geographic region, so the user can click on the geographic region they want to examine and see the top 5 states. For example, if I chose "South" in the slicer, I can see that the worst offender is Texas, and that the general trend of outages is increasing for the region.
To identify the trends of power outage causes and the top reasons for concern, I created a line chart with years across the X axis and # of outages on the Y axis. Users can filter for the event type they want to examine. For example, if I wanted to look at the trend for power outages caused by weather, I can select the weather event type from the slicer:
When looking at trends over time, I saw that there is a dramatic increase in power outages due to the following causes: weather, cyber events/cyber attacks, physical attack/vandalism, suspicious activity, and system operations.
Top 5 causes of power outages (all time)
Top 5 causes of power outages (2022):
To address the weak points in power outages, we need to address the dramatic increase in power outages in California, Florida, Michigan, North Carolina, Texas, and Washington.
California is of particular concern - California not only has the most outages over the whole length of the study, but has the most outages in 2022 and most of the years studied. The power outages are increasing with no sign of stopping - 2021 had 42 outages and 2022 had 69 outages.
Why am I especially concerned? California is the most populous state in the U.S. according to the US Census (39 million), followed by Texas (30 million), and Florida (22 million.) All of these states are experiencing a dramatic rise in power outages that will only continue with the devastation of climate change.
California is already experiencing the impact of climate change, including an increase in extreme weather events, more frequent & severe heat waves and wildfires, more variable precipitation, and an increase in droughts.
Texas is one of the U.S. states most vulnerable to climate change. Between 1980-2022, 44% of all billion-dollar climate and weather disasters impacted Texas, including Weather Storm Uri (2021), Hurricane Harvey (2017), and the severe drought of 2011. Hurricanes, wildfires, flooding, and droughts are all expected to increase in Texas.
Florida also predicts negative effects of climate change, including a rise in floods, higher temperatures, and more frequent and severe hurricanes. All of these are listed as causes of power outages.
In order to prepare for climate change, the DOE needs to work directly with these states to secure the power stations from the increasingly harsh weather conditions. In this article from the DOE, "Building a Better Grid: Addressing Climate Change and Bolstering Electric Grid Security through Planning & Innovation," they report that power outages from extreme weather have doubled over the past 2 decades across the U.S., highlighting our aging grid and infrastructure.
Through the Grid Modernization Initiative, DOE and national labs are working together to provide technical assistances to grid operators.
DOE should also consider the following recommendations: