__STYLES__

Big City Collisions and Big Elephants in the Room

Tools used in this project
Big City Collisions and Big Elephants in the Room

About this project

Introduction

"Big City Collisions and Big Elephants in the Room" is a data-driven portfolio project that delves into the intricate dynamics of traffic collisions within an urban environment. Through skilled analysis and visualization techniques, this project aims to uncover trends, patterns, and contributing factors that shape collision occurrences in this urban area.

With a focus on data from New York City, the project showcases a combination of interactive dashboards, insightful charts, and compelling narratives to provide a holistic understanding of urban collision dynamics. From the temporal distribution of collisions to the identification of top contributing causes, each visualization offers unique insights into the complexities of urban traffic safety.

Let's go for a drive!

Temporal Analysis

An examination of temporal trends or seasonal variations in collision patterns. Are there more accidents during certain months or seasons?

The line chart is showing collision data for 236,538 collisions in 2021, 2022, and part of 2023 (1,883 April 2023 collisions are not included.) Organizing the years combined in this configuration allows us to clearly see the seasonality of collisions year over year (literally year over year!)

undefinedBasically, more collisions in warmer weather months and less collisions in colder weather months. (One would think this was solely pedestrian collisions!)

+----------------------------------------+
|      2-Year Seasonal Difference        |
+----------------------------------------+
| Warmer Weather Months (e.g., Apr-Sep)  |
| Total Collisions:              113,569 |
+----------------------------------------+
| Colder Weather Months (e.g., Oct-Mar)  |
| Total Collisions:              100,757 |
+========================================+
| Difference:                     12,812 |
| Percentage Difference:           12.7% |
+----------------------------------------+

One would also naturally suspect the slippery surfaces during winter months would increase collisions during the winter, but it seems collisions increase in the nice-weather months. This begs the question, "Why?" Well, we'll come back to this after we delve into data surrounding contributing factors irrespective of seasonality.

PivotTable

The line chart and data table calculations were created using the following PivotTable. The PivotTable shows April collisions, specific changes year to year, and the totals. Note: 2-year December low of just 16,948 collisions and a calculated 2-year high in June of 20,079 collisions—a 15.6% swing.

undefinedTime and Day Analysis

Identification of trends in collisions based on the time of day and day of the week. For example, are there more collisions during rush hours? Are weekends more prone to accidents?

The following heatmap containing all 238,421 collisions is configured so that darker red indicates a higher number of collisions than lighter. 1 indicates Monday and 7 indicates Sunday in order to better represent weekday/weekend contrasts. 0 indicates midnight and 12 indicates noon.

undefined

Turns out, yes, there are identifiable trends. Surprise!

Friday and Saturday midnights show a spike in collisions.

Collisions increase throughout the workweek, peaking at five o'clock on Friday, which is a 56.7% increase from the Monday ten o'clock spot!

Which, by the way, the Friday five o'clock spot is 15.6% more than Monday at five o'clock and 25.9% more than Friday morning rush hour.

Contributing Factors and Severity Factors

  1. An analysis of the contributing factors leading to collisions. Are certain factors like driver distraction or unsafe speed more common than others?
  2. An exploration of the severity of collisions by examining injuries (Dangerous) and fatalities (Deadly.) Injuries are bad enough, but what about fatalities. They're a bit more worrisome than injury, yes?

The following PivotChart represents the top ten contributing causes of injuries—accounting for nearly 83% of collisions!

The chart shows the percentage of collisions which were reported as causing injury or death. The difference in totals (e.g. 39% plus 0.13 % doesn't equal 100%) is due to lack of information or the collisions did not result in injuries or fatalities.

undefinedAs the starkest contrast, deaths pale in comparison to injuries. However, this begs the question, "Are deaths ever negligible?"

In another comparison, 'failure to yield the right-of-way' and 'traffic control disregarded' dominate the injuries, but 'unsafe speed' dominates the fatalities by a mile.

Location and Correlation Analysis

  1. A determination of which boroughs or streets have the highest frequency of collisions. This could be useful for city planning and implementing safety measures in specific areas.
  2. An observation of correlations between contributing factors and other variables such as location or time. For instance, do certain factors tend to occur more frequently in specific areas or at certain times?

Are there locations that are more prone than others to collisions? Bang—yep!

One of set circumstances accounted for 552 collisions holding for these four matching factors: in Brooklyn, on Belt Parkway, exhibiting 'driver inattention/distraction', and in a 'passenger vehicle'.

Are there locations and times that are more prone than others to collisions? Bam! You guessed it. Yep.

Believe it or not, there were 13 accidents at the same time of day and the same day of the week (see six matching factors) on Staten Island. These occurred on Tuesdays at 6:00 p.m. on the Staten Island Expressway where people were 'following too closely' in a 'passenger vehicle'. Stay off that road on Tuesdays at 6:00!

Are there intersections that are more prone than others to collisions? I'm getting yep-lash already.

In Queens, on Cross Island Parkway & Hempstead Avenue, there were 30 'unspecified', 'passenger vehicle' accidents.

Location and Correlation Identification Process

Those figures were discovered using the following formulas several times over, used in basically the same process sequence (with specific columns and worksheets varying, of course.)

Add a weekday column.

=WEEKDAY(B2,2)

Add an hour column.

=HOUR(D2)

Concatenate Weekday, Hour, Borough, Street Name, Cross Street, Contributing Factor, and Vehicle Type.

=CONCATENATE($C2,$E2,$F2,$G2,$H2,$K2,$L2)

Identify intersections (the record has a Street Name and Cross Street.)

=IF(AND($A2<>"",$B2<>""),1,0)

Concatenate the unique combinations with the 1 or 0. Remove records starting with 0.

=B2&A2

Remove the 1s.

=RIGHT(E3, LEN(E3) - 1)

Identify and list unique intersection combinations.

=UNIQUE(G2:G111293)

Count occurrences of unique combinations.

=COUNTIF($G$2:$G$111291,$I2)

Get the count of the unique combination occurring most often.

=MAX($K$2:$K$91166)

Identify the most occurring unique combination.

=INDEX($I$2:$I$238422, MATCH(MAX($K$2:$K$238422), $K$2:$K$238422, 0))

For further EDA, a discovery of how many occurrences of each matching factor count. i.e. There were 8461 unique intersection combinations that had 2 matching factors, 2159 that had 3 matching factors, on down to the single most-occurring unique combination (30).

=SORT(UNIQUE($K$2:$K$91166))
=COUNTIF($K$2:$K$91166,$P7)

A Return to Seasonality

Returning to the seasonality elephant in the room from the line chart at the beginning of this project, we now have a better understanding of the contributing factors.

There was unexplained seasonality. Why?

This stacked bar chart stacks the contributing factors inside each month. The chart only includes 2021 and 2022 collisions (year over year) so as not to skew January-March.

undefinedDisappointing. Nothing stands out, visually, as the proximate cause of the 2-year December low June high differential of 15.6% (stated earlier.) All months' contributing factors increase pretty proportionately and uniformly.

So that elephant is still in the room.

If you would learn how to be disappointed without being discouraged, you need never be defeated. ~Robert Schuller

Analytical Thinking Begets Regrets

I could have taken analysis further with Excel, aggregating and grouping by season—winter, spring, summer, and fall—but I don't think the juice will be worth the squeeze.

That could have also involved calculating the frequency of collisions for each contributing factor within each season, but again, the stacked chart already tells this much of the story.

Statistical analysis? Now that's super juicy! Performing statistical tests to identify correlations between contributing factors and seasonality is promising. For example, using correlation coefficients or regression analysis to quantify the strength and direction of the relationship between each contributing factor and seasonal variations in collision frequencies.

And considering time series analysis techniques such as seasonal decomposition or Fourier analysis to decompose the collision data into seasonal components and identify patterns is definitely called for.

So, stay tuned. I'll circle back when I get statistical analysis firmly under my belt.

Conclusion

"Big City Collisions and Big Elephants in the Room" represents a sophisticated dive into urban traffic collision dynamics using data-driven analysis and visualization techniques. This portfolio project showcases an array of insights gleaned from the analysis of collision data from New York City, providing a comprehensive understanding of the factors influencing collision occurrences.

The project demonstrates proficiency in Excel skills, including data manipulation, PivotTable and PivotChart creation, and advanced formula usage. Through the meticulous construction of interactive dashboards, insightful charts, and compelling narratives, the project effectively communicates complex findings to stakeholders.

While the analysis sheds light on temporal trends, contributing factors, severity factors, and location correlations, it also highlights areas for further exploration. The project acknowledges the presence of unexplained seasonality in collision occurrences, hinting at the potential for deeper statistical analysis to uncover underlying patterns and correlations.

Overall, "Big City Collisions and Big Elephants in the Room" presents a compelling narrative of urban traffic safety, showcasing analytical thinking, technical proficiency, and a commitment to continuous improvement. As the project evolves, further exploration through statistical analysis promises to unlock deeper insights into the complex dynamics of urban traffic collisions.

Final Thought

Through this project, we've only scratched the surface of understanding the intricate dynamics of urban traffic collisions. Further analysis holds the promise of uncovering hidden patterns and correlations, ultimately paving the way for targeted interventions to enhance urban traffic safety.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.