__STYLES__

When Do Traffic Collisions Occur?

Tools used in this project
When Do Traffic Collisions Occur?

About this project

Goals

My goals were to:

  1. Identify seasonal patterns,
  2. Visualize weekly trends, and
  3. Determine the percentage of dangerous collisions that resulted in injury or death.

Methods

  1. In order to identify seasonal patterns, I began by creating a Pivot Table.
    • I used the 'Collision ID' field in my Values field, summarized by COUNT.
    • The 'Date' field I put into my Rows field, which was broken up by Excel into 'Months' and 'Years'.
      • 'Years' was dragged to my Columns field in order to visualize it by month and year.
    • I created a Line Pivot Chart to visualize the seasonal trends over time.
      • April 2023 was incomplete and made the line appear as if traffic collisions decreased dramatically. The data for that month was removed so as to not give misleading information.
  2. I created another Pivot Table to determine the number of collisions by day of the week as well as hour of day in which they occurred. To visualize, I created a heat map highlighting the highest number of incidents.
    • I needed to extract the day of the week and the hour when collisions took place.
      • I used WEEKDAY and HOUR functions to create new columns in my source data.
    • From here, I used 'Collision ID' summarized by COUNT in my values, 'Weekday' as my column, and 'Hour' for my rows.
    • In order to best visualize the table, I added a heat map to it.
      • Under Conditional Formatting, I selected a 3-color scale in order to emphasize the greatest number of traffic collisions by day and time.
  3. To determine the percentage of dangerous collisions, I added two more columns to my source data and created a third Pivot Table.
    • The first column I labeled 'Dangerous'. This column's purpose is to identify every collision where a person involved was injured or killed with a '1', and all others with a '0'. I used =IF(SUM('Persons Injured'+'Persons Killed")>0,1,0) to achieve this.
    • The second column I simply labeled 'Ones'. In this column, every row entry has a '1'.
      • This is to get a "count" of collision IDs without actually using COUNT OF 'Collision ID' because the calculated field only calculates using sums.
    • Once again, I used 'Collision ID' summarized by COUNT and my newly created 'Dangerous' field summarized by SUM in my Values field.
    • I filtered my results to only the Top 10 contributing factors and sorted them in descending order.
    • Then I added a calculated field to the end of my Pivot Table to calculate the percentage of collisions that resulted in injury or death.
      • I divided my 'Dangerous' field by my 'Ones' field to get the percentage of dangerous incidents.
    • Finally, I added data bars to the '% of Dangerous' entries to further illustrate the percentage.
      • Under Conditional Formatting, I selected the Solid Red data bar and edited the rules so the bars are accurately scaled to reflect the percentage (0-100%).

Insights

Seasonality

  1. From looking at the line graph, all three years begin January with just under 8,000 collisions for the month. Thankfully, the numbers have decreased just slightly on a year-by-year basis but staying mostly consistent.

    • This could potentially be explained by New Year's Day celebrations. Many people go to parties to celebrate. Some make the mistake of drinking too much or being tired and then driving home. I would suspect upon a more granular analysis, most of the traffic incidents in January are on this day.
    • This is also immediately following the hectic holiday season. People may be out returning or exchanging gifts they may have received.
  2. Collisions dip going into February, then rise significantly in March. This is seen across the three years of data provided.

    • February is the lowest data point across all three years. It is still winter when the weather is cold. It is likely less people are out driving or walking the streets than at any other time of the year.
    • February is also the shortest month of the year, which could lead to slightly misleading figures. Upon further analysis, February may very well have the same number of collisions per day on average when compared to the rest of the year.
    • As the season shifts into spring and then summer, more drivers are likely out on the roads. This explains why the occurrence of traffic collisions increase overall.

Weekly Trends

  1. Regardless of the day, most collisions are occurring between 2-7 PM.

    • These are of course times of rush hour when most people are leaving work and going home for the day. The 8 AM hour is also when a high number of auto accidents occur.
  2. Fridays and Saturdays at midnight are outliers.

    • These data points are completely separate from the cluster of red cells located in the middle of the chart.
    • These can most likely be explained by the active night life within the city. People like to spend their weekends out and unfortunately, they may spend it drinking, which would result in impaired drivers.

Percentage of Dangerous Collisions

  • Top 10 Contributing Factors (dangerous collisions/total collisions)

    1. Driver Inattention/Distraction (22,680/58,308) - At first glance, this one seems to reflect the way people can tend to be distracted while driving and using their phones at the same time. What is surprising is how many of these collisions result in injury or death. Deeper analysis would show if these incidents are occurring on highways at higher speeds or if pedestrians are being harmed while walking.
    2. Unspecified (15,562/58,262) - These statistics are more challenging to analyze as there are no specific details provided. Perhaps the parties involved didn't want to divulge information or accept blame.
    3. Failure to Yield Right-of-Way (10,542/16,555) - At 64%, this is the factor that resulted in the highest number of injuries and deaths. If I were to look at the locations where these incidents took place, I would guess I would find they occurred at intersections where one party made a turn with oncoming traffic, resulting in a t-bone crash, where the driver's side door was the spot of impact.
    4. Following Too Closely (6,269/15,519) - When following too closely, the vehicle in front is most likely to suffer an injury - such as whiplash - when being hit from behind.
    5. Passing or Lane Usage Improper (2,779/10,733) - This statistic likely applies to bike lanes. Automobiles may use the bike lane to make an oncoming right turn and fail to see a cyclist in the lane.
    6. Passing Too Closely (988/9,132) - At 11%, this factor resulted in the smallest number of injuries and deaths. These kinds of incidents - if not happening at high speeds or during heavy traffic - should generally result in bumper-to-bumper collisions and not cause much harm to the passengers.
    7. Unsafe Speed (3,850/8,429) - As the third highest percentage of dangerous collisions, unsafe speeds can - and evidently do - lead to serious injury. The higher the speed, the greater the impact, the more severe consequences.
    8. Backing Unsafely (1,387/7,473) - This factor also has a low proportion of dangerous collisions. Most of these incidents likely occurred in parking lots at lower speeds. Vehicle-on-vehicle collisions shouldn't cause much harm. Pedestrians were likely the victims.
    9. Traffic Control Disregarded (3,939/6,717) - Proportionally, the second most dangerous contributing factor at 59%. Presumably, traffic lights are out, and drivers are unaware of correct procedure.
    10. Other Vehicular (2,044/6,494)

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.