__STYLES__

LA Crime Watch : Dissecting Crime Patterns in Los Angeles |Pandas, Excel, SQL, Power BI, Tableau|

Tools used in this project
LA Crime Watch : Dissecting Crime Patterns in Los Angeles |Pandas, Excel, SQL, Power BI, Tableau|

City of Angels Crime Gender Victimization

About this project

Note: This project is currently in progress and is being updated. The third dashboard that is listed in Data Visualization section is a rough draft and is meant to give an idea of insights that would be gained from its use. It is in the process of being modified.

Introduction:

Just how dangerous is walking in LA? I decided to dive deep into this subject by analyzing a crime dataset that spans from Jan of 2020 through December 2022. A use case for this analysis could be to help communities come up with crime deterrent programs and in formulating programs to help crime victims of multiple ethnic backgrounds. A secondary use case may be to help the city allocate resources to help combat crime depending on when and where crime activity is at its highest. The use of these insights could help the city to set aside a budget in designing crime prevention programs and or allocating police present to help deter crime in areas and at times it may be most present. Below is a list of the steps in completing this project.

Methodology:

  1. Data Collection: The dataset is collected from the following website: https://data.lacity.org/. On the home page, the Public Safety tab is clicked and then can download the dataset as a CSV file. The downloaded dataset consists of data from 2020 through December 26th, 2022. The website is continually adding more data and so additional records will have been added if downloading the dataset now.
  2. Data Cleaning and transformation: The data cleaning process starts by uploading the CSV file to a Jupyter Notebook. Once uploaded, the CSV file is transposed into a Pandas DataFrame. Various functions and methods in the Pandas Python library are being used to clean the data. A list of all the functions and methods with syntax examples are provided in the additional images attached to this project. This process involved renaming columns, deleting columns that will not be used in the analysis, changing of data types, and handling missing values. Additional columns to include an age bracket column, gender column with the gender spelled out, and an ethnicity column with the actual ethnicity name are created. The age bracket column and ethnicity column were created in the Jupyter Notebook using the Pandas Python library. undefinedundefinedSome minor further data transformation and cleaning to drop unused columns and add a couple of derived columns is executed in Microsoft SQL server. This refined dataset is being used to create an interactive Tableau dashboard to enable deeper insights across the different variables.undefined
  3. Data Exploration: Preliminary data exploration is done using some of the Pandas functions to acquire information on the size of the dataset, measures of central tendency (mostly mean and median), and some other basic statistics of the data set. The age column is of special interests due to a few outliers. Power BI is used to further explore the data through charts and the use of various filters. This process also involves the use of running SQL queries to make sure the values in charts match those of the returned values from running various SQL queries.
  4. Data Visualization: Data visualization consists of three Power BI dashboards. Due to not having a professional Power BI account, a fourth dashboard that will be interactive will utilize Tableau. The goal of each Power BI dashboard is derive specific insights related to LA crime. The first dashboard gives insight into victim demographics. This includes information on victim age and ethnicity. This yields insight as to which age group and or ethnic groups are have had the greatest share of crime victims. undefinedThe second dashboard derives insights as to crime rate trends. Factors considered include visualizing the crime rate trend in relation to the day of the week, month of the year, and time of day. The insights gained from this dashboard help to visualize which day/s of the week, month/s of the year, and which times of the day are likely to see the highest crime rate. undefinedThe third dashboard looks into the details of the number of crimes by location. This enables the viewer to see which parts of LA have seen the greatest number of crime. Filters are put into place so that more granular information can be obtained. Filters include, age-bracket, ethnic background, and month. This would enable the user to see how how frequent each age-group and or ethnic background has been a victim crime based on location. There is also a month filter as to see how the number of crimes for each location differ by month. Other filters including year and day of the week will also be added. undefinedA fourth dashboard that will be interactive is also being created using Tableau to allow users to see various metrics by different variables including, year, month, gender, age, and other variables to get deeper insights.
  5. Communicating the insights:
  6. Recommendations:

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining