Chicago Crime & Arrest Data Mining

Samantha Gonzalez

Co-Presenters: Miguel Soler

College: The Dorothy and George Hennings College of Science, Mathematics and Technology

Major: Computer Science

Faculty Research Mentor: Ching-yu Huang

Abstract:

This study examines the relationship between crime severity and the likelihood of an arrest, utilizing 2024 data from the Chicago Crimes (2001–Present) and Chicago Arrests datasets. The research explores whether felonies lead to more frequent arrests compared to misdemeanors and lower severity crimes and whether felonies have a higher clearance rate in high-crime areas. By analyzing these crime and arrest patterns, this study aims to uncover potential disparities in law enforcement efforts and seeks to improve resource allocation strategies.To guarantee data integrity, we conduct an Extract, Transform, and Load (ETL) process, where we can ensure that raw data is cleaned, formatted, and structured for meaningful analysis. The Arrest Data and the Crime Data are kept on separate MySQL databases on a Linux server and maintained separately for comparative analysis. The cleaned datasets serve as the foundation for our data mining approach, which includes clustering, correlation analysis, and classification models. Clustering techniques help us to identify crime hotspots and times when crime is occurring most often, while classification models help us predict the probability of arrest based on crime attributes. Additionally, statistical modeling provides insights into whether certain crime characteristics, locations, or suspect demographics influence case resolution rates.By applying these methodologies, our research aims to highlight trends in crime clearance rates so that we can offer insights into law enforcement priorities and potential biases. These findings can be used to assess the efficiency of current policing strategies and identify areas where law enforcement efforts may require adjustments to improve overall crime resolution.

Previous
Previous

2020 US Presidential Election and COVID-19 Sentiment Correlation​

Next
Next

Cryptocurrency volatility— the reasons behind and risk management strategies