Post

18 Best Crime Datasets for Machine Learning

July 23, 2021

Looking to build a text analysis model, analyze crime rates, or assess crime trends for a specific area or time period? This list of 18 crime datasets should help. This list features crime data (predominantly within the United States) for various locations that can be used for machine learning.

Top 18 Crime Datasets

  1. Crime in Vancouver: This dataset focuses on crime in Vancouver, Canada between the years of 2003 and 2017. The data focuses primarily on the type of crime, date, scene of the crime (street name), coordinates, and district.
  2. Crime in England and Wales: Originally published by Home Office, this dataset covers crime that occured in England and Wales between 2008 and 2009. It was compiled by the British Crime Survey and includes statistics data on violent crime, property crime, and more in an XLS format.
  3. London Crime: Featuring crime data that took place in London, this dataset contains 13,000,000 rows of data around which borough the crime took place, the type of crime that occured, and the date of the occurrence. 
  4. Austin Crime Statistics: Revolving around crimes that took place in Austin between 2014 and 2016, this dataset features 159,000 rows of data within 18 columns including location info, district, crime description, date and time, and area. 
  5. Baton Rouge Crime: Featuring incidents of crime that were handled by the Baton Rouge Police Department, this dataset includes crimes relating to narcotics, theft, assault, nuisance, vice, battery, damage to property, homicide, and sexual assault. The data isn’t geocoded to preserve the privacy of the victims. 
  6. Crimes in Boston: This dataset follows crimes that were handled by the Boston Police Department beginning in August 15th to the present day. It includes information regarding the type of crime that occured, the date and time the crime took place, and the location the crime took place. The CSV file contained within features the following columns: incident number, offense code, offense code group, offense description, reporting area, shooting, date, year, district, month, day of the week, hour, street, latitude, and longitude.
  7. Crimes in Chicago: This constantly updated crime dataset features information regarding crimes taking place in Chicago, Illinois. It began in 2001 and is updated on a weekly basis with information such as location, incident type and description, year of the occurrence, and date the record was updated. 
  8. Denver Crime Data: This regularly-updated dataset revolves around crime that took place in Denver, Colorado beginning in 2015. The information contained within features the National Incident Based Reporting system that covers offense codes, offense types, date of crime, reported date, address, and location.
  9. FBI National Incident Based Reporting System (NIBRS): A fantastic resource for crime or policing analysis within the United States, this original dataset has been cleaned, curated, and organized to be convenient and simple to utilize.
  10. Los Angeles Crime and Arrest Data: Based on open data from the city of Los Angeles taken between 2010 and 2019, this dataset includes the report ID, arrest date, time, area, suspect data, type of charge, charge description, and location info.
  11. NYC Complaint Data: A New York City crime dataset that features all crimes reported to the NYPD between 2006 and 2017. The dataset includes 6.5 million rows along with 35 columns including incident date, complaint number, location, coordinates, suspect info, victim info, and more.
  12. Oakland Crime Statistics: This crime dataset focuses on Oakland, California, and was taken between 2011 and 2016. Every year featured within the dataset contains its own CSV file for a total of 1,000,000+ rows of data and 10-11 columns.
  13. Phoenix Crime Data: Among the few crime datasets on this list to be updated daily, the Phoenix Crime Data dataset accounts for crimes that took place beginning in November of 2015 all the way up to the present day. The data enclosed within focuses on homicides, rapes, robberies, aggravated assaults, burglaries, thefts, motor vehicle thefts, arson, and drug offenses.
  14. San Francisco Crime Classification: This dataset features San Francisco, CA crimes taking place between 2003 and 2015, and focuses on the timestamp of the incident, category, description of the incident, day of the week, district, resolution, address, and coordinates. 
  15. US Mass Shootings: The US Mass Shootings dataset is a collection of indiscriminate public rampages that resulted in a minimum of four or more victims by the attacker between 1982 and 2019. 
  16. Kansas City Crime Data: This crime dataset focuses on crimes that took place in Kansas City, Missouri between 2009 and 2016. Data files are separated based on the year they took place, and include columns for date & time of the crime occurrence, location information of the crime (with coordinates), crime type, and demographic information of the people involved.
  17. Crime in Atlanta: This collection of crime data from the Atlanta Police Department’s open data portal includes all crimes taking place between 1/1/2009 and 2/28/2017. It’s an aggregation of city-wide data for every month beginning January 2009 to February 2017.
  18. Marijuana Related Crime: This dataset features crimes reported to the Denver Police Department around marijuana. It is based upon the National Incident Based Reporting System (NIBRS) and features information relating to all victims of crimes and all crimes that took place within the incident.