The Crime Distribution of New York City

Charts & Graphs, Lab Reports
Dashboard created through Tableau Public


Crime data and analysis is an integral part in the fight on crime especially in determining efficacy of departments and where crime occurs. To this end visualizations are very useful in communicating these findings so it is important that these visualizations are well made and understandable. Furthermore crime data is very rich with information so it is easy to over do a visualization by having to much detail that detracts from the overall message.


As someone who enjoys working with crime data and hopes to work in the field as a crime analyst it made sense for me to do this report on crime data. This led me to looking through the city of New York’s open data portal where I found a summary dashboard for NYPD year to date arrests data. The first interesting plot that I came across was the Number of Arrests in NYC plotted out by date on the summary dashboard and as we can see below there are a lot of spikes and dips. This begs the question as to whether this is a week based cycle where more crime occurs on certain days or if it is due to when officers are on duty.

The other interesting plot on the summary dashboard was a choropleth map of where the arrests occurred base on the police precincts which while ascetically pleasing can be hard to discern relative counts between two non adjacent precincts due to the similarity of their color. Which at least to me makes a fair majority of the precincts indistinguishable as it is difficult to tell which one has a higher count than the other.

The last interesting plot was of the perpetrator’s race showing the differing counts of arrests but there’s lots of empty space and furthermore most of the perpetrator’s race categories are unlabeled so you don’t know what they are unless you are familiar with the data.


The NYPD YTD Arrests data was acquired from the city of New York’s open data portal as a CSV file. It was then imported into Tableau Public for visualization. For most of the visualizations it was as simple as creating a worksheet and placing the variables into their respective locations and adding appropriate labels and color where needed. Others required a bit more work and finesse to create a visually appealing and non deceptive graph. The most difficult of these was creating the Time by weekday bar chart. This was due to it required creating a new calculated field that used a function to determine the day of the week that the arrest occurred based on the arrest date and then creating aliases for the numeric output so that the chart is human readable.

The next two plots that were made were looking at arrests over time with one looking at the racial makeup of the data over time and the other looking at the change in crime over time. Both of these required that the originally numerical counts needed to be represented as a percentage with the change over time graph being the change relative to the first month and the race over time graph being what percentage of each month consisted of each racial group. This grouping was not the original grouping as the original data set had 8 different racial groups with several overlapping or similar categories so using Tableau a new variable was created to group the data.

The final main visualization that was created was the map of where the arrests for December 2021 robberies took place. Originally I had intended to create a heat map of the data but due to the limitations of Tableau Public that was unable to be preformed so I decided to just plot the points. This caused a new issue as there were too many crimes which resulted in all of NYC to become blue. In order to make the map legible I filtered the data to only be looking at a single type of crime, robberies, but even this wasn’t enough so an additional filter was added restricting the data only to the most recent month, that being December 2021. With the map now being legible I added in a grouping to show the distribution of robberies by race.


There are several key take aways from the data. First, those classified as Black make up the overwhelming majority of all arrests in NYC as shown by all three visualizations involving race which given previous studies on this subject was to be expected. The second main take away was that arrests usually occur at the beginning of the week before teetering off and hitting a low over the weekend. To answer the question asked earlier about the spikes and dips this is most likely due to more officers being on duty during the week and the increase in arrests at the start of the week is probably due to the backlog created over the weekend. The last of the key take aways is the crime overall is down

Future Analysis

There are several different avenues for future analysis that would be interesting to delve into. The first would have to be to get more data and by this I mean to get several years worth of data and see if the trends found here are consistent across time or if this year was an anomaly. The second would be to load this data into a GIS program to both recreate the choropleth map using a better color palette and to create other crime specific heat maps as it would be interesting to see where the various types of crime occur within NYC.