I have always enjoyed and been interested in crime data and to this end I plan to join law enforcement as a crime analyst. Crime data and analysis is an integral part in the fight on crime especially in determining what, where, and when crime occurs. This led me to study New York City’s crime data both spatially and temporarily as apart of one larger super project that is made up of three separate finals using charts, graphs, and maps. To do this arrest data was used as it is the best approximation of the actual levels of crime that has spatial elements to it. This portion is meant to explore and analyze the data using charts and graphs to look for trends in the data over time and to show the distribution of arrests across the different police precincts.
The visualization for this project were created using Tableau public where I began by creating a time series graph of all of the data to learn the overall trend of the data. The next visualization was another time series graph except this time I separated the arrest data by the eleven crime categories to encompass all forty four different offense descriptions. To accompany this new graph and to give context I added a bar chart to show the totals for each type as well as a breakdown of what offense descriptions make up each type. Subsequently, I then created a bar chart of the arrests per police precinct along with a third time series graph that only shows the trend of the ten precincts with the highest arrest counts and a fourth graph that included the trend line for the 14th precinct as it had the highest count. The last chart that was created prior to doing any user research was a recreation of the arrests per police precinct bar chart except where the user could filter the arrest type to see where each arrest type was concentrated. After conducting the research it was apparent that a map of the precincts should be included to aid in the comprehension which I then added with the same coloration as the rest of the police precincts. Furthermore, annotations were added to give some context and descriptions to the visualizations.
While making these visualizations I referenced two visualizations made by Knoema. The first of these was referenced in the making of the time series graphs.
This visualization helped me choose to display crime by month as the daily and weekly displays both ended up with too much noise that it became hard to understand. The second graph is a horizontal bar chart.
In this situation the horizontal chart was perfect as it allowed for all the country names to be easily legible while also keeping the entire data set on a single screen. Furthermore they also organized the data in depending order. In my case the hight of my bar chart would have been too much with the 77 precincts in New York City thus making it harder to read. This issue would have also been exasperated on the interactive chart as the users would be constantly scrolling to see the whole thing.
The user research was conducted in two different sessions with revisions done after each session. Each session consisted of the participant thinking aloud as they go through the visualizations. During this time the participant where asked to keep the following questions in mind as they think aloud.
- Can you understandable the visualizations that you are looking at?
- Does it make sense contextually?
- Is it ascetically pleasing?
- Do you have any additional critiques and/or recommendations?
The participants were selected due to them being New York City residents and also having an interest in crime data and policing.
The visualization of this data led to several key findings. The first of these findings was that the overall trend was fairly stable despite some minor fluctuations at the beginning of the year. The trend is also reflecting in each of the individual arrest types. This indicates that the overall decrease in crime that occurred due to COVID have yet to return to the previous levels that they were at prior to COVID. Furthermore this also indicates that people have not entirely returned to their previous way of life.
The second major finding is that despite the fact that the 14th precinct makes up only 0.24% of the city’s landmass, it makes up 3.22% of all crime and is in fact the precinct with the largest amount of crime. Furthermore, this precinct also has an upward trend in arrests while all the others in the top ten show no real increase nor decrease. This trend shows an increase of 16.5 arrests per month and has strong correlation. These findings, while significant, are not entirely unexpected given the location and area that the 14th precinct covers and the relaxing of COVID restrictions. This is because the 14th precinct services Grand Central Station and Penn Station, two massive transit stations that are responsible for bringing many people in and out of the city daily, as well as Times Square, which is a very popular tourist attraction. Due to the relaxing COVID restrictions the number of people passing through these areas will start to increase and as there are more people in a given location, then the amount of crime that occurs in that location will also increase. Unlike the overall trend, this could indicate the start of a slow return of normalcy.
The user research provided many key insights that were initially overlooked with the most interesting and informative coming from the first session and classroom critique. The biggest request made by the first user was to add in a map of where the various precincts were located which was then echoed by several members of the class. The next most transformative feedback came in the request for the context to be included which they only referred to adding the city’s name to the title but several critiques from the class recommended that the added context be a bit broader and include direct annotations on the charts and graphs themselves along with the year that the data came from.
The second user’s feedback had a much more subtle effect on the overall report. The most noticeable was them recommending some annotations to be reworded to make them clearer as they did not understand the points that the annotations were trying to portray. The other finding from the second session was that the Y-axis’s name used the file’s name including the .csv as this is what Tableau defaults to and I overlooked this aspect in the design process. To rectify this I gave all charts and graphs new axis labels so that those unfamiliar with the data and/or Tableau would be able to easily understand what the axises mean.
Moving forward there are two short term steps and one long term step to improve the findings of and context surrounding this project. The first short term goal is to bring in the other studies, analyses and visualizations that focused on mapping the data, to improve the understanding of the data, especially where it occurs spatially. The other short term goal is to acquire the past several years of this data to expand the time series analyses as well as to allow for the comparison of arrests between years and mainly pre, during, and post COVID restrictions. This brings me to the sole long term goal for this project which is to continue to add yearly data to see how arrests are distributed across time and space in the years to come.