Mapping Lab: 2016 Stop-and-Frisk Data in Carto


Visualization

Introduction

The NYPD’s stop-and-frisk program – or the practice of spontaneously and temporarily detaining, questioning, and potentially searching civilians – led to over 100,000 stops per year between 2003 and 2013, with almost 700,000 stops at its peak in 2011. Communities of color have been consistently disproportionately targeted, with Black and Latino males between the ages of 14 and 24 comprising the majority of stops. Furthermore, research suggests that there is no significant correlation between the program and the reduction of crime in New York City. Despite the recent decline of the program, the issues it raises regarding racial profiling and privacy rights remain relevant, particularly in the current climate of rampant police brutality (or at least a heightened awareness of it) and continuing disparities in who is targeted.

As such, for this visualization I was interested in exploring on a broader level how stop-and-frisk targeting breaks down by neighborhood/precinct (and how that might reflect the demographics that are being targeted), as well as demonstrating racial and gender disparities on a more granular level.

 

Example Visualizations

Below are four example visualizations I chose as inspiration before creating my own visualizations in Carto. These all come from a Columbia student’s project that used the same data (provided by the NYPD) used here for this lab and crated visualizations with ArcGIS. This project includes several maps that focus on particular aspects of the data, including total number of stops per precinct in a year (although the year is not specified) (Figure 1), a breakdown of total stops during certain time intervals by precinct (Figure 2), a breakdown of individual stops and total per precinct by race (white or “minority”) (Figure 3), and individual stops in Manhattan (also broken down by race) (Figure 4). While not perfect, these examples really demonstrate the richness of this data and the many possible directions for visualizations.

 

Figure 1                                                                       Figure 2

 

Figure 3                                                                       Figure 4

 

 

Materials & Methods

Yearly stop and frisk datasets created by the NYPD from 2003 to 2016 are available for download on NYC.gov. Although 2016 has the lowest numbers, I chose this dataset because it is a more manageable size for this lab and because it is the most recent data.  While I did not have to really clean the data, I did remove several columns that would not be relevant for what I wanted to do for the lab. I also initially planned to change the letter codes used for sex and race to words (e.g. M to Male and F to Female), but there were letter codes that were not readily apparent (e.g. Z for sex and race [perhaps “Other” or “Undetermined”?], and Q for race, which is most likely Latino given the statistics); although I could make educated guesses about these code letters, my search for an official key was unsuccessful. As such, for the sake of consistency, I left them as letters for this lab. In the future I would do more thorough research in order to establish these, as I will mention in my discussion of possible future directions for this project.

As it stood, this data was not yet formatted in a way that would be particularly useful in Carto, the open-source mapping tool being used for this lab. First, I wanted to visualize total numbers of stops per precinct but was unable to since each stop was listed separately. As such, I first created a separate dataset with totals per precinct. To visualize this, I then downloaded an NYPD precinct shapefile which I used to create my first layer in Carto. I then pulled in the second layer of totals data by doing “Analysis,” then “Join columns from 2nd Layer” – the precinct column from the source data (shapefile) was matched with the precinct column from the target data (total numbers per precinct).

I styled this layer by applying color by value (totals) in seven buckets to (in theory) reflect the range in total stops per precinct. I ultimately chose to keep the “color” in grayscale (consistent with the background) so that it would not clash with the colors of the final layer of points or make it too busy visually. I also chose to make the “color” slightly transparent so a user could still see the map beneath (if she zoomed in), which would be relevant for the final layer. I also decided to include pop-ups to indicate the precinct number and total number per precinct for both hovering and clicking (the former more for lap/desktop use and the latter in case a user is looking at it on a phone). I chose to do this instead of permanently labeling the precincts on the map primarily because the labels ended up being obscured by the points in the final layer. Finally, I included a legend to indicate the range total numbers for each precinct, although I am still not completely happy with this – I do not think the number of buckets accurately reflects the range (e.g. the most saturated precincts still range from around 300 to around 1300).

I also wanted to include a layer that showed the individual stops on the map, with the ability to filter for sex and race. However, the NYPD uses the New York State Plane Coordinate System (Long Island Zone, NAD 83 (FIPS 3104)), which is not recognized by Carto. I therefore had to convert these coordinates into latitude and longitude using QGIS, a free and open-source GIS application in which one can view, edit, and analyze geospatial data. With the latitudes and longitudes for each stop, I then uploaded the dataset (which also included information about sex, race, and age) as a final layer in Carto and did a georeference analysis in order to plot all of the points on the map. I styled the points using a fixed (bright pinkish-red) color, and I made the points slightly transparent to allow users to see which points are in fact multiple stops (i.e. when a group of individuals was stopped at the same time/place), which appear more opaque than a single stop at one time/place (although I wish there was a more effective way to show this). I also added two category widgets to allow users to filter the final layer by sex and race.

For my basemap layer I chose “Positron (labels below)” because the grayscale background allowed the colors of the other layers to stand out, but it still provided useful geographical reference points (e.g. neighborhoods and some street names), especially through the semi-transparent precinct layer.

 

Results/Discussion

Once my data was formatted in a way that was recognized by Carto, uploading and visualizing the data was a rather straightforward process. Overall, the resulting visualization provides a lot of information but in a digestible way. The bottom layer (precincts with total numbers) lays the foundation with the bigger picture and the top layer (individual stops that can be filtered by sex and race) both narrows the focus and gives more complexity to the bigger picture. These two layers definitely complement each other; without the individual points the precinct layer would not provide any information about the individuals or the locations of the stops, and without the precinct layer it would be difficult (if not impossible) to get a sense of the total numbers; this is especially true because in some cases there are several individual stops at one point, and it’s not possible to see all of the individual characteristics (e.g. when a user clicks on the point). 

With this visualization, users can look at everything all at once (e.g. Figure 5), or isolate and analyze particular groups (e.g. Figures 6 and 7) or particular precincts/neighborhoods (by zooming in). As it is, the use of code letters (instead of words that have clear meaning) is not ideal. The individual points also admittedly overwhelm the map a bit when fully zoomed out; however, they become much smaller when zoomed in, and this size seemed to be the best compromise so that they did not become too small or too large when zoomed in or out, respectively.

 

Figure 5

Figure 6

 

Figure 7

 

 

Future directions 

This is a very rich area for further exploration; I have only barely begun to scratch the surface with this visualization, and there are so many potential future directions. In addition to the abovementioned changes I would make to the labels (i.e. the race and sex codes), it would be interesting to look at other aspects of this data that were removed for this lab, as the examples above begin to do, such as number of stops during particular time intervals, how often stops resulted in searches, and whether a weapon was found upon searching. It might also be worthwhile to create a similar visualization for each year beginning with 2003 for comparative purposes. However, the use of individual points would likely become overwhelming and therefore not be as effective for other years with greater numbers.

It would also be interesting to complement these kinds of visualizations with others (e.g. graphs) to show trends over time not only in this data (e.g. Figure 8) but also compared with other data (e.g. of violent crime, as in Figure 9).

Figure 8 (from Bump, 2016)

 

Figure 9 (from Bump, 2016). Red = murders; yellow = violent crime; blue = stop-and-frisk

 

References

New York Civil Liberties Union. (2017). Stop-and-frisk data. Retrieved from https://www.nyclu.org/en/stop-and-frisk-data

Bump, P. (2016). The facts about stop-and-frisk in New York City. The Washington Post. Retrieved from https://www.washingtonpost.com/news/the-fix/wp/2016/09/21/it-looks-like-rudy-giuliani-convinced-donald-trump-that-stop-and-frisk-actually-works/?utm_term=.e9078659a1cf