Cycling is great as exercise, leisure, or a mode of transportation. It’s one of the most efficient ways to get around (and friendly to the environment), while offering great health benefits. More and more people in New York City have taken up cycling in the past two decades, according to the city’s report in 2019. 24% of adult New Yorkers reported to have ridden a bike at least once in the year of 2017, and of those New Yorkers, 53% of them (793,000) reported that they ride a bike regularly, or several times a month. The city estimates about 490,000 cycling trips are made per day.
With these rising numbers, the city has also rapidly increased the number of bike lanes: in the past seven years, NYC DOT has expanded the on-street bike network by more than 330 miles, including more than 82 protected lane miles. So how safe is it to bike around in New York today?
Cycling appears to be generally safe in the city, though incidents (including fatalities) are more frequent than they should be. Where are these incidents occurring and what are the causes? Are there any patterns that can be established to better inform the city’s planning to make cycling safer? This project provides a snapshot of cycling incidents (collisions with pedestrians, vehicles, or other cyclists) in New York from 2012-2021. It aims to uncover insights into how the city could potentially improve bike infrastructure, visualizing different factors that may impact safety.
Data has shown that motorists are significantly more likely to be at fault in crashes with cyclists. This comes as no surprise; we all share the same road, and those of us on bikes are far more vulnerable than those in cars. One obvious approach to improving safety is to build more bike lanes; the city reports that, between 2006 and 2016, only 11% of cyclist fatalities occurred on streets with a bicycle facility (the vast majority of fatalities on streets with no bike lanes). Many would also agree that a more robust bike infrastructure makes streets safer for everyone: cyclists, drivers, and pedestrians alike. This report is intended for anyone interested in making New York streets safer for all.
To first get a sense of overall trends of cycling incidents in the city, I used data from the NYC OpenData on Motor Vehicle Collisions, which tracks all collisions involving vehicles and bicycles from 2012-2021. I added a column that summed up the number of injuries or fatalities involving a cyclist, and plotted the data by type, time, and location (borough and zip code) into Tableau.
This provided a good bird’s eye view, but in order to make this information useful and applicable, I had to drill down further. Using Carto, I mapped out all incidents by their coordinates, to see if any major clusters or hotspots could easily be identified on certain streets or intersections. The sheer number of incidents, however, created an overly cluttered visualization, with too many dots (representing incidents) densely packed on the map. I tried to separate them out by year, but couldn’t identify any clear trends year to year either, at least not at a citywide scale.
To understand the spread of incidents more easily, I grouped them by zip code, summing up the total number per area over 2012-2021. To create zip code polygons for the city, I used this dataset, also from NYC OpenData. A color gradient immediately calls out the “hot” areas with high numbers of incidents.
I also mapped out current bike lanes to see if the incidents might be occurring in areas lacking infrastructure, and if there might be a difference in number between types of lanes. I found geospatial data on bike lanes from this dataset, also from NYC OpenData. The dataset includes type of “facility” per lane: whether they are protected (physically separated from vehicles), conventional (marked lanes on pavement), or shared (pavement markings of arrows and bicycle signs only to indicate that the road is shared). I color coded the lanes by type, matching the colors used in the official city map.
I also found a dataset published by NYC DOT that highlighted “Priority Bicycle Districts,” which are neighborhoods currently flagged for increasing infrastructure. These 10 districts are neighborhoods the city has prioritized for future bicycle network expansion. The city reports that these districts represent only 14% of the city’s existing network but have 23% of incidents of cyclists KSI (killed or severely injured).
Lastly, I took the data on incidents and uploaded them into Carto’s “Predict Trends and Volatility” tool. This tool predicts whether a future value at a given location will increase, decrease, or remain the same, based on previous values equally spaced in time of that location. In order to input data, I had to transform my original dataset: first, I counted total incidents by year per zip code, then I spread each year into its own column (all transformations done in R).
I tested my maps with two participants, to gauge overall clarity of the visualization and to see how they would interpret the data. Both users commented on how the incidents seemed to be “everywhere,” even on streets and intersections with protected bike lanes, and how they “all look the same” from year to year.
User 1 (Eric) assumed that the incidents are “in more densely populated places,” and was curious to see a map showing population per zip code.
User 2 (Susan) observed that it seemed like there are “more incidents where there are bike lanes,” which surprised her, before she rationalized that “more traffic means more accidents.” She then wondered if the areas with a higher number of incidents are also areas with higher traffic in general, and thought it would be helpful to see a layer with car traffic data.
I was curious about both factors, population and vehicle traffic. Population was easy to plot out; the data was already part of the city’s Zip Code geospatial dataset. Data on vehicle traffic, however, was tricky: I couldn’t find any dataset that tabulated number of vehicles per street or neighborhood, but I figured that, in place of traffic data, I could use the data I already had on vehicular incidents (the same Motor Vehicle Collisions dataset I used for cyclist incidents).
As for the trends map, user 1 (Eric) understood it immediately, especially once he noticed the trends widget. User 2 (Susan), however, found the map unclear, since it didn’t correlate with the original incidents map. More context is definitely needed for the trends map, even if just in the form of an accompanying description. Unfortunately, Carto limits the amount of descriptive text allowed for a map (and the title isn’t large or noticeable enough). Susan understood the trends map once I described it to her, and wondered why the areas in the outer areas of Queens and Staten Island were trending up, assuming that Covid must have had an impact since “everyone is biking now.”
The main conclusion to be drawn from all this data is that cyclist safety involves many factors: sheer volume of bike ridership, infrastructure needs based on ridership, and also volume of vehicle traffic in a given area (generally, how busy the streets are). I think the city’s prioritized areas for future bicycle facilities seem mostly correct in addressing opportunity areas, but perhaps too much focus is on expansion only. Resources could also be directed to improving existing infrastructure, especially investigating the impact of converting a shared or conventional lane into a protected lane.
I debated whether or not to remove the layer on individual incidents since it was difficult to analyze and interpret, even when separated out into different years. The animation was interesting, but not helpful — there was no discernible pattern in clusters of incidents from year to year. Perhaps a zoomed in view at the street level could reveal patterns for that specific street, but not at the city scale. Both users I spoke with were interested in seeing data by year, even if their takeaway was that number of incidents didn’t change much over time.
The overarching question then becomes: If the city has already invested so much in building more bike lanes, especially strategically in opportunity areas, then why aren’t these numbers going down? It should be noted that these numbers are absolute and do not represent a percentage of total ridership. This may explain why Manhattan still has so many incidents while its streets seem to all be equipped with bike lanes. Type of bike lane also likely makes a difference; they shouldn’t all be treated the same.
When comparing the number of cycling incidents vs. population vs. number of car accidents, all by zip code, a few correlations emerge. There’s a strong overlap between areas with high cycling incidents and areas with high car accidents; there’s only 2-3 areas where cycling incidents are high while car accidents are relatively lower. There’s less of a correlation between the population map and the other two, however. A lot of midtown Manhattan and central Brooklyn zip codes have high incidents (both cycling and car) but low population, likely because these are business areas with high traffic, where people commute in for work but few reside there.
As mentioned, the trends map doesn’t quite stand alone on its own; it requires some descriptive or introductory text so the user can better understand how to interpret the map. Once explained, however, it’s clear what the color gradient means, and why this map doesn’t correlate directly with the incidents map.
Since the trends map is highlighting areas where incidents have been ticking up in recent years (from 2012 to early 2021), it’s measuring any large percentage in growth from previous numbers. This information seems helpful, but I’m assuming the outer areas of Queens and Staten Island still have overall lower numbers than ones in Manhattan and central Brooklyn, even if trends are staying about the same there. As User 2 (Susan) mentioned, this is likely because people in the outer areas have picked up cycling recently, and with higher ridership comes more incidents. This might explain why the areas marked for priority for future bike infrastructure aren’t necessarily the areas that have been trending up in recent years. It seems like half of the areas marked (mostly in central Brooklyn and central Queens) are trending up while the other half of them are remaining the same.
These maps help capture the complexity of the problem: ensuring cycling safety in the city requires taking into account several different factors. One major factor at play is volume of vehicle traffic, which takes place in areas even with smaller populations. Another factor is number of cyclists and if current bicycle infrastructure really meets the needs of ridership in different streets and intersections. It would be helpful to get accurate information on ridership overall, to understand which street lanes are busiest and require either conversion to protected lanes, or for neighboring streets to have lanes as well (so one street’s bike lane isn’t overwhelmed with cyclists).
More research is definitely needed on the different types of bike lanes, with the assumption that protected bike lanes are the safest, conventional bike lanes come in second, and shared lanes perhaps offering no measure of protection at all. With this data in place, the city can weight the lanes differently, so it’s not just a blanket quantification of bike lanes in a given area.
Lastly, data on types of injuries would be helpful. This is where the city has already done some research, in order to calculate cyclists KSI (killed or severely injured), which factored into their decision making on the priority areas for planned bike lanes. In the dataset that I worked with, injuries are all equally counted, obscuring any insight on areas with higher incidence of severe injuries (if not fatalities).
- NYC DOT Safer Cycling: Bicycle Ridership and Safety in New York City (2017) (link)
- NYC DOT Cycling in the City (2019) (link)
- Roe, D. & Jackson, D. (2020, Feb. 28). What’s Really Killing New York’s Cyclists. Bicycling. (link)
- Walljasper, J. (2016, Sep.) 10 Ways Bicycle-Friendly Streets Are Good for People Who Don’t Ride Bikes. AARP. (link)