Mapping the Density of NYC Subway Access


Lab Reports, Maps, Visualization
NYC Subway Secrets and Insider Tips
Image from article in NYC GO by Brian Sloan, November 17, 2015. Photo taken by Joe Buglewicz

Introduction

As many people will (hopefully) start moving back into NYC and it’s neighboring boroughs or states next year after a long-lasting work-from-home policies, it is expected that there will be a surge of people looking for new places to live in. One decisive factor in deciding housing is transportation–namely, the ease of access in getting from one place to another. With rents alarmingly high in most areas within Manhattan and now even in Long Island City and Brooklyn, many who work in NYC are looking further out into other boroughs in search of more affordable residence with convenient access to MTA. Coming across a NYC subway dataset as I myself browse StreetEasy for housing, I wondered: is there truly a difference between subway access within Manhattan? What about with other boroughs? According to a 2019 article in the U.S. News, NYC was ranked as the best city for public transportation, with over 1.3 million jobs accessible within a 30 minute trip and majority of people using transit to commute to work (Leins, 2019). Despite being ranked as a top city for public transportation, if such discrepancies exist among districts, are there districts the city government should expand access to? The visualization of NYC community districts and subway entrances hope to provide some insight into aforementioned questions.

Visualization References

Three different spatial visualizations were referenced in mapping the NYC subway entrance data.

The first reference was a map of the prevalence of pollution particles in a QNS article, a news platform based in Queens, NY (QNS, 2019). The visualization’s fading of the background map (base map layer) to light-grey helped users focus attention onto the geography-of-interest. The positioning of the legend on the top left corner made good use of unused space (with non-important data ink) while providing insight into the color coding. However, the visualization could have benefited from additional analysis on level of pollution by community districts or neighborhoods so that users unfamiliar with NYC geography can identify names of different areas.

Map of the annual average fine particles, i.e. pollutant, in NYC from a QNS article in May 2019

The second visualization referenced was from an article on NYC’s population density on an architecture online media platform, 6sqft (Colman, 2019). Similar to the previous reference, the visualization’s greying out of the base map allowed users to focus specifically on the red-coloured area. This visualization took a step further by completely coloring the rest of the base map grey including the water area, which helped users understand that the visualization is a 3D model as it created a visual effect of a grey base panel. Although the utilization of a bar chart allow viewers to understand where there is peak density vs. not, it was difficult to see the entirety of NYC’s districts. Moreover, the gradient color choice also included a similar tone of grey, which was confusing as there were also other taller bars outside of NYC which were colored the same.

A density map of NYC population density in an article in 6sqft in March 2019

The final visualization referenced was a NYC street tree density map in an article by NYC Parks (NYC Gov, 2015). The visualization’s usage of green gradient-scale suited the topic of trees. I thought the visualization could have benefited from using the same color as the NYC Parks logo and visualization title to unify the overall map. Although the visualization contained a small paragraph-text instructing users on how to read the map, users could benefit from having numerical translations of what low, medium, and high mean.

Materials

I used CARTO, a web-based Location Intelligence platform that allows for easy mapping, to visualize my dataset. According to its website, the platform allows users to draw meaningful insights from location data ranging from data monetization and catastrophe modeling to supply chain management, etc (CARTO).

Data on NYC community districts were found via NYC.gov website. Data on number of subway entrances and subway-lines were found via NYC Open Data. Meta-data on the community-districts file can be found here. NYC Open Data partners with the NYC Mayor’s Office of Data Analytics and the Department, Information Technology and Telecommunications, and other agencies to collect and manage the databases and makes it available to the public for free downloads of city-related data ranging from district geographies to business operations. All three datasets were able to be downloaded as GeoJSON files, allowing for quick and easy ingestion into CARTO.

Research and Visualization Methodology

I initially collected the dataset on NYC subway entrances via NYC Open Data and the community district dataset from NYC.gov website and uploaded the GeoJSON zip files directly into CARTO’s platform using its upload function. Once done, I created a map based on the subway entrances dataset, and loaded in the community district dataset as a second layer.

To understand the degree of differences among districts, the analysis tool’s ‘Intersect and Aggregate’ function was used to calculate the count of subway entrances within respective districts. This allowed for an easy joining of the two datasets to create a density map.

After seeing the choropleth density map, I wanted to see if there was merit in showing the physical locations of the subway entrances. However, overlaying the subway entrances appeared too messy, and upon peer review and discussion, the consensus was that the overall shape of the subway entrances resembled the NYC subway line-map. Thus, I scavenged for the subway line dataset on NYC Open Data, found it, and then uploaded into the map as a new layer. The data, too, was in a GeoJSON format, allowing for an efficient and auto-data-type adjusted upload into the system. Adding the layer automatically showed the subway lines as lines. This was not only a much cleaner representation of the transportation system than the dots of entrances, but also a dataset that provides detail about transportation access in addition to the underlying density map of subway entrances.

I then moved on to styling the map. As the focus of the visualization is only the NYC community districts as provided by the government website, I muted the other areas of the map to be a solid color (white). This would allow users to focus on the data itself and on the gradient scale in areas-of-interest instead of being distracted by extra data-ink of outlying districts. The density of subway entrances were colored based on the hex code for the Metropolitan Transportation Authority (MTA) logo. Tones were adjusted so that the gradient is distinguishable. I resorted to using a single color scale (light to dark, not from warmer color to colder) to maximize accessibility for users with Color Vision Deficiency (CVD). Upon testing various solid tones, the subway lines were colored in a dark grey in order to maximize visibility without being too bold compared to the underlying chloropleth map (i.e. not completely black nor complementary colors). Borders between the community districts were reduced to be at the minimum (0.5 pt). Transparency was also adjusted to allow users to distinguish the districts yet not have it clash with the subway lines-map.

Metropolitan Transportation Authority (MTA) logo color

Several steps were taken to make additional detail available for users and to assist their navigation of the visualization. In the NYC subway entrance density layer, a hover-over ‘pop-up’ was added to show numerical, non normalized values for subway access. To also allow users to understand the visualization at-a-glance instead of hovering over the entire map, I added a legend of what the gradient scale indicates. Numerical expressions of the density (shown in exponential values of 10^-7) were replaced with ‘less dense’ and ‘more dense’, a more accessible label for non quantitatively-oriented users. Although a geographic text may have been a helpful addition to the pop-up, I was unable to easily locate a precise mapping of codes to text-descriptions. In Map Options, I selected the layer selector function so that users can toggle-off the subway lines view should they find it distracting. The layers and the map name was also renamed to be more user-friendly.

Once adjusting the stylistic elements, I published the final map to be viewable by the public with the link to the visualization.

Results

The resulting CARTO visualization of the density of NYC subway entrances by community district can be found below:

The map shows that there is, in fact, a difference in ease of transportation access within Manhattan. Although most districts within Manhattan are more dense than those in Brooklyn and LIC shown by the darker blue hues, there are some areas near the Upper East Side and near East Harlem that have fewer access to subways. The leading reason could be that there are less subway lines available in those districts–a phenomena one can see when overlaying the map of the NYC subway lines on top of the density map.

A wider discrepancy of subway access exist when comparing that of Manhattan to other boroughs–the Staten Island Railway (SIR) is the only line stretching to that borough, explaining the lower density of subway access there. In Brooklyn, there are more access to subway lines in areas in close proximity to Manhattan–generally, the further the district is away from Manhattan, there is less concentration of lines and thus entrances.

Reflection

In reviewing the CARTO map, it’s interesting to circle back to the main question of whether there are areas the city government should focus on in expanding access to transportation. I believe the answer is a strong yes. Additional analysis of other MTA services and/or bike rental stations could prove helpful in determining what the degree of the lack of access to transportation is. However, the initial map of the NYC subway entrance density shows the need for further including outer boroughs (i.e. deep Brooklyn, LIC, Bronx) and areas of Manhattan (i.e. East Harlem, Upper East side). An improved subway infrastructure/architecture can allow people to commute efficiently from a wider range of neighborhoods and open up potential for economic development in previously hard-to-reach communities.

In reflecting upon the usability of the visualization, I believe the visualization could have benefited from further cleaning the data so that districts in which there are no subway entrances due to a lack of a subway line access all-together could be removed from the data analysis. This would allow for a more accurate scaling of the chloropleth map to show the difference between an area in which subway lines exist yet there are 0-1 entrances and one that has none. Moreover, had a different platform been utilized, the map could have been made to be much more interactive for the users so that users can filter out views by different subway lines, districts, and can drill down into a district to see more granular details such as the population, historical trend of new line construction, etc.

In the future, I think this dataset and map can be very useful when overlaying with other NYC datasets. For instance, the QNS article’s NYC pollutant particle dataset can be overlaid with this map to show whether there is a relationship between the level of pollution by the density of subway entrances. Is there a higher concentration of particles in areas in which there’s more public transportation? Or is it only related to more known-methane producing transportation such as buses and cabs? The subway density map can also be overlaid with the population of NYC residents or number of operating businesses by community district. Is accessibility positively or negatively correlated with how many people reside in that area? How about with the number of businesses? Is it more strongly related to where people work vs. inhabit? Various multi-disciplinary questions and insights into city transportation can arise upon further examination of this visualization.

Sources

Calgary, O. (n.d.). Community Districts. Retrieved November 10, 2020, from https://data.cityofnewyork.us/City-Government/Community-Districts/yfnk-k7r4

Calgary, O. (n.d.). Subway Entrances. Retrieved November 10, 2020, from https://data.cityofnewyork.us/Transportation/Subway-Entrances/drex-xx56

Calgary, O. (n.d.). Subway Lines. Retrieved November 10, 2020, from https://data.cityofnewyork.us/Transportation/Subway-Lines/3qz8-muuu

Colman, M. S. (2019, March 25). See how NYC’s urban density stacks up against other major cities. Retrieved November 10, 2020, from https://www.6sqft.com/see-how-nycs-urban-density-stacks-up-against-other-major-cities/

Leins, C. (2019, April 3). These Are the 10 Best Cities for Public Transportation. Retrieved November 10, 2020, from https://www.usnews.com/news/cities/slideshows/10-best-cities-for-transportation?slide=10

Parry, B. (2019, May 01). City’s air is getting cleaner, but Sunnyside and Woodside are experiencing more pollution, report finds. Retrieved November 10, 2020, from https://qns.com/2019/05/new-study-shows-sunnyside-woodside-have-highest-level-of-air-pollutants-in-borough/

Sloan, B. (2017, September 12). 15 Secret Subway Tips. Retrieved November 10, 2020, from https://www.nycgo.com/articles/15-secret-subway-tips

TreesCount! 2015-2016 Street Tree Census. (2016). Retrieved November 10, 2020, from https://www.nycgovparks.org/trees/treescount