Mapping NYC RESIDENTIAL CONSTRUCTION: RACIally-correlated spatial trends in current building densities and remaining room for growth

Lab Reports, Maps, Visualization


Imagine a city where everybody could construct a building as big as as they wanted. Mammoth constructions would blend one into another. The city would be stuffy. Not a healthy urban environment. Now imagine a second city where government officials regulate the shape and size of every building. There’s lots of room for the sunshine and breeze to pass, but every building looks the same in order to comply with tight regulations. This monotony is also not ideal.

Seeking to strike a balance between these extremes, urban planners have employed a creative metric: the floor area ratio (FAR). By regulating volume instead of shape, FAR elegantly enables city authorities to create pleasant urban environments without prescribing the specifics of the building’s aesthetics. FAR is most easily understood by example. Take a perfectly square tax lot and a building where all the floors are the same size. A one story building that takes up the full extent of this tax lot, a two story building that takes up half the lot, and a four story building that takes up a quarter of the lot all have the same FAR. Defined as the the sum of all the floor areas in a building divided by the total area of the tax lot that the building is constructed on, FAR presents architects with creative license and a simple tradeoff: to build higher you must build thinner.

In many cities across the US, officials set a maximum allowable FAR in each type of land use zone. Unless granted an exception, architects must make sure a building’s FAR complies. In this project, I analyze how the FAR of all ~800,000 residential buildings in NYC compare to the maximum allowable FAR of their tax lots. I combine maps of the allowable and utilized FAR with race data from the Census Bureau’s American Community Survey (ACS) and examine whether there are any racially-correlated spatial patterns.


In particular, I’m interested in examining racially-correlated spatial patterns related to the following factors:

  1. Neighborhoods with many high volume buildings (ie. high built FARs) that have the potential to create a crowded and unpleasant urban environment
  2. Neighborhoods where there is limited room for building expansion (ie. high percentage of allowable FAR has already been used)

The interaction of these factors is also important. For example, non-white neighborhoods that have already used a large percentage of the allowable FAR but still have overall low built FARs, are potentially being artificially stunted by inappropriate FAR maximums. Given municipal zoning’s roots in segregationist movements, it is important to flag these neighborhoods for further examination. In this project I build a series of maps that help do this.

To do this, I used Python’s GeoPandas package to combine and analyze data from two sources:

  1. MapPluto: This is a dataset published by the NYC government containing data for each tax lot in the city. I downloaded the file as a geodatabase that associates each tax lot polygon with its attributes (ex. assigned land use, lot area, allowable FAR, built FAR, etc.). I used these attributes to subset out the residential tax lots and calculate both the built residential FAR (ie. the FAR of the buildings in a tax lot built for residential purposes) and the percent of the allowable residential FAR used. It is important to note, that it is possible that the percent of allowable FAR used is over 100% without it evidencing an underlying zoning violation. This happens because the department that grants FAR exceptions (ie. allows builders to exceed the maximum allowable FAR), is different than the department that set the initial allowable FARs. MapPluto only reports the allowable FAR data pre-exceptions. After testing whether these over-100% usage lots made a difference to the analysis if I treated them as properties where 100% of the FAR has been used (my best guess since there is no way of knowing the true allowable FAR), I have discarded them. You can find other instances that I have discarded, along with my analysis rationale and methodology for aggregating the Pluto data from tax lots to census tracts outlined in the Markdown of my Jupyter Notebook.
  2. Census Bureau’s American Community Survey: I use the most recent 5-year estimates which are from 2018. (The 5-year estimates that include the 2019 Census data are still pending from the Census Bureau, so this is the most recent high resolution data available.) The 5-year estimates are published down to the block group level, but I pulled the data at the tract level (the level one above block group), because I found the block group level too small to be able to discern cross-borough trends in the map. I used the censusdata package, a third-party Python wrapper for the Census’ official API, to pull the data. The relevant race question that I pulled tract-level results for is B02001. I used this data to calculate whether each census tract is primarily white or non-white, and then merged it with NYC census tract shapefiles from the NYC Planning Department’s Open Data portal. I chose the shapefile with a clipped shoreline (ie. clips the tract polygons so they stop at the shoreline) because this improves the readability of the map.

For the actual map making I used CARTO Builder, a tool that facilitates the creation of customizable maps in the browser. To simplify my data pipeline, I also used the Python CARTOframes package, and a CARTO API key, to publish the cleaned and analyzed data directly to my CARTO account from the Jupyter notebook. Once published, the datasets were accessible to add to a CARTO map in the browser the same as any other dataset loaded manually would be.


I created two final maps, and two intermediate maps:

Final maps

  • Map 1: Dual layer map with semi-transparent grayscale highlighting relationship between built FAR and race across NYC
  • Map 2: Dual layer map with semi-transparent grayscale highlighting relationship between percent allowable FAR used and race across NYC

Intermediate maps

  • Map 1: Single layer map visualizing built FAR across NYC as non-grayscale
  • Map 2: Single layer map visualizing percent allowable FAR used across NYC as non-grayscale

I made a number of deliberate design decisions when making these maps:

  1. Visualizing non-race layer as semi-transparent grayscale: I chose to capture the interaction between race and the FAR trends by plotting race as a categorical hue and then plotting the continuous FAR variables on top as a semi-transparent grayscale. This has the effect of lightening the underlying hue and showing how the two layers interact. I could have instead plotted the FAR layer as some sort of point geometry sized according to its value (ex. plotting the average built FAR in a census tract with a circle at the center of the census tract). I opted to keep everything represented by color though because pre-attentive processing studies find multiple attributes slows cognition significantly.
  2. Binarizing race categories: I chose to bin the ACS responses into non-white and white categories. While this does discard potentially valuable information about non-white populations it also makes discriminatory trends very difficult to spot, especially with the grayscale overlayed. Ideally, I would used the binarized map to identify potentially problematic trends in non-white populations, and then create a more granular map of non-binarized races in only these areas to investigate further.
  3. Selecting colors that are distinguishable even for those with colorblindness: It was important to me that these maps be accessible to as many people as possible. Drawing from this site, I tried to choose colors to represent non-white and white populations that would be distinguishable for someone with colorblindness. This was tricky because I also needed the colors to both be dark in order for the semi-transparent grayscale to work as a second layer. A dark brown and blue worked well though, especially when lightened by the grayscale layer against the black background.
  4. Dark basemap: CARTO Builder defaults to a light basemap with administrative boundaries, but I found this distracting for my purposes. The maps I’ve created are designed to show coarse cross-borough trends, not local patterns that rely on roads, parks, or topography for contextualization. I found the dark map to be effectively minimalistic. I thought it provided enough contrast to easily distinguish the ocean, the spatial element most important to discerning the NYC boroughs, while also making the trends pop.
  5. Reversing the colormap: The directionality of the grayscale colormap in CARTO seems to be optimized for the light basemaps. I reversed this, setting white as the upper bound of the colormap. With the dark basemap this makes the less-interesting areas (ex. low built FAR or low percentage of FAR used) fade into the background and the more interesting areas pop.
  6. Binning colors using quantiles: I chose to bin colors using quantiles, rather than linearly or using Jenks because I found it to be a helpful middle ground for handling outliers. The cleaned FAR data is slightly left skewed. While I don’t want to emphasize the outliers significantly, as Jenks would do, I do want to maximize the grayscale distinction where the majority of the data is. I found quantiles to be an effective mechanism to do this.

While a full analysis of the outlined research questions is beyond this project, I am optimistic about the utility of these maps for sparking investigation into discriminatory trends in residential FARs and current building densities. A brief analysis reveals a few interesting observations:

  1. The eastern half of Queens does not have a high built FAR (ie. is not built up very densely), but is comprised of tax lots that have utilized a high percentage of their permitted FAR. This suggests that the allowable FAR of eastern Queens may be set prohibitively low and could be increased without creating a building density that would block light and air. An unnecessarily tight limit on building expansion can impede economic and housing development, and in eastern Queens primarily would impact non-white neighborhoods.
  2. The central portion of the Bronx faces a different issue. This predominantly non-white area faces a high average FAR (comparable to the densest parts of downtown Manhattan) and low percent of FAR used. This suggests that the city has potentially set the allowable FAR in this area to be too high, putting the area at risk for unhealthily dense conditions. It notably appears that the southeastern tip of the Bronx, the one predominantly white area in this borough, has a lower allowable FAR. This lower FAR would enforce less dense, more livable conditions.

These observations require more investigation to further understand.


I found CARTO Builder to have an intuitive interface, and helpful customizability. I did wish the legend was easier to adjust either in the interface or via cartoCSS. Unfortunately, there did not seem to be a way to adjust the dimensions of the legend or the size of the font. This limits the explanatory text that can be included. I also appreciated how simple CARTO made it to build a data pipeline from Python.

I’m curious to explore the trends identified in this project and in the MapPluto dataset further. I believe that my analysis could benefit significantly from field research. I have not spent significant amounts of New York City, so my understanding of the patterns in the data is contextually limited. If I were to take this analysis further, I also think it would be important to resolve some of the data anomalies in in the MapPluto dataset more rigorously. This would require input from the NYC departments directly. I hope to have the opportunity to explore this further.