For this lab on mapping, I chose to work with data on energy use, specifically a dataset from 2010 on natural gas use in New York City by zip code, which also contained data on the gas providers, and divided by building class. I sourced this dataset through NYC OpenData. It was provided by the Mayor’s Office of Climate and Sustainability. I was interested to create a visualization that demonstrated how energy/natural gas use changed depending on the class of the buildings within the categories of Commercial, Residential, Industrial, and Institutional.
Knowing that the locational data included in this dataset was zip codes, I decided to make a choropleth map. To prepare for the design of the visualization, I decided to look up examples of choropleth maps showing energy use in New York City. First, I found one created by Modi Research Group’s New York City Energy Mapping Project (Estimated Total Annual Building Energy Consumption at the Block and Lot Level for New York City, n.d.), through Columbia University, which presents “total annual building energy consumption at the block level and at the taxlot level for New York City, and is expressed in kilowatt hours (kWh) per square meter of land area” (Mapping New York City Energy Consumption | Quadracci Sustainable Engineering Lab @ Columbia University, n.d.). It’s an interactive and highly detailed feature, and includes data for all boroughs.
The second inspiration I located was the NYC Energy & Water Performance map, created by the Mayor’s Office of Sustainability. Using six years of data collected by the city, “the NYC Energy & Water Performance Map provides an interactive data analysis and query platform to better understand the energy and water efficiency of more than 20,000 of the largest buildings across New York’s five boroughs” (NYC Energy & Water Performance Map, n.d.). Again, this is interactive and detailed, showing building-by-building data, and even further refinement on the by-block data presented in the previous example.
To edit and clean up the dataset I downloaded from NYC Open Data, I used Open Refine. This is a free open-source software used to edit large datasets efficiently.
My visualization was created using Tableau Public’s mapping functions, following Tableau’s instructions for creating choropleth maps (Mapping in Tableau, n.d.). It was great to work with Tableau again, which we used for Lab 2, and to learn how to further build upon previously learned skills. Tableau Public is an open source software also available for free.
Method and Process
Before beginning work on Tableau Public, I loaded the CSV document I downloaded from NYC OpenData to clean it up. I first had to split out the zip code and latitude and longitude into separate columns, as they were currently dropped into one column without spaces which would block Tableau from recognizing the zip codes. After splitting this one column into three columns (zip code, latitude, and longitude), I then trimmed the leading and trailing white space from all columns, and used text faceting to make sure building types and utility sources were properly consolidated. After this, I exported the edits to a new CSV, which I loaded into Tableau Public.
I started by loading the zip code data into the map function, and then set color to reflect the amount of energy usage by each zip code in Gigajoules (GJ), across all building classes. Then I duplicated this format and filtered for each main category of building class to see how the maps changed depending on the building class filtered for. In the original dataset, residential-class buildings were further faceted into Small Residential, Large Residential, and only Residential. As I didn’t know what factors were being used to determine the differences between small/large/regular residential buildings during data collection, I decided to group all three of these into a single category of Residential.
I also created a map that depicted the natural gas provider by zip code, but since there are only two providers, ConEd and National Grid, I didn’t feel there was much information conveyed by this visualization.
After doing these initial explorations with the data, I decided I wanted my final visualization to be an interactive map showing natural gas use by zip code, where the viewer could filter the map to show the data by each building class, without the need for multiple maps within one visualization. To do this, I created a pie chart that could serve as the filter, and which also in a glance could break down how much natural gas usage was utilized by each building class category.
The finished visualization is a straightforward single map, with a small accompanying pie chart which can serve as a filter. I used a temperature converging color scheme for the map, with red/orange showing the highest gas usage, and green showing lower use. This was influenced by choropleth maps depicting energy use, such as Columbia’s map included above, which use this same scheme which is familiar to viewers. The pie chart I left as a gray scale, to not compete visually with the map. I used some simple sans serif fonts to keep the visual presentation straightforward, and the focus on the map itself.
I am happy with the resulting visualization, which does a good job of showing the differing natural gas use across various building classes throughout the boroughs. By allowing the user to click through the filters, or remove the building class filters, it’s easy to see the changes occur.
By far, the building class that uses the most natural gas are residential buildings. I’m not sure if this is due to commercial, industrial, and institutional buildings not including gas lines among their utility needs, or if there are simply many more residential buildings accounting for the higher use. There were some zip codes that showed as clear outliers, even when no filters were applied, the clearest example being 10314, in Staten Island, the only zip code that shows a value in the red in the unfiltered map, which when filters are applied is shown to be accounted for by the commercial buildings in that zone. Another outlier is 11434, in Queens, which is shown as orange in the unfiltered map, but is a clear red zone when filtered for residential buildings. These examples demonstrate the utility of being able to easily filter this data using my visualization.
I was pleased with my resulting visualization for this lab, and with how easy it was to create a choropleth map in Tableau. I think in the future, I would like to work further with a dataset that includes latitude/longitude beyond the average location within a zip code, to have the opportunity to work with more precise data such as that which was used in my two inspirations, and possibly to explore how to make a proportional symbol or point distribution map. Overall, the only downside to my lab was the simplicity of my dataset selected. While it did make it easy to create the visualization I ended up with, it also meant there wasn’t much more I could do with the data afterwards.
Estimated Total Annual Building Energy Consumption at the Block and Lot Level for New York City. (n.d.). Retrieved April 15, 2022, from https://qsel.columbia.edu/nycenergy/
Mapping in Tableau. (n.d.). Retrieved April 15, 2022, from https://help.tableau.com/current/pro/desktop/en-us/maps_build.htm
Mapping New York City Energy Consumption | Quadracci Sustainable Engineering Lab @ Columbia University. (n.d.). Retrieved April 15, 2022, from https://qsel.columbia.edu/project-nyc-energy-mapping/
NYC Energy & Water Performance Map. (n.d.). Retrieved April 15, 2022, from http://energy.cusp.nyu.edu/