ENERGY CONSUMPTION OF NEW YORK CITY HOUSING AUTHORITY (NYCHA) DEVELOPMENTS, 2010-2017
In a city that never sleeps, all those flashing lights and air-conditioned buildings are bound to add up to one large bill.
I wanted to analyze NYC’s energy consumption and found a dataset, made available by NYC Open Data, detailing the energy and power consumption of the New York City Housing Authority (NYCHA) developments located across the five boroughs.
NYCHA developments are residences provided for low- and moderate-income New Yorkers by the NYCHA, which is the largest public housing authority in the country. The NYCHA currently owns 326 developments and provides subsidized rental assistance to private homes through the Section 8 housing program.
I compiled my findings into a dashboard that can be found on my Tableau page and within this report.
Data: CSV file titled, ‘Electric Consumption And Cost (2010 – March 2019),’ which was produced by the NYCHA and updated in May 2019.
To clean the data: OpenRefine, which is an open source program created for data cleanup and transformation (a.k.a. data wrangling).
To visualize the data: The desktop version of Tableau (2019.2.1) to create my graphs and Tableau Public to publish them online. Tableau is a modular GUI program that allows you to drag and drop different categories (based on your imported dataset) into columns, and rows, as well as apply various filters, colors, and calculations on your data.
I did not find any particularly inspiring visualizations to base my graphs off of, but the majority I came across were bar graphs or stacked shapes that demonstrated a type of progression over discrete or continuous time frames.
I began by evaluating which dataset categories would be useful for parsing out information. The categories and their descriptions can be found in an Excel file called ‘Data Dictionary Electric Consumption Cost‘ which is linked on the same page as the dataset.
I imagined having a single comprehensive graph surrounded by smaller graphs that each focused on specific categories. For the comprehensive graph, I chose to focus on the five boroughs and their energy consumption (kWh) over time (years and months).
An issue from the beginning was the inconsistent values measured from 2018 to 2019. The dataset jumps from June 2018 to January 2019, which makes 2017 the last year to be fully recorded. Due to this gap, I chose to filter my visualizations down to the data recorded between 2010 to 2017.
I also decided to filter out locations with the borough, “NON DEVELOPMENT FACILITY,” because they were typically NYCHA offices, not residences.
I created my largest visualization by applying boroughs as my main row values. The years were further drilled down into months, which is how each bar is separated. Originally, I graphed the values as lines , however I changed the look in favor of bars so that the differences between each month would be more evident and I could apply color to further emphasize them. The original dataset had the revenue date (the year the energy bill was issued) formatted as “3/1/2019 0:00”, which I updated to “2019-03-01T12:00:00Z” in OpenRefine. Afterwards, I split the columns by year, month, then date and time. The date/time column I ended up removing completely since there were no actual times recorded.
I chose to apply color to the bars based on their kWh value on the range from 0 to the maximum, which was over 48M. Tableau automatically selected the minimum value for the opposite end, but I changed it to the value zero so that there could be more depth to the colors on the graph.
For the next visualization I knew I wanted to incorporate the quantity of developments in each borough since they were not evenly distributed. For example, Brooklyn doesn’t use the most energy because they’re necessarily wasteful, but rather because from 2010-2017 they had the greatest number of developments out of the five boroughs. I also changed Tableau’s automatic data type for the Borough category to Count, which turned the category into a quantitative value that I could apply calculations to.
Rather than displaying the literal amounts, I chose to change the quantities to percentages of the total amount (of developments), so that it would be clear which borough had more or less in relation to the others. I then added labels to each borough, changed the color of the lines based on their count, and sorted them in descending order.
Another category that seemed interesting was the distribution of funding for each borough. The values varied between federal types, mixed finance/LLCs, and section 8 housing. I decided to create new Funding Source groups in Tableau and combine similar fundings into either a “Federal” or “Mixed Finance/LLC” group.
Since the bar graphs are different heights, I added labels detailing the percentage of each group (federal or mixed/LLC) so that a more clear ratio of funding could be seen across the boroughs.
It was clear from the main comprehensive graph that energy consumption fluctuated widely from month to month, and with some variation each year. I wanted to measure the rate of difference from one year to the next, although I’m not sure if this or measuring the change from 2010 each year would have been better.
I made the positive rates red since increased energy use does not automatically denote a good thing, so I wanted to stand out. I didn’t want to potentially confuse users by making positive change red and negative change green or blue, so I made the higher percentages red but the lower values (which were the decreased consumption rates) gray. Using red like this also aligned this visualization with my primary graph where high energy consumption was shown in red while lower use would dim down and de-saturate to gray. I chose to further divided the graph into boroughs since my use of color for consumption measure prevented me from using colors for boroughs in a stacked bar chart style.
I originally place this chart on the right beneath the filters I provided, however it had important information regarding energy consumption change and it seemed disjointed from the other bar graph which detailed consumption over time. I moved it to the left to align with our natural Western left to right reading pattern, and allowed the filters to have their own space on the right.
I used gridlines where it was harder to compare some categories’ values to others without some type of reference line. Otherwise, I tried to remove the gridlines in order to avoid clutter or to emphasize a trend over the values themselves.
I added annotations (my findings) in separate boxes beside the various graphs because I didn’t have space to place annotations on the graph themselves, and I wanted to add more white space and breathing room between each visualization. They unfortunately don’t translate well once you download the dashboard PNG from Tableau Public.
Eventually, I found the “Automatic” size option under the Dashboard menu, which performs slightly better under the pressure of various screen sizes. As mentioned previously, I moved the rate of consumption graph to the left of the dashboard, extended the filters into the now vacant blank space, and reorganized my annotations and titles.
Each borough (most evident in Brooklyn, Manhattan, and the Bronx) shows a decrease in energy consumption after the year 2013.
As expected, the boroughs that contain the greatest number of NYCHA developments receive the most funding and reflect the highest energy (kWh) consumption.
Although Brooklyn and Manhattan exhibit the greatest energy consumption, Staten Island’s rate of consumption has the greatest percentage increase from 2014 to 2015.
Having a firm grasp of what data was recorded in my dataset allowed me to accurately identify information and trends. At first glance, the I misinterpreted the dataset for a general log of NYC energy consumption, however the dictionary provided and details regarding funding helped to further distinguish the NYCHA housing from other types of developments in the city.
Color was an interesting series of choices for my visualizations because I started out using a palette that defined each borough throughout my dashboard. However, when I got to making graphs that were not honing in on the boroughs themselves (e.g. the funding source bar graph), the colors seemed to add too much traffic on the dashboard and were not enhancing a message. They were merely operating as labels. After I realized that, I stripped the graphs of their color and started experimenting with accent colors until I got the palette I wanted. When creating certain gradients, I found Tableau to be difficult and wished it would allow greater freedom to define colors at specific points in a color spectrum.
Compared to the specificity that Adobe products have when creating gradients, I was somewhat flustered by Tableau’s limited options. I ended up creating my own “Custom Sequential” gradient by selecting the red-black divergent gradient, reversing the sides, then changing the black color to a light gray that would create a gradient from the mid gray to the end.
The hardest part of this project was fitting my visualizations properly on the dashboard and having the ratios properly translate onto a web browser in Tableau Public. I used three different computers to work on this project and found it difficult to resize graphs and have them look fine on a different computer. This was most evident in my final dashboard (particularly in version 1) where the proportions are overlapping and squeezed together as opposed to when I viewed it on an iMac desktop in the Pratt computer lab and it was fine. Eventually, I found the “Automatic” size option under the Dashboard menu, which performs slightly better under the pressure of various screen sizes. Next time I will be more conscientious and start with small graphs so that if worse comes to worse, I can resize my graphs in the online beta editor on Tableau Public.