Introduction
In 1980, Congress passed the Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA) as way to manage abandoned or unsupervised toxic waste storage facilities. CERCLA established a national program to standardize things such as emergency response, data collection, liability of responsible organizations, and cleanup procedures. CERCLA also created a trust fund, now referred to as the Superfund, through which resources could be allocated for cleanup of contaminated sites. Today, the term “Superfund Site” refers to any site on the National Priority List (NPL) that receives funding for remediation through the Superfund. The goal of this visualization was to map the location of all current Superfund sites in New York State, perhaps prompting a deeper examination of which communities are more likely to harbor toxic waste facilities and what factors contribute to this phenomenon.
Influential Visualizations
I began my research by analyzing three different visualizations of Superfund locations around the country. The first visualization that I engaged with was created by the ToxicSites project.
The visualization begins with a map of the United States, overlaid with points for each Superfund location. Users can click on a given point to see the name of the Superfund site. Clicking on the name of site loads a separate page with detailed information about that specific site, including a brief overview of the company’s activities, a timeline of their involvement in the Superfund program, a list of the types of contaminants present and their health effects (carcinogen, endocrine disruptor, etc.), census information, including number of people living within one mile of site and the ethnic and economic breakdown of the region, and a list of potentially responsible parties. I was particularly attracted to the supplemental information that accompanied this visualization, especially the related census information. I was also drawn to the interactivity of the map, including the variety of filters available to users, such as media type (soil, surface water, etc.), hazardous ranking score, population, and race. I was disappointed in the lack of clarity on what each site’s hazardous ranking score meant, and what the full scale was. Overall, though, I found this visualization to be extremely influential as I created my own map.
The second visualization that I explored was the ToxMap, a joint project of the U.S. Department of Health and Human Services and the U.S. National Library of Medicine.
Visually, I found this map to be a bit overwhelming. The datapoints are extremely small, meaning that users are forced to zoom in to the street level in order to discern unique points on the map. The map displays information on TRI Facilities, (Toxics Release Inventory), as well as NPL final sites, deleted sites, and proposed sites, which are shown in two colors and four different shapes. I found this mix of colors and shapes made for a rather muddy visualization, and I think the map would have been far stronger if it had used only one variable (either shape or color) to delineate between sites, rather than mixing the two variables. It also seemed that this visualization was intended for researchers who have a high degree of familiarity with Superfund sites and are looking for detailed information on types of locations, rather than general users who may simply be curious about the prevalence of Superfund sites in their community.
The third visualization that I examined was featured on Caliper’s Mapitude site as an example of the type of visualization that one can create using their Mapitude GIS software.
It is unclear who created the visualization, but their data is cited as coming from the EPA’s 2015 National Priorities List. The major difference between this map and the previous two is that it is a static image, meaning users are unable to zoom or to click on individual points to identify each Superfund site. The static nature of this map also makes it nearly impossible to discern the Superfund site density in the New York City/New Jersey/Delaware/Philadelphia region. The map utilizes stepped-coloring in seven gradations, each denoting a specific ratio of Superfund sites per 1 million people. The map also contains gray triangles which appear to represent actual Superfund sites, though there is no key. I find the coloring of states based upon Superfund sites per 1 million residents to result in a somewhat misleading map. Montana, which hosts only 19 Superfund sites, is shaded the darkest red in this map due to the state’s extremely low population. California, on the other hand, appears several shades lighter than Montana because its 112 Superfund sites are deemed more proportional to the state’s large population. I made sure not to utilize any per capita-style evaluations in my visualization.
Materials
The data for this visualization came from the EPA’s Superfund Enterprise Management System (SEMS) database within their Envirofacts system. This dataset featured information on 500 facilities in New York State that are currently active in the EPA’s Superfund database, including sites that currently are or are not on the NPL, or have been deleted from the NPL. The dataset contained information on sites’ registered names, the CERCLIS (CERCLA Information System) EPA ID Number, the site’s address, county, whether or not the facility is federally run, records of decision, the facility’s EPA regional relationships, if any, and the latitude and longitude information for each site. Since the data was already quite clean and included latitude and longitude information, there was very little that I needed to do before importing it into CartoDB. I also imported shapefiles for New York State counties and towns, which I collected from data.ny.gov, in order to add different layers of meaning to the map.
Methods & Results
Initially, I began by importing the CERCLIS dataset into CartoDB and superimposing the Superfund site locations onto a Positron-style map of New York State, supplied by CartoDB. I experimented with different aggregation styles, but ultimately decided that the simple dot style of site marking was the most effective for this visualization. I added a pop-up for the “Site Name” line of data, using the hover-style rather than the click-style. This allowed users to merely hover over a given Superfund site and see its name. While this map was successful in showing each Superfund site as a point on a map of the New York, I felt that it could be far more successful with additional geographic information. I then imported the shapefile for town boundaries. I added a hover-style pop-up for town name and a click-style popup for Superfund site name.
While I definitely felt that this map was more successful than the first, I really wanted some way to convey the number of Superfund sites in a given region. Since there are a large number of towns in the north-central part of the state without any Superfund sites, it seemed unnecessary to try to calculate site density per town. I instead chose to create a new map using the shapefile for county and overlaying the Superfund site data. I styled the fill color for each county based on the count value line of data. I initially hoped that a fill style based upon 5 buckets of color would work well, since the county with the largest number of Superfund sites has 53 sites. Unfortunately, this number of buckets results in some very strange groupings, notably Niagara County, with 30 Superfund sites and Erie County, with 53 sites, being lumped into the same color block. I believe this had to do with the large number of counties with fewer than 10 sites, and the relatively small number counties with more than 15 sites. I increased the number of buckets to 7 and found that this resulted in a delineation that felt more appropriate. I also added a click-style popup for each county that listed the name of the county and the number of Superfund sites.
I also included a layer selector in this map, to let users determine whether they wanted to view only county data, only Superfund site data, or both.
Ultimately, I felt that this visualization was fairly successful. I definitely felt that this map became stronger with each iteration, and I was especially pleased with the added interactivity of the layer filters.
Future Directions
In the future, I would love to add additional demographic information to this visualization, especially income level data. I have recently acquired a dataset with median household income data per zip code in New York City. I would love to integrate this data into my visualization, perhaps adding another filter layer. I would also love to add additional information on other types of facilities that are proven health hazards, such as Department of Sanitation storage facilities and “special waste” sites, all compared to demographic information, particularly income level.
It is worth noting that I also encountered a fair number of bugs with CartoDB. I was unable to update certain aspect of the map legends, which I ultimately deleted because it was so distracting. I also encountered issues with my pop-up features, occasionally receiving “no data available” notices, even when the data had been working properly only moments before. However, I still felt that this was a successful first attempt at mapping Superfund data, and I look forward to enhancing this visualization in the future.