Trash Pandamonium: Discord and Distemper in NYC Parks


Charts & Graphs, Lab Reports, Visualization

This week, I’m using a tool called Tableau to tell a story through data visualization. 

A raccoon on its hind legs on the forest floor, paws raised in mischief.

While browsing New York City Open Data for a dataset to explore, one in particular stood out to me, titled Urban Park Ranger Animal Condition Response. It held data on animal assistance calls in parks citywide that the Urban Park Rangers responded to between May 2018 and June 2019. My affinity for animals and the natural environment made this a particularly interesting topic for me. I went into this without knowing what questions I would ask—I hoped that as I familiarized myself with the data, a story would emerge from its patterns.

I looked to visualizations from the New York City 2015 Street Tree Census for inspiration for this project. Like the Animal Condition Response dataset, the Street Tree Census includes location data for the 5 boroughs and collects qualitative data on species and health. The resulting visualizations are simple and accessible, using basic color palettes and stacked bar graphs to illustrate census findings. Users may filter results by borough for a localized scope and access tooltips by hovering over different areas of each chart.

  • A horizontal stacked bar graph showing tree health by species, with tooltips
  • Horizontal and stacked bar graphs showing the most prevalent street tree species in New York City. Tooltips show additional data.

To get started on my own visualizations, I downloaded my dataset, provided by the New York City Department of Parks and Recreation (NYC DPR), from NYC Open Data. Reviewing the table preview, I could see that the data in certain columns was reported inconsistently, so I imported it into OpenRefine, a powerful open source tool for cleaning messy data. 

It appeared to me that all of the columns pulled information from standardized fields except for Species, Property, and Location. I employed OpenRefine’s cluster function to consolidate these values to the best of my ability, merging the names of Parks and Recreation properties such as “Riverside Park South” with “Riverside Park,” and eliminating variations in spelling and capitalization.

Once I imported my data into Tableau, I created a basic line graph showing the frequency of calls over the months. Immediately apparent was a jump in calls during the late summer months. Perhaps this was normal, I thought, considering that people and animals are more active in park spaces in the summer. When broken down by borough however, the data seemed to suggest discrete events in Manhattan in August 2018 and Brooklyn in November 2018.

This dataset included a huge variety of species (nearly 150!), so I created a new graph to reexamine my data from this angle. I saw that one species was responsible for a disproportionate number of calls—raccoons. The graph’s shape resembled the number of calls by borough plot. Was this coincidence, or a correlation?

Multi-line graph showing number of calls per month, with each line representing a different species. They are all mostly flat except for the one labeled Raccoon, which has one huge peak and two smaller ones.
Tableau: Number of calls per month, broken down by species.

To rule out the possibility that fragmentation of other major species was contributing to raccoons being overrepresented, I returned to OpenRefine to further consolidate the data. The contributors to this dataset, being nature experts, often (but not always) provided very detailed species data in this field. I used simplified terms such as “gull,” “bat,” and “turtle” as stand-ins for “herring gull,” “silver-haired bat,” and “red-eared slider turtle” in the case that patterns for any other group of species might emerge. (They did not.) 

What could possibly account for these calls? A Google search for “manhattan raccoons august 2018” led me to local news reporting from that summer of an outbreak of canine distemper virus in Central Park raccoons. This checked out, according to the data, which showed that nearly 90% of calls about raccoons in Manhattan were linked to Central Park.

Could a similar story have played out in Brooklyn, I wondered? The data showed me that in Brooklyn, just over 80% of calls about raccoons were indeed linked to its largest green space, Prospect Park. Further internet research confirmed the existence of a canine distemper virus outbreak in this borough, too. 

Two frequently-cited properties emerged in Queens—Forest Park and Alley Pond Parks—but even combined, they did not constitute a majority of raccoon calls in this borough. Additional research revealed that distemper was reported in Queens in the spring of 2019, but there were not enough cases to constitute an outbreak.

Three pie charts show that 89.78% of raccoon calls in Manhattan came from Central Park, 82.55% of raccoon calls in Brooklyn came from Prospect park.
Tableau: A location breakdown for the top three boroughs with reported raccoon cases.

Canine distemper virus causes strange and sometimes aggressive behavior in raccoons and is fatal if left untreated. As the name suggests, the virus also presents a serious health risk to dogs. The data revealed a correlation between declining raccoon health and outbreak peaks. I chose a stacked graph to highlight the part-to-whole relationship of unhealthy raccoons to all reported raccoons. By showing it on an interval time scale, I could identify patterns around the time of reported outbreaks.

Tableau: Raccoon health over time.

I found it interesting that while unhealthy animals were reported year-round, with an increase in the months surrounding the outbreaks, a significant number of raccoons in Manhattan and Brooklyn were reported dead on arrival for about 1 month around the respective peaks. 

This graph shows a third, slight rise in raccoons reported unhealthy and dead on arrival in March 2019. Isolating these values, I saw that many of these calls originated in Queens, around various NYC DPR properties. However, they’re in relatively low quantities and I suspect that these reports were placed out of an abundance of caution.

I created several visualizations in pursuit of exploring the Animal Assistance Calls dataset, but chose only three to tell this story via my interactive Tableau dashboard, linked below. The first graph starts with a broad view of the data (calls over time), the second notes a striking characteristic about this dataset (many calls about raccoons), and the third strives to attribute that to something (declining raccoon health).

I’m curious about other data released by the Urban Park Rangers or NYC DPR, but I did not find any on NYC OpenData. This particular dataset likely made it here because of public concern about the distemper outbreaks—it had been such an eventful season for raccoons. I’m pleased to have come to the distemper outbreak conclusion, given that there were no clues provided in the metadata or data dictionary. 

If I could take this research further, I would love to examine the action taken for each call. How do Urban Park Rangers educate others? What impact does animal monitoring have on incident outcomes? Do responses differ depending on whether the animal is native or invasive, domesticated or a known pest? With any luck, I will get to explore this further with future data on New York City’s wildlife.

The author uses “trash panda” as an affectionate term for these resilient, brilliant, and beautiful creatures.