Monuments in New York City


Charts & Graphs, Lab Reports, Visualization

Introduction

For this assignment I used statistical visualizations to understand the landscape of when and where monuments were constructed in New York City. I wanted to better understand the rate at which they were built. For example, were there more built earlier in New York’s history than today? Additionally I was curious if there were monument “hot spots” in the city. Were there certain boroughs that had had more built than others, and if so were there districts within those boroughs that had more or less than others? And since the dataset I used, more information on it below, included information about whether or not the monuments were still extant I was curious if looking at still extant monuments changed the picture significantly than if I looked merely at total monuments erected.

While I didn’t identify a specific visualization as my inspiration, in this assignment I wanted to focus on how I used color to clarify the data I was presenting.

Materials

I obtained my dataset from New York City’s repository for open data. The specific dataset I used was the NYC Parks Monument dataset in conjunction with its corresponding data dictionary. The dataset included information on 1,878 monuments, tracking 34 variables. However the data was not entered consistently for certain categories, which meant I had to spend a significant amount of time cleaning the data. Since I was working on my work computer, which I couldn’t download OpenRefine on, I made these changes manually. Eventually I narrowed down the dataset to 9 columns for ease of use: name, fileorder, number, parkprop, borough, commboard, council, extant, and year.

Snapshot of the dataset. Notice in column L the many different ways the data is formatted. I added column M to make the data more consistent.


Once the dataset was in order, or as much order as I could manage, I used Tableau Public to create the visualizations themselves, as well as the dashboard that presents them grouped together.

Steps

The first thing I wanted to visualize was how many monuments were built by year. However there were over 300 rows that didn’t have definitive years for when the monument was built. Also the earliest monument was constructed in 1600 and the most recent 2018. With so many years being represented individually it was hard to see any specific trends. I decided to narrow the data to a hundred year period, 1900-1999, in order to do a deeper dive in a smaller window. There were still 1,300 monuments built in that time period, so there was still enough material to analyze.

Once I limited the data I grouped the monuments by 10, and then by 25, years to see trends in construction. This showed a clearer picture of the overall trend in construction than the initial visualizations.

Then I was curious to see where the monuments were being built. I compared how many monuments were in each borough, and then what percentage of monuments were still extant in each borough. Manhattan had the most monuments built overall, as well as the highest percentage of monuments that are still extant. (Although there wasn’t as high a difference in the latter category between all five boroughs and as the former.) I then wanted to take a closer look at where in Manhattan the monuments were.

The second line of inquiring I pursued was how many of the monuments were owned by the Parks Department. I looked at how many were owned by the Parks Dept. over the whole city, as well as by community district in Manhattan. Since I had already looked at monuments by community in the previous visualizations, I thought this would be an interesting point of comparison.

Then, I returned to my original question of when the monuments were being built. Inspired by one of the exercises we did as a class, I wanted to see the change in how many monuments were being built as a percentage of the previous year instead of just how many were being built. I then compared the city as a whole to Manhattan alone. However since I was still in the mindset of monuments that were owned by the Parks Department, I broke the visualization in two, one line for those owned by Parks, and one for those owned by others.

Results

I presented my final visualization in two dashboards (below). The first dashboard collected the visualization I made when looking at the monuments by borough. Manhattan had over 50% of the monuments, and within Manhattan the Upper West Side Community Board district had the most monuments, while the Lower West Side had the least. I had wanted to look at the Community Board data for other boroughs, but I had too many questions about the accuracy of that data for me to trust the results. Many of the data were coded as being in Community Boards that don’t currently exist. I found it interesting that Manhattan’s Community Board 7 had the most extant monuments, 153, as well as the most “missing” monuments, 28. This tracks in my eyes as if there are the most build there, they would also probably have the most that were broken, removed, or lost.

The second dashboard collected the visualizations I made about the amount of monuments owned by the Parks Department. The visualization I feel that presented the most interesting information was the change in difference in monuments being built. The Parks Department had a more consistent rate of building monuments. While there were still peaks and valleys, as a whole there was smaller variations when compared to monuments that are owned by others. This is even more apparent when you only look at monuments built in Manhattan.

Reflection

While I had set out to really pay attention to the color I was using in my visualizations, I think I could have used color smarter. In my first dashboard for instance orange tracks different variables in each graph, as does the grey. While it looks cohesive, I am not sure it is meaningful. I think the blue is more successful in the second dashboard as the blue tracks the same information in the first two charts/graphs.

Additionally I think because I was experimenting with a dataset I wasn’t super familiar with, I wasn’t sure what meaningful questions I could ask of the data. This means ultimately I made a number of bar charts. I think if I were to continue this project I would need to ask different types of questions. For example a bar chart showing how many monuments are in each district does tell you some information, but would it be more meaningful to map them? This way instead of grouping them by district number, which most people don’t know the geographical location off the top of their heads, you could see where they are and draw your own conclusions.

I also didn’t add any explanatory text on the dashboard themselves. If I were to continue this project I would also want to take more advantage of the different types of media you can include on them.