Public Library Usage Data in New York City


Visualization

The Mayor’s Management Report (MMR) of New York City, released twice a year, documents performance of the city’s public services. This includes measuring the operations of every department from 311 to the taxi and limousine commission.  The statistics published in these reports from 2003-2012 are available online at NYC Open Data.

 

The data set is arranged in a crosstab format, so I first had to change it to a normalized format using data wrangler before I could load it into Tableau. Once the data was properly formatted, I began to try and arrange the categories along the axes in the most useful way. I ended up deciding to place time across the X axis and placing agency name, indicator, and value along the Y axis. This was the most logical arrangement to look for potential trends.

 

This data was best presented on a line graph demonstrating changes over time. As Stephen Few points out, “if your objective is to see how quantitative values have changed during a continuous period of time, nothing works better than a line graph” (Few, 150). These graphs allow us to view trends, variation, rage of change, co-variation, cycles, and exceptions (Few, 143). I was primarily interested in looking for any trends and variation in the data, but was curious to see if anything else would appear as I made the graphs.

 

The data included several indicators for each agency and, consequently, had huge ranges in values for them. These large ranges made it difficult to get a meaningful picture of the data, so it was necessary to filter down which indicators were displayed. I looked through the data to identify agencies which would allow for interesting comparisons, such as the Administration for Children’s Services and the Department of Education. After looking through different possibilities, I chose to compare the New York Public Library (NYPL) and Brooklyn Public Library (BPL) systems.

 

Both library systems reported on several of the same indicators, such as program sessions, visits to their website, and reference questions. I chose eight categories to compare, each on a separate sheet in Tableau: Weekly hours, library card holders, program sessions, program attendance, total library attendance, reference queries, circulation, and visits to website. Each of these presented interesting comparisons. In the end, I chose three to present on my final dashboard: Total library attendance, library card holders, and circulation.

 

As the data was reported yearly, this is the level of aggregation for the graphs. Each graph has a different scale for values along the Y axis as data for each category had a widely different range of values. It would not have been possible to have one scale which would allow for a clear display across all three graphs.

 

There were large technological changes in the years 2003-2012 and these presented challenges for libraries. With the increasing prevalence of electronic texts and digitally available information, I was curious to see if this would be reflected in library usage and circulation.

 

I initially expected to see some correlation between the three categories I chose to focus on. I thought that if attendance rose, it would be logical to think that circulation and the number of library card holders would also be rising, though not necessarily at the same rates. At the very least, I expected a closer correlation between library card holders and circulation. However, the data did not support my initial assumptions.

 

NYPL and BPL Usage Data, 2003-2012

Total library attendance peaked in 2009 for both the NYPL and BPL, though higher at the NYPL. The NYPL overall had steadier increase in attendance leading up to this peak. The BPL, in contrast, saw a big jump in attendance between 2006 and 2007, followed by more steady increase the following years. Since 2009, the NYPL has seen a slow but steady decline in attendance, while the BPL has had decline with a slight increase in 2011 before declining again the following year.

 

Next, between 2010 and 2011, both systems saw a considerable decrease in number of library card holders. The BPL fell from 1,306 to 741, while the NYPL fell from 3,120 to 2,215. However, the following year, the BPL library card holders increased to 915 while the NYPL continued to fall, ending with 1,985 in 2012.

 

Finally, data for library circulation. Both library systems saw increased circulation over this period, but between 2011 and 2012 the BPL circulation decreased. As there is not yet data for 2013 and 2014, it isn’t possible to say whether this year was an exception or the beginning of a downward trend. Circulation at the NYPL steadily increased, but saw its biggest increase between 2010 and 2011. I found this trend interesting, as this same year saw a decrease in library attendance and the largest decrease in library card holders at the NYPL.


The trends that begin to appear on these graphs offer many starting places for further research about public library usage and habits of library users in New York. After studying  these, it was clear to see that my initial assumptions about library usage trends were incorrect. In most cases, there did not appear to be correlations between these categories. Circulation continued to rise while other factors declined. One possible reason for this could be that both the NYPL and BPL offer ebook downloads, which could lead to an increase in circulation without necessarily affecting library attendance. While this data is not comprehensive enough to make many conclusions with certainty, it is helpful for thinking more accurately about public library usage in New York City. For further research, it would be helpful to have data for the Queens Library to extend the scope for all of New York. However, while imperfect, this data offers many starting places for future research about library usage.

 

References:

Few, S. (2009). Now You See It: Simple Visualization Techniques for Quantitative Analysis. Oakland: Analytics Press.