Grants from the National Historical Publications and Records Commission, 1964-2013


Visualization

Introduction

The National Historical Publications and Records Commission (NHPRC) is the grantmaking arm of the National Archives and Records Administration (NARA). The NHPRC supports research, preservation, publishing and other activities that make America’s historical records available. The NHPRC has provided a public dataset of all grants commissioned between 1964 and 2013, which contains information on the amount granted, the type of grant made, and the grantee institution name and location. As a fundraiser, I was curious to see how NHPRC’s funding had changed over the years. I was particularly interested in answering the following questions:

  1. Are more grants given under certain programs?
  2. How has the size of grants changed?
  3. How have grants been distributed across the United States?

Inspiration

In order to answer my question about whether more grants are given under certain programs, it seemed to make the most sense to use a heat map, as the dataset had so many grants (well over four thousand) and types of programs that another visualization type would result in over-plotting (Few, 2009, p. 155). I liked the heat map below, posted to a blog called The Daily Viz, for its simplicity and color scheme, which does not attract undue attention to any one box, as can be the case with heat maps that use a variety of colors.

Source: http://thedailyviz.com/2012/05/12/how-common-is-your-birthday/

To explore the range of grant values over time, I wanted to see not only the largest and smallest grants, but also their distribution. This was a perfect opportunity to create a box-and-whisker plot. The one seen below tracks the historical performance of women in the biking component of the Ironman World Championship. Though the colors in this case are decorative rather than functional, the wide range of colors help to set apart the plot for each year. I appreciate this type of visualization for its ability to display a large amount of information in a small area.

Source: https://plot.ly/~AaronWebstey/131/ironman-world-championship-trend-in-bike-time-top-10-women-1982-2014/

My third question related to the distribution of grants across the United States. There are many examples of geographical maps that display this type of information. This Visualizing the Shelter Problem map is one of the best that I have seen. It uses size and color (both preattentive attributes) to quickly paint a picture of both the size of homeless populations and the percentage of each population that is sheltered.

Source: http://mmviz.blogspot.co.uk/2015/03/visualizing-shelter-problem.html

Materials

I began by opening the above-mentioned grant dataset from NHRPC in Microsoft Excel. The data was already tidy, as each row corresponded to a single grant and each column was a variable related to that grant, but I did aggregate a few very similar grant program types into larger categories (NHPRC makes grants under different programs, such as Publishing, Digitizing, and Professional Development) and used the LEFT function to retrieve only the first five digits of 9-digit entries in the Zip Code field. I also deleted the Last Name, Middle Name, and First Name variables (presumably these were the names of the Principal Investigator for each grant), as many of them were blank.

Next, I opened the modified dataset in Tableau Public, freely available software that allows users to create interactive visualizations. I created three visualizations and experimented briefly with pulling them into a dashboard.

Methods and Discussion

Question 1: Are more grants given under certain programs?

To address this question, I first dragged the Program field into Columns and Commission Date into Rows. I then reversed the two variables, as I realized that that was the appropriate arrangement for looking at grant types across time. As there were so many years included in the dataset, I used the Create Group function to create 5-year increments to produce an interval scale and a narrower matrix.

At this point, only text appeared in the individual cells of the matrix, so I dragged the Number of Records measure over the data points in order to display the number of grants for each cell. Number of Records was also added to the Color property of the Marks card. Finally, I changed the type of mark to Square, and I had the heat map I wanted. The heat map clearly shows a strong preference for projects under the Publishing and Archives and Records programs, though there is a sudden decline for the latter program in the last time interval.

grants by type

Question 2: How has the size of grants changed?

I used the grouped version of the Commission Date variable (created for the first visualization) in Columns and Final Grant Award in Rows. I then attempted to create a box plot with the main features in Tableau, but was unable to figure out the correct technique. I instead used the Show Me feature’s option for a box-and-whisker plot. The visualization still didn’t look right, so I added Final Grant Award to the Detail property of the Marks card, turned it into a Dimension instead of a Measure, and changed the type of mark to Square. This produced the following box plot.

box plot original

There are a few large outliers in this box plot, so I added the name of the grantee institution and the grant program to the Detail property of the Marks card. Two of these extra large grants were given to the University of Virginia for the same publishing project, so I felt that these additional variables could be important in understanding why such outliers exist. I also copied this worksheet and removed the outliers from that copy in order to see a close-up version of the boxes.

boxplot without outliers

Question 3: How have grants been distributed across the United States?

I also used the Show Me feature for my last visualization. I created a symbol map, which used the latitude and longitude values generated by Tableau from the location variables of the grantee institutions. I added the Number of Records to the Size property of the Marks card so that concentrations of grants could be easily identified. I also added the Institution name to the Detail property, as a concentration of grants may be related to a preferred grantee. The addition of a Filter for the Commission Date was my final step – I made this a slider bar that shows a range of values, so that users can adjust the bar to show any period of time they desire.

map of grants

Future Directions

One thing that I failed to account for in the box-and-whisker plot is inflation rate. Ideally, I would have used grant values that were adjusted to present day dollars, and thus eliminate inflation as a cause of grant size increase. My heat map could be improved upon through further investigation into the types of grant programs and how much overlap there is between the types of projects funded by similar programs. It is possible that further examination would result in some categories being collapsed into larger buckets. The map of grants could also be made into a Story in Tableau, if there are distinct differences between decades or other measure of time that can be explained by changes in funding priorities or other events.

References

Few, S. (2009). Now you see it: Simple visualization techniques for quantitative analysis. Oakland, CA: Analytics Press.

 

The live version of my final dashboard can be seen here: https://public.tableau.com/views/NHPRCGrants/Dashboard1?:embed=y&:display_count=yes&:showTabs=y.