American Experience Feature Documentaries, 2015 to 2017

Final Projects, Visualization


Last Semester (Spring 2017), I conducted a survey and wrote a paper about archival producers in the documentary film industry. I surveyed twelve archival producers connected through an informal, largely New York-based network, asking them how they perceived the industry and their companies valued their work. I asked them for long-form responses about their perception of the industry’s valuation of their work through esteem, crediting, and monetary compensation. In addition, I asked the survey-takers if they thought gender played a role in the external valuation of their professional lives. The responses I received indicate that, at least in the group I surveyed, archival producers think that they are not given enough time or resources to do their work, and that they think the field is predominantly female and suffers neglect because of that. Two archival producers echoed an idea I had heard in casual conversation before, that compared to the technical post-production roles that they thought were comparable in terms of effort and skill required, archival producers are much less valued.

Unfortunately, because there is little existing research on the specific issue of documentary cast and crew demographics, it was difficult to generalize these expressed feelings more broadly. I wanted to look at a broader selection of documentary credits to examine if cast and crew roles had any gender imbalances, as well as looking at a network of cast and crew members to visualize the connections between people with different roles as a portal to explore the issue of credits. My intended audience is both adults interested in issues of gender and media and professionals in the documentary field.


I decided to begin looking at the broader issue of documentary casts and crews and gender by gathering data from the last three seasons of the PBS affiliate WGBH’s documentary series, American Experience. In total I gathered cast and crew lists from twenty six films starting in 2015 and cleaned and arranged the data into two tables: in the first each credit was a unit, which meant that a single person could be listed several times if they had several credits, the second individual people are units and their credits and films are gathered into strings in a single descriptive column.  I used Tableau Public 10.4, Gephi 0.9.2, and the sigma.js Gephi plugin to create two complimentary visualizations.

To add gender data I manually assigned gender based on first names, leaving gender unassigned where it was ambiguous. I also looked up gender where possible for ambiguous names, especially where the person was weighted heavily for connectedness. Unfortunately, one of the greatest flaws of the is methods of assigning gender is that, particularly for the documentaries with reenactments shot internationally, Western names are favored. I am also unable to account for non-binary identities and may have misgendered a crew member.

UX Methods and Design Choices

I brought in a UX tester early in my network design after I realized that I was unsure how best to balance the large amounts of data I wanted to convey with a clear interface. I had a network that had clearly separated into nodes that seemed to represent different films and clear indications of degree of connectedness, but I was unsure if the clusters were important or if there was a good way to introduce gender and other metadata to the visualization. My friend Anne, a data specialist at the city Department of Education and a student in CUNY’s business analytics program agreed to look it over. Because I was not yet sure of the visualization strengths, I had her perform a think-aloud. Examining the network, she immediately wanted to know the credit and film details of specific nodes when she clicked in, and also thought a brighter color should be used to distinguish nodes from edges. She understood that the clusters represented groups of highly connected people, probably representing one film, and thought that node colors would be better used to express gender. However, when we tested red for women and blue for men, the visual effect suggested gender parity in crediting when the numbers actually reflected closer to a 60/30 split for men/women.

My second tester, Alex, an information-professional, examined both the network and the dashboard. I had him complete tasks, narrowing in on particular users and credits in the sigma.js interface. He was able to locate my lone credit in the network and find all of the credited researchers. He was not able to tell me by looking at the dashboard the total number of various credits by gender because it turned out that the sum function was not working properly when totals were expressed visually as percentages in bar charts. In a think-aloud he was concerned that expressing the totals in the dashboard as percentages was unclear. However, after testing percentages versus sum totals, he agreed that the percentages were best for comparing. I ended up displaying the overall credits and production company charts as percentages, with the sum totals of the credits displayed over top for reference. I then showed the subset of archival producers versus post-production as totals with the percentages over top as label. Alex thought that while the flipped percentage/total displays could be confusing, he understood it quickly enough and the overall effect was helpful and told the story of gender balance on the crews well.

My third tester, Eric, is a professional archival producer. We had an open discussion about the visualization, discussing some of the findings that were revealed and the best ways to tweak the visualizations to convey those findings. He was especially curious to filter the dashboard on credits, so I added a universal filter feature to enable this. He suggested I add totals for the post production and archival subsets for the subgroups and the whole, and suggest I change the title “archival” to research and clearances to better encompass the extent of the work included in that term. 

In the end I wanted to choose consistent colors across the tableau dashboard and the network to represent gender, while considering potentially color-blind users, the aesthetics of the visualization, and the balance between communicative colors and gender stereotypes. I chose yellow for women, blue for men, and gray for un-assigned genders.


The dashboard provides an overview of the gender breakdown between the production companies, the credit types, and highlighted roles. Looking at the visualization, I was struck by how quickly the gender imbalance in different roles became clear. It seems to me that information and communication roles are often female-dominated ,which I expected, and technological roles like sound design were male-dominated. This was also expected. However, I was glad that I left some of the credits ungrouped instead of putting them into broader categories; I was incredibly surprised to see that the post-production supervisor credit, specifically, was female-dominated. I was also surprised to see how few credits there were for “archival producer.” It seems instead that many people are credited simply as “Archival Researcher”.

The dashboard can be seen here:!/vizhome/AmericanExperienceFeatureDocs/Dashboard1?publish=yes

Here is a static screenshot:

One additional revelation from the visualization was that aside from “Special Thanks”, one of the greatest number of credits for one group on documentaries goes to “Interns”. This was not the focus of my research to start with but I think it could be something to explore further in the future. I was also struck by the evidence that while most of the companies commissioned to make documentaries for American Experience were male dominated, the core staff at American Experience is predominantly female.

This ties into the network of credited people, which can be seen here:

Here is a static screenshot without the labels that can be seen in the dynamic view and are critical to experiencing it:

Blue- Male Yellow- Female Gray- Un-assigned gender

Though the network is dominated by blue dots, each cast and crew is equally gender-diverse. It seems that rather than having a diversity problem between companies, there is more of a problem in the roles that people take on within films. I was surprised to see the predominance of yellow dots for women that were larger because of their connectivity; these were all American Experience core staff who worked on almost every project. While American Experience is still less gender balanced than other industries, it actually does better than the industry as a whole; perhaps the gender equity at the top contributes to gender equity further down.

The network itself was highly connected, thanks to the central crew members. It had a diameter of 3, an average path length of 1.895, and a density of 0.109.

Visualization Future Directions
Going forward, if I post this to my network of archival producers, I will encourage them to reach out to me if they think I should change any gender assignations. The British Film Institute did something similar in their breakdown of British films when they had to assign gender based on first names as well. I would also like to expand this project to documentaries  from other companies to get a broader picture of the industry. Hopefully I can work with the IMDB interface for this; while I was working on this project, the best tool for accessed the API was not working for bulk queries. I would also like to display the network and the dashboard together, perhaps on a WordPress site. Though I would like to include some form of data about race, that would require finding official data sets from professional associations that collect such data; this is doable but would not be easy.