American Horror Story is an anthology television series that premiered in October of 2011 and is currently airing its fifth season. With four years of characters and storylines that fluctuate between that past and present all centering around the horror genre, I thought I would be interesting to examine what connections exist between these stories. More specifically, I want to examine actor prevalence in relation to character connectedness. Maintaining relatively the same cast members across the series, I hypothesized about how prevalence would change from season to season, if there would be trends, and who would have the most prevalence across the show thus far.
Choosing to limit my research to the four seasons of the show that have finished airing, I used a dedicated Wiki as well as the show’s IMDb pages to pull actor and character information. My parameters for including an actor in my data sets were that they had to appear in more than one episode and/or more than one season. Using a low-level of interaction – appearing in the same episode together – I prepared spreadsheets for Gephi by pairing characters in undirected source-target directions in edges tables; weight was determined by how many episodes they were in together. Since this is an anthology series, each season has a different set of characters despite having recurring cast members therefore to analyze actor prevalence, IDs were assigned to each character in my nodes table in addition to the season they are in. Nodes tables had characters’ names as IDs, actor names as labels, and a column for season name (Murder House, Asylum, Coven, or Freak Show). As a result I had a total 2,999 rows of data for edges and 183 rows for nodes. My goal with this data was to create ten infographics: two for each season using modularity and degree as well as two comprehensive, multi-season images using the same parameters. For all visualizations I wanted node and edge size to be representative of their degree and weight respectively so within the Ranking tab I used the Size/Weight diamond to resize nodes by degree; edges were resized by simply selecting “Weight” as rank parameter. Everything was arranged using the ForceAtlas 2 layout to expand data points where I selected to dissuaded hubs, prevent overlap, and LinLog mode. Once these aspects of my visualizations were in place, I used the Partition tab to add colors to my nodes for modularity to detect communities and degree which, in addition to nodes size, displays those with the most edges (connections). Instead of having a different color for each degree, I color coded them from highest to lowest by ten’s – 50’s, 40’s, and so on and so on as one will see in the following visualizations. For an additional level of interaction I added images to nodes with the highest degrees. While the identifying labels for each node are the names of the actors, the representative image is of the character they played during each season. For example, there is a picture of Evan Peters as Tate Langdon in the “Murder House” graphic and as Kit Walker in the “Asylum” graphic.
Communities and degree group were extremely similar. Despite using a low-level of interaction, main characters for each season were in communities together and had the highest degrees. My smallest data set was for Asylum, however Murder House – season one – had the lowest degree numbers. This may have to do with the fact there are only twelve episodes as opposed to the other seasons having thirteen considering the amount of characters in the node table is comparable to Coven who is third highest behind Freak Show (the highest). Click images for full size:
To expand upon that point, Freak Show had the largest amount of data with sixty characters and 1,223 interactions in the edges table. This does make sense to me considering the theme and setting of this season and the amount of actors needed to transport viewers into the show while watching. It is not as self contained as a season like Murder House where things really did not occur outside of the Harmon home.
All seasons examined consistently had the same six actors: Evan Peters, Frances Conroy, Jessica Lange, Lily Rabe, and Sarah Paulson, so this analysis for prevalence became an interesting look at how they fluctuate throughout them all. Looking at each season individually, I was not surprised by the most prevalent actors. Despite these six being in all seasons it does not mean they were always in roles that permitted a high amount of edges. Looking at Sarah Paulson for example and her jump from Billie Dean Howard and Lana Winters.
Then, as more main actors are added as seasons go on – Kathy Bates, Angela Bassett, Emma Roberts – and actors from previous seasons return – Taissa Farmiga, Denis O’Hare – interesting connections and trends begin to emerge.
Once I compiled all of my data into one visualization, I was able to fully answer some of the hypotheses I had about actor prevalence in American Horror Story. Communities are grouped by season. As one can see from the visualizations below, season and modularity partition parameters are nearly identical.
I believed that I would see Jessica Lange and Evan Peters reign supreme as they consistently play main characters in seasons one, two, three, and four with the trend of actors like Sara Paulson, Angela Bassett, and Denis O’Hare following close behind. This was undoubtedly confirmed. What I was not expecting was that the actress with the highest degree would be Naomi Grossman who plays Pepper in both Asylum and Freak Show. As one can see from the visualization below, her node is by far the largest with a degree of eighty-four, twenty-five points above the closest degree to her at fifty-nine which is shared by Sarah Paulson, Jessica Lange, Evan Peters, Erika Ervin, and Rose Siggins in season four. I believe this is due to the fact that Naomi Grossman is the only actor to play the same character in a primary role in two different seasons and looking at the edges for her node, this is supported.
As mentioned in the methodology section, I color coded degrees from largest to smallest. Below Naomi Grossman, degrees are colored highest to lowest: red, blue, yellow, green, purple. Comparing this in addition to the size of the seasons against each other, Freak Show holds most prominence something that was hinted at when creating visualizations for each season individually. Looking at the series as a whole, prominence really changed for someone like Sarah Paulson if you were to compare her playing Dot and Bette to the psychic Billie Dean Howard in Murder House. I also found it interesting when actors maintained prevalence after being introduced into the series like Angela Bassett, Kathy Bates, and Emma Roberts have compared to an actress like Jamie Brewer who has played a mix of degrees of role. A final observation I made was that although Sister Mary Eunice (Lily Rabe’s season two character) only appears in one episode in season four, her circle is red, again proving that being in more than one season as the same character increases prominence more than being a lead in every single season.
As time goes on I would like to keep up with these data sets and create them for season five – Hotel – and the recently announced season six. I really enjoy making these visualizations using data from television shows I enjoy watching and collecting the data is actually pretty enjoyable when it is a topic you have a lot of interest in. I would also be interested in examining deeper levels of interaction like being in the same scene or characters speaking to each other . I also would be curious to see how these visualizations would change if the edges table had actor names as source and target as opposed to character which would in turn affect label and ID in the nodes table.
Data and Visualization Images: