Visualizing the game of throne social network


Networks, Visualization
Image result for game of thrones logo

Introduction 

Games of Thrones (GoT) is a widely popular series based on a novel by George R.R. Martin titled “A Song of Fire and Ice.” It is the most popular show in HBO history averaging 44.2 million viewers per episode in gross audience for season eight, which was the final season (Mitovich, 2019). To summarize the series into a few words, it’s essentially a depiction of a medieval fantasy story where members of powerful families compete for the iron throne, with the intention of ruling all of Westeros. Even though the plot may sound simple, it is incredibly captivating and will have anyone hooked from the first episode. I have invested a massive amount of time and emotions in this show, so I thought this would be an excellent chance to closer examine the relationships between some of the characters.

Inspiration

I was inspired to do a GoT network analysis after browsing through some of the previous network visualizations that exist on the information visualization student work website. I came across many different visualizations of the marvel universe heroes. Marvel is another a set of series/movies that I have also invested a massive amount of time and emotions into. I would’ve loved to do a network analysis of the the marvel universe with a different interpretation, but I noticed that were several that already existed. I figured that I’d create a visualization of another extremely popular fantasy universe – GoT. 

Data Collection

Unsurprisingly, it was not too difficult to find a data set to use to visualize the GoT universe. There is a vast amount of data publicly available about nearly every aspect of the show. The dataset that I choose was a csv file from GitHub, created with the express purpose of creating a network visualization. This dataset contains the number of cooccurrences of characters within 15 words from each other in the book “Storm of swords,” which is the third of seven books in the series. It also contained weights for the number of cooccurrences. Each cooccurrence counted as one weight, therefore the more cooccurrences the higher the weight associated with the relationship. 

Screenshot of the Data from CSV file

Methods and Materials

To create this visualization, I used gephi – an open source network visualization software. Gephi requires that the imported data source have at least three columns: a source, target, and type. The source and target were already labeled as the first two columns in my file. The source column referenced one character and target column referenced another. I had to create an additional column labeled “type” and fill in each cell as “Undirected” for each occurrence. The final imported csv file had four columns: source, target, type, and weight.

After uploading the file, gephi displayed the number of nodes and edges that were contained in the file and I double checked it with the file to make sure that it was correct, which it was with 107 nodes and 352 edges. Then I inspected my data in the preview tab, and I was unsuccessful at understanding anything I was looking at it. It was simply a jumble of dots and lines, or rather nodes and edges.

Screenshot of Gephi with unformatted data

I ran some statistics on this visualization then tried to work with the stats to redefine this visualization. The statistics are as follow:

  • Average Degree – 6.579
  • Average Weighted Degree – 80.822
  • Network Diameter – 6
  • Graph Density – 0.062
  • Modularity – 0.6

Results

Based the statistics, I created two visualizations. The first visualization shows the relationships of the characters by geographic location. Characters that are geographically closer had more cooccurrences. This was done by using the modularity and specific colors to easily identify the different clusters that were naturally created. Additionally, by displaying the size of the node in relation to the degree, it made it clear which characters had more cooccurrences than others since they have larger nodes.

Visualization 1: GoT Communities by Geographic Location

After looking at this visualization for some time, I realized that it didn’t show the relationship between the characters, or at least it wasn’t very easy to interpret. In the first visualization, the edges are weighted according to the number of cooccurrences. To make it easier to see these relationships, I simply changed the edge colors and choose curved edges instead of straight edges.

Visualization 2: Relationships between GoT Characters

This second visualization shows stronger relationship between characters much clear than the first.

Reflection

I enjoyed creating this visualization since I created a network visualization of a show that I have spent many hours watching. However, there are a few aspects that I wish I could have improved on. The first major aspect is the dataset. This dataset only contains cooccurrences from the third novel in the series, which only refers to seasons three and four of the HBO series. I think it would be very interesting to recreate these visualizations if the the dataset contained all cooccurrences from all seven novels. I also wonder about how different these visualizations would be if the data type showed directed relationships. Maybe we’d learn more about the relationships between characters versus just whether or not they had more occurrences. Finally, I would have really liked to have the labels of the nodes in a different position than directly on top of the node. I think it would be much easier to read if they were placed slightly below or above each node.

Reference:

Mitovich, M. W. (2019, May 20). Ratings: Game of Thrones Series Finale Breaks Records, Tops 19 Million Viewers. Retrieved November 12, 2019, from https://tvline.com/2019/05/20/game-of-thrones-finale-ratings-season-8-iron-throne/.