About the Dataset
The dataset was originally compiled by Andrew Beveridge and Jie Shawn, a Math professor and graduate student, respectively, at Macalester College. (Fun fact: NPR’s Weekend Edition featured this project in a story on April 16, 2016. There is a corresponding piece on the NPR website here.) The nodes represent all the characters in George R.R. Martin’s book A Storm of Swords, the third novel in the series A Song of Ice and Fire, and the novel upon which the HBO show Game of Thrones is based. There are 107 character/nodes in this dataset, and the edges represent the number of times two characters appear within 15 words of one another in the novel. (One question I have, unresolved through research, is how directionality is established. I would expect there to be bidirectionality for each of the edges, but this is not the case.)
Configuring the Initial Visualization Approach
After importing the dataset into Gephi, I began by playing around with various layouts, starting with ForceAtlas 2. The one that seemed most promising, particularly in terms of revealing node outliers, was OpenOrd, so I went with that approach.

Fig 1: Various data visualization approaches within Gephi
First Iteration of Visualization
I then clarified the visualization by adjusting the size of the node bubbles by degree (min = 1, max = 50) and the weight of the edges, also by degree. Next, I partitioned the nodes by modularity class, and ended up with the following:

The value of this visualization is that it efficiently affords users the ability to visualize two key things: (1) The centrality of specific characters, in terms of their connection to other characters, and (2) which characters exist in a network “island,” i.e. their community is relatively unintegrated with other character networks within the system. Daenerys and her network (orange) particularly stood out in this regard.
By simply looking at the static visualization (Fig 2), it’s hard to discern how diverse – in terms of being connected with various modularity classes – any given character is. However, by hovering over each of the main characters, you can more easily visualize the unique connections. The following screenshots were taken based on this approach:

Fig 3: GOT characters and their connections (intra- and inter-modularity class)
Feedback on First Iteration of Visualization
I showed both figures to a GOT enthusiast (User #1) for feedback, and he said he definitely preferred Fig 3. When I asked him to explain, he said that he liked being able to isolate each character, and that looking at the dataset in its entirety (Fig 2) was too overwhelming. Not only was it too graphically dense, but he claimed he had little interest in knowing anything about the minor characters. He recommended condensing the dataset into only the major characters, and also suggested color coding the character nodes not by modularity set (which was an algorithmic construct irrelevant to him) but by which major House (i.e., family) each character belonged to. Much of the plot of Game of Thrones revolves around rivalries among 9 noble houses, so he encouraged me to re-run the data through that prism. I gathered House affiliation data and added a column in the dataset for this purpose.
Second Iteration of Visualization
In the dataset, I sorted the nodes by number of degrees, then removed more than half to arrive at a set of 50 characters. I then re-ran the dataset in Gephi with OpenOrd, this time using House as the partition. Finally, I manually repositioned the nodes to be clustered by House, pulling the main characters (most number of edges) to the outermost periphery. Just for fun, I re-ran the analysis using the Fruchterman Reingold approach, which yielded an interesting perspective. Rather than encouraging the viewer to prioritize House affiliation, this version focused the viewer’s attention on the largest nodes, which were centrally located, while still being able to trace the House affiliations.

Feedback on Second Iteration of Visualization
I shared the visualizations with User #2, who happened to be much more conversant with the Game of Thrones franchise. (In addition to having watched the HBO series, he’d read the George R.R. Martin books in high school.) While viewing the visualizations, he made an interesting observation: while in the broadest sense the visualizations capture the full network of relationships, it obscured the evolution of these relationships, ignoring the dimension of time in the unfolding story. He suggested creating a panel of separate visualizations based on time periods, or better yet, developing a dynamic visualization that tracks how the network changes over time. I loved the idea, but unfortunately my dataset lacked the parameter of time, so I ran into a dead end. However, User #2 also suggested that it would be interesting to overlay the network onto the geography of Game of Thrones, since each of the Houses was physically located within a particular region of the (fictional) Westeros continent. That was something – with a bit more research and many more hours of design work – I could try.
Third Iteration of Visualizations
This time around, I used Adobe Illustrator to modify a map of Westeros, labeling different regions by House dominance, and using low-opacity color to define the regions. I then went into the Gephi OpenOrd visualization and changed the color scheme of the nodes to coordinate with the colors of the regions. Using trial and error, and toggling between Gephi and Illustrator, I nudged the nodes to conform to the geographic confines of the regions. I decided to move all the nodes without House affiliation to the outskirts of the map. I worry that this may be a bit confusing to the viewer, but the map got too cluttered to include them in the interior, and there was also no logical way to position them without knowing more about their character histories.

Feedback on Third Iteration of Visualization & Next Steps
When I showed this visualization to user #3, another avid Game of Thrones fan, he liked the geographic overlay but echoed user #2’s wish to see how the network evolved over time. He also mentioned that it would be interesting to somehow visualize the nature of the relationships – not just whether two characters were connected, but how. Were they friends, rivals, lovers, relatives or just acquaintances? We discussed that one way to do that would be to differentiate the edges, possibly by color-coding the lines depending on relationship category. Another option might be to render this information as a text box when hovering over the edge in question. He recommended a filtering option, so that users could drill down into versions based on dimensions like relationship, time period, and even House. While that’s beyond the scope of this dataset (i.e., it does not contain time periods or the nature of relationships, here is a mockup of what he conceptualized:

Conclusion
Gephi is an incredible tool that enables a unique form of network visualization, but I found it fairly cumbersome and wonky to use; some features, such as filtering and the preview function, were unreliable, only working occasionally (?!). My broader insight from this project is that so much of the potential of a visualization is based on the source and nature of the original dataset. This Game of Thrones dataset was based on an algorithm that I found somewhat limiting: two nodes were created if the characters were found within 15 words of one another in the text. (Why not 20? Why not pronouns? Was directionality established based on the order of the mentions in the text?) In addition, the dataset omitted any reference to the nature of the relationship and the time period of the connection. If I had more time to construct the dataset myself, I would include these and other elements and and use them to build a more dynamic, interactive visualization approach.