Exploring the Dynamic Network of Harry Potter Characters


Lab Reports, Networks, Visualization

Introduction

Written by British author, J.K. Rowling, the Harry Potter novels are about a fictional character that goes by the name of Harry Potter. The books follow the adventures of Harry Potter, a young wizard, and his pals Hermione Granger and Ron Weasley, who are all students at Hogwarts School of Witchcraft and Wizardry. The major plot is around Harry’s battle with Lord Voldemort, an evil wizard who wants to become immortal, topple the Ministry of Magic, and take control of all wizards and Muggles (non-magical people). Spanning 7 books and 8 movies, the Harry Potter series is made up of multiplex narratives that may pose a bit of a challenge in understanding its entirety. In order to start understanding the complex nature of the narratives in the books, we aim to conduct a network analysis in order to see all the present relationships that exist.

Inspiration

In order to inspire me in my network creation, I sought to the web in order to see whether others have also tried making sense of all the relationships that exist in the Harry Potter narrative. The first visualization I took inspiration from was created by Minnie Bell. Minnie’s network analysis indicated that there are 65 characters with 356 ways they are connected. Minnie made it easy to see which characters belong to which network group. After conducting her analysis, it was evident that group 1 was the largest group which naturally included Harry Potter.

Minnie Bell’s visualization of all Harry Potter characters

Duval Alexandre created the next visualization from which I took inspiration. He created this dynamic social network of characters in order to get some real insights from the novels. Duval’s network analysis was conducted on one book, “Harry Potter and the Philosopher’s Stone”. Four communities were discovered in the book which included communities like Harry’s family and Dumbeldore’s friends which were a little outside the actual plot of the novel. It was interesting to see how he captured the main character communities as well as the communities that were not mentioned as often.

Dural Alexandre’s visualization of “Harry Potter and the Philosopher’s Stone” characters

Materials and Methods

Materials

In order to conduct my network analysis, I had used Microsoft Excel in order to clean up the data a bit that I had found on Github. After the data was more concise, I had imported the Excel sheet into Gephi in order to start creating my visualization.

Data in Excel

Methods

Once my data was in Gephi I started to run some statistics in order to analyze my graph and visualize it in different ways. I ran statistics on average degree, network diameter, graph density, and modularity. The average degree lets me know the number of edges connected to a node. Network diameter lets me know the longest shortest path between nodes within the graph. Graph density lets me know the measure of how close the graph is to complete. Finally, modularity is a community detection algorithm that shows different communities that are present in the data (Padilla,n.d.). 

Statistics used to visualize the data

Structure

After the statistics were available I played around with the layouts and found out that Expansion was the best layout for my visualization because it was easier to interpret the network. The other layouts, like ForceAtlas 2 and Fruchterman Reingold didn’t quite capture the data in a way that I hoped they would. In the Expansion layout, I ran it a couple of times in order for everything to be spaced out. Using this layout made it easier to view the labels of each node later on.

Expansion layout available in Gephi

Color and Text

As for the color, I wanted my network to include a lot of vibrant colors in order the capture the dynamic nature of all the relationships visible in Harry Potter. Placing the colors on a black background made them pop out even more and made it more interesting to view in somewhat of a “dark mode”. Since there were lots of colors utilized and black for the background, the labels for the different nodes had to be white. It took me a while to figure out what the best size was for the labels but the smallest font ended up being the best one because it didn’t suffocate the rest of the network. For the font label, I decided to go with Lato because it had a larger x-height than other fonts. The x-height tells us the height of the body of the lower case letters. This font allowed for better readability at smaller sizes which in my case was 1px.

Final visualization without labels

Results

With 333 edges and 65 nodes, it was great to see the visualization come together. In order to discover who the most important characters were, a degree centrality(as node degree is sometimes called) measure of each node was conducted. Doing so would help reveal what pivotal role characters had that would lead them to have high measures of degree centrality. Since centrality is used to determine how important certain nodes are in a network, it was important to figure out what exactly is meant by importance in this case. Based on the data, it was determined that the character with the greatest degree centrality mean that they were one of the key characters that made up the story. Harry Potter was the character that had the highest degree centrality followed by Lord Voldemort, an enemy, Ron Weasley and Hermione Granger who were both Harry’s friends, as well as Albus Dumbeldore.

It was great to see how all the other less-known characters were connected with each other. These are the connections that were not able to be seen through the narrative’s complex nature. Seeing the visualization made it easier to understand how singular characters acted as the go-between for several other characters in the story. Although the network visualization made it easier to see certain relationships that have developed over the course of the narrative, it has been quite hard to see the weaker connections on the outside of the main hub of connections. This is because the labels were set to be in proportion to the nodes that they correspond. If the font size was to be increased, it would overlap with other labels and would create a nightmare in terms of readability. Ensuring that this visualization is easier to read would depend on having a full resolution image which I have provided in the button link below the image. The button leads to an SVG version of the image so you can zoom in as much as you want to see the characters that are less connected.

Final visualization with labels

Reflection

I had a really fun time creating this visualization. One thing I have noticed is that it is super important to have a good dataset that is not too big. I first started off with a dataset that had way too many nodes and edges where it ended up crashing every time I wanted to run a statistic. I had to abandon that dataset and I found a much more interesting one to work with based on a topic that I knew a lot about. My network allowed me to see all the connections past the main ones that are clearly portrayed in the books and movies. 

Future directions for this experiment would be to compare the Harry Potter narrative to another famous narrative in order to explore the similarities and differences between them.

References

Alexandre, D. (2020, August 31). Explore Harry Potter via a dynamic social network of characters. Retrieved from https://towardsdatascience.com/explore-harry-potter-via-a-dynamic-social-network-of-characters-f5bed9a39f01

Bell, M. (n.d.). Network of Harry Potter Characters. Retrieved from https://rpubs.com/MinnieBell/728320

Efekarakus. (n.d.). Harry Potter Character Network Visualization. Retrieved from https://github.com/efekarakus/potter-network

Padilla, T. (n.d.). Introduction to Network Analysis. Retrieved from http://thomaspadilla.org/na2014/