Le Misérables: Coappearance of characters

Lab Reports, Networks


I’ve always been interested in creative writing and found this project to be a perfect moment for me to analyze the structure of storytelling using data visualization. Every story is made of different characters and each of these characters come into defining the journey of the protagonist. Inevitably some characters will have a heavier influence on the overall flow of the story in comparison to some of the other characters, but I was wondering if I could demonstrate their connections and their level of influence on the story using network visualization. 

I will be analyzing the degree of connectivity between the characters and their potential influence in introducing different communities of characters together. By creating this network visualization, I am hoping that you could identify important characters and their respective roles by having a glimpse at the graph. 


I would say the “Marvel social Network” visualization by Yuan Ma had a significant influence on the ways in which my network analysis project has developed. By looking at the Marvel social network visualizations, I realized that one of the strongest features of using network visualization could be the ability for the user to understand the connection between characters and how they have facilitated building other characters within the Marvel universe. This made me look for more story driven data to structure my own research around. 


I started by looking for more creative story lines that have not been analyzed as much as other stories but unfortunately the only data I could find was in Excel format and my Excel could not export the files as .csv without including (BOM) which would ultimately result in Gephi not being able to read the .csv file. 

I used Openrefine to try and export the Excel sheets as .csv, but unfortunately that did not work either. After that I tried to use the Marvel comics data set and narrate a new network conclusion but the Gephi software would crash as soon as clicking on “open” due to the data set being way larger than my Laptops processer could handle. I proceeded to open about 8 more data sets from the CASOS or Gephi website but none of those were opening considering that some had formatting issues, and some were too large, similar to the Marvel comics data. 

I finally surrendered to the will of technology and chose the only single piece of data I could find that would both open without any issues and my computer could manage to process which was the “Le Misérables” data on weighted coappearance of characters.  

I then proceeded to design my network visualization on Gephi and used InDesign to create additional visualizations for my report. 


I started by loading my data sets into Gephi. After that, I ran the Force Atlas and Force Atlas 2 to see which one would work best and I quickly realized that Force Atlas was a better choice for my data set. After that, I ran some statistics such as modularity, Average weighted degree, Average Degree, and Avg. Clustering coefficient. From there I went on to set the size of each node to represent the degree, and the color combinations to represent the betweenness centrality. 

After having followed those steps, I realized that I myself cannot get a proper grasp of the network in its current state, so I started editing the location of nodes in relation to their community and in relation with central characters. 

By using the degree range filter, I tried to group individuals that seemed to know each other, individuals that connect different communities together, and the 3 main characters in a visually cohesive manner to better communicate their role in the progression of the story.  

I also grouped individuals that seem to be minor characters to be positioned outside their respected communities to prevent unnecessary clusters and overlaps and facilitate finding their connection to the story. 

After I was done with the fundamentals of my network visualization, I clicked to check the preview, which was not working. I spent couple days uninstalling and installing my Gephi and trying every single possible way to fix this. I finally found a solution on GitHub suggesting that I should click on “preview” in the software menu rather than the button on the interface which surprisingly fixed the issue. 

After I fixed the preview page, I modified the presets to “Text outline”, enabled “show labels” and refreshed the preview to have the final outcome. 

I then exported my network visualization and continued to preview the graph but this time only allowing degree as color ranking to be enabled, in order to generate a graph solely portraying the degree of connectivity between characters. 


I started by not wanting to follow a simple data set and tried my best to create a more attractive story and research something that had more personal significance to me but the complications that did rise, from not having a stronger laptop to not being able to export excel files as .csv without BOM, did not allow for that to happen. I also tried importing the nodes and Edges separately by importing the Nodes first and then importing the Edges but that did not work either.

I think my final outcome for this network visualization is cohesive, Fine-tuned, and visually pleasant. However, for future experiments, I will try to find better tools to help me deal with the excel files and their transformation into .csv and potentially use a computer that has better processing capabilities than my own laptop. 

During this exploration I learned many things and grew as an individual who is involved with technology.