Visualizing Les Misérables Social Network, 1862


Visualization
Network Graph of Les Misérables social network, 1862 novel version

Intro

Coming from an art museum background I was interested in exploring a social network in a creative context. I initially wanted to try and visualize a network derived from a museum collection however, I was unable to find open access datasets I could easily restructure as a single-mode network for the scope of this lab. I came across some really interesting networks visualizing artists to artist relationships based on shared exhibition histories, which I will reference in the next section. With this goal in mind and the data sets I was able to run in Gephi (the visualization tool) I decided to use the Les Misérables coappearance character encounter dataset from The Stanford GraphBase: A Platform for Combinatorial Computing, Addison-Wesley, Reading, MA (1993) (from Gephi Github Dataset), as a way to explore visualizing a social network. The dataset was derived from Victor Hugo’s novel from 1862. By creating a network visualization, I hope to explore potential social interaction in the narrative by visualizing relationship weight, central characters, and social clusters.

Inspiration

  1. Artist-info.com via Visualizing Art Networks 
A network graph with dots and curved lines connecting between in purple, light blue, orange and pink. There is a Larger pink dot with the label Chris Newman on the bottom left.
Fig 1. Screen capture of artist-info network graph

I came across a network visualization published by artist-info.com that created an ego network based on one artist, Chris Newman’s exhibition history of 40 solo- and 32 group-exhibitions from 1982-2015. The team further researched all the accompanying artists in the group-exhibitions to create an artist-to-artist relationship with edge relationships based on exhibiting together. The visualization uses colors to differentiate between clusters of artists and weight is determined by how many times artists showed together. 

More information on the process behind the graph:
https://visualizingartnetworks.com/artist-artists-networks/

Direct link to the graph:
https://www.artist-info.com/visualizations/ARTIST-artists/#Michael%20Voges

  1. Artist Collector Network via Graph Commons 
Network graph with pink nodes labeled with art collector names and gray links connecting to small black dots with artist names
Fig 2. screen capture of artist collector network

Another reference I looked at was a multimodal graph by Burak Arikan and Ahmet Kizilay using Graph Commons that visualized a network of Art Collectors and Artists represented in their collections. The collectors are distinguished by color and the forced-weight is based on the number of art pieces by the said artist in a collection. 

https://graphcommons.com/graphs/64c77f7f-ff34-4316-83f8-3009f0c54309?show=analysis-cluster

Method

Tools:

For this lab, I used Gephi, a free, open-source graph, and network software, which also influenced my selection of the Les Misérables dataset. According to Scott B. Weingart’s article Demystifying Networks, Gephi is more fine-tuned “to be used with single-mode networks.” 

Process:

After downloading the .GML file from Gephi’s Wiki page  I opened the file in the Gephi software interface. I first checked the Node (characters: 77) and Edges (character relationships:254) sections in the Data Laboratory tab to make sure the data was imported correctly. I then went to the overview tab to begin analyzing and running statistics:
Average Degree (connection each node has):3.299
Network Diameter(connective distance between nodes):5
Graph Density (present edges / all possible edge connections):0.043
Modularity (cluster group resolution 1.0): 0.565

Layout:

I initially tried to model the network in both ForceAtlas and ForceAtlas 2 (fig 3) but after applying a color range to the edge by weight and viewing it in the Preview pane it was quite hard to discern from this layout which characters (with the name labels present) were more central to the narrative (more degrees).  So after going through the other layouts I settled on Fruchterman Reingold (fig 4). This force-directed layout evenly spaces out the nodes which make it easier to see all the characters. Though the edge lengths in this layout don’t have a specific meaning, I set the edge width to reflect the weight of the relationships (thicker or thinner). Lastly, I set the color based on the modularity statistic to highlight social clusters amongst the characters. 

Fig 3. ForceAtlas 2
fig 4. Fruchterman Reingold

Results 

Upon importing the dataset into Gephi, I noticed that the edge types were listed as directed. I wasn’t able to find more details explaining why the coapperance of characters in a novel would be directed vs undirected. Is In-degree who was in the scene first? This highlights the importance of datasheets for datasets and having clearer documentation behind dataset creation! Edge type aside, I was surprised that I settled on Fruchterman Reingold’s layout given the matrix-like order of the nodes. I wrongly assumed that a grid-like node distribution would be less impactful in visualizing character ‘importance’ (more degrees).  I also think this layout better visualizes where Valjean, a central character (most degrees) to the narrative sits in relation to social clusters, the most prominent grouping being in lime green (top right).

Reflection

The most challenging part of this lab was finding a ‘useable’ dataset for the scope of this project. From what I read about Gephi, I wanted to use a single modal dataset but I also encountered issues with trying to process really large networks on my personal computer. Given more time, I would like to focus on transforming opening-access datasets from museums to create a network graph visualization of, say a social network of contemporary Asian artists in major museum collections in the United States. I won’t know however until I try it if the network graph would be meaningless without the larger context of demographic breakdown in each museum collection.  

References 

https://www.sciencedirect.com/topics/computer-science/reingold-layout

http://journalofdigitalhumanities.org/1-1/demystifying-networks-by-scott-weingart/

Dataset:
https://github.com/gephi/gephi/wiki/Datasets