Lab Reports, Networks, Visualization


Les Misérables is a French historical novel by Victor Hugo, first published in 1862. It is considered to be one of the greatest novel written in the 19th century. Hugo began writing Les Misérables twenty years before its eventual publication in 1862. Les Misérables is primarily a great humanitarian work that encourages compassion and hope in the face of adversity and injustice. Over the years, this novel has been adapted into various plays, dramas, musicals and more. Thus, through analysing this data set of Les Misérables, I have created visualizations which will help us understand the character relationships, who the main characters were and who dominated the plot in this novel.


As I explored the various data sets provided, I found that to work with a set of data regarding Les Misérables would be interesting. It caught my attention and explored more visualizations with respect to the same. I then came across an article written by Mithun Sridharan in which he performs a Social Network Analysis (SNA) of the characters in the novel. His visualization is shown below:

In his article Mithun Sridharan talks about how to render a network ineffective. To gain more perspective and understanding of the matter, I came across Gene Dan’s Blog wehere he analyses the graph from a different perspective. The article mentions and talks about centrality and how it may affect our interpretation of the data set and graphs. Gene Dan’s graph is depiected below:


1. Finding A Data Set

In order to work with gephi and create a visualization, a data set was needed. Amongst various data sources that were provided, I found the data about Les Misérables on GitHub. As, I browsed through the data sets available on GitHub, Les Misérables interested me and thus I chose to work with this data set.

The Lés Miserables data set was available as a GML file which needed to be downloaded and worked with in Gephi.

2. Working With Gephi

The downloaded data set which was available in the form of a GML file is supported by Gephi and can be directly accessed. In order to understand Gephi, one must understand the purpose of the tool which is to visualize and analyse large networks graphs. Gephi is an open-source software and uses a 3D render engine to display graphs in real-time and speed up the exploration. You can use it to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs. It is important to understand the basic terminology and concepts of nodes and edges, source and target, undirected and directed graphs and so forth.

At first, the data set is extracted and renders a visualization.

This graph needs to be further synthesized in order to undertsand the relation between nodees and edges and how that applies to the Lés Miserables data set.

3. Choosing The Colour Palette

Lés Miserables can be thought of as a musical, play or drama. The imagery for this novel can be numerous. Hence, I decided to stick with the original theme and setting and chose a colour palette accordingly. I chose an image I thought would represent the theme and setting of Lés Miserables. Therefore, with the help of Adobe colour, I extracted a few colour combinations which essentialy describes the setting i.e. France.

Results And Discussion

The networked graph consists of 77 nodes and 254 edges. After implementing the colour palette and synthesizing the graph further, the results were as follows:

Force Atlas 2, Replusion strength : 20,000

The visulaization is a network graph consisting of circles known as nodes, and lines connecting these nodes, known as edges. Each node represents a character that appears in the novel. The edges represents an association between characters. The size of the nodes and names of the characters have a direct relation with the number of connectionsa character has. As you can see here, Jean Valjean, the main character, has the greatest number of connections. From the graph above it can be concluded that the major characters in this novel are Valijean, Javert, Fantine, Cosette, Marius, Gavroche and Myriel.

However, greater number of connection to a node is not completely conclusive of the factor that the character is the most dominant. Therefore, in order to understand which character is the most dominant, I ran Fruchterman Reingold layout to understand the and consider the force between nodes or in this context, the characters. The results are as follows:

Fruchterman Reingold Layout


With the help of Gephi, I was able to understand and distinguish the different parameters by which one would and can understand and comprehend data. It helped me visualize, compare and contrast different parameters of a whole data set.

However, finding a dataset is what I found to be the initial hinderance in this task. Amongst various data sources that were provided, I found the data about Les Misérables on GitHub. Before that, I browsed through a couple of optios in order to understand what exactly I am looking at, the relation between nodes and edges and what does that rendered image in Gephi convey.

After a few tries with various data sets, I was able to understand what the different layouts mean and convey. Depending upon what one is looking for, accordingly the layouts and statistics apply. Different layout and statistics convey different data and interpretation.

However, I further had to read about what the different layouts and statistics mean and how they apply.

With respect to the future scope, I would like to further understand the software and alogrithims in more detail. This is so that I understand what the algorithim is, what it tells us and when to use it so that I will be able to apply it to any given data set in Gephi.