The Drama within the Dots


The ability to see connections among various elements and the depth of those connections derived from data has long been a passion. As a writer, I often make sense of facts by creating rudimentary visualizations, so I can find connections that are not obvious to me on the surface. Gephi software and other social network analysis tools drive infinitely more potential.

I knew Les Miserables was loaded with connections, but I wondered if a Gephi visualization could show me new ways to discover and express connections and connectedness. Without relying on the details of relationships, I wanted to find out if it would be possible to narrow down and focus in on key figures within a network, and then use tools such as weightedness to identify relationship values.

Shakespeare’s Universe
I’m a great fan of the Bard and have a special affinity for his vast numbers of characters that fill nearly every play. When I encountered an article about the plot in Hamlet based on nodes of connectedness, I was eager to learn more. Network Theory, Plot Analysis (Moretti, 2011) explores the potential and some weaknesses in approaching literature from a quantitative perspective. Moretti’s piece is one of several created in the Stanford Literary Lab, a student and faculty collective that “applies computational criticism to the study of literature.”

Visualization of Hamlet

Figure 1. Stanford LitLab Pamphlet2: Network Theory; Plot Analysis.

In the visualization (Figure 1), Moretti breaks down “Hamlet,” and shows that being friends with Hamlet or Claudius means certain death. It’s one of 57 visualizations he uses to discuss plot and relationships mostly in Hamlet. None of the visualizations stand alone, and that allows Moretti to explain the queries he pursued and some of the insights he gained.

Centers of Influence
Network visualizations are most effective when reflecting actual connected groups. This is clear in Harvard University’s Economists of Cambridge visualization. The visualization is web-based and offers excellent interactivity to enable the user to make discoveries. The project looks at 136 economists at Cambridge University. Instead of expecting a single visualization to do all the work, the project is broken into several visualization groups. Visualizations are easy to see and interact with because sets of the data are visible only when moused over. That means that users don’t experience fatigue at the sight of more information than can be taken in. Additionally, the consistent color assignment adds to readability. Red is always “other institutions” and green (two different hues) is always “Cambridge Institutions.” A good example of that readability is evident in Figure 2, showing where faculty fall in the Keynesian influence sphere. One can see that Cambridge economists are largely Keynesians, but not strictly devoted.

Economists visualized at Harvard

Figure 2. Harvard University Visualization: Economists of Cambridge

Visualization Structures Story
In “Connecting the Dots Behind the 2016 Presidential Candidates” the New York Times profiled candidates based on a social network-type analysis. Here again, the visualization is interactive, which enhances readability. The Times relies on graphic design rather than color alone. The visualization is high-level and unlike my visualization and the Cambridge example, the Times has filtered out most of the extraneous relationships so that the connections are easy to see.


NYTimes candidate visualization

Figure 3. The New York Times visualization: “Connecting the Dots Behind the 2016 Presidential Candidates.”

As seen in Figure 3, the Times found a solution to presenting a complex analysis. It followed a consistent style, allowing each candidate four or three visualization approaches. Clinton and Bush both have breakouts for advisors and loyalists along with connections via their PACs and links to previous presidents. Other candidates are viewed in appropriate networks, but also through a major-party lens with a twist: Rubio can be examined through connections to the Bush family. It adds more narrative, but like the Cambridge economists, it lets data drive the story.

Les Miserables: Rich Tapestry of many Threads
Les Miserables is an ideal universe for network analysis. Not only does it have an abundance of individuals (nodes), the individuals possess myriad properties that can be, as discussed in class, “attribute-ized” by emotion, location, and political values. Sorting it all out is near impossible without some sort of visualization.

I pulled the dataset from Github > Social Networks. I imported it into Gephi 9.1, (quickly learning that 9.1 is full of bugs and has few tutorials. I did, however, rely heavily on the Gephi Tutorial Visualization.

I started by increasing the image size, adding labels and color. Initially, I pulled out the main characters forming a circle of vertices linking them with links to other characters left in the center, essentially creating a “ring” and “mesh” structure. That revealed little. I then pulled Jalvert and Valjean to opposite ends (see Figure 4), and began to discover levels of connectedness and hierarchy.

Gephi Visualization Les Miserables

Jalvert and Valjean

Figure 4. Jalvert and Valjean
Adding in a filter weighting the connections revealed the importance of letting the data drive the visualization. In the figure 4, one might conclude that the primary connections are between Jalvert and Valjean. However, weighting shifted the connection strength to Valjean and Cosette (see Figure 5). What I like about this visualization is that while the Valjean and Cossette connection stands out, other strong connections are revealed

Gephi Visualization of Les Miserables

Figure 5. Cosette and Valjean

Figure 5. Cosette and Valjean

Future Directions

I’m curious to explore further, especially in the way Lothar Krempel suggests, by creating visual layers (Krempel, p. 562). In particular, I’d like to define various subsets based on a series of attributes, including political affinity, class, out-law or in-law, lovers, hatred, and wealth. Separating those out would then allow me layer attributes that seem aligned. For example, I’m curious if characters separated by class would have the same connections if a wealth attribute were added.


I’m also interested to see how connections change over time similar to the example of Occupy Wall Street, where we saw that Project List values shift over time. As Le Miserables is an epic loaded with characters who enter and exit and transform over time, I believe it would make for several interesting visualizations.



Krempel, Lothar (2011). “Network Visualization” in Sage Handbook of Social Network Analysis, eds. John Scott and Peter J. Carrington. London: Sage Publications, 558–77