Gephi1 Gephi2
The topic and purpose of Gephi is to match actors and events in a way that establishes or uncovers the strength and value of the relationships between them. Creating a network diagram with Gephi is a particular step-by-step system of capturing and organizing data, importing the data correctly and then understanding manipulations within the Gephi tool that will yield an embellished and interpretable network map that also correlates directly to your research goals.
Actors and events are called nodes and edges,respectively, in network mapping. The topic I chose to explore involved the social network of a group of karate club members in 1977. Each member of the club became a node and the directed relationship between them, the edges. This dataset was already compiled at https://github.com/gephi/gephi/wiki/Datasets so I did not use a raw dataset. If I had, I would have configured the edge table (prior to importing into Gephi) to include Target, Source, and Type. Target and Source refer to the interacting nodes and Type is the direction. My dataset was of type directed and I wanted to explore why and what characteristics of the group might become apparent.
These three visualizations were found on Manuel Lima’s page http://www.visualcomplexity.com/vc/ and searching the Gephi files. Manuel Lima has a great laid back style and a knack for explaining visualization in a direct manner.
1. The politics of systems, The Guardian’s FB page
I particularly like this visualization because it is not overwhelming and completely presents the data in a visually intuitive manner. Without even reading the explanation, you get a sense that there are many, many contributors but only a few with concentrated activity. The groupings and labelling are accomplished with professionalism and style. This visualization would be my hallmark in terms of my own presentation of complicated data.
2. Gephi Abstract art, FB data and Photoshop upgrade of the visualization
https://www.behance.net/gallery/19961887/Gelphi-Abstract-Art-Facebook-Network-Data
I chose this set of images as inspiration because the idea of turning visualizations away from tools of interpretation into candidates for digital art intrigues me. This trend already exists, and I believe more so its growth for the future. Stemming from the need, if you have the expertise, desire and creative skills, to re-interpret information/data personally.
3. Visualizing the database; The aesthetics of databases and visualizations
https://dhs.stanford.edu/spatial-humanities/visualizing-databases/
I chose the visualization entitled: Top Contributors to the Catalogue of Life and their associated species, references and databases,’ mainly because I also believe, as the author states, there is an “importance of network visualization for database aesthetics.” The value of mapping information only shines when that information can be interpreted properly. One way to gain interest(s) in mapping and visualization is to present the final options in a manner that is both intriguing and underwhelming (by underwhelming here, I mean the very opposite of overwhelming). I find many of the network maps are too overloaded for the novice. Why make it so difficult to interpret and why not engage the audience with aesthetic engagement to garner interest?
I also like the idea of presenting databases in visual form. I believe databases, like their parent, the spreadsheet, are simple and basic tools to help tackle and interpret data relationships.
We used Gephi in the lab to conduct our data manipulations. My first set of files, downloaded from http://snap.stanford.edu/data/email-EuAll.html had thousands of nodes and edges and proved to stall Gephi when importing. After a few false starts with other data sets provided at https://github.com/gephi/gephi/wiki/Datasets, I settled on a very small set of data, regarding a Karate club, technically below the threshold of the lab requirements, but seemingly manageable for me. As a rookie on Gephi, I find huge network maps in blazing color uncomfortable and impenetrable. My small karate club yielded 34 nodes and 78 edges.
Until you are familiar with the key manipulations and commands in Gephi, producing a visual that is easily understood presents a challenge. I find it overwhelming when the nodes and edges are all clumped together, particularly with large datasets. I applied the ForceAtlas2 algorithm to the data, which, when stopped, produced a layout shape not unlike the final shape of my preview. Still I didn’t became comfortable with the view until I applied the Fruchterman-Reingold algorithm. This (view) brings the nodes outward and establishes clear space between the nodes for the edges “to breathe” and that helps me understand what I am seeing.
I did not use any of the plug-ins that are available within Gephi as most of my effort was centered on understanding the role of the map. Saving often, I created four versions and spent a great bit of time relabeling the nodes which was not necessary. Finally I began running the statistical reports which measure the strength and gravity of the pull between nodes, as well as the varying distances and the node which attracts the most interaction.
Running reports serves several purposes: adding the report results to your node table as columns and producing printable graphing of the stats for further analysis. I’ve summed the reports in results section. You can also partition and color-code your data based on the boundaries created by the statistical run. Modality is a common method of partitioning and I found my data segregated into three main groups.
One caveat with Gephi, you can change things as you l Iike but there is no undo so saving versions is the best way to build and layer explorations. Export to .pdf for record keeping but .svg if you want to further enhance or overlay onto a traditional map.
Fully connected network with single mode (club members). Because it is a small network, we try to analyze the whole network, not really focusing on particular nodes, though there are a few key nodes.
• Density Graph 0.070
• Network Diameter: 3
• Radius: 0
• Average Path length: 1.2735849056603774
• Number of shortest paths: 106
• Modularity: 0.399 Modularity with resolution: 0.399 Number of Communities: 3
• Average Degree: 2.294 (both in- and out-degree). We have one person (1.0) who has multiple in-degree, with persons (2.0) and (3.0) also with multiple in-degree edges.
• Average Clustering Coefficient: 0.285 (the Average Clustering Coefficient is the mean value of individual coefficients.)
Future directions for me are to continue to explore Gephi in small social networks and begin to expand and play around with increasing larger datasets. I would like to be able to understand and interpret seemingly complicated network diagrams and critically examine how they were constructed.
I believe Gephi is one of the many clunky but valuable programs for cracking the code inherent within information patterning and I would aspire to be able to create digital artwork as a secondary goal.
The karate club social network diagram and its node table of statistics offer the opportunity for further numbers crunching, perhaps with a different tool. I also begin to see how network mapping with Gephi might help with other current research projects such as survey of pre-K children and visual literacy.