NETWORKS OF 1000 INDIVIDUALS SENDING LETTERS ALL OVER EUROPE
Writing letters is a traditional way of communication. As a lover of the art of letter writing, I have a particular soft spot for sending and receiving letters. So I set out to explore people’s networks transmitting letters for my network analysis and visualization project.
Gephi is a free, open-source software for network analysis and visualization, which accepts data formatted as “nodes” (points in a network) and “edges” (lines connecting those points) and converts them into a two-axis network visualizing relationships among data points. I also reviewed Gephi’s quick start guide and an online tutorial that teaches how to visualize networks. The datasets are geographical networks of 1000 individuals sending letters all over Europe. Before analyzing the data, I download a few plugins in Gephi. They are GeoLayout, NoverlapLayout, and Multimode Networks Transformation.
I imported the nodes CSV file into Gephi and checked “create missing notes.” Then I input the edges CVS file, attaching the dataset to the original file and unchecking “create missing nodes.” After importing the datasets, I adjust the size of nodes to min size to 10, max size to 100
I run three spatializations Fruchterman Reingold, Force Atlas 2, and Geolayout. I begin with Fruchterman Reingold that gives more space to the graph. Fruchterman Reingold more geometric and equal area it’s better for small space, which helps to see each individual clearly.
Then, I applied the Force Atlas 2, a layout algorithm determined by connections, to disperse groups and give big space and structures. Check “prevent overlap” and change “Scaling” to 50.
In the Degree Report, the average degree is 14.166, representing there are 14.166 were sent and received by each person on average. In-Degree Distribution showing how many letters different people get from others, versus Out-Degree Distribution showing how many letters different people sent to others.
Then I added labels and changed color with the biggest nodes white and the smallest nodes dark blue.
Each name plus a number interpreted as a person. Since there are a lot of labels on the diagram and hard to see so I took off the ‘name’ and only keep the numbers in excel by using the formula =RIGHT(C2,(LEN(C2)-4)) to separate the numbers in a new column.
Finally, I justify the size of the nodes based on the statistics: In-degree 1-110; out-degree 1-139.
Result A – Force Atlas 2 layout
By comparing the in and out-degree, we compare to see that each person sent and received letters. Those who are writing a lot are not necessarily those who are receiving a lot. For example, people can barely notice 540 in in-degree but it is obviously shown in out-degree. In other words, 540 sent letters much more than he received.
Then I run the Modularity statistic analysis with a resolution setting of 1.25. In the Modularity Report, there are four modules of classes, which are also the number of communities in the graph. The nodes in a single cluster are closed related to each other. In the Partition menu in the Appearance panel, I selected Mudularity classes under the Nodes and modified the color attributed to different communities. Finally, I run the GeoLayout to display different communities and change the label to the city. The clusters are Geologically connected, representing southern Europe, Western Europe, Northern Europe, and Central Europe. People in each community wrote to each other more frequently.
Gephi is a powerful tool to visualize networks. I tried different datasets. But when the dataset is too large. It took a long time to run the data and ended up shutting down. Thus, refine the dataset before imputing data is important. One thing that can improve is to put a map under the geo layout networks.