The network is a more complicated concept compared to other data vis genres. On a network the meanings of edges. Since the required elements of data are particular. There are not many options for me to choose from. I used the 1994 World Trade data from CASOS Public Datase1ts. Through the visualization of this data, I want to discover the economic status of 80 countries in 1994 and find the most active trading country back in time.
The first graph that inspired me is from Seed Magazine about the gene. The graph showed a focus on the top right part. The colors are very beautiful. By looking at this graph, I got really interested in making a circular network. This graph was created with Circos which is a powerful DataViz program, particularly for chord diagrams. However, since it does not have a graphical interface which means every command needs to be typed in manually. Plus that it doesn’t support popular data file formats, makes it impossible to learn Circos in a short time.
I found this example made by Jill Marie Hacket from the final projects on this blog. This beautiful graph drew my attention immediately when I saw it. The whole poster looks like a console from a Sci-Fi film. There are a few big labels shown on the poster which make it look clean. Also, there is almost zero useless ink on this poster that inspired me a lot on the visual design. I was surprised that the networks on the poster were done in Gephi.
Material, Tools & Process
I used the World Trade data from CASOS datasets. The original file is in XML format. First, I used OpenRefine to clean the data. On the XML, there are the node table(countries) and the edge table(country to country network). Fortunately, the country names were included in the edges, so I only needed to export the edge table. Before exporting, I deleted unnecessary columns and renamed the source, target and weight columns to make them detectable by Gephi.
Gephi is a data viz tool used particularly for networks. Compared to other data vis tools such as Tableau and Carto, Gephi is not really user-friendly since it is still in its beta stage. I encountered some bugs while using. However, the tool is the only options I can use for free for network visualization projects, and the bugs are not fatal regarding my workflow. In Gephi, I imported the CSV file I generated with OpenRefine and selected undirected option. In the beginning, I used modularity to color and cluster the nodes and used weighted degree to resize them.
For the network layout, I tried every preset layout such as Force Atlas and Yifan Hu, but none of them looked really good. I downloaded other layouts from Gephi plugins and finally decided to use Circle Pack layout which can group the nodes by modularity class accurately.
I used Ubuntu for the label font and added a font stroke to make the text more readable.
Result & Reflection
This is the final network. Through the network, the most active trading county is Slovenia which is weird. Also, the labels of some countries such as “Rep.” are wrong. Unfortunately, I didn’t realize the data is wrong until I finished the project. I can still see which countries are closely connected to each other. Countries in the same continent tend to show a stronger relationship. For example, Singapore and Malaysia are in the same color. For this network practice, I don’t plan to find the new data and redo the visualization. But I should definitely check the accuracy next time when I work on any data.
I think the most difficult part of this project is to understand what the network means in particular scenarios, as well as what the clusters calculated by Gephi mean. I started to understand them by actually using Gephi. For the next step, I want to try making my own network data for the final project and use Illustrator the make the network look better.