This analysis was done with the purpose of getting a visual overview of what trade relationships look like around the globe. Which countries have the broadest spread of trade relationships and which have the smallest? Which countries are connected through trade and who are they trading with the most?
Process + Tools
The dataset, Major Trading Partners, used for this study was derived from UNdata. It shows the 3 major trading partners for individual countries around the world. The quantitative data is categorized into 3 years: 2005, 2010 and 2018. However, for the purpose of this study I have chosen to work only with the data from 2018.
The data was visualized using Gephi, a tool that allows users to create network visualizations. To format the dataset into one that would be readable by Gephi, Google Sheets was used. The original dataset from UNdata was already in CSV, a format readable by Gephi; however, the data had to be organized and retitled for it to be recognized as Nodes and Edges. The dataset was mostly clean and organized, with the exception of a few inconsistencies in country names, which were fixed on Google Sheets. Any rows of data with the dimensions “NES” (Not elsewhere specified) and “Bunker” were removed as they did not specify a recognizable country or region.
The following network visualization was initially created from the edited dataset, and exported as a PNG file. The type of network was selected as “Directed” in order to be able to visualize exports in comparison to imports, and arrows were added to the edges showing the trade path. However, after some trial and error it seemed that the trade direction was not clear, and therefore the arrows were removed.
The size of the nodes show which countries have the highest amount of trade relationships on the dataset. The weight of the edges represent how high or low the percentage of imports/exports is. A modularity statistic was run on Gephi and the countries were color-coded according to their modular class. In some cases, it may seem as though the modular class is grouping the countries according to region or continent, however, upon closer inspection no such categorization was found. The color-coding was not altered as it still made the visualization easier to read in comparison to having all the countries be represented by the same color. Force Atlas 2 and Noverlap layouts were used, along with some manual rearrangement, to make the labels somewhat legible.
From Visualization 1 it is easy to deduce that China and United States have the most trade relationships, answering one of my research questions. However, the large amount of connections along with the inability to interact with the visualization after exporting it, makes it difficult to pinpoint which countries were connected to which.
As a test to see whether the network visualization could show trade direction accurately based on the acquired dataset, a second visualization was created. This time the dataset was altered to show only exports instead of both exports and imports, and “value” was retitled as “weight” to see if there would be any major difference.
A major difference in Visualization 2 is that United states is represented by a larger circle than China, unlike in Visualization 1. This indicates that majority of exports from other countries go to the US. The weight of the edges represent the percentage of exports from the country of origin. For example, if we look at India and Bhutan (top right of Visualization 2) it’s clear that most of Bhutan’s exports go to India, since the edge is thicker than the other two very faint edges coming out of Bhutan. This visualization used the same layouts as the previous along with manual rearrangement to separate the nodes more and therefore make the country names even more legible.
Although challenging, this analysis was interesting to me and the information from this dataset could be used to dig further into the topic. Separate network visualizations could be created for exports and imports to see how trade paths differ for different countries. Smaller networks could be created with countries with most and least trade connections as nodes, in order to be able to look at each connection more clearly. I also think it would improve the visualizations to group the nodes into different continents, color-code and cluster them based on that instead of modularity classes.
As a tool, I found that Gephi required some trial and error before being able to come to a clear visual outcome. It was challenging to restructure the dataset and match the information to the network visualization as the dataset contained both import and export paths. Because of the large number of nodes and edges it was difficult to make all the labels legible in the visualizations. One of the limitations found on Gephi was the inability to customize labels other than to have them be all the same size or “proportional.” The latter setting made it impossible to view anything properly for this particular dataset as some of the text became larger than the entire visualization itself, so it was not an option. Ideally, I would’ve liked to selectively label the nodes or have it be interactive so that the labels appear on hover. Another major limitation I found was the inability to “undo” any action. The familiar “Ctrl+Z” did not seem to work and nor was I able to locate an undo button. Overall, I would like to use Gephi again to analyze this topic further with a smaller, more focused dataset, in order to make it easier to read.