Introduction
The United States may be the world’s biggest economic engines. On this gorgeous map of the US flight paths, designed by colleagues Scott Hessels and Gabriel Dunne at UCLA, traces some 60,000 routes on major and regional airlines around the country. I was fascinated by those relatively faint purple lines for each individual route and bright white nodes where multiple routes intersect or overlap. All together, it creates a piece of art, and that inspired me to create something of my own.
Data collection
Since I was curious to see the connections between airlines and airports. I came accross this dataset Infrastructure Networks — Airlines from Gephi Wiki. I then downloaded the data to my computer. As I opened the file, I checked the dataset and found out that in the node table, the label, longitude, and latitude of the airport were all in one column that could not process in Gephi. Therefore, I exported the table and used OpenRefine to organize it, separating the data into three independent columns: airport, longitude and latitude. After the clustering, I imported the CSV file into Gephi and ready for the next step.
Visualize in Gephi
1. Statistics: degree, diameter, density
I imported the CSV file in Gephi and the software creates basic visualization for me in the overview. I then run the statistics with an average degree of 11.038, 4 network diameter, 0.047 graphic density and 0.249 modularity to receive the best result.
2. Layout and styling: Geo
I then played with the layout and chose Force Atlas, one of the force-directed algorithms, to generate the first graph (Figure 1).
As I can see, there are a plethora of connections in the MSP (Minneapolis−Saint Paul International Airport) on the upper side. However, the connection is still tangled, and a lot of detailed information is missing. After consulted with Prof. Sula, I downloaded the Geo tagged plugin to manipulate the layout through geolocation. With an easy placement of longitude and latitude. The preview tab gives me a final display of the visualization. Then I set the size of the font, size of nodes and thickness of the edge to make the network more visually appealing (Figure 2).
Reflection
One of the things that leaps out at first glance is the plethora of connections in the MSP (Minneapolis−Saint Paul International Airport) on the upper midwest, which makes me guess the data probably from the Delta Airlines, one of the largest airlines in MSP. By contrast, almost the entire west coast is dim, save for a few bright spots in airports such as SEA, PDX and SLC. The relative brightness of populated city Seattle compared to nearby airports highlights the persistent economic disparities in air travel patterns.
I was surprised to see the final output created by Gephi turned out to be clear and informative. Given the raw dataset had so little description of its origin and time of the year on the website. Sometimes we have to test to see and believe in our guts.
Softwares
OpenRefine: A open-source tool for data cleanup.
Gephi: A free open-source software for network analysis and visualization.
Datasets
Gephi Wiki: Infrastructure Networks — Airlines
References