Busiest Airports of North America in 1997


Visualization

INTRODUCTION

Airports connect one place to another. Airports in North America have always been busy, as some of the most important trade and technology cities are located in the USA and Canada. Like NYC, San Francisco, Los Angeles, Toronto, etc. In 1997, a study was carried out to understand the 300 top international and domestic airports and their connections. This study merely included data for the busiest airports within North America.

From the perspective of 2021, I would like to form a few hypotheses about the data gathered in 1997 and I would try to prove/disprove them with compelling visualizations. It would be an excellent experiment to investigate whether the assumption for today still made sense 20 years ago. The hypotheses are:

1) I would mark several airports “prominent” in North America, and they would be ranked busier than the ones that are “non-prominent” ones.

2) The airports in New York City (JFK), LA, and San Francisco would rank Top 5 airports for 1997.

TOOLS & PROCESS

First of all, the dataset of US Air97 was chosen from Gephi Wiki. The user-friendliness of the data required little to no data cleaning and synthesizing. The dataset was fetched in dot net (.net) format. It is one of the formats that can directly be assimilated by the software Gephi.  As you can see below, no issue was detected by the software since the format of the dataset is compatible with he software. In this case, “Mixed” Graph Type was chosen mohave some direction but also some freedom to go whatever direction that suits the best according to the requirements of the experiment.

For this experiment,  Gephi was chosen for its specialty for turning the data into nodes (destination) and edges (journey). The nodes will represent the names of the airports, and their edges will show their connectivity to other airports. Upon getting in, there will be three tabs be spotted on the top to navigate:

1) Overview: Designing the layout and presentation.

2) Laboratory: Textual related Nodes and Edges

3) Preview: Rendered Visualization.

“Laboratory” is the perfect place for elementary data cleaning and synthesis. As shown in the picture below, data for Nodes and Edges are shown in different sheet. These data is by default shown by the software, and, yet, there is a room to add, delete, edit any data per discretion

To compare the data for the international airports, I got rid of the domestic, local, and private airports. For better clarity of the data visualization, I also got rid of the obscure international airports. In the end, there were 118 airports left to compare. Also famous airports were marked “High” on their prominence in the North America and across the Globe, as shown in the picture below.

Regional and other obscure airports were removed from the data set and also “High” mark was given to the famous airports in the North America. The value for other airport is by default “Null”

After cleaning up the data, I was left with 118 nodes and 913 edges. One of the purpose of the experiment was achieved that not to overwhelm the viewers with tons of trivial data.

RESULTS

Let’s recap the hypotheses so we can compare with different facets of the results:

1) The “prominent” airports in North American would get ranked busier than “not-so-prominent” data.

As you can see in the preliminary visualization, the blue color depicts highly prominent airport and red color depicts not-so-prominent airport.

But, how to know that which airports are relatively busier than other airports? Well. the size of the node would tell the tale. In the picture below, the one of the assumptions was busted that all the non-prominent airports (Red) would be ranked less busier than the prominent ones (Blue). There are nodes of the “Red” airports which are indeed larger than many “Blue” airports.

Not all all red nodes (non-prominent airports) are ranked less busier than their blue counterparts.

To further look into the data, the data is further modularized into 4 clusters:

  1. Big Nodes: Busiest airports
  2. Medium Nodes: Moderately busy airports
  3. Small Nodes: Not-so-busy airports
  4. Very small Nodes: Least busy airports

This visualization says that in the top three clusters, most of the airports are the prominent ones. In the last cluster – a group of the least busy airport – most of the airports were hypothesized to be “not-so-prominent.” You’d still see some tiny Blue Nodes – the least busy airports in North America in 1997.

The second Hypothesis was that the airport of NYC (JFK), San Francisco, and Los Angeles – given their prominence today – would be one of the busiest ones in the Top 5 busiest airports in 1997. Below is the chart for the top 5 busiest airports in 1997 in North America.

As you can see, instead of NYC, SF, and LA airport there is one airport with one node. Atlanta was deemed as not-so-prominent airport in 1997 given its significance today. To further dig into it, the visualization shown below denote the top 20 busiest airports in North America in 1997.

Top 20 busiest airports in 1997 in North America.

The airports of SF and LA have secured their places in top 20 but there is no JFK airport whatsoever. For a record, you would see Queen’s airport made to the list but, JFK did not.

REFLECTION AND FUTURE WORK

The data disproved both hypotheses. Although, in the minority, the “not-so-prominent” airports were indeed busier than many “prominent” airports. Let alone not making it to the list of Top 20 busiest airports, JFK was even less active than “not-so-prominent” airports like Atlanta, Queen, Cincinnati.

It’s surprising to discover that airports at SF, LA, and NYC (JFK) were not the top 5 busiest airports in North America for all the flights taking place within the continent. This dataset fails to show for flights that took place globally, so it would be interesting to find whether JFK airport was one of the busiest among all the North American airports in 1997 for the worldwide flights outside of North America.

REFERENCES

  1. https://dishant-ux.com
  2. http://gis.icao.int/gallery/trafficflow2019th.JPG
  3. https://www.youtube.com/watch?v=dSx5_PjaWVE