Marvel Cinematic Universe Phase 1 : Most Influencial Characters


Lab Reports, Networks

The Marvel Universe, a fictional universe with a great number of characters that are all connected in some way. The universe has more than 10,000 characters. The focus of this visualization will be on phase 1 of the Marvel Universe(2008 – 2012). The goal is to find the most connected characters in this time period.

Inspiration

My inspiration for this graph is the article ‘Shakespearean tragedies visualized through character interactions’. The visualizations have a clear goal. They want to identify if characters are closely connected. They also want to see if there is a pattern in the structure.

Fig. 1 Inspiration: Shakespearean tragedies visualized through character interactions

Materials Used

Tools

Gephi An open-source software used to create and analyze network data visualization

Microsoft Excel To format the data

Datasets

The dataset used was taken from http://www.casos.cs.cmu.edu/tools/datasets/internal/index.php#marvel

Methodology

To understand the data, I opened it in Spreadsheets. The data was neatly organized. It was a directed network. (Fig 2)

I removed multiple columns with unnecessary details. The data included the following connections:

  • Agent to agent
  • Agent to location
  • Agent to company
  • Company to company
Fig 2. Initial Data

I only wanted Agent to agent connection so I removed rows with other types of connections. Since it was just agent to agent connection, I removed columns with data to identify the type of connection. After deleting columns with data about location and companies. I ended up with a very simple data(Fig 3).

Fig 3. After formating data

I imported my data into Gephi. The first view on the network seems chaotic and I was not able to interpret anything. I realized I need to to add proper labels and need to need to change the layout.

Fig. 4 Initial Graph

After adjusting ranking, showing labels, adding color and layout, the network was looking much better. It was very easy to interpret the information and find out the most connected characters.

Fig 5

To visually improve the graph, I wanted to make the names of characters more readable. I tried removing edges but it was making it hard to understand the connection. I also tried ‘Tag Cloud’ present but it was adding ambiguity to the network. I removed the stroke around the nodes and change the colors to lighter shades.

Fig 6. The final network graph

Results & Interpretations

Some important stats from the visualization are:

  • Average Degree: 5.891
  • Diameter: 5
  • Density: 0.109
  • Modularity: 0.536

I was expecting similar results from the network. Tony Stark and Steve Rogers are the most influentials characters in the universe in all phases. I was not expecting Loki to be one of the most connected characters in phase 1. Overall it was a nice and simple network to show the connection between characters in the universe.

Reflection

I spent half of the time understanding the tool and I was not able to work more extensively with the data. Once I got the good command on the tool, the lab time was over and I didn’t get a chance to work on it. The tool can be frustrating but I believe we can do amazing things with the networks. I am happy that I was able to create a nice meaningful graph. I achieved the goal by easily identifying the most connected characters.

Future Direction

I enjoyed working with Marvel’s data. For the future, I would spend more time on this graph and make it look more readable and visually appealing. Since I also have the data about the location and company of the agents. I want to explore the connections between locations and the agents. It will be interesting how many agents have been to one location.