Raiders of the Lost Arc


Visualization

Visualisation One

Visualisation Two

Visualisation Three

Introduction

For this lab, I began by looking at visualisation examples since I did not have a specific topic of interest. My attention was drawn to a world map visualisation. I searched on CASOS, SNAP, and the Gephi wiki for datasets with location data. I settled on a dataset from CASOS on the movie Raiders of the Lost Ark (1981), directed by Steven Spielberg. The data encodes the location of the characters for each time interval (where the location of characters is known).

Visualisation Examples

As stated in the introduction I was drawn to the visualisation using a world map and used that inspiration to find a dataset with location data. Since the data has character and location names I felt an arc diagram would display the information in a format easier for the user to interpret. I would be able to order the characters nodes on the left and location nodes on the right with the arc connecting the nodes. Aesthetically I like radial diagrams, however, I do not think it would display this data in a useful visualisation.

Materials

Software:

  • OpenRefine
  • Gephi

Research resources used in this lab were:

Methods

  1. Use OpenRefine cleanup data
  2. Import data to Gephi
    1. Import CSV file as edges table
    2. Import Nodes (if applicable)
  3. “Data Laboratory” tab
    1. “Nodes” tab
      1. Shows the nodes created by Gephi
      2. Users can merge nodes here if necessary
    2. “Edges” tab
      1. User can copy data from one column to another (e.g. copying data from the ID column to make a Label column
      2. User can make new columns
      3. Statistical data results are located here
  4. “Overview” tab
    1. “Filter” window
      1. Run desired statistics
    2. “Layout” window
      1. Run the layout of choice
      2. Adjust settings
      3. Stop when satisfied with the visualisation
  5. “Appearance” window
    1. Adjusted nodes size or colour, label colour or size
    2. Adjusted edges colour, label colour or size
  6. “Preview” tab
    1. To preview the visualisation click “refresh”
    2. Edit style of nodes, nodes labels, edges, edge arrows, and edge labels
  7.   Export
    1. Make sure the margins are large enough to incorporate labels
    2. Export as PNG

Discussion and Results

The data was in XML format, I had to use OpenRefine to format the data into columns and rows. Character names were entered in lowercase and without spaces (e.g. indiana_jones) and locations were identified with an ID (e.g. location_1). Using OpenRefine I was able to batch edit both columns, changing indiana_jones to Indiana Jones and location_1 to Peru.

I created three visualisations for this lab: two force-directed and one radial. All three visualisations display the same information (location of the characters for each time interval where the location of characters is known). For the two force-directed visualisations I first ran the layout Force Atlas 2, but I wanted to manually adjust the repulsion strength so I ran Force Atlas. I ran the Force Atlas layout with the following parameters: repulsion strength of 900, auto stabilize function, attraction distribution, and adjust by sizes. After running the layout I ran statistical calculations: average degree, diameter, density, and modularity.

Visualisation One is a force-directed visualisation: nodes are sized by degree (minimum size: 1, maximum size: 26), coloured by modularity class, and labelled. The edges are curved with a thickened of 1.0. The background is black for readability and aesthetics. A black background provides a better contrast for the colours as well as anchors the visualisation.

Visualisation Two is a force-directed visualisation: nodes are sized by degree (minimum size:1, maximum size: 26), coloured by character (pink) or location (green) and labelled. On the Data Laboratory tab on the “Edges” table I created the column “Colour” by copying the ID column. This allows me to select “Colour” as a partition under the “Appearance” window. The edges are curved with a thickened of 1.0. The background is black for readability and aesthetics.

After creating the two forced-directed visualisations I started browsing the plugins to create other visualisations. Visualisation Three is a radial diagram: nodes are sized by degree (minimum size: 1, maximum size: 26), coloured by modularity class, and labelled. The background is black for readability and aesthetics. This visualisation is not useful for this type of data. 

Future Directions

During this lab, I experienced numerous of technical difficulties with Gephi and was only able to create force-directed and radial diagram visualisations. I would like to continue exploring Gephi plugins and work with this dataset to create an arc diagram and world map.