Doubtful Sound Dolphin Community


Visualization

 

Introduction

For the Gephi Lab, I chose an undirected social network dataset from the website for the Center for Computational Analysis of Social and Organizational Systems (CASOS). The dataset, compiled by Lusseau et al. (2003), documents interactions between 62 dolphins.  

I was interested in seeing how many clusters were formed between the dolphins, as well as the density of the network as a whole.

 

Informative Visualizations

The Dolphin dataset is not large, which I kept in mind when I began looking for visualizations to inform my design. I sought out visualizations that used color to distinguish clusters, incorporated size to indicate degree. I found “Envisioning the Near Future of Technology“, the “Flavor Network“, and a visualization of character interactions in Game of Thrones, to be most informative for my purposes. 

 

Materials

The data were available as two XML files, which I reformatted in Excel. For the file that would become my Edge table, I deleted unnecessary information and added the appropriate column headers for Source, Target, and Direction. I did the same thing for the file that would become my Node table, adding the appropriate column headers for ID and Name.  I saved both as CSV files, and imported them into Gephi.   

 

Methods and Results

I began my design by using the Force Atlas 2 Layout, and running statistics on Degree, Diameter, Density, and Modularity. I changed the appearance of the nodes, sizing based on degree and coloring based on modularity. On the Preview screen, I selected “Show Labels” and “Proportional Size.”

dolphin_1

At this point, I had answers to my two questions: the network had a density of .084, and I could identify four distinct clusters based on color. I could also see that certain nodes (SN4, Grin, Scabs, Topless, and Trigger) were well connected. While the size of the nodes helped make this clear, it was more easily distinguishable because of the proportional label sizes.

Next, I tried to explore how other layouts affected the network. Most layouts, such as Expansion and Yifan Hu, looked essentially the same as Force Atlas 2, which I assumed had to do with the small size of my dataset. Fruchterman Reingold looked very different, but didn’t show the clusters well, so I did not explore it further.

My next step was to explore a hypergraph. Using my existing Edge table, I replaced the original source and target IDs with their corresponding cluster IDs. In the Node table, I replaced the IDs with their corresponding cluster IDs, and created a new column, Size, which included the number of nodes in each cluster.

I began the design by running Force Atlas 2, and selected “Prevent Overlap.” I then ran the same statistics I had used for my original visualization. I sized the nodes based on the Size attribute, and colored the nodes using the same color palette that I used to show clusters in my original visualization. On the preview screen, I checked “Rescale Weight” for the edges, so the thickness of each Edge indicated the number of connections between nodes.

hypergraph_3

 

Discussion

This is a small dataset, which made it difficult to explore alternative visual possibilities. This also makes it difficult to imagine what other directions one could take in exploring this dataset further. I would be interested in going back to the Fruchterman Reingold layout, manipulating aspects of the layout and appearance, and seeing if anything different could be learned about the network.  I would also enjoy exploring the hypergraph further, to see what more I could learn from it.

References

  1. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations, Behavioral Ecology and Sociobiology 54, 396-405 (2003).