Network Analysis of Zachary’s karate club


Visualization

Introduction

Zachary’s karate club is a great example of social relationships within a small group. This set of data indicated the interactions of club members outside of the club. This dataset also documented the conflict between the instructor, Mr.Hi, and the club president, John because of course price. In the end, half of the members formed a new club around Mr.Hi, and the other half either stayed at the old karate club or gave up karate.

Inspiration

This is the graph that the original author created in his article to show the social relationships between club members. I like this graph because it is very neat. It is easy to tell there’s a symmetrical divide between two sub-groups in this graph ( between member 34 and member 1). However, I decided to go with an a-symmetrical divide by keeping the edge directional. I was hoping that this shift would provide a different perspective to the picture.

This is a graph done by a classmate last semester regarding relationships between Marvel heros.I like this graph because it is easy for me to tell the most popular Marvel heros. It highlights and suggests the dominant members to viewers even at one glance. It is also relatively easy to tell the sub-groups by its distinctive color. However, it did took me a while to figure out what the color stands for, because there wasn’t any notes about the colors on the graph.

Materials

  1. Gephi 0.9.2 – A free open-source software that allows creating network and graph data
  2. Network data – Collection of datasets that are freely available  
  3. Zachary’s karate club – social network of friendships between 34 members of a karate club at a US university in the 1970s.
  4. An information flow model for conflict and fission in small groups – article & analysis written by the author of the datasets. It provided me with a better understanding of the context.

Process

The dataset came in the format of gml which could be read directly by Gephi. At first, I evaluated the dataset by confirming that it is clean. It has 34 unique nodes and 78 directional edges.

Node Sizes Varied by Degree (non-directional)
Node Sizes Varied by Input
Node Sizes Varies by Output

Then I tried different layouts, and finalized with the frutchterman reingold layout for clarity of sub-formed small groups. After that, I increased visual edge strength to show the directional relationship through visible arrows. I added labels to the nodes for easy reference. At the very end, I varied the size of the nodes by degree, input degree and output degree. I highlighted the ones with most significant sizes in each graph.

Yifan Layout

Then I created another graph using Yifan Hu layout, then expanded the graph multiple times. I also rotate the graph from vertical to horizontal, so that users will be easy to tell two separate parties.

Results

  • Average Degree: 2.294
  • Network Diameter : 3
  • Graph Density: 0.07
  • Modularity: 0.42
  • Average Clustering Coefficient: 0.285
  • Average Path Length: 1.274

The average path length is short and the graph density is relatively high. We can tell that it is a close and small group where information travels fast. Thus the argument could be heated up really quick and people needed to choose a stand. From the graph, it seems like two major divided group around member 34 and member 1 were formed. Small social groups was also formed around member 3. The group formed around member 34 has a higher density and bigger expand. On the other hand, the group formed around member 1 is more loose. Per direction of the edges, ,member 34 was a very active member in the social group while member 1 was considered a more popular and likeable figure.

Reflections

If there were data sets regarding the network relationship between club members in the beginning, I would like to make a comparison and see how the argument transformed the group statement.

I think I had difficulty interpreting the results, especially the different numbers calculated on the right side. I don’t like that the software automatically curve the edges when exporting the graph. I think it caused a gap between expectations.

Reference:

W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 (1977).