Social Network in Zachary’s Karate Club

Background

Zachary’s karate club is a social network of a university karate club, described in the paper “An Information Flow Model for Conflict and Fission in Small Groups” by Wayne W. Zachary. The network became a popular example of community structure in networks after its use by Michelle Girvan and Mark Newman in 2002. A social network of a karate club was studied by Wayne W. Zachary for a period of three years from 1970 to 1972. The network captures 34 members of a karate club, documenting links between pairs of members who interacted outside the club. During the study a conflict arose between the administrator “John A” and instructor “Mr. Hi” (pseudonyms), which led to the split of the club into two. Half of the members formed a new club around Mr. Hi; members from the other part found a new instructor or gave up karate. Based on collected data Zachary correctly assigned all but one member of the club to the groups they actually joined after the split.

This is the graphic representation of the social relationships among the 34 individuals in the karate club. A line is drawn between two points when the two individuals being represented consistently interacted in contexts outside those of karate classes, workouts, and club meetings. Each such line drawn is referred to as an edge.

Dataset

The standard 78-edge network data set for Zachary’s karate club is publicly available on the internet. The data can be summarized as a list of integer pairs. Each integer represents one karate club member, and a pair indicates that the two members interacted. The data set is summarized below and also in the adjoining image. Node 1 stands for the instructor “Mr. Hi”, node 34 for the club administrator/president “John A”.The standard 78-edge network data set for Zachary’s karate club is publicly available on the internet. The data can be summarized as a list of integer pairs. Each integer represents one karate club member, and a pair indicates that the two members interacted. The data set is summarized below and also in the adjoining image. Node 1 stands for the instructor, node 34 for the club administrator/president. Although this version of the network is considered standard, the connection between nodes 34 and 23 is ambiguously reported in Zachary’s original paper. A 77-edge version, which omits this edge, is also publicly available.

Process

My analysis was driven by a simple but compelling question: Who interacts the most, and how might this influence their role in the group? I used Gephi to visualize the network with a focus on interaction patterns. Specifically:

Node size represents the total number of interactions a member has (both incoming and outgoing).

Node color indicates the intensity of interaction—darker shades for higher interaction counts.

Version1. Focused on in-degree interaction

Version 2. Focused on out-degree interaction

One notable finding was that node 1 and node 34—the club instructor and the club president, respectively—were among the largest and darkest nodes, symbolizing their critical roles in the group. They also stood at the center of the two opposing clusters in the final visualization. This visually reflects the real-world division that occurred between followers of each leader. Another observation was the asymmetry in some interactions. Certain nodes(node 2 and node 3) had high in-degrees but relatively low out-degrees, suggesting members who were reached out to often but not necessarily central in the group’s social life. This prompted further questions about influence versus effort in social settings

Reflection

One of the most fascinating aspects of this project is how accurately network analysis can reflect real-world group dynamics. Zachary predicted the club’s split with near-perfect accuracy—only misjudging member #9, who chose Mr. Hi over John A. It’s striking how much context can be uncovered from just numerical data. Zachary’s use of the Ford–Fulkerson maximum flow–minimum cut algorithm was especially compelling. By modeling Mr. Hi as the “source” and John A as the “sink,” he identified a network “cut” that closely matched the actual division. This highlights how communication flow plays a vital role, with members on the network’s border most likely to shift sides. While my visualization effectively shows key influencers and interaction volume, there’s room for refinement. Separating in- and out-degree more clearly or adding contextual background could deepen the analysis. This project reinforced how powerful network data can be—not just for mapping relationships, but for revealing the stories behind them.

Resources

https://en.wikipedia.org/wiki/Zachary%27s_karate_club#cite_note-Data-3

https://networkkarate.tumblr.com

http://www.communicationcache.com/uploads/1/0/8/8/10887248/an_information_flow_model_for_conflict_and_fission_in_small_groups.pdf

Information Visualization

Student work at the School of Information, Pratt Institute

Social Network in Zachary’s Karate Club

Related posts: