Prisoner Connections: A Gephi Presentation


Visualization

Introduction

In the 1950s, sociological professor John Gagnon conducted a survey of 67 prisoners.  The survey simply asked: “Who do you consider yourself to be closest friends with?”  According to political scientist Duncan MacRae Jr., the data was incredibly difficult to interpret.  There appeared to be no concrete groups formed among the inmates; no strong function to separate them into distinct categories.  This study was conducted fifty to sixty years ago, and these sociologists did not have the same resources that we have today.  So, by using modern tools, would we be able to detect patterns and communities in this prison population?  And if so, what further information could we infer about the prison system at the time?

Materials

The main tool used to process this data was an experimental network-mapping program called Gephi.  The dataset on the prison population was retrieved from a network dataset repository called Casos.  Once the graph itself was completed, it became apparent that additional research had to be conducted.  Thus, the original article was retrieved for the purposes of an accurate report.

Methods

The dataset used for this presentation only dealt with numbers and undirected connections.  None of the prisoners in the study gave their name.  Because of this, the data itself was easy to work with.  Nodes were color-coded based on modularity and size-coded based on the degree of connections.  In theory, the graph would neatly divide the sixty-seven prisoners into distinct groups based on the number of connections they had with each other.

Results

The completed graph is shown down below, and it has simultaneously lined up with and defied expectations.

 

Just as Professor MacRae stated in his original article, the communities within the prison system are much less clear-cut than communities in a school system.  While some distinct groups were carved out in the clustering process, several inmates shared multiple connections outside of their groups.  This made the boundaries for certain groups loosely defined to the point of nonsensical.  The orange and green groups in particular are so loose that they appear snakelike, cutting through multiple other groups while winding all over the map.

Since there is a certain randomness factor to the construction of the graph, some of the results appear unintuitive.  For instance, Inmate 7 shares connections with Inmate 5, Inmate 8, Inmate 28, and Inmate 40.  However, only Inmate 40 is in the same color group as Inmate 7.  What’s more, Inmates 5 and 8 are both in the purple color group, thus logic would dictate that Inmate 7 would be part of that group as well.

The most intriguing factor in this presentation was that while the dataset marked all of the edges as undirected, the original study did not suggest that possibility.  In Professor MacRae’s article, there is a table listing each of the prisoners’ choices regarding whom they considered to be friends.  These choices were split up into x and y coordinates.  X represented choosers, and y represented the people they chose.  That form of terminology implies that these connections should be directed.  While every prisoner in x would consider the corresponding prisoners in y to be their friend, those prisoners in y did not always think the same way of the prisoners in x.  This is implied by showing subjects who are otherwise isolated from the group, such as Inmate 11 and Inmate 19.  Nevertheless, the dataset itself marked all of the edges as undirected.

Moving Forward

Since the study focused mainly on sociometric calculations and formulas, there was not a lot of data to work with for this project.  It was stated in the article that factors such as cell block location were not taken into consideration, but there might have been other factors that influenced the connections these people made.  Further research beyond the initial study might be necessary to paint an accurate picture on how social groups were formed in prisons during the 1950s.  It might also be beneficial to rewrite the dataset to include direct connections as well.  This would likely change the overall design of the graph and emphasize which relationships are reciprocal and which aren’t.  Though if Professor MacRae’s research is to be believed, it would not make much of a difference.