Women Nobel Peace Prize Winners – A Network Analysis


Lab Reports, Networks, Visualization

Introduction

A professional modus operandi of mine will has always been and will always be highlighting feminist work. Whether that means applying critical feminist analysis or simply using gender-sensitive data, my starting point is with these dimensions. This is my first foray into network analysis, so naturally the first thing that came to mind was personal experiences with feminist networks. 

One of the datasets I started years ago I’ve always wanted to visualize in novel ways, in part because it’s the only one of its kind: the database of Civil Society Briefers at the United Nations Security Council. However, I ultimately chose against using it because country names are repetitive; both the civil society representative and the UN Security Council President who invited them to speak need country labels, which can be confusing to non-experts. I resolved to save that analysis for future iterations of this experiment.

The ultimate question I asked was: “how many women have won nobel peace prizes, in what category, and when?” Moreover, “what can the network analysis show me about the women who have won.”

Visualization Research

One of the visualizations I particularly liked during research for this experiment is Cosponsorship networks in the Swedish Parliament, years 2014-2018. The range of colors for nodes and edges is visually appealing, as is the background color and legend. The interactive element is crucial to the effectiveness of this network visualization. The software used does not lend itself to interaction, but this visualization prompted me to think about what information was important to visually display front and center. 

Materials, Software & Datasets

The primary software used for data visualization was Gephi. Additional software used includes Google Sheets, Microsoft Excel for data manipulation, and Google Chrome and Firefox for web browsing. As detailed below, I had hoped to use OpenRefine but without luck. I’m running a Windows 10 environment on a Dell XPS 15 9500 64-bit operating system. Peer review was conducted via Zoom desktop application.

I explored several datasets from CASOS, SNAP, SocioPatterns, and Gephi Wiki without finding any adequate gendered data. I moved to Moviegalaxies, and while that was fun to explore, none were large enough for this experiment. Then I landed on Kaggle (at the suggestion of my peer reviewer) searched “gender” and “women” to finally settle on Women in Nobel Prize 1901-2019. I experimented with creating my own network by manipulating the data into separate sheets of nodes and edges.

Methods & Processes 

I attempted to work with OpenRefine but after considerable effort and troubleshooting, unfortunately I was not able to run the program in any browser. A promising GitHub thread documented the issue I repeatedly experienced, but following those instructions (among others) did not resolve the issue. I tried the following: (re)installing different release versions of OpenRefine and Java, running the openrefine.exe and refine.bat, as well as adjusting the .ini files for both, forcing port and IP changes on Command Prompt, working in different browsers, updating browser and network proxy settings, whitelisting http://127.0.0.1:3333/ and its iterations on my firewalls, disabling my firewalls, and going through http://localhost:3333.

So after having exhausted all my troubleshooting options, I dismissed OpenRefine and instead opted for my old friend Google Sheets to manipulate and process data, making it ready for Gephi and network analysis broadly speaking. Once the data was ready, I downloaded a CSV and uploaded it into Gephi.

Using the CONCATENATE function, I easily defined my nodes and edges in separate sheets. Or so I thought. 

Results

My results varied. I ran several iterations of the experiment but did not create a satisfactory result.

Noverlap

The edge sizes represent the years when men, women or organizations won a prize, corresponding to the category of prize. The node color scale corresponds to how many men, women or organizations won.

Force Atlas iteration of the same data structure, this time coloring the nodes for year density, in addition to edges as the above example. Screenshot includes the statistics and settings for the force atlas graph.

In both the above examples, I arranged the nodes to show gender (and organization) on the periphery in order to show that they were different from the category of prize nodes.

Force Atlas iteration of data in a different structure, where the years are nodes and the gender is represented by color in edges. This is closer to the result I was hoping for, but it is so big that it doesn’t adequately show the gender of the Nobel Prize Winner, which is my main goal.

Detail of the above graph showing the detail lost in the larger network as a PNG.

Reflection & Future Directions

Unclear if it is my conceptual understanding of networks, or my technical work and code that needs adjusting, or both, my results in Gephi were far from satisfactory. 

In terms of data for the future, I would like to do a network analysis of social media networks and women politicians, in line with my growing body of work on the topic. I would also like to find novel ways to represent the civil society speaker data mentioned at the beginning of this piece.

For formatting, I would like more customization than Gephi affords and will likely export the graph as an .svg, then import into Adobe Illustrator. I also intend to create interactive network visualizations, which I find far more effective than static images.

Before I can work with these bigger datasets that have more complexity and orders of magnitude than what I worked with here, I will need to solidify my conceptual and technical mastery of network analysis.