Visualizing Artists Connections with Gephi


Visualization

Introduction

Art historical research and collecting tends to overwhelmingly focus on the creative work of white, male artists. In response, many artists and activists have turned to crowd-sourced platforms, such as Wikipedia, to create records for artists not always included in the mainstream historical narrative. One such project, the Black Lunch Table, organizes Wikipedia edit-a-thons to increase the representation of black artists in the online encyclopedia.

As information is added to Wikipedia, art history researchers can use Wikidata, the backend database for Wikipedia, to conduct sophisticated searches that draw out new, surprising connections within a collection. This lab uses network visualizations to explore some of those potential connections and ask how thinking through networks can challenge dominant narratives in art history. It will look at data for one particular group, female African American visual artists.

This lab leans on three examples, the Black Lunch Table archive, visualization of contemporary artists connections, and the “Tudor Networks.”

The Black Lunch Table’s archives visualization highlights the topics addressed in the collection. Through the use of size and color, major themes in Black art jump out at the viewer. In the ARTIST-artist network, exhibition and catalog information connects contemporary artists directly to their collaborators.

In my network visualizations, I wanted to bring together elements from both of these graphs by showing the connections of artists to other artists within a certain area of interest that would reflect the themes and concerns of the Black Lunch Table.

I also took inspiration from the “Tudor Networks,” which shows connections based on correspondence over time. Although I knew my visualization would not look similar to this one, I felt inspired by the way the element of time added richness of connections.

Materials

For this lab I used Gephi, an open-source network analysis and visualization software, to create my graphs and OpenRefine to clean and format my data.

I extracted my dataset from Wikidata using the Wikidata query service. In my query, I searched for instances of African American female visual artists who were born after other visual artists who went to the same school, had the same profession, and had works in the same museum collections. My full query is linked here and the template, which will work for any group, is pasted below.

SELECT ?source ?target 
WHERE {
?source wdt:P106 ?occu ;
    wdt:P69 ?school ;
    wdt:P6379 ?collection ;
          wdt:P569 ?sourcebirth .
  ?target wdt:P106 ?occu ;
    wdt:P69 ?school ;
    wdt:P6379 ?collection ;
    wdt:P569 ?targetbirth .
  
  FILTER (?source != ?target)
  FILTER (?targetbirth < ?sourcebirth)  
}

In order to visualize my data in Gephi, I had to format it appropriately. In addition to renaming columns, I reduced instances in my data where a relationship recurred multiple times to a single instance. My original query provided duplicate results when artists had multiple relationships. For example, if an artist both went to the same school and shared work in a collection, that pair appeared twice. If an artist pair appeared together five times, I recorded a weight of five in a new column, titled “weight.” I chose to keep that information because I felt it did accurately speak to the strength of connections, one of my key areas of interest. Finally, I created a column to indicate that I had created a directed dataset, meaning that relationships did not run in both directions. This reflects the relative ages of the artists, introducing an element of time that I thought might enrich my visualization.

When I uploaded this data to Gephi, the software automatically extracted unique identifiers for each individual person in my dataset and stored that information in a separate table. After running some summary statistics, I now had the freedom to experiment with layouts, appearance, and further analysis.

I chose the keep the graph fairly simple visually, in order to emphasize what I saw as its most valuable insights. Both size and darkness of color in this graph indicate in-degree (the number of incoming relationships), so the most important relationship — an artist to a predecessor — has a double encoding. Since the graph only uses one color for nodes, it is possible to compare all of the nodes on the graph visually, even though they obviously fall into relatively separate communities. Edges are colored inversely by weight, to reduce the visual clutter caused by a few very strong relationships.

I think this visualization raises very compelling questions for art history and the push to diversify exhibitions and collection holdings. On the one hand, it holds perceptions to account. How are these relationships treated within and across collections? Do exhibiting strategies reflect, obscure, or celebrate these connections? Do some of the stronger relationships speak to the hyper-visibility of highly prominent artists in the digital record? It also proposes ideas, offering a way of thinking of art history as the exploration of the once removed connection. And it invites intervention into the segregation of these artists from one another, because of place, perception, or occupation, as well as their displacement from mainstream digital and physical collecting.

Perhaps most excitingly, this graph could evolve in a number of ways. As projects like the Black Lunch Table add information about these and other artists to Wikipedia, this network could grow to include more place, more disciplines, and more history. It serves as a model, both conceptually and in its coding, for performing this analysis with other communities as well.

It should also be possible to reproduce this graph as an interactive and interoperable artifact. In addition to creating edge tables, Wikidata has all the information necessary to make information about these artists available right on the graph. Researchers could compare exhibitions or artworks by clicking on a node. The graph could even link to the individual Wikipedia articles of the artists, allowing the network to serve as another kind of exploratory tool. The way that Wikidata handles data means that an open-source, reproducible way of creating this kind of network graph is very possible.