Soccer Teams From The World Championship, 1998


Visualization

The dataset I chose to work with was a network of 22 soccer teams that participated in the World Championship in Paris, 1998. The description of the dataset suggests that the data shows players with allegiances with multiple teams across different countries, but the given XML files seem to provide information only on teams from American universities. Attempting to understand the network I wound up creating, I believe it to be the rate at which teams are trading their players to other teams, though I am not entirely certain.

 

I started out by attempting to get the data into Gephi, which was a process that required me to convert the XML files using OpenRefine. Following the given instructions, I was able to successfully format the data for Gephi, though I found that process and even less intuitive than my experience with Tableau. I also found the visualization tools within the program to be very difficult to intuit, and did not quite understand how they worked (even if they did). Ultimately one of the tools was able to recreate the network of the original data, which I assumed to be the correct one.

The generated network contained several different clusters, which I then color-separated according to cluster. Even with the clearly colored clusters I was unsure of how to to clearly distinguish each group, either due to my limited knowledge of the data or university sports. I feel that it would’ve been more helpful if I were able to provide a clear meaning behind the colors, because with the level of distinction I created I feel that they are clearly coded to suggest something specific. I think that without knowing more about the clusters, it would have been more responsible to use a color palette that was much more similar, while still being able to distinguish the clusters.

Visually I was happy to have the nodes scale based on their supposed weight, and I liked that I was able to get the labels to scale accordingly. Unfortunately I didn’t feel like I handled the lower end of the spectrum well, and some of the labels are illegible due to the small sized nodes. I also struggled to effectively label the nodes, as I wanted something that would be legible, but also visually similar to the respective nodes. Ideally I would’ve liked to have had a darker version of the each cluster’s color for the text, but I couldn’t figure out how to achieve that. I think that the box labels are helpful by allowing the white text to stand out, but I think that they obscure the nodes too much.

I had hoped to be able to scale the edges similarly to the nodes, but several attempts kept rendering lines that were barely visible. I assume that this was due to a minimal relationship between respective nodes, with a lot of the teams having little to no relationship with a majority of the teams. Adjusting the lower end of the threshold to be more clearly  visible would’ve been successful, but I was unable to figure out how to achieve that effect. That’s how I wound up keeping the uniformly sized lines, even though they can be a little messy and hard to follow in places.

Overall I found working with Gephi to be a bit difficult and challenging, mostly due to my unfamiliarity with the interface. While I think that my network is visually interesting to a degree, I still do not clearly understand the data on several levels, feeling that the description did not match the actual data. With this mismatch/misunderstanding of data and description, I feel that it is hard for me to effectively assess the success or failures of my graph. Assuming that I have mapped the trades across university soccer teams, I think that a lot of my visual concerns of legibility would’ve been helpful in communicating the data.