Introduction
I am always interested in consumption patterns and the top-selling products in different fields statically because they tell you a lot about what is going on around the world and what will happen soon. And book consumption patterns or trends are very informative about what people are thinking and how the existing messages in the society might lead people to believe in the future because books are one of the kinds that can spread ideas, messages, and influences around very quickly.
When I was first introduced to Network Data and Analysis, I felt its power and wanted to build my own network data around exciting topics, so I chose books for my first network data project. I’m not picky about what types of books to start with because what I want to know is how books are connected to each other or form a group and inform us something.
Inspiration
I took inspiration from Wajih and found that his visualizations are strong in providing context and findings with various visualizations. For example, Figure 1 highlights the disparity in the number of matches played by the Big Three, Testing Playing countries, and the Associate Teams by using 3 colors and a similar node shape size. And Figure 2 uses a lot more colors and dramatically different node shape sizes to show the modularity class of all parties. These two visualizations show the different perspectives and facts with the same dataset. And I am aiming to have similar methods to present my book network data.
Process
The Dataset
The dataset I chose is a network of books about US politics published around the 2004 presidential election and sold by the online bookseller Amazon.com. According to KONECT (the second website where I found this dataset), this network was compiled by V. Krebs and is unpublished, but the original data can be found on Krebs’ website. This is an excellent dataset because it gives you the details(Figure 3) to inform you about how you can possibly do with this dataset. For example, it tells you the edge types which are unweighted, no multiple edges and the edge meaning is co-purchase which means that the books connected by edges are co-purchased by the same buyers.
Tools
Since the dataset I downloaded is GML file, which allows me to open it directly in Gephi without any additional work to convert. Gephi is an open-source network analysis and visualization software that helps you explore, analyze, spatialize, filter, manipulate and export all types of graphs. And you can see the graph in 3D real-time display interactively. The interface of Gephi might not look modern and intuitive to use, but the software is very robust in making visualizations solid and diverse with the existing features and plugins.
Methods
I first opened the dataset in Gephi and edited the setting to undirected graph type based on my dataset nature. Gephi generated a graph of black dots within a net that I have to work with to make it user-friendly and informative later.
Figure 4.1 Applied ForceAltas 2 Figure 4.2 Result after running average degree Figure 4.3 Set node size with degree min & max
I then selected a classic visualization layout: ForceAltas 2 to create a typical spatial visualization for the network. And now it looked so much better, and I started to see something. After selecting the basic layout, I then run multiple network settings to see how I can organize the nodes. I ran average degree, network diameter, graph density, and modularity to get some results in numbers.
Figure 5.1 Applied color based on degree size Figure 5.2 Adjusted node size Figure 5.3 Adjusted labels
In order to make the graph network more readable, I played around with the node sizes, color and edited more visualization design at the preview sections to generate my final results.
Result
Check out The Visualizations in High Resolution Here
The final results show the network of books about US politics under the context mentioned in the material section in three different visualization layouts. For example, the first graph was designed with ForceAltas 2, and the second was designed with Yifan Hu Proportional layout. And the last graph was designed with Fruchterman Reingold layout. These visualizations were also designed with the nodes organized by modularity class. And it’s clear to see the 4 modules and how they connected to each other spatially. The larger node size here indicates that the higher degree of a node. And the color of the edges indicates where the resource is from. In order to better understand each module, I imported the titles of the books from each module into word clouds and created additional visualizations for users who have the context of political topics can easily find information.
Reflection
The network visualizations show the 4 modules, their connections, and the comparisons for people who know the political topics. For me, since I am not a huge fan of politics but have some common senses of politics, I can only tell that there are something going on between books that connected to each other, and the higher the degree of the node of the book, the much more influential the book is at that specific time, and vice versus. However, I realized that network visualization is a type of visualization that requires designers to include a lot of contexts to help users understand what everything is. And I am not an expert at creating those contexts besides writing it all down on a report. For future work, I aim to find a way to create more context for network visualization within the visualization itself.
References
Newman, Mark. (2013, April 19). Network Data. http://www-personal.umich.edu/~mejn/netdata/
KONECT. Political Books. KONECT. http://konect.cc/networks/dimacs10-polbooks/
Krebs, Valdis. (2008). New Political Patterns. Orgnet. http://www.orgnet.com/divided.html
Shafiq, Wajih. (2020, Oct 30). A Network Analysis of The Monopolies on International Cricket. Studentwork.prattsi.org**.** https://studentwork.prattsi.org/infovis/visualization/a-network-analysis-of-the-monopolies-on-international-cricket/