Gephi Lab: Online Social Network by Chinaedu Maduagwu


Visualization

Introduction to topic

The visualization project was centered around networks. For this specific project the data used focused on information about social networks and the research was based on work done by Tore Opsahl and Pietro Panzarasa. The interest of their data research was to gain an understanding of weighted networks, also touching upon the strength of ties in large scale networks as mentioned in their article “The strength of a tie is generally operationalized into a weight that is attached to the tie, thereby creating a weighted network” (Opsahl, T., Panzarasa, P., 2009) For this project studying, exploring and understanding the graphical analysis of the ties and weight was something I considered could be important as  this project was based on network connectivity. As also revealed in the research used to inform the visualization “Exploring the information that weights hold allows us to further our understanding of networks. In social networks, strong ties are often found among socially embedded individuals” (Granovetter, 1973; Panzarasa et al., 2009)

Visualizations that informed design.

The first visualization that informed the project was a visualization based around data along the lines of Mathematical Finance where the nodes represented market participants and their relations to subsets of other market participants, revealing a snapshot of a financial network in a timeslice. The networks were used to represent complex systems of interacting entities around the idea of “community detection”.

The second visualization that informed the project was from datasets the human brain and financial markets, were although different data sets, the graphical analysis between both demonstrated commonalities in terms of global network topological properties (Petra. E et al, Sept  2011)

The final visualization that informed the project was a correlation network Visualization “Correlation networks are useful when you have a large number of variables measured over a period of time and you want to learn about which of the pair-wise relations among the variables are the strongest, most important or most interesting. In finance, the variables in question are often asset prices, and the ultimate goal is to learn which assets’ returns are most related in order to better understand and monitor financial markets ” (Cook, 2012) While this project did not focus or was not exactly  a replica of the correlation network visualization, the size of the data was useful in starting to understand the potential possible structure of the resulting Visualization.

Materials including software and datasets used in lab.

The software used to create this visualization was Gephi. The data set used was retrieved from Github under its social networks free datasets as a DL file. The data set contained information on online social networks specifically Online Social Network 1899 nodes – Opsahl, T., Panzarasa, P., 2009. Clustering in weighted networks. Social Networks 31 (2), 155-163
Methods used to create visualization.

The data imported was already cleaned and was imported without any problems as a DL file into Gephi, so the next step was to run the data to see the degree under statistics.

 

After  the data was run for the degree, the next thing was to run the programs network diameter in search of the distance between nodes and finally the graph diameter was run as well.

Data Laboratory  was then clicked on, followed by overview, under overview the layout button was clicked under Appearance. Force Atlas 2 was selected. To style the data, the nodes were used for layout and styling. The size icon was then clicked on to adjust the size of the node. Back in the data lab, range was looked up and degree was clicked for the min and max values that were used. Under appearance the attribute was selected , then the degree where the range was entered for the minimum value and maximum value. Then under statistics modularity was run.

 

Once the values were entered, back in overview the color was added on. Node labels were added in preview under Node labels .  

 

Results/discussions, including images of embedded visualizations.

The results of the visualization revealed a more dynamic layout between nodes as the weights of the networks ties varied in sizes and lengths across the plot. A total of 1899 nodes were imported and revealed varying in size.

 

While there are not too strong of clusters based on the magnitude of the layout and a few of the nodes get lost due to the tiny size possessed.

 

These ties contribute to the creation of  color clustered segments that reveal concentrations of the nodes in certain areas on the layout.

 

 

At the end of the day the layout reveals a social network where a good concentrations of nodes are connected.

 

Discussion of future directions.

As mentioned above the final results did reveal high levels of connectivity within the visualization. Moving forward there are things that would be helpful to the data  to make it more understandable. Adding names to the nodes would help in identifying the networks better.  I would also move towards reducing the amount of nodes that appear redundant in the data set to try and get a clearer final plot.

Network visualization results can be intriguing at times because you put in data and you never know what it will reveal, one can only predict but the true insights unfurl as you manipulate the inputs. The idea of bringing 2 datasets from entirely different fields and finding the correlations if any is very interesting to me. I would like to in some capacity further explore more correlations in networks and see what other correlations are revealed. Also perhaps exploring deeper,the clusters that are formed, search for insights and learn from what is discovered as well. While this may seem pointless, I believe this is only the beginning of a different way of analyzing data. And we are yet to fully uncover the various uses of this approach to information exploration and the benefits we

 

References

Granovetter, M., 1973. The strength of weak ties. American Journal of Sociology 78, 1360-1380.

Opsahl, T., Panzarasa, P., 2009. Clustering in weighted networks. Social Networks 31 (2), 155-163.

 

Snapshot of a financial network in a timeslice.

https://www.maths.ox.ac.uk/groups/ociam/research/mathematical-finance

 

Petra E., Ruth M, Sandra C, Nicholas W, Duncan A and Edward T, (2011)Topological isomorphisms of human brain and financial market networks

Front. Syst. Neurosci., 15 September 2011 | https://doi.org/10.3389/fnsys.2011.00075

 

A visualization of the growth of financial and brain graphs as cost is increased.

https://www.frontiersin.org/articles/10.3389/fnsys.2011.00075/full

 

Samantha Cook (2012) Correlation Networks. Retrieved From http://www.fna.fi/blog/2012/11/23/tutorial-7-correlation-networks/