{"id":5500,"date":"2016-11-05T19:42:32","date_gmt":"2016-11-05T23:42:32","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=5500"},"modified":"2016-11-05T19:42:32","modified_gmt":"2016-11-05T23:42:32","slug":"exploring-gephi","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/exploring-gephi\/","title":{"rendered":"Exploring Gephi"},"content":{"rendered":"<p>For the second lab, my initial question pertained to how networks of people could be represented in a visual manner.<\/p>\n<p>Initially, I thought of using my Facebook network to use as an example, or as an alternative, my LinkedIn network, to see how<\/p>\n<p>people were connected.<\/p>\n<p>In terms of questions, I was looking to see (visually) how people in my \u2018networks\u2019 are connected. Although I ultimately did<\/p>\n<p>not wind up going in that direction, I shifted focus to a different subject, word adjacencies in David Copperfield. . The dataset<\/p>\n<p>was focused on word adjacencies in the novel David Copperfield, not business (or friend) connections.<\/p>\n<p>Visualization #1 (Figure 1, below) was what I was initially considering. I looked at my LinkedIn network primarily as a<\/p>\n<p>matter of curiosity. The final output somewhat resembled this first pass at visualization if only in terms of items being linked<\/p>\n<p>and clustered.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture1-2.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-5504\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture1-2-620x411.png?resize=620%2C411\" alt=\"Figure 1, My LinkedIn Network, as visualized using Socilab (www.socilab.com\/home)\" width=\"620\" height=\"411\" \/><\/a><\/p>\n<p style=\"text-align: center\">Figure 1, My LinkedIn Network, as visualized using Socilab (<a href=\"http:\/\/www.socilab.com\/home)\">www.socilab.com\/home)<\/a><\/p>\n<p>Socilab did provide network analytics (Figure 2), showing absolute size, effective network size, as well as other measures, to<\/p>\n<p>highlight areas where more connections can be made, or a network can be strengthened.<\/p>\n<p style=\"text-align: center\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture2.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-5505\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture2-620x306.png?resize=620%2C306\" alt=\"Figure 2, Socilab LinkedIn network analytics (partial)\" width=\"620\" height=\"306\" \/><\/a>Figure 2, Socilab LinkedIn network analytics (partial)<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Visualization #2 was more along the lines of what I was producing once I started working with Gephi datasets. This<\/p>\n<p>graph came from networkrespository.com, a site with multiple data sets for exploration and analysis. It seemed to be closer in<\/p>\n<p>line with the examples in the readings and in class, as it had more connections and stronger ties among the nodes.<\/p>\n<p style=\"text-align: center\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture3.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-5506\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture3-620x371.png?resize=620%2C371\" alt=\"Figure 3, Visualization #2, from www.networkrepository.com\/web-pollblogs.php\" width=\"620\" height=\"371\" \/><\/a>Figure 3, Visualization #2, from <a href=\"http:\/\/www.networkrepository.com\/web-pollblogs.php\">www.networkrepository.com\/web-pollblogs.php<\/a><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture4.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-5507\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture4-620x494.png?resize=620%2C494\" alt=\"Figure 4, Visualization #3, from https:\/\/flowingdata.com\/2010\/02\/26\/news-topics-as-social-network\" width=\"620\" height=\"494\" \/><\/a>Figure 4, Visualization #3, from <a href=\"https:\/\/flowingdata.com\/2010\/02\/26\/news-topics-as-social-network\">https:\/\/flowingdata.com\/2010\/02\/26\/news-topics-as-social-network<\/a><\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Visualization #3 was the closest to the final product from the Gephi lab session. Again, though the focus of Figure 4<\/p>\n<p>(Visualization #3) \u00a0is different than the end result,it provided a representation of what I was looking to do.<\/p>\n<p><strong>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/strong>To create the final product, I used a dataset from <a href=\"https:\/\/github.com\/gephi\/gephi\/wiki\/Datasets\">Github<\/a> focusing on word adjacencies in the novel David Copperfield. I<\/p>\n<p>settled on that dataset after trying out various other datasets from other sources (such as Networkrepository.com) that did not<\/p>\n<p>work out. One set of data had what looked like properly formatted data (in CSV), but upon trying to import it into Gephi, the<\/p>\n<p>file lacked the appropriate set up (nodes and edges), so it was not usable for visualization. I had tried locating alternative<\/p>\n<p>sources in advance of the lab, but ran into issues such as incompatible file formats and dead links. In the interest of time and<\/p>\n<p>efficiency, the readily-available and importable data worked best for creating a visualization.<\/p>\n<p><strong>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/strong>To create the visualization, I used a dataset from Github, and imported it into Gephi. I had plans to look at other<\/p>\n<p>datasets, but as mentioned above, they were not usable for this exercise. The set used seemed like it would present well and was<\/p>\n<p>not overly large (smaller numbers of edges and nodes) or complex once visualized in Gephi (such as the airline data set). The<\/p>\n<p>end result did look different than what was in the program workspace, but through making adjustments for layout, statistics,<\/p>\n<p>and other factors, it created a usable product.<\/p>\n<p>Working with the program had its challenges, though. The output would not show up in the program, making it<\/p>\n<p>necessary to save the current version, export it, then open the result. Had that not happened, I may have been able to adjust<\/p>\n<p>certain things such as color, with a bit more ease.<\/p>\n<p><strong>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/strong>Using the Github dataset, Figure 5 (below) shows that the word \u201clittle\u201d appears at the center of the \u2018universe\u2019 in the<\/p>\n<p>visualization, and is by far the largest node. Other words also appear frequently (\u201cother\u201d, \u201cold\u201d, \u201cgood\u201d), and were connected to<\/p>\n<p>multiple other words throughout the diagram. Though it was not the dataset I intended to use, the end result succeeded in<\/p>\n<p>visually presenting the networks concept.<\/p>\n<p style=\"text-align: center\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture5.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-5508\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Picture5-620x639.png?resize=620%2C639\" alt=\"Figure 5, Output from Gephi lab session (reduced size)\" width=\"620\" height=\"639\" \/><\/a>Figure 5, Output from Gephi lab session (reduced size)<\/p>\n<p>While I had a workable dataset and a final product, a brief discussion of ideas explored, but not used follows.<\/p>\n<p>Facebook did not pan out because of changes made to the application program interface (API). Previously users could<\/p>\n<p>map their own networks and their friends would appear as well; after the <a href=\"http:\/\/www.kdnuggets.com\/2015\/06\/visualize-facebook-network.html\">change<\/a> in 2015, users would only be able to show their<\/p>\n<p>own activity. For a user to show their friends on a network, they (the friends) would need to allow Facebook to access their<\/p>\n<p>connections as well. I did find instructions for building such a visualization, but it seemed rather time-consuming, when other<\/p>\n<p>datasets were readily available on other sources.<\/p>\n<p>I was able to plot my LinkedIn network using Socilab (Figure 1). In terms of answering the \u201cwhat would this look like\u201d<\/p>\n<p>question, it was useful, as I had a diagram to see and evaluate. The output had its limits, as it just reflected my connections, not<\/p>\n<p>the connections of people to who I am linked. From the output, I saw that many of my connections are isolated. Additionally,<\/p>\n<p>the clusters of people that appeared correlated to various parts of my life (work, college, other organizations of which I am or<\/p>\n<p>was a part).<\/p>\n<p>As to future directions, relationships between certain words in the novel could be explored and graphically represented.<\/p>\n<p>The layout could be changed to emphasize or de-emphasize certain nodes, among other areas.<\/p>\n<p>Having a somewhat better handle on Gephi now, more complex data sets could be explored and visualized. Also, having<\/p>\n<p>a somewhat firmer understanding of how to work with formatting data, other visualizations can be created, more in line with<\/p>\n<p>my interests, such as a network of characters from Absolutely Fabulous, for example.<\/p>\n<p>&nbsp;<\/p>\n<p>Sources and Figures<\/p>\n<p>Facebook API change,\u00a0 <a href=\"http:\/\/www.kdnuggets.com\/2015\/06\/visualize-facebook-network.html)\">http:\/\/www.kdnuggets.com\/2015\/06\/visualize-facebook-network.html<\/a><\/p>\n<p>Figure 1, My LinkedIn Network, as visualized using Socilab (<a href=\"http:\/\/www.socilab.com\/home)\">www.socilab.com\/home)<\/a><\/p>\n<p>Figure 2, Socilab LinkedIn network analytics (partial)<\/p>\n<p>Figure 3, Visualization #2, from <a href=\"http:\/\/www.networkrepository.com\/web-pollblogs.php\">www.networkrepository.com\/web-pollblogs.php<\/a><\/p>\n<p>Figure 4, Visualization #3, from <a href=\"https:\/\/flowingdata.com\/2010\/02\/26\/news-topics-as-social-network\">https:\/\/flowingdata.com\/2010\/02\/26\/news-topics-as-social-network<\/a><\/p>\n<p>Figure 5, Output from Gephi lab session<\/p>\n<p>https:\/\/github.com\/gephi\/gephi\/wiki\/Datasets<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For the second lab, my initial question pertained to how networks of people could be represented in a visual manner. Initially, I thought of using my Facebook network to use as an example, or as an alternative, my LinkedIn network, to see how people were connected. In terms of questions, I was looking to see&hellip;<\/p>\n","protected":false},"author":432,"featured_media":5504,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"image","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"coauthors":[],"class_list":["post-5500","post","type-post","status-publish","format-image","has-post-thumbnail","hentry","category-visualization","post_format-post-format-image"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1qI","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/5500","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/432"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=5500"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/5500\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=5500"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=5500"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=5500"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=5500"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}