{"id":11299,"date":"2018-11-07T14:38:56","date_gmt":"2018-11-07T19:38:56","guid":{"rendered":"http:\/\/studentwork.prattsi.org\/infovis\/?p=11299"},"modified":"2019-01-10T23:53:43","modified_gmt":"2019-01-11T04:53:43","slug":"using-gephi-to-dive-into-a-novel-david-copperfield-by-charles-dickens","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/labs\/using-gephi-to-dive-into-a-novel-david-copperfield-by-charles-dickens\/","title":{"rendered":"Using Gephi to Dive into a novel : David Copperfield by Charles Dickens"},"content":{"rendered":"<h3 style=\"text-align: center\"><\/h3>\n<hr \/>\n<h2 style=\"text-align: center\"><em><span style=\"color: #7fe070\">Introduction<\/span><\/em><\/h2>\n<p>On this page, I used Gephi as a visualization tool to see the common words(adjectives and nouns) in the novel &#8220;David Copperfield&#8221; .<\/p>\n<h2 style=\"text-align: center\"><em><span style=\"color: #7fe070\">Inspiration<\/span><\/em><\/h2>\n<p>It brought delightment to me that things I learnt in Course 601: Foundation of Information connect to what I am doing in this project in Course 658: Data Visualization. Dr. Rabina just clarified that &#8220;Content Analysis&#8221; does not mean doing research on what message conveyed in a paper, it is rather a research on word count or paragraph as objects.<\/p>\n<h2 style=\"text-align: center\"><em><span style=\"color: #7fe070\">Material<\/span><\/em><\/h2>\n<p>After researching on <strong><span style=\"color: #7fe070\"><a style=\"color: #7fe070\" href=\"https:\/\/github.com\/gephi\/gephi\/wiki\/Datasets\">this page<\/a><\/span>\u00a0<\/strong>of datasets, it turns that i found my interest. There are datasets of <strong>Web and internet<\/strong>, <strong>Social network<\/strong>, <strong>Biological networks<\/strong>, <strong>Infrastructure networks<\/strong> and <strong>Other works<\/strong>. I chose to investigate more on Word adjacencies, which is an &#8220;adjacency network of common adjectives and nouns in the novel David Copperfield by Charles Dickens. Please cite M.E.J.Newman, Phys.Rev.E 74, 036104 (2006)&#8221;<\/p>\n<p>The source file is named &#8220;word_adjacencies.gml&#8221; . Since Gephi can open file with <span style=\"color: #7fe070\"><a style=\"color: #7fe070\" href=\"https:\/\/gephi.org\/users\/supported-graph-formats\/\"><strong>many forms<\/strong><\/a><\/span> including .gml, so I just need to open the program then import the file as a source.<\/p>\n<h2 style=\"text-align: center\"><em><span style=\"color: #7fe070\">Methods<\/span><\/em><\/h2>\n<p>The goal of Gephi is to &#8220;help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during date sourcing&#8221;. In order to make reasonable hypothesis, I reviewed <span style=\"color: #7fe070\"><a style=\"color: #7fe070\" href=\"https:\/\/en.wikipedia.org\/wiki\/David_Copperfield\">wikipedia<\/a><\/span>\u00a0for a general idea of this novel. Then I have the following questions or hypothesis:<\/p>\n<ol>\n<li>The author Charles hold this novel as the&#8221;favorite child&#8221;, Can I see this through the network?<\/li>\n<\/ol>\n<p>2. This novel contains descriptions based on Charles&#8217; own childhood experience, can I see it?<\/p>\n<h2 style=\"text-align: center\"><em><span style=\"color: #7fe070\">Results<\/span><\/em><\/h2>\n<p>In the &#8220;Data Laboratory tab&#8221;, it can be seen that, there are 112 nodes with ID numbered from 0 to 111. As described in the name of the dataset, these 112 words are all common adjectives and nouns. Such as: agreeable, man, old, person, anything,short&#8230;..<\/p>\n<p>Also in the Data Table in Data Laboratory of Gephi, it can be seen that value of each words are defined: adjective is defined as 0, noun is defined as 1. Besides, degree value is the sum of both In- Degree and Out- Degree.<\/p>\n<p>The highest score of In- degree is 40, and the word is &#8220;little&#8221;; The highest score of Out- degree is 20, and the words are &#8220;same&#8221; and &#8220;good&#8221;; Overall, the top five highest score of Degree is 49: little, 33:old, 28:other,28:good and 21:same.<\/p>\n<h2 style=\"text-align: center\"><em><span style=\"color: #7fe070\">Interpretation<\/span><\/em><\/h2>\n<pre style=\"text-align: center\"><strong>filter:Degree&gt;5\r\nA general map of common words can be seen.\r\nThe word \"little\"has the highest degree of connection.<\/strong><\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"wp-image-11397 size-full aligncenter\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/screenshot_121005.png?resize=840%2C840\" alt=\"\" width=\"840\" height=\"840\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/screenshot_121005.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/screenshot_121005.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/screenshot_121005.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/screenshot_121005.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignleft wp-image-11404\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/in-degree5-1.png?resize=512%2C512\" alt=\"\" width=\"512\" height=\"512\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/in-degree5-1.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/in-degree5-1.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/in-degree5-1.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/in-degree5-1.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 512px) 100vw, 512px\" \/><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-11408\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree5.png?resize=512%2C512\" alt=\"\" width=\"512\" height=\"512\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree5.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree5.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree5.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree5.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 512px) 100vw, 512px\" \/><\/p>\n<p>&nbsp;<\/p>\n<pre><strong>Filter:In-degree&gt;5\r\nCommon Words:\r\nlittle, other, old, good<\/strong><\/pre>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<pre style=\"text-align: right\"><strong>Filter:Out-degree&gt;5\r\nCommon Words:\r\nLittle, other, good, same<\/strong><\/pre>\n<p>Compared with the last graph, the word &#8220;old&#8221; is missing, and the word &#8220;same&#8221; is here with higher out-degree. It can be estimated that less words adjacent with word&#8221;old&#8221; , such as &#8220;little old&#8221;, but more words adjacent with &#8220;old&#8221;, such as &#8220;old friend&#8221;,&#8221;old man&#8221;&#8230;<\/p>\n<p>&nbsp;<\/p>\n<h2><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"wp-image-11410 alignleft\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/indegree8.png?resize=512%2C512\" alt=\"\" width=\"512\" height=\"512\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/indegree8.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/indegree8.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/indegree8.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/indegree8.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 512px) 100vw, 512px\" \/><\/h2>\n<p>&nbsp;<\/p>\n<pre><strong>In-degree &gt; 8<\/strong><\/pre>\n<p>When focused on nodes with higher degree(in&gt;8 and out&gt;6). It is easier to research more on the direction of the connection.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"wp-image-11411 alignright\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree6.png?resize=512%2C512\" alt=\"\" width=\"512\" height=\"512\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree6.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree6.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree6.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/outdegree6.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 512px) 100vw, 512px\" \/><\/h2>\n<p>&nbsp;<\/p>\n<pre><strong>Out-degree &gt;6<\/strong><\/pre>\n<p>All these layout are based on &#8220;Fruchterman Reingold&#8221;, it turns the graph of netword into a round display, which is simple and direct to show network with lower modularity class(&lt;1).<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2><\/h2>\n<pre><strong>Out-degree &gt;6<\/strong><\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"wp-image-11424 alignleft\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/colored-degree.png?resize=512%2C512\" alt=\"\" width=\"512\" height=\"512\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/colored-degree.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/colored-degree.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/colored-degree.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/colored-degree.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 512px) 100vw, 512px\" \/><\/p>\n<p>I adjusted color of nodes and size of nodes to show a better image of\u00a0 more information. I also switched the layout function to &#8220;ForceAtlas 2&#8243;<\/p>\n<p>In this layout, nodes or words with the same degree of network are colored the same, as well as the size.<\/p>\n<p>Below is a video\/gif of flashing displays under function&#8221;ForceAtlas 2&#8221;.<\/p>\n<div class=\"video-wrapper\"><div style=\"width: 510px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-11299-1\" width=\"510\" height=\"480\" loop autoplay preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"http:\/\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/Untitled-Project.mp4?_=1\" \/><a href=\"http:\/\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/Untitled-Project.mp4\">http:\/\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/Untitled-Project.mp4<\/a><\/video><\/div><\/div>\n<h2><span style=\"color: #7fe070\">Reflection<\/span><\/h2>\n<p>I wondered how did the source data generated these adjectives and nouns. I guess they need to have a collection or dictionary of all the adjectives and nouns to compare, then they generate lists of both adjectives and nouns used in this novel. They also need to set a standard of how many times of appearance can be considered &#8220;common&#8221;. Optionally, it might be necessary that to hold a balance between nouns and adjectives. Since this is a network analysis, if there are too many adjectives or too few adjectives, it may not be qualified for a network analysis and visualization.<\/p>\n<p>Gephi is an open-source network analysis and visualization software. I used Windows to run this software and when I tried to open the working file (file named with .gephi ) I in the gephi software of Mac OS, it opened but with a blank layout. Thus, I switched back to Windows and went on with my process.<\/p>\n<p>I am also interested in comparing the result with other works of Charles Dickens, maybe we can see the difference between &#8220;most loved child&#8221; and &#8220;the other children&#8221; . I am also investigating in grouping those nodes by degree, there is just not an end of playing with gephi.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction On this page, I used Gephi as a visualization tool to see the common words(adjectives and nouns) in the novel &#8220;David Copperfield&#8221; . Inspiration It brought delightment to me that things I learnt in Course 601: Foundation of Information connect to what I am doing in this project in Course 658: Data Visualization. Dr.&hellip;<\/p>\n","protected":false},"author":559,"featured_media":11397,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[149,342],"tags":[],"coauthors":[321],"class_list":["post-11299","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-labs","category-networks"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/11\/screenshot_121005.png?fit=1024%2C1024&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-2Wf","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/11299","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/559"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=11299"}],"version-history":[{"count":9,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/11299\/revisions"}],"predecessor-version":[{"id":11452,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/11299\/revisions\/11452"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media\/11397"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=11299"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=11299"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=11299"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=11299"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}