{"id":9172,"date":"2018-04-19T07:57:18","date_gmt":"2018-04-19T11:57:18","guid":{"rendered":"http:\/\/studentwork.prattsi.org\/infovis\/?p=9172"},"modified":"2018-05-02T01:45:45","modified_gmt":"2018-05-02T05:45:45","slug":"visualizing-the-content-of-science-fiction-with-gephi","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/labs\/visualizing-the-content-of-science-fiction-with-gephi\/","title":{"rendered":"Visualizing the Content of Science Fiction with Gephi"},"content":{"rendered":"<p>According to <a href=\"https:\/\/www.goodreads.com\/genres\/science-fiction\" target=\"_blank\" rel=\"noopener\"><em>Goodreads<\/em><\/a>, \u201c<em>Science fiction (abbreviated SF or sci-fi with varying punctuation and capitalization) is a broad genre of fiction that often involves speculations based on current or future science or technology.<\/em>\u201d It is a very popular genre of fiction which often includes some imaginative concepts and explores the potential consequences of scientific and other innovations.<\/p>\n<p>Within the genre, there are also many topics of science fiction, such as spaceflight, timetravel, robots, etc. It will be very interesting to see which topics are the most popular and how they are related to or overlapped with each other.<\/p>\n<h2><strong>Data Sources and Inspirations<\/strong><\/h2>\n<p>I found a dataset from <a href=\"http:\/\/www.casos.cs.cmu.edu\/tools\/datasets\/internal\/index.php\" target=\"_blank\" rel=\"noopener\"><em>CASOS<\/em><\/a> collected by Dr. Kathleen M. Carley, which includes the date, author, century, author gender and content of story of 157 Sci-Fi books. I intentionally used the data of \u201ccontent of story\u201d and <a href=\"https:\/\/gephi.org\/\" target=\"_blank\" rel=\"noopener\"><em>Gephi<\/em><\/a> to create a network visualization about the content of science fiction.<\/p>\n<p>Before jumping into creating the visualization, I collected three Gephi examples for inspirations. I like them having different sizes and colors to visualize the connections and grouping, and annotated labels for readability.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-9174\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?resize=840%2C223\" alt=\"\" width=\"840\" height=\"223\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?w=2653&amp;ssl=1 2653w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?resize=300%2C80&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?resize=768%2C204&amp;ssl=1 768w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?resize=1024%2C272&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?w=1680 1680w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Gephi-Inspirations.jpg?w=2520 2520w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/p>\n<h2><strong>Tools and Process<\/strong><\/h2>\n<p>To create the network visualization for the content of science fiction, I used <em>Excel<\/em> and <em><a href=\"https:\/\/www.r-project.org\/\" target=\"_blank\" rel=\"noopener\">R<\/a><\/em> to clean the dataset, and then used <em><a href=\"https:\/\/gephi.org\/\" target=\"_blank\" rel=\"noopener\">Gephi<\/a><\/em> to create the visualization.<\/p>\n<p>At first, I used <em>Excel<\/em> to remove the irrelevant data in the original dataset, just leaving the \u201ccontent of story\u201d. There are 11 kinds of content identified in the dataset: <em>robots, androids or AI computers \/ battles \/ romance \/ magic \/ time travel \/ interplanetary \/ multi-species, sentient species \/ beasts \/ psychic powers \/ novel technology (not <\/em>AIish<em>, ex. <\/em>steam based<em> technology is considered novel) \/ after catastrophe &#8211; often post <\/em>apopolyptic. Each fiction has different rates (0-4) for each content.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-9175\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.53.39-PM.png?resize=840%2C392\" alt=\"\" width=\"840\" height=\"392\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.53.39-PM.png?w=1646&amp;ssl=1 1646w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.53.39-PM.png?resize=300%2C140&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.53.39-PM.png?resize=768%2C358&amp;ssl=1 768w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.53.39-PM.png?resize=1024%2C478&amp;ssl=1 1024w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/p>\n<p>In this project, I put away the fiction title and rates, and replace the rates (1-4, except for 0) with the content label and saved the file as CSV file in Excel.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-9176\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.55.35-PM.png?resize=840%2C429\" alt=\"\" width=\"840\" height=\"429\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.55.35-PM.png?w=1728&amp;ssl=1 1728w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.55.35-PM.png?resize=300%2C153&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.55.35-PM.png?resize=768%2C392&amp;ssl=1 768w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-6.55.35-PM.png?resize=1024%2C523&amp;ssl=1 1024w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/p>\n<p>Then I used <em><a href=\"https:\/\/www.r-project.org\/\" target=\"_blank\" rel=\"noopener\">R<\/a><\/em> to permute the network data and wrote out the weighted edgelist as a new CSV file. Now the dataset is ready for importing to <em><a href=\"https:\/\/gephi.org\/\" target=\"_blank\" rel=\"noopener\">Gephi<\/a><\/em> to create the network visualization.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-9177\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-7.04.28-PM.png?resize=524%2C290\" alt=\"\" width=\"524\" height=\"290\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-7.04.28-PM.png?w=524&amp;ssl=1 524w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-7.04.28-PM.png?resize=300%2C166&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-7.04.28-PM.png?resize=360%2C200&amp;ssl=1 360w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Screen-Shot-2018-04-18-at-7.04.28-PM.png?resize=450%2C250&amp;ssl=1 450w\" sizes=\"auto, (max-width: 524px) 100vw, 524px\" \/><\/p>\n<h2><strong>Results and Analysis<\/strong><\/h2>\n<p>In <em><a href=\"https:\/\/gephi.org\/\" target=\"_blank\" rel=\"noopener\">Gephi<\/a><\/em>, I ran <em>Average Degree<\/em> and <em>Avg. Weighted Degree<\/em> to create the visualization with expansion layout, from which shows the most-common content in the larger nodes: <em>Battles, Novel Technology, Interplanetary <\/em>and<em> Romance, followed by Multi-species, Psychic Powers, After Catastrophe, Beasts, and Robots, Androids or AI <\/em>Computers.<\/p>\n<p>To further identify how different content are related to each other. I ran \u201c<em>Modularity<\/em>\u201d to see which content tend to appear together. The result shows there are majorly three groups:<\/p>\n<p>1. (<em>Orange Nodes<\/em>) Novel Technology, Interplanetary, Multi-species, and Robots, Androids or AI Computers;<\/p>\n<p>2. (<em>Purple Nodes<\/em>) Battles, Psychic Powers, Beasts, Magic, and Time Travel;<\/p>\n<p>3. (<em>Green Nodes<\/em>) Romance and After Catastrophe.<\/p>\n<p>It means the content within each group is often mixed with other content in the same group.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-9178\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Sci-Fi11.png?resize=840%2C840\" alt=\"\" width=\"840\" height=\"840\" srcset=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Sci-Fi11.png?w=1024&amp;ssl=1 1024w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Sci-Fi11.png?resize=150%2C150&amp;ssl=1 150w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Sci-Fi11.png?resize=300%2C300&amp;ssl=1 300w, https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Sci-Fi11.png?resize=768%2C768&amp;ssl=1 768w\" sizes=\"auto, (max-width: 840px) 100vw, 840px\" \/><\/p>\n<h2><strong>Future Direction<\/strong><\/h2>\n<p>Besides the content data, there is also \u201cdate\u201d data in the dataset which refers to the date the science fiction first published (if it is a series, date listed is the date first book was published). I believe it will be meaningful to separate the dataset based on the century and run the analysis again to see the popularity and relation of content in different centuries.<\/p>\n<p>Network visualization is an interesting while relatively complicated kind of information visualization. It requires great understanding of both data and the tools. I would like to explore more features of Gephi in the future, so I can create more in-depth visualization around the topic, for example, trying more the layout and metrics features. Creating real-time and interactive visualization are also the directions that I want to explore more, because it provides flexibility for the audience to find out the information they want. It&#8217;s also important to learn more about data cleanup using tools like R and OpenRefine, because how the data files look like directly affect the visualization result.<\/p>\n<p>By going forward, I will try different kinds of dataset to explore more about network visualization. For example, the social media data and character relationship of fiction or tv show. Furthermore, after learning different kinds of visualization, I hope I can use different tools at the same time to gain more insights from one dataset or project.<\/p>\n<p><strong>References:<\/strong><\/p>\n<ol>\n<li>Sci-Fi Books Dataset:\u00a0<a href=\"http:\/\/www.casos.cs.cmu.edu\/tools\/datasets\/internal\/index.php\" target=\"_blank\" rel=\"noopener\">http:\/\/www.casos.cs.cmu.edu\/tools\/datasets\/internal\/index.php<\/a><\/li>\n<li>Science Fiction Intro:\u00a0<a href=\"https:\/\/www.goodreads.com\/genres\/science-fiction\" target=\"_blank\" rel=\"noopener\">https:\/\/www.goodreads.com\/genres\/science-fiction<\/a><\/li>\n<li>First Design Example:\u00a0<a href=\"http:\/\/www.martingrandjean.ch\/network-visualization-shakespeare\/\" target=\"_blank\" rel=\"noopener\">http:\/\/www.martingrandjean.ch\/network-visualization-shakespeare\/<\/a><\/li>\n<li>Second Design Example:\u00a0<a href=\"https:\/\/noduslabs.com\/courses\/network-visualization-and-analysis-with-gephi\/units\/section-3-network-visualization-and-analysis-case-study\/page\/8\/?try\" target=\"_blank\" rel=\"noopener\">https:\/\/noduslabs.com\/courses\/network-visualization-and-analysis-with-gephi\/units\/section-3-network-visualization-and-analysis-case-study\/page\/8\/?try<\/a><\/li>\n<li>Third Design Example:\u00a0<a href=\"https:\/\/health-policy-systems.biomedcentral.com\/articles\/10.1186\/s12961-016-0104-5\" target=\"_blank\" rel=\"noopener\">https:\/\/health-policy-systems.biomedcentral.com\/articles\/10.1186\/s12961-016-0104-5<\/a><\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>According to Goodreads, \u201cScience fiction (abbreviated SF or sci-fi with varying punctuation and capitalization) is a broad genre of fiction that often involves speculations based on current or future science or technology.\u201d It is a very popular genre of fiction which often includes some imaginative concepts and explores the potential consequences of scientific and other&hellip;<\/p>\n","protected":false},"author":257,"featured_media":9178,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[149],"tags":[],"coauthors":[],"class_list":["post-9172","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-labs"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infovis\/wp-content\/uploads\/sites\/3\/2018\/04\/Sci-Fi11.png?fit=1024%2C1024&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-2nW","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/9172","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/257"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=9172"}],"version-history":[{"count":10,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/9172\/revisions"}],"predecessor-version":[{"id":9191,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/9172\/revisions\/9191"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media\/9178"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=9172"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=9172"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=9172"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=9172"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}