{"id":6784,"date":"2017-06-19T13:38:51","date_gmt":"2017-06-19T17:38:51","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=6784"},"modified":"2017-06-19T13:38:51","modified_gmt":"2017-06-19T17:38:51","slug":"superheroes-social-cliques","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/superheroes-social-cliques\/","title":{"rendered":"Superheroes and Social Cliques"},"content":{"rendered":"<p><strong>Tyler Dennis<\/strong><\/p>\n<p><strong>Dr. Sula<\/strong><\/p>\n<p><strong>Information Visualization &#8211; Gephi Lab<\/strong><\/p>\n<p><strong>\u201cMarvel Chronology Project\u201d<\/strong><\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_6785\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/gehpi-1.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6785\" class=\"size-medium wp-image-6785\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/gehpi-1-620x523.png?resize=620%2C523\" alt=\"\" width=\"620\" height=\"523\" \/><\/a><p id=\"caption-attachment-6785\" class=\"wp-caption-text\">Final Gephi Visual<\/p><\/div>\n<p><strong>\u00a0<\/strong><\/p>\n<p><strong>Introduction:<\/strong><\/p>\n<p>The dataset used for this, the Gephi lab, was exhaustive and super ambitious in size and scope. It was curated by the <em>Marvel Chronology Project<\/em>, which aims to \u201ccatalog every\u00a0<strong>actual<\/strong>\u00a0appearance by every significant character in the Marvel universe, and place them in their proper chronological order.\u201d Characters that co-exist closely and most often, for example, are linked together. Those with the most occurrences together are closest. Those with the most links to other characters are more central and prominent within clusters. Characters that are frequent co-stars in a series are grouped together, with the most popular characters in each cluster forming the nucleus of the cluster. Characters with the most connections are biggest, and pull lesser-networked bubbles into their cluster. Obscure characters or versions of popular characters that only existed in alternate universes (the dataset is <em>that<\/em> auspicious) appear along the outskirts of the visual.<\/p>\n<p><strong>Inspiration:<\/strong><\/p>\n<p>Looking at examples of linked social data and knowing the large amount of data I had to work made it clear that I&#8217;d probably have a graph of entities clustered due to how connectivity. This is an effective way of representing large datasets like the Marvel one. I wanted my visualization would look something like this one, found on wikimedia. Nothing too flashy, something monochromatic and simple. Having the bubbles be unlabelled, with a graph as expansive as mine, felt cruel. No one wants to hover over as many bubbles as are in my visualization.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Social_Network_Analysis_Visualization.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-6799\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Social_Network_Analysis_Visualization-620x462.png?resize=620%2C462\" alt=\"\" width=\"620\" height=\"462\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Materials and Methods:<\/strong><\/p>\n<p>The original data had 10,469 nodes (representing the amount of characters) and 178, 115 edges (number of linkages between characters). Not only does the dataset aim to capture just major characters, but also not-so-major ones, I found. The amount of data in this set made it hard for me to actually work with it. The data didn\u2019t do so well on my Windows computer and it didn\u2019t work too great on the desktop computers at Pratt. I knew then that I had to edit the data down significantly.<\/p>\n<p>This led me to my roommates much-newer, very lovely Apple computer. At first I used it to actually open the dataset. \u00a0After that, I whittled the nodes and edges down in the data laboratory tab. This was easy, since the data felt intuitively ranked by characters with most connections at the top to characters with the least at the bottom. By the end of this, the node count had gone from 10,469 to 553. The edges were now at 11,472. This felt much more manageable and less stress-inducing. I found it hard to care about so much about the deleted data&#8211;much of which was one-time characters from the seventies with names like \u201cRubber Man\u201d or \u201cNeon Ant.\u201d<\/p>\n<p>The strategy for whittling down this incredibly-expansive dataset was to delete characters with few linkages to others. I attached amount of linkages with relevance and deleted those that were fit for this category. These were obscure characters, and I didn\u2019t feel bad deleting them. There is such a wealth of data in the Marvel Chronology Project; I knew any amount I left, no matter how much had been deleted, would tell a good story that I could work with.<\/p>\n<p>By the end of my deletions, important characters and sensible connections between them are overwhelmingly in tact. Characters who coexist in supergroups are close to one another, for example, in the visual. Member of groups like the Avengers and The X-Men, etc. \u00a0are clustered together, naturally. On the other hand, characters who are larger than a single comic title (think heavy-hitters like Iron Man or Wolverine) are shown as being crossover characters who frequently interact with people outside of the comic they primarily appear in. The visual makes a lot of assumptions about the way these characters relate to one another, and it gets a lot of thing right despite me butchering the dataset.<\/p>\n<p>After I deleted that data, there were still issues. This time in the form of occlusion\u2014all my bubbles were clustered around one super-popular character until I tweaked the scaling and the gravity.\u00a0 I still hated how the graph looked after this. Aiding in the occlusion were intrusively popular characters who \u00a0seemed to have all of the characters drawn to them. It created a \u201ccenter-of-the-universe\u201d graph in which the most-connected character was the over-crowded nucleus in a very confusing, bubble-manic graph.<\/p>\n<p>Deleting nodes with the most interconnectedness made the graph more dynamic. When there wasn\u2019t such a clear front-runner with regards to interconnectedness, the graph became more interesting. Sacrificing a larger, obvious piece of data felt justified so that more-interesting connections could be shown more clearly. When you have few characters that are overwhelmingly well-connected in these visuals, they end up being represented by huge bubbles which make all the other characters appear super small. Deleted data related to the top most-connected characters made the graph more dynamic.<\/p>\n<p>When the amount of superheroes felt right-meaning it was no longer crashing my old computer- I plugged the data sheet into Gephi and got something I wasn\u2019t initially pleased with (early incarnations of my unsightly first drafts will conclude this lab reports). The data was still very occluded. The character bubbles all gravitated to the most linked character. To change this, I played around with statistical settings\u2014functions like modularity and density, but they only drew the character bubbles closer together, making it harder to make sense of an already large and potentially very-confusing graph. I found that the simplest settings, with minimal tweaks in layout and attributes, created the simplest, most cohesive graph. \u00a0It felt lie Fruchterman-Reingold did most of the work honestly. At the conclusion of this lab, I felt like Gephi was a master of me and not the other way around.<\/p>\n<p><strong>Results:<\/strong><\/p>\n<p>There should be a key with this graph. As it stands, you would have to have a human guide giving some exposition on what it all means\u2014at least briefly. This is especially true if the viewer doesn\u2019t know comics. Or maybe the bubbles speak for themselves\u2014who knows?\u00a0 Fruchterman-Reingold seems very no frills. The settings under that presented my data in the way that made the most sense.<\/p>\n<p>I always start by making my graphs look crazy with bizarre color schemes. Even though I did that this time as well, I ultimately found that simple is best when you are representing something this large. Tt\u2019s best to keep color innovation at a minimum when you are working with 11,000 pieces of networked data among 500 nodes. But maybe I could\u2019ve used a more bold color than the purples.<\/p>\n<p>I had trouble with Gephi, as was aforementioned, because the dataset was so large. At first, I didn\u2019t really enjoy working with datasets or Gephi because I didn\u2019t feel like I could customize my graph beyond really superficial things like color or size or tinkering with line widths. And since the dataset <em>was <\/em>so large, I kind of felt like it had more control over the final result of my lab that I did, in a way.By the end of working on this lab report, I had discovered that you can download other applications that make Gephi more customizable. That aside, I also realize now that Gephi is essential for large datasets like this. It would be impossible to map huge mines of social data without graphs like these. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 On a final, superficial note: \u00a0I saw many relational network databases online at the time of writing this report. It struck me that the ones with bold colors representing the \u201cedge\u201d lines looked best against black backgrounds. Seeing these images makes mine feel kind of boring by comparison. In a future revision, I would make mine look sleeker and more eye-catching by copying that.<\/p>\n<p><strong>Evolution of the Visual<\/strong><\/p>\n<p><em>A: The Data<\/em><\/p>\n<div id=\"attachment_6786\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/gephi-2.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6786\" class=\"size-medium wp-image-6786\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/gephi-2-620x285.png?resize=620%2C285\" alt=\"\" width=\"620\" height=\"285\" \/><\/a><p id=\"caption-attachment-6786\" class=\"wp-caption-text\">Characters matched with ID number.<\/p><\/div>\n<p><strong>\u00a0<\/strong><\/p>\n<div id=\"attachment_6787\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Gephi-data.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6787\" class=\"size-medium wp-image-6787\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Gephi-data-620x285.png?resize=620%2C285\" alt=\"\" width=\"620\" height=\"285\" \/><\/a><p id=\"caption-attachment-6787\" class=\"wp-caption-text\">Source, Target, and Type!<\/p><\/div>\n<p><strong>\u00a0<\/strong><\/p>\n<p><em>B: Early Draft<\/em><\/p>\n<p><em>\u00a0<a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Gephi-3.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-6788\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Gephi-3-620x377.png?resize=620%2C377\" alt=\"\" width=\"620\" height=\"377\" \/><\/a><\/em><\/p>\n<p><strong>Source for dataset:<\/strong>\u00a0<a href=\"http:\/\/www.chronologyproject.com\/\">http:\/\/www.chronologyproject.com\/<\/a><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Tyler Dennis Dr. Sula Information Visualization &#8211; Gephi Lab \u201cMarvel Chronology Project\u201d &nbsp; \u00a0 Introduction: The dataset used for this, the Gephi lab, was exhaustive and super ambitious in size and scope. It was curated by the Marvel Chronology Project, which aims to \u201ccatalog every\u00a0actual\u00a0appearance by every significant character in the Marvel universe, and place&hellip;<\/p>\n","protected":false},"author":450,"featured_media":6785,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[102,39,5,48,103],"coauthors":[],"class_list":["post-6784","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization","tag-comic-books","tag-gephi","tag-information-visualization","tag-networks","tag-superheroes"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1Lq","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/6784","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/450"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=6784"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/6784\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=6784"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=6784"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=6784"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=6784"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}