{"id":6795,"date":"2017-06-19T19:17:07","date_gmt":"2017-06-19T23:17:07","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=6795"},"modified":"2017-06-19T19:17:07","modified_gmt":"2017-06-19T23:17:07","slug":"screamintothevoid-far-twitter-network-reach","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/screamintothevoid-far-twitter-network-reach\/","title":{"rendered":"#ScreamIntoTheVoid: How Far Does Your Twitter Network Reach?"},"content":{"rendered":"<h4>Introduction<\/h4>\n<p>Digital technology (DT) has revolutionized communications. Coupled with the boundless potential of the internet, DT promises to continue redefining the standards for engagement in the public sphere. In particular, social networking sites like Twitter, Reddit, Facebook, Instagram, and Tumblr offer users an unprecedented level of access to the world around them. Twitter, being the site I use most frequently, is the data source for this lab.<\/p>\n<p>Having the ability to transcend distance (Kadushin, 2012, p. 18) and easily locate users with similar interests (Kadushin, 2012, p. 19-20) is key in the formation of online communities on Twitter. The nature and functionality of these groupings, which are of increasing interest to social scientists and communications scholars, is ripe for analysis should the necessary data be collected (Kadushin, 2012, p. 4). Fortunately, software exists that can mine, manipulate, and analyze the data, while network visualizations provide useful graphics to facilitate the formation of key conclusions (Krempel, 2011, p. 560).<\/p>\n<h4>Discussion<\/h4>\n<p>True to my individualist Western socialization, I was interested to see a visualization of my own Twitter network. The wealth of data available from social networking sites is unfathomable, so to start, I needed a way to more finely comb through the site. In his blog post \u201cJust Landed: Processing, Twitter, Metacarta &amp; Hidden Data,\u201d artist and educator Jer Thorp shares a thought which greatly informed my data collection method. \u00a0In \u201cthinking about the data that is hidden in various social network information streams,\u201d Thorp \u201cwondered if it would be possible to extract\u2026 information from people\u2019s public Twitter streams by searching for [a specific] term\u201d (Thorp, 2009). Thorp rightly postulates that (perhaps) unbeknownst to themselves, users of social networking sites share pertinent information in the text of their posts that remain \u201chidden\u201d during more precise searches. For example, if one were wanting to plot the distance certain Twitter users travel when they fly, querying the phrase \u201cJust landed in\u201d and collecting those tweets\u2019 geospatial data could produce more records more easily than if one were to rely on a more specific approach (like searching for tweets whose location is within a certain distance from the known location of an airport). (Thorp, 2009)<\/p>\n<div id=\"attachment_6997\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/blog.blprnt.com\/blog\/blprnt\/just-landed-processing-twitter-metacarta-hidden-data\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6997\" class=\"size-medium wp-image-6997\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/3521509776_7c62509b1b_o-620x375.png?resize=620%2C375\" alt=\"Visualization of Thorp's Experiment\" width=\"620\" height=\"375\" \/><\/a><p id=\"caption-attachment-6997\" class=\"wp-caption-text\">(Thorp, 2009)<\/p><\/div>\n<p>Following Thorp\u2019s example, I thought about the phrases my followers, the users I follow, and I use most frequently in our tweets. I decided to use the slang term \u201csis,\u201d whose meaning in AAVE is expanded to encompass and surpass the commonly known usage as an abbreviation of \u201csister.\u201d<\/p>\n<p>When visualizing Twitter conversations, the common layout is a series of concentric circles. The outer circles are often more populated, but the nodes have fewer edges between them. Moving towards the center, the nodes increase their average number of edges but the area between them is usually greater. This cluster pattern reveals the largest or most extensive interactions happening around a certain subject or query over a given period of time; it also makes plain the users who are most popular, most active, or most commonly referenced within a given scope. The following visualizations, though representing a diverse array of subjects, all follow the same general spatial arrangement of nodes and edges.<\/p>\n<div id=\"attachment_6998\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/mattkushin.com\/2015\/10\/04\/hokies-tweets-network-visualization-how-i-extracted-tweets-via-tags-6-and-visualized-them-in-gephi\/\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6998\" class=\"size-medium wp-image-6998\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/hashtaghokies-tweets-620x620.png?resize=620%2C620\" alt=\"Kushin visualization\" width=\"620\" height=\"620\" \/><\/a><p id=\"caption-attachment-6998\" class=\"wp-caption-text\">A visualization of tweets containing the hashtag #Hokies (Kushin, 2015)<\/p><\/div>\n<p>&nbsp;<\/p>\n<div id=\"attachment_6999\" style=\"width: 605px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/carnegieendowment.org\/2015\/12\/20\/sectarian-twitter-wars-sunni-shia-conflict-and-cooperation-in-digital-age-pub-62299\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6999\" class=\"size-full wp-image-6999\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/siegel-vis.jpg?resize=605%2C425\" alt=\"siegel vis\" width=\"605\" height=\"425\" \/><\/a><p id=\"caption-attachment-6999\" class=\"wp-caption-text\">A visualization of retweets reflecting Anti-Sunni sentiments (Siegel, 2015)<\/p><\/div>\n<p>&nbsp;<\/p>\n<h4>Materials<\/h4>\n<p>For this lab I relied on a computer with internet connection and the following programs: Microsoft Excel 2013, NodeXL, Twitter, OpenRefine, Gephi 0.9.1. To recreate this lab you must have your own Twitter account.<\/p>\n<h4>Methods<\/h4>\n<p>The Social Media Research Foundation created a tool called NodeXL to collect, analyze, and visualize data from Twitter and Facebook. I used NodeXL as data collection tool only.<\/p>\n<p>NodeXL operates as a plugin to Microsoft Excel. Once Excel was running, I selected the NodeXL tab in the control bar to view the program options. The main feature available in the free version of NodeXL is \u00a0importing data from a personal, public Twitter account. I followed the program prompts and logged in to my Twitter. Next, I entered my query parameters. I wanted all tweets in my network that contained \u201csis\u201d and entered that string in the given field, but NodeXL allows searches for specific users and hashtags as well.<\/p>\n<p>The results are automatically opened in multiple sheets of a blank workbook, along with NodeXL\u2019s own visualization and metric calculations displayed in a sidebar. I intended to visualize my data with Gephi, so I saved the sheet of edges as a .csv file and imported it to OpenRefine. The data cleaning process involved deleting columns containing extraneous data; NodeXL collects tweet url, geospatial data, full text of the tweet, a separate column with any hyperlinks found in the text, and several other fields. To be compatible with Gephi, the table needed at least three fields in this order: Source, the sender of the tweet; Target, the user(s) mentioned in the tweet; and Type, a field that enables Gephi to calculate network metrics, limited to Directed or Undirected. I chose to keep the NodeXL field titled \u201cRelationship.\u201d The values in this field are \u201cMentions,\u201d \u201cReplies To,\u201d and \u201cTweet.\u201d These terms tell what kind of interaction the users in that tweet had. I exported the clean data as a .csv and imported the file to Gephi.<\/p>\n<p>I followed the Gephi import wizard protocol for an edges file, then rearranged columns where necessary. In the overview tab, I calculated Average Degree, Avg. Weighted Degree, Network Diameter, Graph Density, Modularity, Ave. Clustering Coefficient, and Clustering Coefficient. I ran a ForceAtlas 2 layout and then a Fruchterman Reingold. The nodes were in their final relative locations but were too spread out so increased gravity to draw them closer together. I color coded the edges by the \u201cRelationship\u201d type and tied node size to degree.<\/p>\n<p>In Preview I added node labels, replacing the circles with the text of the corresponding user\u2019s Twitter handle. I changed the background to Black so that the color coded edges were easier to see and adjusted the relative sizing of the font to improve readability.<\/p>\n<p>I attempted to use the SigmaJS plugin for Gephi to export an interactive version of the visualization, but the vis did not appear in the folder containing the relevant files. This could be due to my version of Gephi being incompatible with the latest iteration of the plugin. Instead, I downloaded the vis as an image file.<\/p>\n<p>An important note: Gephi crashes often, particularly with larger datasets, so I saved a copy of the project after every step.<\/p>\n<h4>Results<\/h4>\n<div id=\"attachment_7002\" style=\"width: 620px\" class=\"wp-caption alignnone\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7002\" class=\"size-medium wp-image-7002\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Sis-620x877.png?resize=620%2C877\" alt=\"my final visualization\" width=\"620\" height=\"877\" \/><p id=\"caption-attachment-7002\" class=\"wp-caption-text\">A visualization of all the tweets in my network containing the string &#8220;sis&#8221; between June 5, 2017 and June 12, 2017<\/p><\/div>\n<p>\u201cEncoding numerical information into visual layers\u201d (Krempel, 2011, p. 562) aids the viewer in their processing of information. In my visualization, the size of the node, represented by the Twitter user\u2019s handle, corresponds to the number of tweets containing \u201csis\u201d in which that handle appears: more mentions equals a larger font size. This encoded representation of the quantitative qualities of the data is easily decoded by viewers because there exists a directly proportional relationship between the \u201cmagnitude of a physical stimulus\u201d and \u201cits perceived intensity or strength\u201d in human perception (Krempel, 2011, p. 563). Accordingly, the nodes with the greater centrality are also written in larger font, in addition to being located in the middle of the vis.<\/p>\n<p>Lines represent the edges of the network, or the tweets themselves. The number of lines connecting to a particular node conveys that node\u2019s betweenness (Osphal, 2011), thus nodes with a high measure of betweenness are bridges. The color of the line tells what kind of Twitter action occurred between the users. \u201cMentions\u201d are purple. These tweets involve the sender interacting with users who have replied to the tweet of an original poster (OP). A mention can also include the OP and any other users who have not replied to the OP and are instead included at the sender\u2019s discretion. \u201cReplies to\u201d are orange. A Reply results when a sender selects the option to tweet directly to an OP. A \u201cTweet\u201d is green. This category contains tweets which are not a part of any \u201cthread,\u201d or series of related tweets. They represent a single, unidirectional interaction.<\/p>\n<p>A complication for my visualization is all the smaller conversations, mostly dyads, which create confusion for viewers who have no context for the graphic (Kadushin, 2012, p. 27). These brief, isolated interactions within my Twitter network form a thick ring around the clusters in the center. Related is the ethics of my pulling this user data without consent. NodeXL only searched through public accounts connected to mine, but if these accounts are tweeting the queried term with users outside that scope, those handles appear in the dataset as well. A more refined data collection method would address both concerns.<\/p>\n<h4>Future Directions<\/h4>\n<p>Narrowing my scope with a more specific search term would decrease the size of my dataset and consequently produce a more manageable visualization. This end could also be achieved by searching for my own handle in my network\u2019s tweets or querying viral hashtags that are common on my timeline. Additionally, time series data could be used to track the most active members in my Twitter community.<\/p>\n<p>Kadushin asserts that \u201cthe social statuses, positions, and social institutions\u201d linking nodes outside of the visualization\u2019s intended scope \u201ccan themselves be regarded as connected networks\u201d which \u201care constantly emerging and as a result affect and change\u201d the original network (Kadushin, 2012, p. 11). This statement supports applying a color coding of the edges according to a larger network structure to provide wider context for the visualization. Color could also be used to communicate multiplex relationships between nodes (Kadushin, 2012, p. 26). Understanding how users are related to each other outside of the Twitterverse would also inform how these users interact in the platform.<\/p>\n<h4>References<\/h4>\n<p>Kadushin, C. (2012). <em>Understanding Social Networks: Theories, Concepts, and Findings<\/em>. Oxford University Press, USA.<\/p>\n<p>Krempel, L. (2011). Network Visualization. In <em>The SAGE Handbook of Social Network Analysis<\/em> (1 edition, pp. 558\u2013577). London\u202f; Thousand Oaks, Calif: SAGE Publications Ltd.<\/p>\n<p>Kushin, M. (2015, October 4). #Hokies Tweets Network Visualization: How I Extracted Tweets Via Tags 6 and Visualized Them in Gephi. Retrieved July 2, 2017, from http:\/\/mattkushin.com\/tag\/twitter\/<\/p>\n<p>Osphal, T. (2011, June 9). Node Centrality in Weighted Networks. Retrieved July 2, 2017, from https:\/\/toreopsahl.com\/tnet\/weighted-networks\/node-centrality\/<\/p>\n<p>Siegel, A. (n.d.). Sectarian Twitter Wars: Sunni-Shia Conflict and Cooperation in the Digital Age. Retrieved July 2, 2017, from http:\/\/carnegieendowment.org\/2015\/12\/20\/sectarian-twitter-wars-sunni-shia-conflict-and-cooperation-in-digital-age-pub-62299<\/p>\n<p>Thorp, J. (n.d.). Just Landed: Processing, Twitter, MetaCarta &amp; Hidden Data. Retrieved July 2, 2017, from http:\/\/blog.blprnt.com\/blog\/blprnt\/just-landed-processing-twitter-metacarta-hidden-data<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Digital technology (DT) has revolutionized communications. Coupled with the boundless potential of the internet, DT promises to continue redefining the standards for engagement in the public sphere. In particular, social networking sites like Twitter, Reddit, Facebook, Instagram, and Tumblr offer users an unprecedented level of access to the world around them. Twitter, being the&hellip;<\/p>\n","protected":false},"author":225,"featured_media":6997,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[39,105,106,107,108,109,110],"coauthors":[],"class_list":["post-6795","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization","tag-gephi","tag-network","tag-network-visualization","tag-nodexl","tag-social-media","tag-social-media-research","tag-twitter"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1LB","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/6795","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/225"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=6795"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/6795\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=6795"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=6795"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=6795"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=6795"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}