{"id":5696,"date":"2016-11-08T17:55:32","date_gmt":"2016-11-08T22:55:32","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=5696"},"modified":"2016-11-08T17:55:32","modified_gmt":"2016-11-08T22:55:32","slug":"amazon-product-co-purchasing-network-information-visualization","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/amazon-product-co-purchasing-network-information-visualization\/","title":{"rendered":"Amazon Product Co-Purchasing Network: An Information Visualization"},"content":{"rendered":"<p>&nbsp;<\/p>\n<p><strong>INTRODUCTION<\/strong><\/p>\n<p>For the purposes of this graphic report, Gephi and Excel (Numbers) were used to manipulate data applying network analysis and creating a visualization based on the \u201cAmazon Product Co-Purchasing Network\u201d.\u00a0 This course provides the necessary guidelines for students to use the available open resources in the age of information visualization.<\/p>\n<p>When it comes to information visualization data, the overwhelming aspect of how to get started can easily end up in a bubble of isolation, exploring methods to enhance a network which is comprised of nodes (point of items) and edges (links or connectivity).\u00a0 There are vast amounts of data that stem from your typical chart, to graphs, and mapping in order for it to make sense of data.\u00a0 Consequently, the shield is lifted once the networks are formed to show functional characters such as grouping of nodes that is a representation of clusters among other elements.<\/p>\n<p>Furthermore, the need to ask questions about the graph explains why there is a need to do one in the first place.\u00a0 What is the graph trying to convey?\u00a0 How does Amazon Product Co-Purchasing Network work? After reviewing several datasets, I realized that the need to know the logistics of a graph became a priority.\u00a0 The graph depicts line network that is an essential component within a graph.\u00a0 The chosen dataset for this graph is directed which means that the edges have a direction.<\/p>\n<p>According to the dataset description, the Amazon graph was initiated by purchasing items directly from the online stores instead of department stores.\u00a0 Understanding online customer behavior denotes how data is perceived. \u00a0\u201cIt is based on customers who bought this item also bought feature of the Amazon website.\u00a0 If a product is frequently co-purchased with product j, \u00a0 the graph contains an undirected edge from i to j.\u00a0 Each product category provided by Amazon defines each ground-truth community.\u00a0 We have removed the ground-truth communities which have less than 3 nodes.\u00a0 As for the network, we provide the largest connected component.\u201d Please refer to source citation: J. Yang and J. Leskovec.\u00a0<a href=\"http:\/\/arxiv.org\/abs\/1205.6233\">Defining and Evaluating Network Communities based on Ground-truth<\/a>. ICDM, 2012. \u00a0Analyzing what customers are focusing on is shown in the graph by product strengths and different insights.<\/p>\n<p><strong>VISUALIZATION DESIGNS THAT INSPIRE<\/strong><\/p>\n<p>The following graph (Figure 1) is a wonderfully made design made using TVNViewer.\u00a0 This is an interactive visualization source that allows exploring network designs to change over time and space.\u00a0 This graph shows the edges in color revealing what it entails when using labels.\u00a0 Hovering over the edges will highlight the nodes and reveal the source. \u00a0Further describing the format, the in edges are in red, out edges are in green, and cyan for bi-directional edges.\u00a0 It is important to mention that manipulating the graph will allow the user to shape the color of a node as an example.\u00a0 Another fascinating function allows the user to view multi-faceted nodes by using \u201cSelection Depth to show slider\u201d.<\/p>\n<p><strong>\u00a0Figure 1<\/strong><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Graph-Documentation.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"wp-image-5700 aligncenter\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Graph-Documentation-620x620.png?resize=260%2C260\" alt=\"graph-documentation\" width=\"260\" height=\"260\" \/><\/a><\/p>\n<p>Upon viewing the multiple graphs available, this animated floating graph nodes\u00a0(Figure 2) caught my attention.\u00a0 As the author explains, the graph is entitled \u201cArt meets Computer Science\u201d.\u00a0 The graph consists of 70 nodes, 20 extra edges, and a balanced network style.\u00a0 The color scheme with blue background with light blue nodes clearly defines the relationship between the nodes and edges.\u00a0 Although this image is static, the animated floating graph moves or drifts around within the rectangular shaped canvas showing random speeds as the nodes move around.\u00a0 The edges are what makes the graph into a network, and resembles a mesh as covered during the class lecture.<\/p>\n<p><strong>Figure 2<\/strong><\/p>\n<p style=\"text-align: center\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Animated-Floating-Graph-Nodes.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-5701\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Animated-Floating-Graph-Nodes.jpg?resize=406%2C253\" alt=\"animated-floating-graph-nodes\" width=\"406\" height=\"253\" \/><\/a><\/p>\n<p>The structural data within a graph can either be directed or undirected.\u00a0 I felt compelled to show a graph (Figure 3) about the difference between the two.\u00a0 Can you guess which one of the selected three graphs is directed or undirected?<\/p>\n<p><strong>Figure 3<\/strong><\/p>\n<p style=\"text-align: center\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Graph-Directed-and-Undirected.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-5702\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Graph-Directed-and-Undirected.jpg?resize=502%2C378\" alt=\"graph-directed-and-undirected\" width=\"502\" height=\"378\" \/><\/a><\/p>\n<p><strong>Dataset, Software, and Materials<\/strong><\/p>\n<p>Creating visualization in lab involved carefully selecting datasets that I felt comfortable working with.\u00a0 The SNAP: Network datasets have multiple categories in which to choose from.\u00a0 I chose the network dataset on Amazon Product Co-Purchasing Network (<a href=\"http:\/\/snap.stanford.edu\/data\/bigdata\/communities\/com-amazon.ungraph.txt.gz\">com-amazon.ungraph.txt.gz<\/a>).\u00a0 The datasets were broken down in various categories such as Amazon Communities and Amazon Communities (top 5,000).<\/p>\n<p>For this project, we are using Gephi 0.9.1, free open-source software platform that deals with layout, metrics network analysis and real-time visualization.\u00a0 The first step is to download the dataset into Excel (Numbers).\u00a0 Make sure the dataset is clean before importing into Gephi.\u00a0 After importing the report, Gephi will prompt you to any additional errors or issues found. \u00a0\u00a0The columns are broken down into three categories ID, Source, and Target changing the existing columns to create and organize the graph.\u00a0 Proceed to eliminate any rows that do not contain data.\u00a0 Now you are ready to export the data into Gephi, make sure the graph format or extension ends in CSV.\u00a0 If the data is intact, proceed to validate to view the graph.\u00a0 The first step that I took was to regulate the \u201cEdge Thickness\u201d by using the slider to control this visualization.\u00a0 As far as the layout goes, this is the layout algorithms which form the shape of the graph.\u00a0 Once you locate the Layout module on the left hand side of the panel, choose \u201cForce Atlas\u201d and run the algorithm.<\/p>\n<p><strong>Results\/Discussion<\/strong><\/p>\n<p>The basic network structure that resembles the outcome of my design is a full circle which is made up of nodes and edges to emphasize the network that is \u201cdensely linked\u201d to communities whereas the edges concentrate on the members of the community.\u00a0 The degree report which includes In-Degree, Out-Degree, and Degree is a combined average of 1.968.\u00a0 There are a total of 66,562 nodes, and 65,499 edges.\u00a0 The modularity sets in at 0.996 with the number of communities at 6144.\u00a0 The graph was marked as directed in Gephi.\u00a0 However, when obtaining the dataset information, the results appeared as an undirected graph when retrieving the information from this link: <a href=\"http:\/\/snap.stanford.edu\/data\/com-Amazon.html\">http:\/\/snap.stanford.edu\/data\/com-Amazon.html<\/a>.\u00a0 The undirected graph has no direction as clearly stated in Figure 3.\u00a0 My assessment of this graph suggests that it is directed, in addition, the combined \u201cDegree\u201d as shown in Gephi is also indicative that this graph is directed.<\/p>\n<p><strong>Figure 4<\/strong><\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Amazon.graph1_.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"wp-image-5703 aligncenter\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Amazon.graph1_-620x878.jpg?resize=240%2C340\" alt=\"amazon-graph1\" width=\"240\" height=\"340\" \/><\/a><\/p>\n<p><strong>Future Direction<\/strong><\/p>\n<p>Including the datasets to show the labels is one possible solution to revealing the network in its entirety.\u00a0 However, this dataset did not have any labels.\u00a0 Further exploring labels to achieve the relationship between the nodes is a possibility.\u00a0 For example, value one would show a direct relationship whereas value two will show the connected nodes to the selected node in addition to their neighbors.\u00a0 Also, the ability to create a hypergraph is a foreseeable endeavor for this dataset \u2013 did not succeed in creating a hypergraph as the process was quite daunting.\u00a0 Expanding your reach to unexplored datasets and graphs will unleash your creativity.<\/p>\n<p>Utilizing other platforms for large scale graphs is an option.\u00a0 Using large scale graphs, such as the ones provided, to resolve problems in a graph is the wave of the future.\u00a0 While researching the topic, I came across a very interesting article that poses a question on how well graph-processing platforms perform \u2013 see link: <a href=\"http:\/\/www.ds.ewi.tudelft.nl\/~iosup\/perf-eval-graph-proc14ipdps.pdf\">http:\/\/www.ds.ewi.tudelft.nl\/~iosup\/perf-eval-graph-proc14ipdps.pdf<\/a>.\u00a0 Exploring the benefits of other platforms will ensure a greater variety and avoid obstacles or limitations in designing a graph according to your specifications.\u00a0 The power of visualizations goes a long way with providing data information in charts, graphs, and mapping in an understandable and actionable way.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; INTRODUCTION For the purposes of this graphic report, Gephi and Excel (Numbers) were used to manipulate data applying network analysis and creating a visualization based on the \u201cAmazon Product Co-Purchasing Network\u201d.\u00a0 This course provides the necessary guidelines for students to use the available open resources in the age of information visualization. When it comes&hellip;<\/p>\n","protected":false},"author":436,"featured_media":5703,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[95,93,94],"coauthors":[],"class_list":["post-5696","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization","tag-graphs","tag-information","tag-visualization"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1tS","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/5696","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/436"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=5696"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/5696\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=5696"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=5696"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=5696"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=5696"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}