{"id":5731,"date":"2016-11-22T23:27:36","date_gmt":"2016-11-23T04:27:36","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=5731"},"modified":"2016-11-22T23:27:36","modified_gmt":"2016-11-23T04:27:36","slug":"nyc-food-retail-stores","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/nyc-food-retail-stores\/","title":{"rendered":"NYC Food Retail"},"content":{"rendered":"<p>For my CartoDB project, I am interested in the distribution and characteristics of food retail stores in New York City. I find a dataset on <a href=\"https:\/\/data.ny.gov\/Economic-Development\/Retail-Food-Stores\/9a8c-vfzj\/data\">data.ny.gov<\/a> that has the location of all licensed food retail stores in New York State, and decide to use it for my analysis. For inspiration I look at three examples of data visualizations, all maps that deal with food retail in the United States.<\/p>\n<p><a href=\"https:\/\/www.linkedin.com\/pulse\/20140626105859-276474-data-visualization-where-are-the-bars-vs-grocery-stores-in-the-us\">The first example <\/a>is a project that analyzes the ratio of bars to grocery stores across the country. This is achieved with a cartogram consisting of even-sized circles colored on a 7-step, dichromatic scale; the two-tone scale works well, because it shows a clear distinction between bar-dominated (more brown) and grocery store-dominated (more blue-green) areas, and uses a neutral white where the balance is even. The choice of white could maybe be reconsidered, as it seems to suggest a lower density of store or a lack of data, even though this is not the case at all &#8211; this is especially true because the visualization is displayed on a white background.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignleft size-full wp-image-5733\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/IV-Carto-Ex-01.jpg?resize=723%2C476\" alt=\"iv-carto-ex-01\" width=\"723\" height=\"476\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"http:\/\/gizmodo.com\/which-americans-have-the-longest-drive-to-the-grocery-s-1221441087\">The second example<\/a> visualizes the travel distance to nearest grocery store from different starting points across the U.S. With line segments spanning from a given starting point (distributed in regular intervals) and ending at the nearest store, I find the result visually very effective. Metropolitan areas across the East and West coast with high population density are, not surprisingly, characterized by short distances to the nearest store. Less populated areas with desert and mountains have a noticeable longer travel distance, suggesting a much lower population density.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignleft size-large wp-image-5734\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/IV-Carto-Ex-02-940x529.png?resize=720%2C405\" alt=\"iv-carto-ex-02\" width=\"720\" height=\"405\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>As <a href=\"https:\/\/flowingdata.com\/2013\/06\/26\/grocery-store-geography\/\">a third example<\/a>, I am looking at a set of small multiples showing \u201cGrocery Store Geography\u201d, that is, the distribution of a selection of some of America\u2019s leading food retail stores. The use of small multiples is very effective, as it is easy to imagine that if the data had been all shown in one map, it would be cluttered to an extent of being impossible to read. Instead, each map shows only one retail chain, making it\u2019s presence across the country easy to assess.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignleft size-large wp-image-5735\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/IV-Carto-Ex-03-940x662.png?resize=720%2C507\" alt=\"iv-carto-ex-03\" width=\"720\" height=\"507\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>For my visualization, my first approach is to create a map showing the density of grocery stores across NYC\u2019s five boroughs. Opening the dataset in OpenRefine lets me discover, that there are 14,172 entries related to counties I am interested in. I could have exported only these rows, but instead I import a csv of the State-wide data into Carto, so that I will have the opportunity to look at state-wide data as well if needed. I add a shape file with data on the five NYC boroughs. To eliminate the data points that I don\u2019t need, and to be able to stylize the data in relation to the boroughs, I use the \u201cIntersect Second Layer\u201d analysis, which eliminates all data points not located within the counties of the shape file. Since a choropleth map doesn\u2019t make a lot of sense when there are only five areas, I try using hexbins colored with a five-step, monochrome gradient by the attribute \u201ccount_vals_density\u201d that Carto created as a result of the analysis. The result is successful in that it shows high density areas very clearly, however, the visualization is very unstable in Carto, and trying to manipulate it further results only in error messages, with the final consequence that I can\u2019t reopen the map. Having tried it a couple of times, and also realizing that this visualization doesn\u2019t provide that many insights (the hexbins make it impossible to get information about individual stores), I decide on a different approach; with store square footage also being an entry in my dataset, I decide to focus on store sizes. To do so, I go back to plotting individual stores as dots, but this time sized by square footage value. The values in the data set range from 0 (which I will also investigate further) to 230,000 sq ft, with an average of 2,500. I try out different sizing options, and end up finding an equal interval from size 4 to size 45 most effective in showing the difference between smaller store locations and really big ones. The approximated relationship of 1:10 mimics the 1:100 relationship between the average (which is relatively close to the minimum) and the maximum value.<\/p>\n<p>I use a stepped color to enhance the difference, and also to tone down the visual impact of the many, many small dots representing small food retail stores.<\/p>\n<p>Finally, I add a legend and a pop-up feature that allows you to see the store name and the square footage of each place. For the sizing legend, I decide to use a neutral gray because there is no option to use stepped color; using one solid color would potentially be a source of confusion because it makes the smallest dots look darkest (opposite my color assignment) due to the way Carto assigns transparency.<\/p>\n<p><a href=\"https:\/\/sandraatakora.carto.com\/builder\/5ec456f6-b10f-11e6-9cd4-0e3a376473ab\/embed\">The final visualization<\/a> shows a striking difference between a few, very big retail locations, such as Target (noticeably, the one in downtown Brooklyn doesn\u2019t show up), and many, many small ones. It is also still possible to get a sense of the density of the retail stores in different parts of the city.<\/p>\n<p><a href=\"https:\/\/sandraatakora.carto.com\/builder\/5ec456f6-b10f-11e6-9cd4-0e3a376473ab\/embed\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignleft size-large wp-image-5732\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/11\/Screen-Shot-2016-11-22-at-11.18.59-PM-940x517.png?resize=720%2C396\" alt=\"NYC Food Retail\" width=\"720\" height=\"396\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>For a further development of this project, I would work more with the data set. Upon investigation, I realize that there are 1000+ entries with a 0 value for square footage, which can obviously not be true. I choose to assume that these stores are indeed small, and that the overall impression of the visualization is thus still valid. However, the average value for square footage probably not too reliable. Another concern is the very small number of food retail\u00a0stores in Queens, which doesn&#8217;t seem accurate.<\/p>\n<p>With a more complete data set, I would consider expanding the visualization to covering the entire state, to investigate whether this store distribution is characteristic only to NYC.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For my CartoDB project, I am interested in the distribution and characteristics of food retail stores in New York City. I find a dataset on data.ny.gov that has the location of all licensed food retail stores in New York State, and decide to use it for my analysis. For inspiration I look at three examples&hellip;<\/p>\n","protected":false},"author":433,"featured_media":5732,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"coauthors":[],"class_list":["post-5731","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1ur","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/5731","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/433"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=5731"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/5731\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=5731"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=5731"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=5731"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=5731"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}