{"id":6855,"date":"2017-06-26T14:00:24","date_gmt":"2017-06-26T18:00:24","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=6855"},"modified":"2017-06-26T14:00:24","modified_gmt":"2017-06-26T18:00:24","slug":"lab4-mapping-nyc-311-complaint-data","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/lab4-mapping-nyc-311-complaint-data\/","title":{"rendered":"NYC 311 Service Requests by Zipcode"},"content":{"rendered":"<h3><strong>Introduction<\/strong><\/h3>\n<p><span style=\"font-weight: 400\">Launched in 2003, New York City\u2019s non-emergency complaint hotline (311) calls have been made available as a dataset on NYC OpenData portal. \u00a0The portal provides useful intelligence and the ability to detect patterns which would be difficult to understand without geospatial data. \u00a0For the Carto\/mapping lab I explored this dataset to gain insight of the volume and type of complaints reported across neighborhoods in all five boroughs. \u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<h3><strong>Inspiration<\/strong><\/h3>\n<p><span style=\"font-weight: 400\">The following images exemplifying maps which highlight key insights from the data in easy to use\/understand visualizations.<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><a href=\"https:\/\/www.wired.com\/category\/magazine\/\"><span style=\"font-weight: 400\">Wired Magazine<\/span><\/a><span style=\"font-weight: 400\"> mapped a week of complaint data in September 2010 called \u201c<\/span><a href=\"https:\/\/www.wired.com\/2010\/11\/ff_311_new_york\/\"><span style=\"font-weight: 400\">What\u2019s Your Problem<\/span><\/a><span style=\"font-weight: 400\">\u201d <\/span><b>(Fig 1)<\/b><span style=\"font-weight: 400\">. \u00a0This fun map communicates the number of complaints by zip code using colorful graphics and a simple key. \u00a0<\/span><\/p>\n<div id=\"attachment_6851\" style=\"width: 485px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/research.prattsils.org\/blog\/coursework\/information-visualization\/lab4-mapping-nyc-311-complaint-data\/attachment\/whats-your-problem\/\" rel=\"attachment wp-att-6851\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6851\" class=\" wp-image-6851\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/06\/Whats-Your-Problem-620x566.png?resize=485%2C443\" alt=\"\" width=\"485\" height=\"443\" \/><\/a><p id=\"caption-attachment-6851\" class=\"wp-caption-text\"><strong>Fig 1<\/strong> 2010 Mapping of NYC Complaint Data from <a href=\"http:\/\/2010 Mapping of NYC Complaint Data from Wired\" target=\"_blank\" rel=\"noopener noreferrer\">Wired Magazine<\/a><\/p><\/div>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">(<\/span><b>Fig 2)<\/b><span style=\"font-weight: 400\"> is from <\/span><a href=\"https:\/\/carto.com\/learn\/guides\/widgets\/exploring-widgets\"><span style=\"font-weight: 400\">Carto.com\u2019s<\/span><\/a><span style=\"font-weight: 400\"> documentation pages showing analysis from a multi-layer map combining \u00a0polygons and point data. \u00a0Again, visually interesting with clear analysis.<\/span><\/p>\n<div id=\"attachment_7010\" style=\"width: 620px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/research.prattsils.org\/blog\/coursework\/information-visualization\/lab4-mapping-nyc-311-complaint-data\/attachment\/widgets\/\" rel=\"attachment wp-att-7010\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7010\" class=\"size-medium wp-image-7010\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/07\/widgets-620x350.png?resize=620%2C350\" alt=\"\" width=\"620\" height=\"350\" \/><\/a><p id=\"caption-attachment-7010\" class=\"wp-caption-text\"><strong>Fig 2<\/strong> Applying interactivity on a multi-layer map from <a href=\"https:\/\/carto.com\/learn\/guides\/widgets\/exploring-widgets\" target=\"_blank\" rel=\"noopener noreferrer\">Carto.com<\/a>.<\/p><\/div>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/fivethirtyeight.com\/\">FiveThirtyEight<\/a> presented a choropleth map <b>(Fig 3)<\/b> depicting mortality rates for leading causes of death in every U.S. county from 1980 to 2014. \u00a0I like the interactive feature of this map animating the changes over time.<\/p>\n<div id=\"attachment_6992\" style=\"width: 620px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/research.prattsils.org\/blog\/coursework\/information-visualization\/lab4-mapping-nyc-311-complaint-data\/attachment\/screen-shot-2017-06-23-at-5-34-08-pm\/\" rel=\"attachment wp-att-6992\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-6992\" class=\"size-medium wp-image-6992\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/07\/Screen-Shot-2017-06-23-at-5.34.08-PM-620x470.png?resize=620%2C470\" alt=\"\" width=\"620\" height=\"470\" \/><\/a><p id=\"caption-attachment-6992\" class=\"wp-caption-text\"><strong>Fig 3<\/strong> Interactive choropleth map from <a href=\"https:\/\/projects.fivethirtyeight.com\/mortality-rates-united-states\/\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>FiveThirtyEight<\/strong><\/a>.<\/p><\/div>\n<p>&nbsp;<\/p>\n<h3><strong>Materials<\/strong><\/h3>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">I used <\/span><a href=\"https:\/\/opendata.cityofnewyork.us\/\"><span style=\"font-weight: 400\">NYC Open Data<\/span><\/a><span style=\"font-weight: 400\"> 311 dataset records from 2010 to present, initially made available to the public in 2015 as part of the Open Data For All initiative. The complete dataset contains 15M rows and 53 variables (columns) describing the service request, such as complaint type, date received, incident address, and resolution description.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">To map the data I used Carto Builder, formerly CartoDB, a web-based analysis tool which enables users to gain key insights for location data through visualization and analysis.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Tableau Public is business intelligence and analysis tool visualizing quantitative and geospatial data used in this lab as a comparison to Carto Builder.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">MS Excel pivot tables were employed to review and validate map results.<\/span><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3><strong>Methods<\/strong><\/h3>\n<p><span style=\"font-weight: 400\">To limit computational overhead, I first filtered the table to eight columns including lat\/lon reducing it to 2M rows. I sorted the table to get the top 10 complaints for all years then attempted to \u201cclean\u201d using OpenRefine and R with limited success given the large dataset. \u00a0Subsequent iterations of \u201ccleaning\u201d the data involved grouping and filtering directly in the portal. \u00a0By filtering year to 2016 and grouping by month, type, zip code, and count of unique id I created datasets of 160k rows down to 50k rows. \u00a0I used Excel pivot tables at each step to validate the numbers.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The FiveThirtyEight map looked like a Tableau choropleth so I attempted to recreate it in Tableau for practice and as a comparison. \u00a0I imported the dataset, dragged zip codes then measures which created filled map <\/span><b>(Fig 4)<\/b><span style=\"font-weight: 400\">. \u00a0I then created filters based on the dimensions and added date to pages which generated a time slider. \u00a0The map looked and behaved as expected so I then moved on to Carto Builder.<\/span><\/p>\n<div id=\"attachment_7011\" style=\"width: 642px\" class=\"wp-caption alignleft\"><a href=\"http:\/\/research.prattsils.org\/blog\/coursework\/information-visualization\/lab4-mapping-nyc-311-complaint-data\/attachment\/tableau-map-2\/\" rel=\"attachment wp-att-7011\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7011\" class=\" wp-image-7011\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/07\/Tableau-Map-620x344.png?resize=642%2C356\" alt=\"\" width=\"642\" height=\"356\" \/><\/a><p id=\"caption-attachment-7011\" class=\"wp-caption-text\"><strong>Fig 4<\/strong> NYC 311 by Zip visualized in <a href=\"https:\/\/public.tableau.com\/views\/311ComplaintsbyZipMonth\/311ServiceRequestsbyZip?:embed=y&amp;:display_count=yes\" target=\"_blank\" rel=\"noopener noreferrer\">Tableau Public<\/a>.<\/p><\/div>\n<p><span style=\"font-weight: 400\">Carto created point data from the dataset using the lat\/lon upon import. \u00a0I realized I needed polygon geom to create a filled map. \u00a0Referring back to Tableau I had not noticed that it used generated polygons rather than the lat\/lon in the data set. \u00a0I tried running the Georeference analysis in Carto but received an error message: too many rows. \u00a0I then imported a shapefile of the zip codes and created a join which appeared to work but still had difficulty processing the 50k rows. \u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">At an impasse, I finished the lab with a good looking map and incorrect measures. I later had the idea of sampling the dataset and researched some tools. \u00a0Fortuitously I discovered sampling is built in to Carto. \u00a0I ran the sampling analysis and georeferenced the result set which quickly generated the correct geometry but the numbers still looked \u201coff\u201d. \u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">I revisited the dataset in the portal and through further filtering, I reduced the table size to 20k rows and 1MB file size. \u00a0I ran the georeference analysis again and got the \u201ctoo many rows\u201d error again. \u00a0\u00a0I resorted to joining this table to the shapefile and generated a filled map <strong>(Fig 5).<\/strong> \u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_7012\" style=\"width: 819px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/research.prattsils.org\/carto-lab-transformations\/\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7012\" class=\" wp-image-7012\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2017\/07\/Carto-Lab-Transformations-620x187.png?resize=819%2C247\" alt=\"\" width=\"819\" height=\"247\" \/><\/a><p id=\"caption-attachment-7012\" class=\"wp-caption-text\"><strong>Fig 5<\/strong> Map Iterations: point data, NYC zip code shapefile, joined tables<\/p><\/div>\n<p>&nbsp;<\/p>\n<p><strong>Results<\/strong><\/p>\n<p><span style=\"font-weight: 400\">The widgets for complaints by type and by zip reported accurately but the legend and map represented the count of the complaints rather than the sum. \u00a0\u00a0The widget for time-series inaccurately reflected date range. \u00a0Given the amount of time spent on the data, analysis and widgets, I had to forgo time animation. I subsequently discovered that only point data can be animated. \u00a0There is likely a work around for polygons but I did not find it.<\/span><\/p>\n<p>[iframe src=&#8221;100%&#8221; height=&#8221;520&#8243; frameborder=&#8221;0&#8243; src=&#8221;https:\/\/jpolanco.carto.com\/builder\/ab303325-35fe-4a80-b8a3-36299e4548d4\/embed&#8221; allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen]<\/p>\n<p>&nbsp;<\/p>\n<p><em><span style=\"text-decoration: underline\"><strong>Update July 3:<\/strong><\/span><\/em><\/p>\n<p><span style=\"font-weight: 400\">While doing research I found the Cartp analysis to intersect second layer to create an aggregate column of the complaint counts. \u00a0I joined the zip code shapefile to the complete (not sampled) dataset, ran analysis to intersect a second layer to generate aggregates and styled by that value. \u00a0The map below reflects correctly reflects the table data.<\/span><\/p>\n<p>[iframe src=&#8221;100%&#8221; height=&#8221;520&#8243; frameborder=&#8221;0&#8243; src=&#8221;https:\/\/jpolanco.carto.com\/builder\/c0e54af2-1df6-40f2-874c-aeba43dbb1d5\/embed&#8221; allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen]<\/p>\n<h3><\/h3>\n<h3><strong>Future Directions<\/strong><\/h3>\n<p><span style=\"font-weight: 400\">The Carto lab has been the most challenging and time consuming due to issues with the dataset and non-intuitive application. \u00a0In hindsight and with more experience, understanding the data can save time and effort in the amount of cleaning\/manipulation to use it. \u00a0Going forward I would try to work with smaller datasets or sampling options from the beginning. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Carto Builder is deceptively complex. \u00a0Though it presents as \u201cintuitive, logical and effortless\u201d a non-geospatial analyst would need some command of statistics. Knowledge of CSS and HTML are very useful but familiarity with SQL is necessary to create complex visualizations. \u00a0By comparison, Tableau automatically georeferenced the zip codes into polygons had no issue with the larger datasets. \u00a0I was able to animate polygons and duplicate the FiveThirtyEight map functionality. \u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Despite less than desired results, the mapping exercise still revealed key insights. \u00a0The zipcode shapefile increased my awareness of the neighborhood boundaries within the boroughs and their relative size. \u00a0The mapping of the highest volume of complaints by zip through the year gave clarity to the disparity of issues. \u00a0To maximize the rich dataset and would like to get the time-series working, \u00a0keep addresses in the dataset to show point data and drill down to specific issues.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Launched in 2003, New York City\u2019s non-emergency complaint hotline (311) calls have been made available as a dataset on NYC OpenData portal. \u00a0The portal provides useful intelligence and the ability to detect patterns which would be difficult to understand without geospatial data. \u00a0For the Carto\/mapping lab I explored this dataset to gain insight of&hellip;<\/p>\n","protected":false},"author":210,"featured_media":7012,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"coauthors":[],"class_list":["post-6855","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1Mz","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/6855","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/210"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=6855"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/6855\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=6855"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=6855"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=6855"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=6855"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}