{"id":3743,"date":"2015-10-05T16:13:09","date_gmt":"2015-10-05T20:13:09","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=3743"},"modified":"2015-10-05T16:13:09","modified_gmt":"2015-10-05T20:13:09","slug":"tenants-call-the-311-visualizing-nyc-housing-complaints","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/tenants-call-the-311-visualizing-nyc-housing-complaints\/","title":{"rendered":"Tenants Call the 311: Visualizing NYC Housing Complaints"},"content":{"rendered":"<p><strong>Goal<\/strong><\/p>\n<p>For the Tableau lab in our Information Visualization class, I decided to focus on 311 hotline data from the NYC Open Data portal, in particular the subset of complaint data published by the Department of Housing Preservation and Development (HPD). This dataset represents all calls by tenants to the city since 2010\u00a0 to report either issues in their apartments or\u00a0building-wide issues, with the exception of structural issues for buildings. These fall under the jurisdiction of the Department of Buildings.<\/p>\n<p>The tutorial videos from Tableau helped outline the possibilities with the software. I thought it would be interesting to visualize the data to allow analysis of the following:<\/p>\n<ul>\n<li>areas within NYC experiencing a high level of housing trouble;<\/li>\n<li>types of problems reported and the frequency of each type of complaint;<\/li>\n<li>the spread of complaints over the months of the year.<\/li>\n<\/ul>\n<p><strong>Inspiration<\/strong><\/p>\n<p>I remembered I had seen one really\u00a0good visualization for noise complaints in New York City in the past, but I was unable to find the exact one, though I was able to locate a few others. What I like about the first substitute I found, Example A,\u00a0is that it shows the entire city and that its rendering as a chloropleth map allows you to clearly see which areas are the \u201cnoisiest\u201d.<\/p>\n<div id=\"attachment_3745\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_complaint_cartoDB.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-3745\" class=\"size-medium wp-image-3745\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_complaint_cartoDB-620x372.jpg?resize=620%2C372\" alt=\"Example A: Chloropleth Map Visualization of NoiseComplaint Calls Source: &quot;NYC, the City That Never Sleeps&quot;. http:\/\/blog.cartodb.com\/noisy-new-york\/\" width=\"620\" height=\"372\" \/><\/a><p id=\"caption-attachment-3745\" class=\"wp-caption-text\">Example A: Chloropleth Map Visualization of Noise Complaint Calls in NYC<br \/>Source: &#8220;NYC, the City That Never Sleeps&#8221;. http:\/\/blog.cartodb.com\/noisy-new-york\/<\/p><\/div>\n<p>I also really like Example B that visualizes noise complaint data, too. It\u2019s not entirely different from Example A, but this series of visualizations includes a breakdown of types of noise.\u00a0Knowing that HPD breaks down tenant complaints into categories (no heat, pests, and plumbing issues are all very separate issues), I wanted to be able to represent the variety of complaints occurring throughout the city.<\/p>\n<div id=\"attachment_3746\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_Sluis_01.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-3746\" class=\"size-medium wp-image-3746\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_Sluis_01-620x193.jpg?resize=620%2C193\" alt=\"Example B(1): Visualization of Noise Complaints in NYC. Source: &quot;Yo, I'm Trying to Sleep Here! New York's Wonderful Map of Noise&quot; by John Metcalfe. http:\/\/www.citylab.com\/tech\/2013\/04\/yo-im-trying-sleep-here-new-yorks-wonderful-map-noise\/5279\/\" width=\"620\" height=\"193\" \/><\/a><p id=\"caption-attachment-3746\" class=\"wp-caption-text\">Example B(1): Visualization of Noise Complaint Calls in NYC<br \/>Source: &#8220;Yo, I&#8217;m Trying to Sleep Here! New York&#8217;s Wonderful Map of Noise&#8221; by John Metcalfe. http:\/\/www.citylab.com\/tech\/2013\/04\/yo-im-trying-sleep-here-new-yorks-wonderful-map-noise\/5279\/<\/p><\/div>\n<div id=\"attachment_3747\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_Sluis_02.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-3747\" class=\"size-medium wp-image-3747\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_Sluis_02-620x384.jpg?resize=620%2C384\" alt=\"Example B(2): Visualization of Noise Complaints in NYC.\" width=\"620\" height=\"384\" \/><\/a><p id=\"caption-attachment-3747\" class=\"wp-caption-text\">Example B(2): Visualization of Noise Complaint Calls in NYC<\/p><\/div>\n<p>The last example that I found on WIRED really resonated with me: Example C.<\/p>\n<div id=\"attachment_3749\" style=\"width: 620px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_WIRED.jpg\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-3749\" class=\"size-medium wp-image-3749\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_example_WIRED-620x398.jpg?resize=620%2C398\" alt=\"Example C: Steam Graph of 311 Data Source: &quot;What a Hundred Million Calls to 311 Reveal about New York&quot; by Steven Johnson. http:\/\/www.wired.com\/2010\/11\/ff_311_new_york\/\" width=\"620\" height=\"398\" \/><\/a><p id=\"caption-attachment-3749\" class=\"wp-caption-text\">Example C: Steam Graph of 311 Data<br \/>Source: &#8220;What a Hundred Million Calls to 311 Reveal about New York&#8221; by Steven Johnson. http:\/\/www.wired.com\/2010\/11\/ff_311_new_york\/<\/p><\/div>\n<p>This steam graph illustrates that by choosing smart ways to visualize your data, you can assist people in understanding the world that exists behind the data. By aggregating all 311 complaint data not only by type, but also by hour, the creator of the visualization has offered an inner portrait of New York City. According to the steam graph, noise bothers New Yorkers the most when it occurs from the evening into the wee hours of the morning, whereas problems like poor street conditions and illegal use of buildings are only report-worthy to New Yorkers during the day. Taxi complaints arguably reflect taxi usage: throughout the day with an obvious taper between late evening and morning.<\/p>\n<p>New York is infamous for bad housing conditions. And my experience has been no exception. I have had some really <strong>bad<\/strong> landlords, so I am very conscious of my complaint algorithm as a tenant. An hourly visualization such as the one on WIRED might not paint a revealing portrait of housing conditions in NYC, but I felt a monthly visualization could offer an interesting view of what New Yorkers endure in their apartments, season by season.<\/p>\n<p><strong>Downloading and preparing the data<\/strong><\/p>\n<p>The entire HPD dataset available on the NYC Open Data platform spans 2010 to the current date. The initial downloaded file with minimal column filters applied was ca. 2 GB. When I tried to open it with Excel, the program warned that it was not able to open the entire file. I decided to visit the site again and remove any columns that I was fairly certain I wouldn\u2019t need. The resulting file was about 800MB, comprising\u00a0over 3.5 million complaint records. According to Wikipedia, there are ca. 8.5 million people living in NYC. As an admittedly crudely calculated statistic, that means for one in every three people\u00a0living in New York, there has been a housing complaint called in\u00a0since 2010. That&#8217;s pretty astonishing considering a percentage of NYC residents\u00a0are homeowners.<\/p>\n<p>For this project, I didn\u2019t want to include partial years. Excel still wouldn\u2019t open this smaller file. I used the program Sublime Text to open the file and drop a marker to delete all complaints made in 2015. The downloaded data from HPD is reasonably normalized, so I did not need to clean or reformat the data any further with GoogleRefine or using Python.<\/p>\n<p><strong>Tableau worksheets: Three views<\/strong><\/p>\n<p>I knew already that I wanted to provide three views of the data to accomplish my goals in visualizing the data:<\/p>\n<ol>\n<li>A <strong>map view of New York City<\/strong> with tenant complaints and number of complaints mapped to addresses;<\/li>\n<li>A <strong>bar chart<\/strong> of the different <strong>types of complaints<\/strong> and the <strong>number of complaints<\/strong>. My intention is to allow this chart to also\u00a0serve as an interactive filter for the map from the previous worksheet when I design the\u00a0dashboard later. The bar chart should allow a person to view complaints by single year;<\/li>\n<li>A <strong>steam graph<\/strong> of tenant complaints aggregated <strong>by month across the years<\/strong>, inspired by the graph from WIRED<strong>.<\/strong><\/li>\n<\/ol>\n<p><strong>Preliminary assessment of the visualizations<\/strong><\/p>\n<p>Once I created these views, a few things became apparent to me. Both within the categories and within the descriptors there appeared to be a lot of overlap of terms in use. By <strong>grouping items<\/strong>, I was able to resolve some of these issues in order to keep the lists\u00a0shorter. One simple solution was\u00a0creating single groups for items that clearly belonged together, for example, for the descriptors \u201cwater-leak\u201d and \u201cwater leak\u201d. Others were less evident, but reasonable, like collapsing the categories \u201cSTRUCTURAL\u201d AND \u201cWATER LEAK\u201d, since the only descriptor under \u201cSTRUCTURAL\u201d was water leak. Water leak also appears under the category \u201cPLUMBING\u201d, but the descriptors under plumbing seem to refer to issues\u00a0with sink and bathtub plumbing, whereas the descriptors under the \u201cWATER LEAK\u201d category seem broader, more structural.<\/p>\n<p>At the same time, some category names were <strong>non-descriptive\u00a0as\u00a0labels<\/strong>, so I changed them to allow at least minimal understanding of the terms: for example, I collapsed \u201cNONCONST\u201d, \u201cSAFETY\u201d, and \u201cUNSANITARY CONDITION\u201d into one term because there was overlap between sub-category descriptors, and then renamed the group as \u201cUNSANITARY CONDITION\/SAFETY\u201d. One problem that I could not resolve was the category used by HPD called \u201cGENERAL\u201d. Descriptors found within that category seemed to be spread across many different other categories and overlapped considerably with \u201cGENERAL CONSTRUCTION\u201d, which I grouped with \u201cGENERAL\u201d.<\/p>\n<p>It\u2019s clear that over time, the names of categories and descriptors have changed. There may even be differences between how individual 311 operators classify housing complaints. For now, I let my created group \u201cGENERAL\u201d be.<\/p>\n<p>Other changes I made were along the lines of design aesthetics. I <strong>changed colors to be more intuitive, <\/strong>for example, \u201cheat and hot water\u201d to red, \u201cunsanitary condition\/safety\u201d to brown, and \u201cplumbing\u201d to blue. I also tried to conform complaint groups of a similar nature to a particular color palette, like using different blues for categories that pertain to water.<\/p>\n<p><strong>Notes on the Steam Graph<\/strong><\/p>\n<p>Unfortunately, Tableau does not offer a default way to render your data as a steam graph. For the particular steam graph I wanted to create showing aggregated monthly trends, the Tableau area chart came the closest. Googling Tableau and steam graph, I did find <a href=\"http:\/\/vizwiz.blogspot.com\/2013\/01\/creating-stream-graphs-in-tableau-8-in.html\" target=\"_blank\">one blogpost\u00a0on VizWiz by Andy Kriebel<\/a>\u00a0who attempted a steam graph with Tableau and came pretty close to representing one. I couldn\u2019t apply his method, since the setup of my data\u00a0was not quite the same, but it at least showed it was possible and directed me to the calculated field function. In the end, I was able to find a solution that created a steam graph similar to the one in the blogpost.<\/p>\n<p>First, I put the \u201cCreated Date\u201d dimension from the original data on the column shelf of the area chart and set it to month. The normal way to create the area graph is to put the number of records measure element into the rows shelf. If you look into the editor for that element, you will see that the calculated field is \u201c1\u201d. I copied that element and named it \u201cSteam Graph Calculation\u201d. Looking at the calculated field syntax in the blogpost, I modified some of the complaint categories to calculate \u201c-1\u201d in order to set some of the elements below the horizontal zero to create a more interesting steam graph-like shape similar to the one in Kriebel&#8217;s blogpost. But the calculated field editor would not let me set the dimensions element I had created with grouped complaint types as the case. That meant I had to use the individual original complaint types (not the grouped complaint types)\u00a0in my script in order to designate which categories should be calculated with \u201c-1\u201d. The final calculated field script looks like this:<\/p>\n<p><code>CASE [Complaint Type (original)]<br \/>\nWHEN 'ELECTRIC' THEN -1<br \/>\nWHEN 'PLUMBING' THEN -1<br \/>\nWHEN 'HEATING' THEN -1<br \/>\nWHEN 'HEAT\/HOT WATER' THEN -1<br \/>\nWHEN 'STRUCTURAL' THEN -1<br \/>\nWHEN 'WATER LEAK' THEN -1<br \/>\nELSE 1<br \/>\nEND<br \/>\n<\/code><\/p>\n<p>The script also reflects my choice to set complaint categories primarily addressing problems with utilities beneath the zero line, with those above the line slowly moving to non-utility complaints within the apartment to common spaces and finally\u00a0outside the building. I then removed the marks along the y-axis to avoid confusion, especially since the y-axis now\u00a0showed negative numbers due to the calculations I applied.\u00a0I am sure there are more ways to manipulate the calculation to create a better graph, but this was a good stopping point for me as\u00a0someone who is still learning to use the calculated field.<\/p>\n<p>These are screenshots of the three preliminary visualizations I created:<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_Hwang_Bar.gif\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-3753\" style=\"border: 1px solid #999999\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_Hwang_Bar-620x376.gif?resize=620%2C376\" alt=\"Tableau_Hwang_Bar\" width=\"620\" height=\"376\" \/><\/a><\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_Hwang_Map.gif\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-3754\" style=\"border: 1px solid #999999\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_Hwang_Map-620x401.gif?resize=620%2C401\" alt=\"Tableau_Hwang_Map\" width=\"620\" height=\"401\" \/><\/a><\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_Hwang_Steam.gif\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-3755\" style=\"border: 1px solid #999999\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2015\/10\/Tableau_Hwang_Steam-620x397.gif?resize=620%2C397\" alt=\"Tableau_Hwang_Steam\" width=\"620\" height=\"397\" \/><\/a><\/p>\n<p><strong>Future Directions<\/strong><\/p>\n<p>My\u00a0Tableau worksheets will certainly change as I lay them out in a\u00a0dashboard. For example, legends currently appear for each one, though once they appear together, I hope the bar chart will serve as the filter for the city map and alleviate the need for a legend there. The legend will be removed from the steam graph, since I chose to explicitly label the color areas. Here\u2019s a laundry list of ways I would improve what I have so far:<\/p>\n<ul>\n<li>Create a dashboard for the worksheets;<\/li>\n<li>Further refine the descriptors for complaint categories to make the list more compact, which would require more diligent investigation of HPD\u2019s data documentation;<\/li>\n<li>Reconsider the placement of complaints in the steam graph. I like my imposition of a &#8220;logical&#8221; order, but similar colors might be too close;<\/li>\n<li>Research whether deleting more columns from my dataset will increase the response speed for the visualizations, especially the map. If not, I might consider reducing the\u00a0visualizations to only represent\u00a0&#8220;warranty of habitability issues&#8221;;<\/li>\n<li>Read more about the calculated field editor. One thought is if there is a way to edit the record count to a range instead of just a negative and positive count and still make an area graph, I will be able to adjust the primary areas around the center line to render more like a Rorschach inkblot (more like a steam graph) than just colored areas above and below a center line.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Goal For the Tableau lab in our Information Visualization class, I decided to focus on 311 hotline data from the NYC Open Data portal, in particular the subset of complaint data published by the Department of Housing Preservation and Development (HPD). This dataset represents all calls by tenants to the city since 2010\u00a0 to report&hellip;<\/p>\n","protected":false},"author":242,"featured_media":3874,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[37,44,23],"coauthors":[],"class_list":["post-3743","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization","tag-37","tag-lab","tag-tableau"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-Yn","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/3743","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/242"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=3743"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/3743\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=3743"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=3743"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=3743"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=3743"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}