- 1.Introduction:
My second Information Visualization practice is about display the NYC rat incident from 4 perspectives:
- Showing the geographical distribution of rats issue of NYC.
- Illustrating the influence of temperature(month change) for rats incident.
- Display the time trend (year) of rats incident in 5 main regions.
- Explain how location type could influence the events occurs.
Background & Topic Reason
The reason why I am interested in NYC rats event is that before I came to New York, my friend who lived in this city for 5 years gives me some tips on how to find an apartment without the mouses and bugs problem. And I also saw many videos and articles about a large number of rodents live in the city, and city leader may cost a lot for this issue.[1] By 311 reports and Jonathan Auerbach[2] research, there are 8 million rats in New York. In fact, he found the city has a mere 2 million of the rodents, a ratio of about four humans to one rat. So, I start to find some similar databases about the mouse. Fortunately, I got a database of the specific location and issue number… that collected by DOHMH(NYC Department of Health and Mental Hygiene). And start my work.
Goal & Target People & Deployed Format
My goal is to show the overall glance of NYC’s rat distribution and focus on some potential factors. For the people who want to choose the apartment and have interest in the situation. The user type is very broad, so I believe this try could be seen as an exploratory program. Display as an episodic dashboard by the screen.
- 2. Methods & Rational
The Tableau project consists 4 visualization table. The data above borough, zip code, location type and record sum…Which found from NYC Open Data.
Since the database contains a lot of blank rows and some information that not pertinent to my creation. I utilized OpenRefine, Excel to clean and reform it.
Also, I use some techniques of Tableau to create my visualization. I will explain my methods and procedures of the creation for information and the corresponding visualizations below.
- 3. The Details
- The Heat Map of “Time Influence Incident Sum”
The objective of this sheet is to show the overall trend of rats incidents raise by year and how the month or temperature influence the number of issues. This is a time-series data visualization with comparable high-volume, I wish to show the two dimensions on one chart, and create clear visual impact. So I select heat map which could use color to display the records number(SUM) and show what I need.
Use OpenRefine “timeline facet” to select the time period between 2010~2017, remove extra data rows.
“edit column” and “split several columns” for separate specific creation time from MM/DD/YY.
Also, Because the data has blank of uncollected data, so I use the “Analytics” to forecast or estimate the further number of incidents depends on current records.
For the color, firstly I used blue, but it looks boring and blue is not a better color to illustrate detail. Meanwhile, blue not in harmony with my main tone red (for display the seriousness of the issue.)
The inspiration from below heat map, very easy to track the specific focus time. For example, we could see the right side of the map is the weekend with more vehicles fatal crash, at the same time catch the fewest accident day is a Monday …With little highlights and marks would be better to remember.
- The Choropleth Map of “Rat Incidents Location”
This sheet no need any data revise, thanks for the clean and clear zip code number, I could directly use it create a Choropleth Map to show the incident difference level of rats issue.
I want to have a geographical map in my dashboard, and I have the data of it, so I create. However, the Choropleth Map sometimes could be misleading a lot, for“One’s perception of shaded value can be affected by the underlying area of the geographic region.” Also, the proportion of every area may affect people’s perception of number, the saturation of color may not easily ease this flaw.
I captured a beautiful Choropleth Map from Pinterest. It’s Simple, beautiful and easy to read, without any burden.
From the map, we can compare the rate of color to have a sense of an overall situation and catch the region you want to know more about, check the annotate.
- The Bumps Chart of “5 Regions’ Rat Incident Year Trend”
For this sheet, I reform the data of borough, combine specific location of one region, and received 5 main regions of NYC (By using the “Create Group” of Tableau.)
I want to show the trend over 8 years, and at the same time compare 5 regions. So, I use the time-series again, separate the chart into 5 parts, use the value of the color to display the serious level of 5 groups. Also, I use the “Analytics” to reveal the trend line. Add the text number on the trend to enrich the chart.
I viewed Edward Tufte’s comments for Bumps Chart, compare and show the change of different stages. Use simple color and er could grab each dimension to trace. I want to show the change of number during 8 years, and make a comparison. But what I did is not traditional Bumps Chart. With more compare measurements could be better.
- The Bar Chart of “Location Type Influence Num of Rat Incident”
The bar chart reveals how location type influences the number of rats incident. I use the simple bar chart, to compare different location type.
I was supposed to use more complex or elaborate way to display it, but it may lose the meaning of the dashboard. So I use the simplest one. Emerge the similar location type by “Create Group” and show the records number in order. I use grey as the color of this sheet.
The inspiration is from an article posted on Eagereyes. Because I was considering making a Stacked Bar display many pieces of information. But after reading the article, I give up the various color, for the confusion it may cause. As we can see, the color is similar, hard to separate, however, using contrast color may look mass.
- Other Try
I was trying use line to illustrate the monthly change, and compare with averages line. But it seems plain and unpersuaded.
I was used “diff” of OpenRefine to calculate the day between record create and record end, manage the 311 processing days.
To create a chart of average 311 processing time of every street. I want to show people who want to know how long it will take after they report the rat issue. However, the Dot Plot is hard to get, with too many rows and if you do not move your mouse, touch the dots for fine details, you would never figure out the meaning of it.
- Results & Exploration
My Link: My First Tableau Dashboard
During the process of creating Tableau Dashboard, I learned the techniques and use basic knowledge of Information Visualization to decide which visualization fit to which kind of data. I am really like the “Filter” section let different tables interact with each other. The future exploration would be more connection between tables, like I may add another database about the clean situation of different regions of NYC.
Some confusion I encountered:
- The similar color between each table may cause confusion of different data that Creator want to display.
- How to show Forecast more interactive and smart.
- The perception influence of Choropleth Map.
- I am not sure of add number in Bumps Chart, for I did not see them in other’s work.
After Class, I got some valuable suggestion, and did some change:
- Change the Bar Chart’s color to coordinate the whole dashboard;
- Remove the misleading trend line from Bumps Chart;
- Revise the Title of Bumps Chart for people could know measure number’s meaning;
- Adjust the order of Dashboard for people could see it orderly(From overall situation to details)
- Add minimum annotate for Heat Map.
Work Citing:
Heer, J., Bostock, M., & Ogievetsky, V. (2010). A Tour Through the Visualization Zoo. Communications Of The ACM, 53(6), 59-67. doi:10.1145/1743546.1743567
Kosara, Robert, et al. “Stacked Bars Are the Worst.” Eager eyes, ROBERT KOSARA, 25 Aug. 2016, eagereyes.org/techniques/stacked-bars-are-the-worst.
Tufte, Edward Rolfe. The Visual Display of Quantitative Information. Graphics Press, 2015.