Death rates from various causes of death pique many people’s curiosity for a number of reasons. They can reveal trends in a country’s overall health, highlight local health issues, or simply satisfy morbid curiosity. For this lab, I created visualizations using data about leading causes of death collected by the National Center for Health Statistics. Two of these visualizations were put into a dashboard, which users may interact with to learn more about the leading causes of deaths in their state. The visualizations and dashboard were created using Tableau Public.
Visualization Examples
I brought three line graphs into the Tableau Public lab as inspiration for my visualizations and dashboard.
The first (Fig. 2) depicts a simple line graph, with one singular line of data, short descriptions of the axes on the labels, and small ticks to mark each side. The graph does not have any coloration, but the 2-D placement of the line is an easy pre-attentive attribute to recognize. I liked the simplicity of this graph and wanted to emulate this characteristic in my own work by keeping gridlines and labels on the dashboard as minimal as possible.
The second graph (Fig. 3) consists of three color-coded lines indicating three series. It has labels 0 to 6 on the y-axis, and categories 1-4 on the x-axis. There are horizontal gridlines that line up with the numbers on the y-axis. This visualization has both positive and negative aspects that I considered. I appreciated the use of three distinct colors to indicate different series; in my own line graph visualization, I color-coded the top ten causes of death in each state using colors that are typically readable by those who are color blind. As per Tufte’s argument about “chartjunk” (2009), I did not like the use of the horizontal gridlines across the graph as I thought these were unnecessary and did not help with the reading of the graph. I am also unsure of what 0 through 6 are supposed to indicate; this axis could have used a more descriptive label to indicate what is seen. Finally, the seemingly categorical nature of the data, as opposed to longitudinal data, indicates that a line graph may not have been the best approach for this visualization. Instead, a bar graph with three colored bars to indicate each series may have been more appropriate. Because my data was longitudinal, however, I chose to emulate only the colorization of different categories from this visualization.
The final visualization I brought in as an example (Fig. 4) had similar qualities to the second. I liked the use of multiple colors to indicate different categories. Otherwise, I felt it was at once too busy and not descriptive enough. The extensive gridlines and 3D look made it confusing to read. I could not tell if the lines were to be read as at different depths or if they were simply stylistic. The complete lack of labels also makes this graph difficult to read. It is unclear what either axis is supposed to represent or what each color of the lines indicates. The visualization is labeled as a template, indicating that it is only an example that people can build off of, but these poor design features would not be things one would want to emulate. Therefore, I used this visualization as an indication of what not to do in my own line graphs.
In addition to these three visualizations that I brought into the lab, I would also like to mention the inspiration I took from Hilary Baribeau’s dashboard. I initially envisioned creating a dashboard consisting mostly of line graphs or graphs using shapes such as circles to represent data. After seeing her work using a map as a filtering mechanism, I realized that that would be a simple and effective way to filter my own data. I therefore included a map at the top of my dashboard so that users may click the state they are interested in seeing the death rates for and it would filter the results in the graph on the bottom.
Materials
These visualizations were completed using death rate data from the National Center for Health Statistics (NCHS), a branch of the Centers for Disease Control (CDC). Using death certificates from each of the 50 states and the District of Columbia, this dataset describes the leading causes of death in the United States between 1999 and 2015. It includes the total numbers of death in each state for each leading cause of death for every year in the range. Additionally, it includes age-adjusted death rates based on census data and intercensal estimates. Because I am not familiar with the implications of this type of data, however, I did not use it in my final visualizations. The data was normalized when downloaded, but it did need totals for “All Causes” cleared out as well as totals for “United States” as a state.
The graphs and dashboard were created using Tableau Public, free visualization software that connects to online accounts and allows users to save and share their work. Tableau Public provides links to projects on their websites as well as embed codes to insert visualizations into HTML webpages. It also keeps the workbooks for each project so they may be downloaded from their server and looked at on local Tableau Public software.
Methods
Five steps were attempted during this lab, with three ultimately resulting in the final product. First, I was interested in a basic line graph for the totals for each cause of death in the data (Fig. 1). I wanted a plain background without gridlines and minimal labels, only to identify what years a user was looking at and the total numbers. Each cause of death was color-coded and a legend was included to clarify which color coordinated with each line. I elected to use a color-coded line graph because colors and 2D placement are pre-attentive attributes that are common and easily recognizable (Few, 2009). If I had more visible control over the style of the legend I would have eliminated the title, but that option seemed limited in Tableau Public. This visualization was ultimately not included in the dashboard, but gave me a taste of what to expect with this dataset in Tableau Public.
After creating this initial line graph, I decided to created a similar one that was filtered by the state. It also was graphed by the percentage of deaths in the state that year rather than totals to make up for the extreme population differences in each state. This graph consisted of lines over the series of years, although I reduced the number of labels to minimize the clutter on the x-axis. Each year has a point over which users can hover and get more information about the numbers for that cause of death in that state that year. This visualization was included in the final dashboard.
Once I created these line graphs, I worked on developing a map filter to use in my dashboard that would separate the data in two ways. First, I wanted users to be able to use that map to select which state the other visualizations displayed, as it does in my current product. Second, I wanted the map to display as a choropleth map that could be filtered by each cause of death and would display the percentage of deaths as result of each cause as chosen from the filter. So if “Unintentional Injuries” was selected from a pulldown filter, all the states would change to a blue hue to match the cause of death and the percentage for each state. If New York is selected and 5% of all deaths in New York were caused by “Unintentional Injuries,” then it would be a lighter hue of blue than the hue of green that would be used for “Cancer” at 32% of all deaths in New York. However, due to limitations in the Tableau Public filtering and calculations, I was unable to filter my data in the second way. I therefore ended up creating a choropleth map that indicates the percentages of total deaths that occur in each state, which users can click to pull up more information in the bottom half of the dashboard.
The last type of visualization I tried to create was a graph with circles representing the total number of deaths in a given state per year. This graph was going to go next to what is now the only bottom visualization to give users more information about the total number of deaths in a given year that they were looking at. I used circles for two reasons, the first being that my scale did not start with zero. I did not want to give the impression that these were absolute numbers, as a line or bar graph might indicate. Second, I wanted to prevent users from looking for direct patterns in the number of deaths per year. Another line graph might suggest that the number of deaths per year were related to each other in a direct way, which is not necessarily the case. I decided not to include this visualization in my final dashboard because placing it within the dashboard made all the visualizations almost impossible to read.
The fifth step was the development of a dashboard, which was being altered continuously throughout the creation process. Although I intended to have at least three visualization within the dashboard, I decided more might be overwhelming for users and difficult to read. In order to make it more legible and comprehensible, therefore, I limited it to the map filter and percentage of deaths seen below. I do feel that the map could have been more powerful had the filtering and calculations I wanted to complete been possible.
Results & Discussion
The final dashboard produced during this lab has two main features. The top is a clickable choropleth map of the United States, with the hue of each state coordinating with the percentage of total deaths in the country. For example, California, where approximately 9.9% of all deaths recorded in this dataset occur, is a dark blue, while North Dakota, with approximately 0.2% of the total deaths, is a much lighter blue. This top map acts as a filter for the bottom visualization, which is a line graph that depicts the percentage of deaths per year in each state for the top ten causes of death in the US. Users can hover over years on each line to see the exact percentage of deaths in that state in the given year that had that particular cause. Percentages relative to the yearly state totals were used in this bottom graph to account for the population variation in each state in different years, and therefore disproportionate death rates between states and over several years.
I limited my dashboard to two items because more items seemed to be confusing. I tried adding a line graph of the total deaths for each cause of death each year next to the bottom line graph, but this made each of these graphs too narrow and caused distortion and unclear labels. A similar approach I attempted was to create a graph with circles that lined up the shapes to represent the total number of deaths in each state, but it also became too narrow to include. This issue relates to a complaint I have about Tableau Public: unadjustable dashboard size. I recognize that perhaps the dashboard is supposed to be limited in size, but the small width of the dashboard made it difficult to have even two visualizations across. I would like to have more freedom in changing the size of the dashboard, or if it is available and I did not see it, for more obvious icons or directions in changing the size.
This leads me to another issue I have with Tableau Public, which is that this program is not very intuitive. Although I can eventually get to the rough final product that I wanted and it can produce a fairly clean interface, reaching this endpoint can be difficult for the novice user. I have had experience with Tableau Public in a previous class, so I had some sense of what could be done to create the visualization that I wanted. However, to someone who is unfamiliar, knowing to drag and drop fields over the “color” element to color-code categories may not be intuitive. I personally was not sure how to create a geographic visualization initially and had to search to find out more about aligning the rows and columns with latitude and longitude. This tool can be powerful, but it is difficult to wield that power if the buttons and interface are unclear.
The final concern I would like to mention about Tableau Public is that some of the data calculations may be impossible due to the way it filters data. As noted above, I wanted to filter the map by cause of death and display a hue that represented the percentage of the total deaths in that state that corresponded to the cause of death selected. Tableau Public seemed incapable of performing the calculations necessary after having the cause of death filter applied to it. This limitation made it difficult for me to have my map convey the information I desired, although, admittedly, it would be difficult to explain the intricacies to users if I could.
Ultimately, Tableau Public worked for the bulk of this project. It is a powerful tool and allowed me to create interesting and comprehensible visualizations with my data. If I could make changes, I would like to see more space in the dashboard, a more intuitive interface, and more complex filtering options. However, it serves its purpose well when users know what it is capable of and know how to manipulate it.
Future Directions
In the future, I would be interested in finding a way to get more space on my dashboard to include more items for the user. A graph of circles corresponding to the total number of deaths in a given year per state could be a useful comparison when considering the percentages represented in the visualization. The lack of space made it difficult to successfully implement this vision. I also would try to focus on a specific cause of death to take advantage of the highlighting capabilities of visualizations and to make a particular point. For this lab, I went in thinking more about the visualizations I wanted to create with my data rather than the point I wanted to get across. Next time, I will be sure to do this the other way around. Additionally, in terms of my data, I would have included more data about the population of each state per year. This would require research into census data and intercensal estimates like the ones the NCHS use for their age-adjusted death rates, but could make the creation of the map I intended easier. Finally, it could be interesting to incorporate more age data into this dataset. Many factors go into causes of death, and I think users may be interested in seeing how age may affect the results from this data.
References
Few, S. (2009). Now you see it: Simple visualization techniques for quantitative analysis. Oakland, CA: Analytics Press.
Tufte, E. (2001). The visual display of quantitative information (2nd ed.). Cheshire, CT: Graphics Press.