Malaria Cases 2000-2012


Lab Reports

Introduction

While there is an abundance of global health data, there is a severe lack of good global health data visualization. The lack of visualization in this area makes it difficult for policy makers to make the best decisions when distributing resources and making decision on both a global and a local level. This is an area that I wish to pursue further in the future, so beginning to practice with the creation of narratives through data visualization using Tableau Public was a valuable learning experience.

Tableau Public is data visualization software which allows users to select specific variables and input them into different visualizations. These visualizations can then be uploaded into a dashboard which is useful for creating a wider view of a topic or specific trends in relationship with one another. The interactivity of the visualizations also assists users in manipulating the visualizations to highlight a specific variable or trend.

The creation of a dashboard with data interactivity is an introduction into the ways data visualizations can be displayed together and create a picture or tell a story. This kind of dashboard creation is often used in reports and presentations.

 Inspiration

I chose a data set from the World Health Organization about reported cases of malaria, a disease transmitted through mosquito bites. Malaria is a preventable and easily curable disease, so its presence is indicative of larger infrastructural problems within a country and a lack of basic resources. For inspiration, I looked at other visualizations of health, disease and epidemiological data. I was looking for visualizations which compared trends over time, incorporated maps and trends. Because the data that I used was organized by country over multiple years, I wanted to create a visualization using small multiples which would represent trend in malaria cases over time. I also wanted to choose specific years in which to compare all of the countries’ instances of malaria.

However, when looking for inspiration, there was a lack of examples of small multiples with which to show the data. While I didn’t find a good example of small multiples, I did find a Tableau project which specifically was made to create visualizations concerning malaria, it’s transmission, and trends in reported malaria cases.

I found that many of the visualizations on this tableau project were quite cluttered and were useful mainly if one was already familiar with specific variable concerning government versus local reporting of cases, and the factors which influence the spread of malaria. Therefore, I knew that I wanted my visualizations to be focused on a small number of variables and to not include a lot of text.

 

 

This visualization is simple and straightforward. It uses color effectively and suggests some interesting trends about the reporting and confirmation of malaria cases over time.

This visualization attempted to show multiple trends at the same time. Interactivity on this visualization makes the information easier to read, but I would say that this visualization is an example of trying to do too many things, especially with the inclusion of two variables on the y-axis.

While this visualization incorporates the mapping element that I hoped to include in my visualization, it includes a lot of small text. Also, the connection between the regions and the trends per country, represented on the bottom of the visualization, is confusing. I think color was used effectively here and the element of representing countries with high rates of malaria and their distance from the equator is interesting.

Materials

This lab required a dataset which met four requirements:

  • 1000+ records/rows (i.e., at least three orders of magnitude)
  • 1+ quantitative dimensions
  • 1+ categorical dimensions
  • historical (has time-oriented data)

For my dataset, I perused the datasets on the United Nations data portal. I located a dataset from the World Health Organization entitled, “Malaria-number of reported confirmed cases.” There are 99 countries and areas represented in this data set with the number of cases collected over 13 years. There are 1,091 records in this dataset.

Google Open Refine, an application for data cleanup, was used to check the data set and clarify any variables. Tableau public, a free data visualization program, was used to ingest the data set and to create data visualizations.

Methods

My goal was to create visualizations which illustrated trends of cases over time, such as a line graph, an area map or a choropleth map to compare number of cases per country, and a small multiples visualization which would compare all countries and all rates of malaria over time.

After downloading the dataset as a csv (Comma-Separated Values) file, I uploaded the dataset to Google Open Refine. Here, I was able to see if there were any unnecessary columns and change any variable names to plain language.

I then uploaded the dataset into the Tableau Public application. The first visualization that I created was a variation of small multiples where all of the countries and regions were represented and the number of cases over time were shown. This visualization was not included in the final dashboard as the visualization was too large and fitting it into the dashboard unnecessarily cluttered the information.

I created a choropleth, or heat, map which represented the regions within my dataset and the number of cases. This visualization includes an interactive element which allows for users to change the year and see how the number of malaria cases in the regions change over time. I chose the color, orange, as an alternative to red. I wanted the color to represent a sense of urgency, but not to be overwhelming over time. One issue that I ran into with this visualization was that the range of cases was very large. This meant that the colors associated with the number of cases are sometimes hard to distinguish. The color orange for one case is very close to the color orange for 10,000 cases. However, the visualization does emphasize the increase in the number of cases and it is possible to distinguish trends.

The next visualization that I created was an line graph which was filled in with color. The total number of cases and the total years surveyed were shown. This allows for an overall trend to be represented clearly. The color I chose was the same orange as the color used for the choropleth map in order to maintain continuity and ease of readability.

The final visualization that I created drew from my original small multiples visualization. I was able to see that there was a specific year in which the number of malaria cases spiked for certain countries. To highlight these extreme trends, I created a line graph in which the four countries with the most drastic change in reported malaria cases were represented. I chose four distinct colors that also had some reference to the orange used in the choropleth map.

The goal of the dashboard was to use the visualizations to create a clear narrative or story. I made the map the largest element as place is very powerful component of this narrative. The map is also the most interactive element and it is easy to see which countries have the strongest trends of cases over time.

The filled-in line graph and the line graph which represented the trends in malaria cases for multiple countries were made equal in size as they both include powerful pieces of the narrative that are introduced from the map. Depending on the user’s interest, they can see both that there is an overall trend of a rise in malaria cases as well as which particular countries have experienced the most drastic increase in cases. The narrative created by the dashboard allows for specific questions and conjectures to made as to the causes of this trend and why these particular countries had the most drastic increase in the number of cases and the years in which those cases spiked.

Because what I learned from the visualizations that I used for my inspiration, I wanted to avoid an abundance of text. My goal was to have the visualizations speak for themselves and that a narrative and clarity of subject matter would arise from the visualizations and their interactive features.

Results/Discussion

My Dashboard

This was an instance in which the trends, unreadable and hard to interpret in the form of a dataset, were suddenly very apparent in the form of a graph. Further research revealed that these particular countries in which the most drastic trends were identified, were undergoing civil conflict which would decrease access to basic resources as fundamental infrastructures collapse.

Creating visualizations with Tableau Public requires a lot of tinkering and not all of the capabilities are intuitive. Overall, however, I am satisfied that my goal of creating a narrative through data visualizations was successful.

Future Directions

I think that the addition of more narrative elements in an expanded dashboard might be more effective. I would also like to figure out a way to make the choropleth map more successful by tinkering the color range so that it is easier to distinguish between countries that have a lower number of malaria cases.

According to the usability testing that I conducted using this dashboard, the map was the strongest element and while the color opacity made it hard to distinguish on a more granular level, the clear indicators of “a lot of cases” and “a low number of cases” was obvious and helped my user complete the usability task that I set.

I would also potentially add information about the potential causes of the rise in malaria cases so that the trends can be connected to specific events, such as civil conflicts. This would give a more complete narrative.

References:

World Health Organization. (Jul 2014). Malaria- number of reported confirmed cases. [data file]. Retrieved from http://data.un.org/Data.aspx?d=WHO&f=MEASURE_CODE%3aWHS3_48