Introduction
Infant mortality is the death of young children under the age of one. The infant mortality rate (IMR) is the number of infant deaths for every 1,000 live births. The rate for a given region is the number of children dying under one year of age, divided by the number of live births during the year, multiplied by 1,000.
Infant Mortality Rate is an important indicator of the overall health of a society. It also reflects the social, economic, and environmental conditions in which children live. There are a number of causes of infant mortality that vary from country to country, including poor sanitation, poor water quality, malnutrition of the mother and infant, and inadequate prenatal and medical care.
Inspiration
This Infant Mortality Rate dataset attracted me when I browsed through Kaggle. It shows the global IMR data from 2009 to 2019. I found there are great differences among countries. I searched some news about IMR of my own country and found that they made much effort to decrease this number. The IMR of my country reached to 200 ‰ before the nation was built in 1949. And this number has dropped to 5.4‰ in 2021. So I was quite interested in analyzing this dataset to see the conditions all over the world.
Tools & Process
Datasets: Kaggle & CDC
My dataset came from Kaggle. It is an online community that allows users to find and publish datasets. For the dataset I used, the source is from UNICEF. And it has been pre-processed by the publisher. Besides, when I found materials of this topic, I found CDC has the dataset of IMR by states. So I also downloaded this dataset to see the conditions among the U.S. states.
Links to the datasets: https://www.kaggle.com/komalkhetlani/infant-mortality
https://www.cdc.gov/nchs/pressroom/sosmap/infant_mortality_rates/infant_mortality.htm
Data Cleanup: Open Refine
Open Refine is a powerful tool for working with messy data. Although the dataset I downloaded from Kaggle is pre-processed, I still used “Facet” to aggregate the name of a column to double-check. Then I found out that this column contained not only countries’ names, but also areas and income groups. It seems that Open Refine is not able to detect whether it is the country name. And I had no idea how to find and filter them out quickly. But this problem was solved in Tableau.
Data visualization: Tableau
Tableau is a visual analytics platform that helps people see and understand data. It is pretty intelligent. When I added my dataset in Tableau and chose the form of maps, it was surprising that it detected 35 unknown content that was not country names. So I only need one click to filter out these data, which is so convenient. Tableau is very powerful. Although it took me some time to learn, I was able to use basic functions after multiple tries.
Results
I used 2 datasets to make 4 visualizations including bar chart, line chart, and maps.
1. Global Infant Mortality Rate (2009-2019)
This map shows the IMR of most countries around the world. It changes by years with the function of Pages. From the map, we can see the global IMR has decreased during this time period and is predicted to continue to decrease in the future.
2. Top 5 & Bottom 5 countries in Infant Mortality Rate (2019)
This visualization shows the top 5 and bottom 5 countries in Infant Mortality Rate and their positions on the world map. We can see there are wide disparities around the world. The bottom 5 countries are all from Africa and they are all developing countries. However, the top 5 countries are mostly from Europe. To be honest, I never heard of Cyprus and San Marino before. I searched them and knew that they are both highly developed countries with a very small population. Healthcare in these countries is of a very high standard. That’s why their IMR is very low. This fact really opened my eyes.
3. The U.S. Infant Mortality Rate by States (2019)
Overall, the IMR in the United States is lower than in most countries. But we can see the disparities in IMR exists among states. New Hampshire had the lowest infant mortality rate (3.07) while Mississippi had the highest (9.07) in 2019.
4. Infant Mortality Rate by Income Groups (2009-2019)
The World Bank classifies the world’s economies into four income groups—low, lower-middle, upper-middle, and high-income countries. From the line chart, we can see IMR also follows the rank of four types of income countries. The countries with higher income have lower Infant Mortality Rate.
Reflection
I got some good suggestions from my reviewer. For example, I didn’t realize that viewers might be confused about how World Bank classified the countris by income. So I added some explenation in my report, which would be more understandable.
For the tools I used in this project, since the dataset I downloaded from Kaggle is well processed, I did not have to use the advanced features of Open Refine. Although I only used “Facet” to aggregate the name of a column, I still found the problem that this column not only contained countries’ names. I think it would be better if it are able to detect country names and let me filter out the unknown names like Tableau.
As for Tableau, I never used it before. From my point of view as a UX designer, it is very hard to make it looks easier for a software with tons of functions. There is no doubt that Tableau made it. Users only need to drag and drop the data in some areas, which is pretty intuitive. It will make aesthetic visualizations automatically. But there is one thing I came across when I use Tableau. I initially wanted to show IMR ranking by year with the form of an animated bar chart. And then highlighting this animated bar chart in the world map. It must look cool because I have seen some examples presenting COVID-19 data. I found a tutorial on the internet,it requires me to write a formula. I followed the tutorial, but it failed for some reason. I guess probably because the dataset needs to be further processed.
Overall, Open Refine and Tableau are both powerful tools. Since they are designed for professional users, it requires me to spend some time to learn and understand their basic functions. And If I want to use them to process more complicated datasets and make more interesting visualizations, I need to spend more time digging into their advanced functions.
References
https://www.britannica.com/science/infant-mortality-rate
https://www.cdc.gov/reproductivehealth/maternalinfanthealth/infantmortality.htm