“Visualization with R Studio about world refugee displacement from 1979 – 2022”


Charts & Graphs, Lab Reports
Afghani refugees – Image source from UNHCR website

Introduction

For this lab, I analyzed a dataset from “UNHCR (The United Nations Refugee Agency)” and found it through the TidyTuesday website, to have a larger set of data that is clean, consistent, and compliant with the assignment requirements. This dataset was installed through its own package in R Studio.

A personal passion of mine has always been social sciences: since I was born and raised in Venezuela, which unfortunately, is a country that has a high population fleeing the country seeking refugee and asylum status in other countries in America and Europe. Is important for me to understand (and visualize) how people from different countries were displaced in the past and how it is happening in the present.

I looked at all the visualization options in the R graph gallery, to get a sense of how to visualize the data to understand which graphs to use and decided to utilize basic bar plots which will serve one categorical dimension and one qualitative dimension into one visualization easily.

Materials and Methodology

Approached this assignment with mixed methods – By focusing on the data frame first, by important in R Studio the R package, and the data dictionary provided by UNHRC.

The dataset has a total of 120,338 entries and 16 columns – was able to run a glimpse in R to understand the data and information collected in this dataframe.

YearCalendar year data was registered
coo_nameCountry Of Origin Name
cooCountry of origin UNHCRcode
coo_isoCountry of origin ISO code
coa_nameCountry Of Asylum Name
coaCountry of asylum UNHCR code
coa_isoCountry of asylum ISO code
refugeesThe number of refugees
asylum_seekersThe number of asylum-seekers
returned_refugeesRefugees who have returned home within the previous year
idpsThe number of internally displaced persons
retuned_idpsThe Number Of Returned Internally Displaced Persons
statelessThe number of stateless persons
oocThe number of others of concern to UNHCR
oipThe number of other people in need of international protection
hstThe number of host community members
Understanding data dictionary

Secondly, expanding my knowledge and understanding with articles about refugees around the world including:

Refugee crisis in NYC by The New York Times

Internationational Rescue Committee (NYC)

United Nations Migrants and Refugees News

After reviewing all these materials, I was able to come up with three research questions:

Question #1: What are the countries that have the highest refugee population in the present?

Question #2: What’s the year that records the most displacement of refugees?

Question #3: What are the social implications of having the highest population of refugees within a certain year?

Visualizations

When creating the first visualization, I proceeded to filter the refugee data frame based on the 2022 year (the most recent year from the dataset) and filtered based on the first 20 countries, this will help me to visualize the highest amount of refugees (per country around the world) it is possible to visualize countries as Venezuela, Ukraine, Syria, Sudan, Afghanistan, etc – this helped me to answer Question #1.

Visualization #1: List of first 20 Countries with the most refugees in 2022
Visualization #1: List of first 20 Countries with the most refugees in 2022

I proceeded to do a second visualization with data collected only in 1990, the reason why selected this year is that in the article I read from the NY Times, it was stated that the refugee crisis started spiking starting that year. This would help me to understand my question #2, in which we can see that Afghanistan refugees stayed in the highest value with approximately 650,000 refugees that year.

Visualization #2: Countries that registered the highest amount of refugee seekers in 1990 (Afghanistan)

It is visible that Afghanistan is the country with the highest amount of refugees – To understand more about the Afghanistan refugee socio-political situation, I proceeded to visualize this third bar chart, which helped me to understand and answer my third question.

Visualization #3: Understanding refugee count from Afghanistan from 1979-2022

In this bar chart, we can visualize the highest bar amount of refugees from Afghanistan is set to be around 1988 with more than 600,000 people. This visualization also helps me to answer question #2.


To support my visualization, I also looked for readings that would explain the Afghani Refugee crisis and displacement history to obtain more context on what is been happening in the past 2 years.

I was able to find on UNCHR’s website the following article which states the events leading up to the “Taliban’s takeover of Kabul in August 2021 intensified instability and violence in Afghanistan – causing even more human suffering and displacement” and this support in the data set the highest bars around in 2022 again as a social impact.

Reflection and Improvements

R Studio was more complex to be able to visualize, next time when framing the methodology, I will start by considering the “bigger picture” in order to be able to use more and different types of visualizations. Also, spend more time in the gg_plot code and add more details such as spacing, color palette, labels, and text size.

It was very helpful to use a data set that came from a data science-oriented non-profit, which included an R Studio package for installment, a data dictionary, and visualization guidelines. I will consider using a data frame as a preestablished guideline for future assignments.

Resources

GitHub, Bar Charts ggplot2, (n.d) ggplot2.tidyverse.org/reference/geom_bar.html.

UNHCR, The UN Refugee Agency. Refugee Data Finder 110 Million Forcibly Displaced People Worldwide, (n.d) www.unhcr.org/refugee-statistics/.

Galal, Hisham, et al. UNHCR Refugee Population Statistics Database, (26 Oct. 2023) https://cran.r-project.org/web/packages/refugees/refugees.pdf

UNHCR, The UN Refugee Agency. About Afganistan, (n.d) www.unhcr.org/refugee-statistics/.