A Simple Cost-Benefit Analysis of Refugee Movements

April 12, 2018 - All

Visualizations here: https://evanvolow.carto.com/builder/e68ddab2-14f4-440b-8546-09206070ee29/embed

For our Carto lab, I vacillated a bit as to what topic I wanted to explore–chalk it up to not undertaking a geospatial project in a while. I came down to two particular datasets. WHO has data on the number of cases of multi-drug resistant tuberculosis by country, which I wanted to compare with their data on external health expenditure per capita. I hypothesized that there may be an “uncanny valley” in the amount of health aid countries receive, where inconsistent access to antibiotics contributes to drug resistant. I’m still curious about this, but I realized the analysis would be better served by tables and scatter plots than maps.I opted instead to look at UNHCR’s tables of refugee population by country of origin and country of asylum.
I was curious about how well one could predict refugee populations’ movements by weighing geographic distance against “developmental distance”, basically a cost/benefit analysis for migration. The UNHCR dataset is a matrix of sorts, and may lend itself well to visualization as a network. For this mapping exercise, I narrowed the results to 2016 populations of refugees from Afghanistan, due to their large number and my abiding interest in the country’s recent history.
Obviously the real-world determinants of a refugee’s destination are complex, involving cultural, linguistic, legal, and emotional factors. However, for the sake of my simple analysis, I stuck to the percentage difference in Human Development Index, or HDI, between country of asylum and country of origin. I sourced my HDI data from UNDP.

I was inspired by the concept of “cost distance analysis” in GIS, the practice of deriving the easiest path from one point to another. I first became aware of this as a concept when my GIS professor, Jeremiah Trinidad-Christensen, included a map of habitat fragmentation that analyzed the difficulty of an endangered animal in traversing unprotected areas between pockets of wilderness. The choropleth map I’ve created here is entirely different, but again, it’s the concept rather than the visualization itself that inspired me.

Upon loading my filtered table of refugee populations into Carto, I was caught off guard by the program’s automatic creation of shapefiles to correspond with the countries mentioned. I’m used to the more direct control a user has in programs like QGIS, where one must make the relationships between shapefiles and tabular datasets explicit. I decided just to roll with it. The only change I needed to make was editing a few formal country names to their common variants for Carto’s benefit (“Islamic Republic of Iran” to just “Iran”, for instance). Styling the results as a choropleth map by number of Afghan immigrants worked as expected, with populations heaviest in neighboring Pakistan and Iran, and wealthy western Europe. By creating centroids and drawing lines to each other country (each process an “analysis” in Carto), I was able to derive a column of data describing the distance (as the crow flies) between each destination country and Afghanistan. Such a linear calculation is crude, sure, but it’ll have to do for now.

Working with my HDI table in Excel, I created a column with a formula to calculate the percentage difference between Afghanistan and the destination country’s HDI. Loading the table into Carto, I joined it with the distance table, exported again, divided distance by HDI percentage difference (times 100, I think), and imported again. I had arrived at a single figure comparing the cost (linear distance) and benefit (increased HDI) of fleeing to a given country from war-torn Afghanistan. The lower this figure is, the more attractive the destination.
The choropleth map of this analytical figure demonstrates that my calculation places far too much weight on distance. Turkmenistan, Tajikistan, Uzbekistan, Kyrgyzstan, and Kazakhstan stand out as far more attractive destinations in my theoretical calculation than they seem to be in real life. The real top regional destinations for Afghan refugees, Iran and Pakistan, do still show up in the top tier of my map, as does prosperous Norway.
The most glaring aberrations may be Ethiopia and Chad, which throw the results by HDI scores lower than Afghanistan. With more time, or working with software I understood better, I would of course correct this error.
The map reveals a quirk in the UN’s dataset–a lack of Afghan refugees in wealthy Arab nations including Saudi Arabia, Kuwait, Oman, and the United Arab Emirates. It seems almost inconceivable that no Afghan refugees would have found their way to these countries. My classmate Robin Miller, who has lived in Qatar and traveled throughout the region, speculates that these governments do not share refugee data as a matter of public image-making.

Obviously there’s plenty of room for improvement on this analysis and map, but I think a better next step may be working with the entire UNHCR dataset (not just filtered for Afghan refugees) as a force-directed network graph, weighted by number of refugees. The dataset is basically ready-made for network analysis, and clustering of nations in such a visualization may elucidate relationships that are not apparent on a map.

The post A Simple Cost-Benefit Analysis of Refugee Movements appeared first on Information Visualization.

› tags: data / visualization /