What Influences the State of Languages?


Final Projects, Visualization
Globe

Introduction

This project shows economy, geography, and recorded conflict in relation to mapped endangered and extinct languages.

Materials

The program used to create the visualization is free software Tableau.

The endangered/extinct dataset is from Kaggle.

The country GDP dataset is from the World Bank.

The recorded conflict dataset is from The Uppsala Conflict Data Program (UCDP) at Uppsala University, Sweden.

A copy of the dataset for the visualization is available online.

One of the maps in the visualization is a custom map from Mapbox.

Methods

This project is an extension of the Map Lab. Inspiration visualizations are the visualizations from After Babylon, and the visualizations from Uppsala Conflict Data Program.

The initial step was gathering datasets. I previously had the dataset about languages.I found a dataset of recorded world conflicts in a few regions of the world. I was recommended to combine the datasets and create one sheet with data_type as a column. This was utilized often for filtering throughout the process.

After creating a map with points for language, and a choropleth map of conflict by country, I layered them in a dashboard using transparency. However, I did not have a clear way to move forward with other potential visualizations on the dashboard. I experimented in creating a dual-axis map, but didn’t end up using it (see image below).

Dual-axis map experiment

I decided to change the conflict dataset to a global (1946-present) one, which also allowed it to be easier to have the (also global) GDP (most recent) dataset; I wouldn’t have to isolate data. This also encouraged me to create a dashboard with multiple maps, rather than one map with multiple layers that were hard to control.

I created one map with the languages as points, one choropleth map of recorded conflict, and one choropleth map of income level. These were all grey or blue to contrast the warm colored language points.

Because the languages were the priority, the points were bright colors. I was initially using Tableau’s light map (see image below).

Because I wanted to have information about land formations, I created a map on Mapbox to highlight mountain regions, which seemed to align with where languages were often plotted. I didn’t want to make a new map just for geography, so I merged it with the point map. I wanted  a serious, not-distracting color. I went with a grey-blue. This became the main map on the dashboard (see below).

The conflict map and income map were blue because green looked Christmas-y, even though I initially wanted green to represent money. I will later go back to using green. At this point, I also considered having the two choropleth maps be different colors but on one dashboard, that was a lot of color. I had all the maps in the a similar “starting” position, situated with Africa somewhat in the middle. Below are the income and violence maps.

Three graphs were created to provide numerical and comparison information about the languages and regions/countries they mostly exist(ed). Because the geography map was grey/blue (to match the blue choropleth maps), I made the graphs with regions and countries grey. The yellow to red language graph matched the colors of the language points on the map.

The visualizations with languages were interactive and will filter together. The two choropleth maps were separate but also interactive. There were two filterable legends.

I based the layout off the “Data Storytelling & Dashboard Design” lecture slide with the baseball players, with the “beginning,” “middle,” “end,” and “epilogue.” The main map was the beginning, the right with the choropleths were middle/end, and the graphs were additional information on the bottom.

Two individuals provided critique about the dashboard. My expectation was that this individual be bilingual, so they would have interest in the subject matter. My second expectation was that they have background understanding of world events and world history. Finally, I wanted tech-literate individuals.

I communicated with one individual via email. One individual found that graphs were squished, so I had to update the format to adapt to screen sizes. They provided positive feedback about color and data displayed by the graphs.

I communicated with another individual in person. This tried to “break” the dashboard by excluding and searching for information. This did cause the visualizations to disappear. This user noted that they should be able to bring back or reset the map, which was later solved by the storytelling format in Tableau.

The main takeaway was that the dashboard was too crowded and there were too many buttons. I went for a structured storytelling format, which informed a multi-dashboard “story” order. The beginning introduces the user to the idea of endangered languages; following two “pages” provide a conflict and economic comparison. I also created more dynamic bar graphs so they provide the breakdown of languages, not just a general count. I also used this idea for the graphs on the other pages.

Partially completed, this version was presented to the class for critique. Suggestions for color changes, emphasis on geography, and the addition of human movement were suggested. 

Many of these recommendations were taken into consideration, and more would have been if not for the time constraint. One big change is the map. I darkened the map to create contrast that would show the mountain ranges and deserts areas more clearly. Because I was no longer using blue as a color for all my maps, I was able to move away from the greyish blue to a darker grey. This helped highlight the language points.

Another change is the color of extinct languages, which went from red to purple. Changing this allowed me to use red elsewhere; it was applied to the conflict map. 

I provided annotations to the bottom of the conflict map and economy map for context. I attempted to do this on the main language map, but it became crowded, and some annotations were irrelevant when it carried onto different pages. I considered writing a conclusion page, but ultimately decided against it for time purposes.

I also enlarged the conflict map and later did this for the income map, in order to provide an easier experience of comparison, and to provide visual weight, thus value, to the new ideas introduced on those pages. Colors were created to match the content; red related to violence and green related to the American dollar, which is a dominant currency.

There were small fixes, such as activating filters and removing the capacity to exclude information. This was to avoid having users “break” the map. The new story format also has a refresh button on the top of the page, which is useful. At this point, both the creator and Tableau were both experiencing slower processing capacity.

There was also the additional step of designing the storyboard. Small edits were made to existing dashboards would fit into the storyboard format. The storyboard was edited to adapt to screen sizes. The map is intended to be interactive and digital. However, in its full screen mode, it has the potential to be a printed resource.

Results

The final project can be found here (link). The first page of the story introduces the user to the idea of endangered languages. A custom geography map is provided to emphasize mountain range and desert. Definitions of the different states of endangerment can be found on the left. Graphs about the languages and where they can be found at the bottom. 

The map is interactive and colors will highlight across the visualization. Hovering over the bar graphs will provide numbers associated with the language state, country, or region. Interacting with one point of the map will provide the user information about language, its state, and if available, the number of speakers.

While Asia has the highest number of endangered/extinct languages by far, the countries with the highest endangered/extinct languages are in both Asia and the Americas. North America has the second highest count by region. The United States has the highest count for extinct and endangered languages. While there is a count for extinct languages, there is a might higher count for endangered, but living, languages. 

The second page compares regions of conflict with endangered/extinct languages. An annotation provides context about the purpose of this page, as well as information that should be considered when viewing this page, such as how events are recorded, and the time period before the listed 1946.

Like with the first page, the languages map can be filtered and points can be interacted with. The map with recorded events of violence also allows for interaction. Interacting with a country will provide a number of recorded events for that country. The graph on the bottom, which will highlight the map, provides the countries with the most events.

Areas that show the most conflict after World War II are in Asia. Numerous countries are not represented in the dataset, and thus are grey.

The third page looks compared countries of various economic states with endangered/extinct languages. An annotation provides context about the purpose of this page, as well as information that should be considered when viewing this page, such as historic economic information.

Like with the first two pages, the languages map can be filtered and interacted with. The map about income status has a key that reflects both the map and the chart. The map and chart are interactive and influence each other when countries or colors are interacted with. 

The darkest green, which is the lowest income level, is most present in Africa. Areas with the lightest green, the highest income level, are in North America and Western Europe.

Reflections

Having a clean, organized dataset is difficult, especially when there are numerous “kinds” of information in one CSV file. I wish to learn how to use multiple datasets on one Tableau project. However, being able to filter with data_type was useful, and I depended on it when creating the graphs. Being able to look at sections of the dataset with this filter on a spreadsheet was also helpful.

One aspect of dataset search struggle was finding a dataset that was appropriate for the project. The dataset showing recorded events of conflict excluded violences that may explain regions in the Americas, and elsewhere. How did the data collectors define conflict? Did they focus on one region more than another? What did they exclude? If I had more time, I would explore these questions. Perhaps I would have chosen a different dataset, or combined multiple datasets.

One dataset that I wish I had but didn’t was human movement. It would be cool to see how languages got to islands in Oceania. It would also have been interesting to show trade routes, ship movement, and movement from colonialism. Not shown in this project, but quite relevant, are the relationships between languages. It would have been nice to be able to include that information, after it has been researched.

For the income map, I wished there was historic data. Because the wealth of a nation can change over decades, it would be hard to understand how language existed through these changes with only the most recent information. Because languages are dependent on a scale of “generations” rather than by fiscal year, income information would need to reflect long term changes. While the map currently shows low income in Africa, which in modern history has been a steady state, there is less reflection of how income has changed in Asia, which could be an interesting story to consider.

In terms of using the program, I felt more comfortable on Tableau compared to the first lab that used it. I was open to experimenting, and I was able to understand tutorials on Tableau’s website, as well as have language to search for what I wanted. I also felt comfortable restarting sheets, rather than fearing loss of effort. I also had a better sense of how Tableau “wanted” the dataset to look. 

I enjoyed being able to use map knowledge from Carto, even though it didn’t translate completely. One aspect I took away from using Carto that didn’t translate to Tableau was layering. As a result, I struggled to move away from the layering concept, and gave up only when users told me the dashboard was squished. However, I enjoyed learning to use about Mapbox from the Carto lab, as well as understanding color on a map, which were both actions I applied to this project.

Seeing other projects use the storyboard structure encouraged me to apply it to mine; this solved the problem mentioned by one of the users about space. Because the inspirations I looked at had interactive pages, or was scrolling page, I was trying too hard to be similar to them. I had to understand that Tableau had constraints and tools, and had to design with that in mind.

For the future, this project needs to connect more closely with geography and human movement. The history of human language is very physical.

A “part two” could be the most recent kind of “language,” online language. Technology has allowed for instant text and visual communication, and it would be interesting to see where certain online phrases or images are used, by who, on what platform, and for how long.