The U.S. Population is changing. Depending on where one lives — or possibly which channels one watches — the fact that the United States of America is becoming a more diverse nation may or may not be a surprise. According to a Brookings Institution article from last year, the nation will turn majority-minority or “minority white” in 2045, meaning that the combined minority populations will outnumber the white one.
However, fewer Americans are probably aware that the nation is also becoming an older one. The same underlying projections from the U.S. Census Bureau that were referenced by Brookings project that by 2030 (in Brookings’ words) “1 in every 5 residents will be retirement age.”
These trends are actually interlinked, due to the racial composition of our younger generations. I chose to examine both using a data slice of the recent past.
The links cited above were the primary inspiration for my explorations, and they brought with them a number of visualizations by the Census Bureau itself.
These are pretty standard. The “More Diverse Nation” visualization indicates some of the challenges of the range of racial categories, which become unidentifiable slivers at the top of each bar.
I found the gender comparison more compelling, as the comparison of overall shapes side-by-side created an immediate comparison — though one that is aided by a much longer time frame than the one with which I would eventually work.
I was also focused on age comparisons over gender and found a similar viz via an online search that produced the following static image, about which more can be found at knoema.com.
This graphic stood out as clear and engaging — though there’s a perhaps slight editorial comment in the coloring of the Silent and Greatest Generations as grey. Without excessive labeling, it suggested the full range of each generation by using discrete bars for each year. I saw possibility here for my own visualizations later, when I looked at how a population advances into a new category.
Materials and Resources
- The data set comes from the Annual Estimates of the Resident Population available at American FactFinder
- Excel was used to combine and examine the separate data sets for the nine years, before cleaning the aggregate up in OpenRefine.
- Exploratory and final visualizations and the dashboard were created and assembled using Tableau Public.
Methods and Process
Based on my previous lab experience, I spent much of the initial period exploring and understanding Tableau Public, so I could be more efficient in using it. This worked well, as I was also, to some extent, looking to see what the data would reveal in different forms. This included initial explorations with tree maps and heat maps that quickly proved fruitless.
Since much of this was about different classifications within others, I created a number of age groups and filters to create different racial and age comparisons (e.g. Under 18 vs. 65 and Older, White vs. Other). These were both necessary and helpful, since some of the shifts in data were relatively minor.
The biggest break came when I explored recreating the Knoema graphic, which got me thinking about how the individual bars could tell a story across time and how color might be used to show changing populations, with individual years advancing into a new age category. (Some of the more relevant unused explorations remain as tabs in the final Tableau Public link below.)
While I discarded this comparison of age groups across race for a trellised line graph format, it provided great insight for the generational graph.
As several of the individual visualizations took form, I created and developed the dashboard, so that I could see how each viz informed the others and how they could work together to tell a story.
The result is a collection of four visualizations that show how the trends noted in the introduction are evident in the 9-year period for which I gathered data. They are basically paired, A/B for Age exploration and C/D for Race exploration. The explanatory text, titles and captions then lead the viewer through the story.
Naturally, these results support the articles, but also allow one to investigate further. Chart A allows the viewer to advance the populations through the years and see how the younger population appears to remain static (actually dropping a little) as the older one increases in size.
I did make conscious decisions to have the separate visualizations work collectively as a compound figure. Color is intended to unify the different charts, with yellow generally denoting static or lower numbers while green suggests an area that is increasing. Similarly, the overall skew in the trellis display of C mimics the overall form of A, suggesting a relationship between the two.
Overall, I think the visualizations are effective in exploring and communicating the topic, factoring in certain limitations that I imposed.
It must be stated, for starters, that the data set only reflects answers to the Race categorization as seen on the Census and American Community Survey and uses only those categories, without factoring in the Ethnicity question asked by the Census Bureau: “Is this person of Hispanic, Latino, or Spanish origin?” (I made sure to at least note this on the dashboard.) While this population is included in the total population (within several racial categories), recent issues and discussions relative to the changing U.S. population dictate that any further explorations include this as a factor.
This additional data would of course create various other possibilities for visualizations — and a more truthful portrait of the nation — but looking at what is there now, many of my interests lie in cleaning up the experience and overcoming some of the idiosyncrasies of the Tableau Public format and/or my unfamiliarity with same. (The legend in the middle of D is a prime example of something that just needs a better solution, though it works better than I imagined.) For instance, B exists because I wanted to make sure the viewer had a direct understanding of the percentage differences in the two highlighted age ranges of A — and because I wanted to make sure the change could be viewed in static form. With more time, I would like to explore further how A itself could also incorporate presentation of the change in percentage for the whole age range collectively, while also showing the progression of a single year through the chart.
Looking forward, I am particularly interested in what user testing reveals about the trellis display and some of my decisions regarding color and labeling. For example, I removed axis headers and labels as often as possible, believing that the viewer could connect the yellow-green combination in A and B as indicating that those bars are “Under 18 years old” and “65 years old and Older,” respectively. Similarly, all the ages are not displayed at the bottom of A, since it seems obvious (to me) that those bars are the individual elements that compose the age range in the label at the top.
We all know that a design is only ‘finished’ because the time frame for working on it has ended. This set of visualizations is a prime example of that and would benefit from a few hours of refinement — and a data set that included an important missing population.