Do city distribution and demographic analysis help first-time visitors to know more about Chicago?


Charts & Graphs, Final Projects, Visualization
Figure 1.  “map lying on a wooden table” – image credit to rawpixel.com on Freepik

1. Introduction

In a previous geographic visualization report, I targeted one of the major cities in the US and try to analyze the visualization by showing the population distribution in relation to land use and the crowdedness of the city. In advance of that report, this project presents more detailed and rounded research with data visualization analyzing more specifically the population distribution in Chicago. The ideal focal point of this analysis is to dive into more live-related demographics of the city and understand its racial and ethnic diversity, age distribution, and the density of population in visualization. When I visited the city, I was intrigued by its rich and diverse cultural heritage. Indeed, by helping visitors appreciate and respect the various cultural traditions, letting them be aware of the population density of different areas to make informed decisions in trip planning, and helping them tailor their experiences to their interests, this visualization is wish to re-explain existing data to a narrative or a helpful tool. By making the demographics visualized and displayed, the report also engaged with 3 interviewees with no previous visit experience with Chicago. The purpose of conducting interviews in visitors’ experience research is to both analyze if the visualization is providing useful information to their acknowledgment, and explore how they feel when using the information to plan their later trip to Chicago.

The main goals of this visualization report are to target on:

  • Is there a connection between the population distribution and age?
  • Is the population distribution affect gender differences?
  • How do different races spread in the city?
  • How does Chicago changed in the last several years and is there any takeaway we can get for the future?
  • By combining this visualization into knowledge and information about the city, is this report/visualization helping first-time visitors to know more about the city?
  • How does Chicago’s population distribution in response to the fluctuation of Covid-19?

2. Design Process

Visualization Design

The origin of this project is derived from my Lab 4, which mainly discussed the city land usage of Chicago in reflection on the total population distribution in areas(Zip codes). According to the feedback from that lab presentation, it is noticeable that the population per area can be divided into more specific and detailed units, in order to fit research needs and provide more practical suggestions for different searching purposes. Indeed I choose to expand my population variables from “total population” to the population of specific groups of ages and races in order to fulfill the diversity of the analysis.

Data.Gov: Since this project is based on a deeper dig of the dataset that I used in the last lab, the main value file is still using data.cityofchicago to gather the population dataset. In expanding the main population source that I wish to focus on, some of the displayed 24 different age/racial groups are overlapped/unclassified. The time range of these populations in relation to years from 2018 to 2021, and I use Openrefine to recompose them with individual years and groups to make my life easier when working with data from different years. The original data is having all years listed in one column with the same zip file appearing overlapping for too many times, and making individual files is significantly useful to eliminate a reading problem for QGIS and easier to select from the different dropdown menus.

GitHub: The linking and combining system is a little hard for me and the file that I found in GitHub helped me a lot in locating different geographic locations and binding them into specific affiliated zipcodes. This process is similar to what I have done in the last lab, but this time with familiarity, I was easily finished even though I ended up creating a lot more CSV files than last time.

Openrefine: As I have explained in the data gathering portion above, my main data file is having difficulties in three areas: 1. Missing and overlapping information. Among the 26 columns of different age groups that are included in the original file, 5 categories only have a total amount instead of spread in areas or zip codes. And among those having various amounts of data for the same category, some data are overlapped with others because maybe when the government were collect and publishes data they set some specific criteria and it makes some of the data count several times. I ended up combining and subtracting the master file into a new file that has more direct and less overlapped data so I can work more easily on my analysis and visualizations.

The five unspecified categories in the master file are the population from age 5-11, the population from age 12-17, the population of Asian Non-Latin, the population of Black Non-Latin, and the population of other races Non-Latin.

Tourism data gathering: One of the purposes of making this data visualization is to see if some of the information is helpful for first-time visitors to know more about the city. Standing from a visitor’s point of view, I was also doing research on mainstream rankings and websites and listing out 30 different visiting attractions throughout the search component “top ranked most popular tourist attractions in Chicago”. According to tour-guiding posts and websites like Planetware, Touropia, Vietravel, GoChicago, and results from Google’s automatic searching engine(the disclaimer stated that the results are ranked/posted mainly by popularity. Based on the frequency of mentions across the web, and proximity to the destination from your search.), I re-ranked them based on mentioned times by these top-influenced posts, in response to the popularities that were mentioned by search results. 17 out of 30 were mentioned 2 or more times and indeed I correspondently relate their coordination with QGIS so I can display them with my geographic displays.

Figure 2. Top visited attractions in relation to mentions in ranking

Tableau Public: The two main functions in the report that were used by Tableau are displaying the dataset that I made for tourists in order for them to know more about the attractions, and to analyze how the population shift/distribution from 2018 to 2021, where the race is only displayed in a total number but not specified in areas.

QGIS 3.0: The choice of using QGIS in this project is obvious: to show the geographic distributions of Chicago. And using this tool I can easily get my ideal information and findings from the illustration. It helps a lot with displaying and comparing time-related issues.

ZOOM & Dovetail: all of my 4 interviews are conducted via Zoom because I want to diversify my results and research into a more considerate environment. One of my interviewees is from Milan so that is a long distance of data gathering. Dovetail is a life-saver for me in this short period of time when I have conducted 4 interviews which are on average 17 minutes and the longest 32 minutes. It is an online transcribe AI system that can automatically generate our conversation into transcripts and the correction is impressive.

Interview Design

After I made the 100000000 illustrations on my data analysis, I intended to choose some of the more informative visuals and consolidate them into a bundle called “visitor’s package”. In order to test out how much this bundle helps visitors, and in which field and level may it help real-life visitors, I compressed it into a Zip file and send it to my interviewees before I conduct the interview. I gave them at least 2 days ahead to read through my visualizations and interview prompts so they can get a sense of what my research is about. Due to the limited time of research, I only conduct 4 interviews and gathered feedback and findings based on the three interviews.

Interview Prompt

Hi, I’m Sean and I am a student researcher at Pratt Institute. First thank you so much for your time today for being with me in doing this interview. Before we start our interview, do I have your permission to record your voice for research purposes? This audio won’t be shared outside the class and will be deleted after it’s analyzed.

  1. What motivated you to visit Chicago/plan to visit Chicago?
  2. How important is understanding the population distribution of a city to you when visiting a new place?
  3. How old are you?
  4. What kind of information would you like to learn about a city’s population distribution before visiting?
  5. What kind of information would you like to learn about a city’s age distribution before visiting?
  6. (Here is a top-rated most popular attraction that I gathered from the most viewed online ranking pages and posts.) Have you heard anything from these listed places?
  7. How familiar are you with the racial diversity of Chicago?
  8. How familiar are you with the ethnic diversity of Chicago?
  9. In your opinion, how does knowing the racial and ethnic diversity of a city enhance your travel experience?
  10. Have you ever visited a city with a diverse population before? If so, how did it impact your travel experience?
  11. Do you think the age distribution of a city is matter to you when visiting a new place?
  12. Overall, how do you think this data visualization of Chicago’s population distribution can help you plan your trip and make the most of your visit?

3. Inspirations & Decision Making

My inspiration for this final project is still an extension of my unhappy memory of the crowdedness of the city almost stressed me out as a tourist. The waiting time is so long and a lot of times we were stuck in traffic or unpleasant fixation of public transportation. Since Chicago is a diverse and dynamic city with a population of over 2.7 million people, understanding its demographics can help policymakers, businesses, and individuals make informed decisions, as a former visitor, I wish my research in population and distribution can somehow help other visitors, especially first-time visitors, to be more engaged and more prepared instead of as overwhelming as me.

Some of the design decisions are:

Geographic Properties – Graduated – 5 columns and equal interval: I choose to use graduated instead of categorized because it is more controllable and can make the entire graph more visually impactful. I am limited to only 5 classes because I choose to have a vibrant color series and I don’t want to be too colorful to distract vision when seeing the areas and boundaries.

Legend – Symbols on the right – only show items inside the linked map and atlas feature: I think it is important to mention my legend design here because not only does the consistency of display matter in my Gif file when doing the time comparisons but also a closer distance to the real map so that it is easily tracing the color with the legend.

Color selections in the illustration: Because I do not have a previous background in either illustration or IXD design, I use Plasma as the default QGIS color series because it is the one that I found useful and has a dramatic difference in range and value of color. In gender analysis, I only use blue and red in order to refer to male and female populations. Some of them were displayed in a roll and some of them are displayed as comparison or individually displayed, and that is also related to the narratives and the findings that come out with each pair/individual.


Before moving on to the next section, can you guess where is Chicago downtown in the map below?

Figure 3. 2021 Chicago total population distribution

4. Visualizations and Results

Figure 4.1
Figure 4.2
Figure 4.3

In these three illustrations, I want to focus on how underages, adults, and elders are distributed in the city in relation to where the city distribution is. By putting the three 2021 data illustration together, it is clear to see that downtown Chicago actually has fewer people in all age groups in relation to other areas. Another finding can be that the middle and middle-top Chicago contains more working classes from the city, whereas the southern part of Chicago is more favored by elders.

Figure 4.4
Figure 4.5

These two illustrations indicate a clear comparison of how males and females are distributed in the city. To my surprise, only several out of 60 different area codes displayed a slightly different proportion between the two genders. Other than the few outliers, the entire city is distributed so even among gender. The 2021 results look almost exactly the same.

Figure 4.6
Figure 4.7

Since some of the race data cannot be analyzed based on a total number, I illustrate both Latinx population and Non-Latinx White population distributions in relation to land occupations. In Figure 4.6, it is clear that in 2021, the Latino population stay around the midwest and northwest of Chicago, whereas the White population dominantly occupied the north and northeast of Chicago, which is also close to the financial district and downtown areas.


5. Further Engagements & Findings

Summary of previous takeaways are:

  • Downtown Chicago actually has fewer people in all age groups in relation to other areas.
  • Working-classes “city labors”(age 18-65) are tended to live in middle and middle-top Chicago. Whereas Elders are more gathered around the southern areas.
  • Other than the few outliers, the entire city is distributed so even among gender in data from 2021.
  • Latino population stays around the midwest and northwest of Chicago.
  • The white population dominantly occupied the north and northeast of Chicago.
  • Downtown is awkwardly having a very low population, and it is the same across age, race, and gender.

Indeed, while I was gathering data and conducting interviews with my research, I did a deeper analysis and visualization of questions related to race and age.

Figure 5.1. Population distribution of people from age 0-17 from 2018 to 2021

The pandemic is a strong hit in all human lives back in previous days, and some of the social problems can also be neglected by some of the minorities like children, teenagers, and elders. Indeed this graph shows the transition of how the population changes from 2018 to 2021 in newborns and teenagers. The first main finding is even though it is not clear in the visual, the dramatic drop in 2020 in the data is much lower than in 2019 and 2021. The maximum of the other three documented years are much higher than 28000, but 2020 only has 23661 in the maximum range. The second finding is the top-ranked population areas that have a higher under-18 population had a very slight change before and after the pandemic. They seem to stay in the middle west and the middle part of Chicago.

Figure 5.2. Population distribution of people from age 65+ from 2018 to 2021

Elders, on the other hand, were unexpectedly not having a drastic drop in 2020. The average maximum population is maintained at around 12,000 to 14,000 per area, and they stayed pretty much in the same part of Chicago. We can see in the legend that due to the balanced number and small amount of range difference, approximately one-eight to one-tenth of the area population, the result looks pretty evenly distributed among all of Chicago. The densest population is stuck to the middle-north side and the south side of Chicago.

Figure 5.3. Chicago Population Distribution in Latinx from 2019 to 2021

Race is another issue that may affect people to move or leave the area. Since the master file only provided Latinx and White non-Latinx population data, I make two Gif files based on how the change for each group from 2019 to 2021. In this illustration, it seems like the population distribution is not changed that much, but only the amount of range in population changed, which is in respect of the overall population of each area. In 2021, Latinx occupies 29% of the total Chicago population.

Figure 5.4. Chicago Population Distribution in White non-Latinx from 2019 to 2021

The white population is acting the same as the Latinx distribution. In 2021, White non-Latinx occupies 33% of the total Chicago population, which becomes the highest-ranking population in Chicago. From this Gif file, we can see that the shift through the post and after the pandemic did not create a huge shift or moving scene in the White population as well. Due to the lack of data and limited time, it can develop further research into how other races performed and how they react through the Pandemic.

Interview Findings

This Final project gives me about two weeks of exploring and digging for information from this field, and it is a very precious opportunity that allows me to also test it out and share my findings with other peers in the field. The feedback I gathered from the interview also inspired me for new thoughts and expand my viewpoint in this research. By following the prompt in most of the interviews, I conducted 4 of them in the past week with a minimum of 13 minutes and a maximum of 32 minutes. All interviewees are been asked first and received the answer that they have never visited Chicago before. The average interview time is 17’45”. Based on my prompt and order of asking questions, I gathered 11 main findings as follows:

  • The most frequent motivation and reason for visiting Chicago would be to relax and to visit a new city. 2 interviewees mentioned visiting a friend or colleague, and 1 mentioned it is similar to New York which is also famous, large, and important in history and to the society.
  • The importance of understanding an overall population distribution before the visit is highly related to trip planning and hesitation of crowds or a new environment. 1 specifically mentioned that planning ahead of using all possible information will help him avoid crowds, which proves that he does not want to be disturbed.
  • The average interviewee age is 29, which is an outlier of the age of 38 and may influence this value.
  • population and age distribution: all of the interviewees mentioned finding interested/targeted groups of people. And knowing about the distribution ahead can help them locate and even book their tour.
  • Despite the graph I have shown before about the top-ranked attractions in Chicago, the majority of my interviewees knew only 1 or 2 of them. Half of them mentioned that they have seen some pictures on social media of the specific attractions, but when they see the texts and visualizations they cannot recall what is the attraction exactly.
  • 3 out of 4 mentioned no familiarity with the race distribution in the city. 3 of 4 know it is a very big city and it is similar to New York. One mentioned that she has never thought about considering race or age as a factor in planning a trip.
  • All interviewees are having previous visiting experience in large and diverse cities like New York, Beijing, Shanghai, Milan, Paris, etc.
  • Half interviewees mentioned age is not a factor that they may refuse to visit a city, but they always prefer younger people and preferred the visiting experiences. 1 interviewee mentioned that European countries and Asian Countries may have totally different ways and interpretations of analyzing tour planning, so that can vary a lot from the U.S. to worldwide.
  • 4 out of 4 interviewees mentioned inspiring/interesting in seeing my illustration. But half of them mentioned hesitation, confusion, and unexpected feelings, which count as negative feelings in Dovetail. some quotes can be: “It is helpful and can help make decisions if looking for someplace to visit, especially finding a good spot to know desired people”.
  • 2 mentioned in chatting that charts related to the population distribution are useful information, but for Chicago particularly, the safety issue is much more famous than crowdedness. Some also predict that perhaps staying with Asian or White communities/populations may be helpful.
  • 1 mentioned that it doesn’t make sense of seeing a population visual on a visiting tour. And she mentioned possible keywords, figures of percentage, or cartoon visualization/illustration that can be more helpful for this project.

Reflection and Critique

Reflection and critique. What revisions would you recommend to your work? Outside of your work, what can be improved? Are there other datasets or data collection methods that could improve understanding?

Overall this is the best experience for doing an illustration and getting feedback based on user experience. It is more direct and much more inspirational than staying with the data world and presenting what I want to present. Supporting needs, gathering feedback on what people want to know, and getting a practical idea of what they wish to see are so helpful. Some of my concerns about this project that I did not satisfy yet are:

1). My time is too limited. I have to compromise that I choose to stay with my most familiar tool to finish this project not only because the visualization needs geographic displays, but also because it is closer to the content that I participated in the last lab and it is handy for me to keep going, in this very limited timeframe.

2). Missed some opportunities to present and display some illustrations. I am not a very tech-savvy person so it is really an obstacle I have to encounter for me as a museum/curatorial-background person, in order to add a lot of fancy illustrations like other classmates. I originally planned to combine the geographic map with my illustrations, but unfortunately, I tried so many times but still cannot do it in either Figma or Adobe Illustrator. And also, according to one of the feedback from my interviews, even though he may accept the result of the illustration as what it is informed now, he does not think an art person who also acts as a first-time visitor may just close the window because it doesn’t attract them.

3). Expand more on research criteria. Inspired by one of my interviewees, I should also include the safety issue and living expenses for each area so that provides more precise and practical information for the targeted audience/user of this project, which are the first-time visitor of a totally unfamiliar environment.


Reference

“10 Top Tourist Attractions in Chicago.” Vietravel, 2017. https://www.vietravel.com/en/around-the-world/10-top-tourist-attractions-in-chicago-v11883.aspx.

“Chicago Tourist Attractions.” Go Chicago. Accessed May 2, 2023. https://www.gochicago.com/chicago-tourist-attractions/.

Data.Gov, Data Catalog. “Chicago Population Counts – Comma Separated Values File.” Catalog. Accessed May 2, 2023. https://catalog.data.gov/dataset/chicago-population-counts/resource/571ce83e-9d97-4e6f-b396-8e3cab880d94.

Fiorentino, Alex Schultz and Fiona. “20 Top Tourist Attractions in Chicago.” Touropia, February 20, 2023. https://www.touropia.com/tourist-attractions-in-chicago/.

“Free Photo: Map Lying on Wooden Table.” Freepik, August 24, 2018. https://www.freepik.com/free-photo/map-lying-wooden-table_2862246.htm#page=3&query=tourism%20traveling&position=47&from_view=keyword&track=robertav1_2_sidr.

Google autogenerated result, Top sight in Chicago. Google search. Google. Accessed May 2, 2023. https://www.google.com/search?q=top%2Branked%2Bmost%2Bpopular%2Btourist%2Battractions%2Bin%2BChicago&oq=top%2Branked%2Bmost%2Bpopular%2Btourist%2Battractions%2Bin%2BChicago&aqs=chrome..69i57j33i160.126396j0j7&sourceid=chrome&ie=UTF-8#ip=1&ttdcs=EAE.

Law, Lana, and Lura Seavey. “18 Top-Rated Tourist Attractions & Things to Do in Chicago.” PlanetWare.com, March 7, 2023. https://www.planetware.com/tourist-attractions-/chicago-us-il-chi.htm.

Openresources, GitHub. Discover gists · github. Accessed May 2, 2023. https://gist.githubusercontent.com/erichurst/7882666/raw/5bdc46db47d9515269ab12ed6fb2850377fd869e/US%2520Zip%2520Codes%2520from%25202013%2520Government%2520Data.