Final Projects, Visualization


New York is a charming city where you can experience the culture of the world. The diversity of its immigrant population has created a vibrant and open environment; there is no city like New York. People come from all over the world; they love New York but don’t know New York well. When people search online about New York, there are full of long and text-based articles. People may want to know some data about New York, but they can not read and understand them. When I was telling people around me about some exciting things about New York, they all have a lot of interests. So for the final project, I want to do a series of graphics about some interesting facts of New York City, and those facts are based on the data I had and analyzed.

In this project, My target people is the person who lives in New York City. And my first goal is everyone can find themselves in the graphics I created. In my previous project, people feel hard to have some connection with the data visualizations I made. Some of the topics are too far from themselves or their life, and the topics may not interests to them. So I choose the topic the interesting fact about New York City as my topic. This series of graphics will includes the overall fact of New York City itself and also the detail data visualization, which is about each person.


Illustration Style

In this project, I want to create a series of the infographic, so how do I organize them is important. In order to figure out what kind of infographics are more readable for normal people and could cause their interests, I collected different styles infographics and let people choose the one they like. Here are some graphics they chose.

Based on the graphics they choose I find that:

  • Most of the graphics are storyboards.
  • Normal people are more interested in the storytelling style of graphic.
  • The aesthetics of design are also very important in information design.


This infographic can not include all the facts about New York City. When I chose topics, I narrow down the target people to the foreign population, who is not a New York resident, the topics I choose are based on what they may interest in.

I interviewed some people to talk about their impression of New York. In this interview, they need to talk one point which they think is the characteristic of New York or one impressive thing about New York.

The most frequently mentioned things are; the subway is dirty, the pizza is very cheap, the diversity of people, etc. Depend on their answer I start to create my map based on those topics.

  • Pizza is very popular in New York City, so how many pizza shops are there in this city where are them and what’s the connection with them and subways?
  • What are the most common languages ​​in New York except English?
  • What is the middle age of New York? How many percentages of people are younger or older than you?



-Clean the messy data


-Visualization of geography-based data

Tableau Public

Google sheets


-An Adobe software for editing graphics


– Data collection

  1. Geo-data
Pizza store data collection
Pizza store data collection

There is no existing data of all the pizza stores in New York City, so I need to collect it by myself. The original data is download from the website USA REFERENCE

The data collected prosses includes collected the store which sells the pizza and also the pizza chain stores like Dominoes. I used Openrefine to clean the data deleted the information I did not need; only keep the ZIP code, address, and counties. Then I combined all the store address information together to make the complete data set. Before inputting the data in Carto to visualize it, I use Geocell to create Lat and Lng cell for Carto to recognize it.

Original geographics of the pizza store in New York City

2. Age data

The problem I met in this data is it includes the age, gender and born information from 2010-2017. The age database is too large so the first step is to simplify the data. I filtered the year only keep the record of 2017 and clean the unnecessary information. I used treemap to visualize the age data for I want to see the percentage each group has and people can have a feeling of how many people are around their age. This image may not show the exact information about the percentage but it can give you an overall feeling.

The age map in Tableau Public

-Visual refine

After I created all the original infographics, I found four of my friends to review them. After talking to them, I found some point I could improve based on the graphics the software created, like the age population map. The original map is using a different level of blue and square to show the percentage, but it is hard for people to connect it with people and age. To solve this problem, I use the icon of people instead of the blue square.

I also considered the use of color. When talks about New York, yellow cab always come to people’s mind, so I choose yellow as the main color. In this graphics, different level of yellow represents a different age level. The lightest yellow is the population who is under seven years old, and the darkest yellow represents the people above 75 years old. There are six groups of age groups, and people can simply find which group they belonged and how many percentages of their age group are in the whole New York City. This graphic also includes the median age of New York City, which could give an add on information to people when they are watching this age-based graphic.

When I designed the chart graphic, some of them I make one chart colorful to emphasis the data. I want people to focus on it like I want to emphasize that New York has most of Chinese populous, so I use yellow in this part and left others grey. In this table, I hope that the audience will see the data of New York at first.

When the data is equally important, and I want people who view my map could notice them all,I keep all of them in the main color yellow. For example in this chart, lots of people know Spanish is popular in the United States, but besides it which language spoken more the others. This is the question most of the people do not know. It is reasonable to keep them all in color for them is both important here. In the same time, I put them in descending order, it fits the reading habit for people when they are viewing this kind of data. It could make people read the data more quickly and clearly and lead to less misunderstanding.


Final story board

Everyone in New York City knows and loves pizza. There are fewer shops on the streets, but you can always find a pizza restaurant. The pizza is delicious and fast, and it is very suitable for the fast pace of life in New York. Many people know that there are a lot of stores, so how many pizza shops exactly are there in New York? The first graph shows the results of it. People can easily tell that the pizza store almost covers all the city. Besides this, there is a very popular principle call pizza principle which is the cost of a slice of pizza is always the same as the price of a ride on the subway. If the pizza price goes up, the other is soon to follow. If people are interested in this part, it could consider as a hook to the professional filed.


This graphic aim to give people an overall feeling of which age group is they belong. So I divided the age group into three groups, from 0-24 years old, 35-54 years old, and 55-75 over. One one side people can finger out how many percentages of their age group is there in New York City, on another side in each group there are different colors show the detailed age range. People can easily find themselves on this map. This map also shows the median age in NY which I think is valuable add-on information for people who view this map.

Another common comment I always heard from people is that there are Chinese restaurants and Chinese people everywhere. There are so many Chinese people in this city, so how is its ranking in the country? When I deal with a set of data of the Chinese population, I found that the Chinese in New York City is actually the highest in the United States. But this interesting fact is not well known to everyone. So I made the second chart so that everyone can see this conclusion in the first second.

New York is a city with a lot of immigrants, and it has the most of Chinese population in the United States. So what kinds of languages ​​are the most commonly spoken in New York except English? What is the ranking? After collecting the data, I visualize it in the bar chart; the result shows that Spanish is the most popular language, and Chinese is followed. Top 5 language includes Spanish, Chinese, Russian, French Creole, and Bengali.


In this final project, I used almost all the software I learned in this course. I enjoy the process of finding the original dataset you need on each resources web and then use another software to filter them into clean and useful data you need. I am a communication designer so when I did this final project, besides playing around with those complicated datasets and visualize them, I also put more focus on refining them into interesting and attractive graphics. Some of the people have the stereotype that the data is unreadable and boring to read. In order to break these ideas, the goal of this graphics is that the viewers could enjoy this playful data viz my map.

Looking back at the overall process there are some parts I could improve

Data collection:
– The data I had are not too up to date; most of them are around the year 2017. Next time I may spend more times on finding the fresh data which is more up to date.
– The data I had is time-based, but the results I designed are the still images, people can not change the time and see some exciting findings based on that.

The map graphic of pizza stores does not fit the first goal I expected. I wish this map could show the density of stores. I want viewers could have the feeling that New York City does have so more pizza stores than they imagine. Next time I may find out another way to visualize them in Carto.

How to maintain a balance between the amount of information and the reader’s analysis ability is the question I always have in this project. The dataset is vast and complex. The simple diagrams can not show the relationship and structure between them clearly. But when a diagram is too complicated, it is ineffective for the average reader and only works for the experts. How to make a chart communicable to regular reader and cause less biased is the future development goal of my project.

Data source: