Why I Chose This Topic?
Air pollution is not something we always see, but I always know it affects our daily health. Growing up in a big city, Seoul, South Korea, and living in New York City, I began to wonder: does air quality vary depending on where you are even within a city? Especially since I spend time in different neighborhoods, and since I have areas that I particularly go often, I wanted to take a closer look at this question.
I downloaded New York City air quality surveillance data from data.gov. And with the help of ChatGPT, I was able to identify three main pollutants: Nitrogen Dioxide (NO₂), Fine Particles (PM2.5), and Ozone (O₃). Each comes from different sources and affects our health in different ways. NO₂ mostly comes from traffic. It can irritate the lungs and trigger asthma. PM2.5—tiny particles from diesel, heating oil, and other sources—can reach deep into the lungs and are linked to heart and lung disease. Ozone forms when sunlight reacts with chemicals in the air and can make breathing harder—especially in summer.
I chose to focus on Nitrogen Dioxide (NO₂) for a few reasons. It mainly comes from traffic and cars, and it has a direct impact on breathing problems like asthma. Unlike some pollutants that change a lot with seasons or weather, NO₂ tends to stay more stable across the year, making it a strong candidate for spatial analysis.
Visualizing the Data
Before creating maps, I initially explored the data using charts and graphs. However, I realized it was difficult to show where air quality was better or worse just through those visuals. So, to better understand the spatial pattern—and reflect feedback to think more spatially—I switched to a map-based approach.
First Map: NO₂ Gradient by District (2023)

From data.gov, the air quality dataset I downloaded only included average NO₂ values by community district code. So, I had to download a shapefile of NYC community district boundaries from NYC OpenData and join it manually with the air quality dataset using the borough and district codes to make the map.
This map shows the average NO₂ levels in each community district across NYC for the year 2023. I used a color gradient:
- Blue = cleaner air (lower NO₂ levels)
- Orange = more polluted air (higher NO₂ levels)
We can clearly see that central Manhattan is more orange—indicating higher NO₂—while outer boroughs like Staten Island, southern Brooklyn, and eastern Queens show mostly blue.
About the gray areas
Some areas appear gray due to insufficient data for 2023. Rather than estimate or fill in missing values, I chose to leave them empty to preserve the accuracy of the visualization. Although this results in a less visually complete map, it still effectively communicates the spatial variation in NO₂ levels across NYC.
Second Map: Safe vs. Not Safe (2023)

Inspired by Professor Sula’s feedback to “segment rather than add,” I made a simplified version of the map using a safety threshold. According to the NYC Community Air Survey (NYCCAS), 20 ppb is considered a safe level for most people. So, I categorized each district as follows:
- Safe: Below 20 ppb
- Not Safe: 20 ppb or above
- No Data: Gray
This map offers a clearer answer to a common concern: Is the air safe to breathe where I live or spend time? Again, Manhattan shows more orange zones, while Staten Island and most of Brooklyn fall in the safe category.
3. Making Design Decisions

One of the challenges in this project was aligning the NO₂ data with NYC’s geographic map. The pollution data from data.gov did not include spatial information, so I had to manually join it with a shapefile of community district boundaries from NYC OpenData. Since the two datasets used different formats for district names and codes, I matched them manually using BoroCD values. It took time, but it was important to ensure that each district’s data appeared in the right place on the map.
In terms of labeling, I included district names on one map but left them out on the other to avoid clutter. These small design choices helped balance clarity and simplicity throughout the visualizations.
4. What I Learned and What’s Next
This project helped me realize how powerful maps can be in revealing invisible patterns. One challenge was not being able to combine income data, even though it would have made the story stronger. I tried finding the right dataset, but matching it at the same geographic level as NO₂ data proved difficult.
Still, I’m happy I was able to test two different ways of visualizing the same data: gradient and binary segmentation. It showed me how the same dataset can tell slightly different stories depending on the framing. For the next step, I’d like to explore combining pollution data with community health outcomes or income, once the right dataset is available.