Background
When approaching the data from the Census and American Community Survey (ACS) in Social Explorer, I had two immediate thoughts: first was to explore how life had changed since pre-pandemic times (for instance, commute time in the metropolitan area) and second was to explore data that could be used to illustrate community-level challenges and influence leadership. I went with the second route due to my own background in education policy and political organizing. I have worked with teams who used maps and charts to analyze local political leanings, levels of public school funding across districts, and demographic data, in order to target outreach and educational materials. In my poking around the Social Explorer data for this lab, I found the data available on children living in poverty across congressional districts in greater New York City, which gave me the idea to look further into this issue. I had also recently read about initiatives at the state level by legislators and the governor to target child poverty. I wanted to answer: Which congressional districts in the greater New York metropolitan area have the highest and lowest incidence of child poverty by population? Where are children struggling the most versus least?
Materials
I mapped a dataset from the American Community Survey (ACS) from 2020 via Social Explorer, and published the final interactive map image via their platform. I then exported the underlying dataset that I created as a .csv file and also exported the final map images. I cleaned the data in Microsoft Excel before re-uploading the cleaned .csv file into Datawrapper to build my visualization. I both published and exported my chart images via Datawrapper.
Methodology
In Social Explorer, I mapped a dataset from the 2020 ACS by the geographic level of congressional district (called CD 116th for the 116th congress), not only across NYC’s five boroughs but also including neighboring areas of New Jersey surrounding Newark and along the Hudson River. I chose the Vitamin C color palette to culturally indicate a “danger” where there is a higher child poverty in bright red and a “warning” in yellows and oranges. Following feedback from class discussion, I updated the map classification to quantile rather than the default cutpoints in order to more clearly highlight the resulting differences between congressional districts.
In Excel, I cleaned up this dataset to narrow it down to the most important variables, where Area Name indicated Congressional District (CD) numbers: Area Name | Population for Whom Poverty Status Is Determined: Under 1.00 (Doing Poorly) | Population for Whom Poverty Status Is Determined: 1.00 to 1.99 (Struggling) | Population for Whom Poverty Status Is Determined: Under 2.00 (Poor or Struggling) | Population for Whom Poverty Status Is Determined: 2.00 and Over (Doing Ok) | Total Population Poverty Status Known | State Postal Abbreviation
From there, I uploaded the data to Datawrapper, where I decided to work with just the Area Name (CD), Total Population Poverty Status Known, and two poverty statuses: 2.00 and Over (Doing Ok) and Under 2.00 (Poor or Struggling). I hid the other columns, chose a stacked bar chart for the visualization in order to compare these parts of the population on one line for each CD, then refined and annotated the visuals.
I simplified the labels to read “Poverty Status: Poor or Struggling” and “Poverty Status: Doing Ok” to highlight the meaning of the scores in this visualization. Again, I chose orange to represent the “Poor or Struggling” population, because culturally we tend to associate orange with a warning or danger. I chose gray for “Doing Ok” as a neutral comparison. I toggled between the color blind checks for each condition and found that the color choice still worked. I then had Datawrapper sort the bars and stack them by percentage to show highest and lowest percentages of “Poor or Struggling” children. I tried a version that also showed the absolute values next to the percentages, but it cut off some percentages, so I kept it as-is.
I also added contextual information at the bottom to let readers know that the dataset was from 2020 ACS data and was based only on the population of people under 18 for whom poverty status was determined.
Results
Both the Social Explorer map and the Datawrapper stacked bar chart allowed me to immediately understand where child poverty is more prevalent. There appear to be fairly dramatic differences between congressional districts in 2020, for instance on the map between northern Manhattan and the Bronx areas compared to the upper east side of Manhattan and the parts of Brooklyn and Queens just across the East River.
It’s also interesting to note the differences in how CDs have been drawn. While CD 13 and 15 appear to cluster streets and boroughs together, CD 10 is a strange mix of southern Brooklyn up to the Upper West Side of Manhattan. It left me wondering if these boundaries are obscuring more dramatic data in some areas (e.g. if wealthier neighborhoods are lumped in with poorer neighborhoods so that the average rates appear middling). Note: In order to see the CD numbers, it’s best hover over them in the interactive map here.
The chart tells a visual story of which CDs are doing better with this issue than others and provides a jumping-off point for further discussion and investigation. What is it about CD12 that means only 18% of children are struggling versus 59% in CD15?
One obvious challenge is that labeling bars by congressional district means that anyone who doesn’t know their own CD, or where any of the other CDs are located, will not have an immediate geographic reference point. More information probably needs to be provided, whether via a map, additional guide, or maybe (as suggested in class) adding in corresponding congresspeople’s names and faces for reference. You can view the full interactive chart here on Datawrapper.
Reflection
In general, New Yorkers could use datasets and visuals like these in local investigative journalism, to organize and/or lobby elected officials (in this case, congresspeople), to inform policy writing and grant making, or to spur further research and longer studies on this issue. However, I found that working with this dataset brought up multiple questions for me, and illustrates several challenges and opportunities for deeper research. For one, I would want more information on the actual survey questions and methodology that helped compile this data. The poverty scores and corresponding labels are a little opaque to me as-is, so I’d want to pull out more meaning from labels such as “struggling” and “doing ok” among others. We discussed this briefly in class, but it seems like there is a huge amount of complexity involved in determining these categories, and I think we have to highlight this.
Additionally, there was only data available for this one year on Social Explorer, so I would want to plan ahead to study if this dataset and area child poverty rates change over time. Other interesting research questions relating to this dataset might be: Would these rates be different if the CDs were redrawn as they are being redrawn right now in New York? Is there any correlation with educational attainment, e.g. high school dropout rate, or current and future policy interventions such as the Child Poverty Reduction Act passed in 2021? Also, does “Poverty Status Known” include most children? How much data do we actually have here as related to the entire population of children in these CDs?