During the day, I am a 7th grade math teacher. 7th grade is often a touchpoint in NYC specifically because those grades help determine high school enrollment for the next year. So I was curious about trends over the last year across the board as well as specifically for 7th graders. To me, 7th grade is at struggle point for many students, due many factors – some academic, some personal. I was curious if their data aligned with that. I also wanted the bigger picture as well and how that idea might fit in within a larger story.
Since the 1800s, NYC schools have collected data around school testing and specifically about math examinations. At the school level, math standardized test results are often tied for public & charter schools securing additional funding and negotiations with teacher salaries.
For the standardized test in NYC, math test scores are given out with 1-4 proficiency.
– Level 1 means: Below Standard
-Level 2 means: Partially Proficient
-Level 3 means: Proficient
-Level 4 means: Exceeds Proficiency
I used the DOE color palette as my reference point, and tried to use consistent colors across the graphs.
In NYC, these tests additionally play a factor in competitive high school enrollment eligibility. Usually DOE data is shown with budgetary commentary and sometimes scale. I wanted to see it without that context. The DOE publishes their own results regularly. This specific data set seemed to be missing some data – as some schools only reported 1 test, which gave me pause to use the entries. For averaged numbers, I chose to use grouped data provided in the data set to hopefully avoid additional bias due to that data.

For the number of tests recorded, Queens and Brooklyn recorded the most data, which makes sense as they have the largest number of students enrolled. The numbers still seem small compared to the enrollment, which suggests so data that has been omitted from this set. The smallest number of records came from Staten Island, which also aligns with population. I next wanted some idea of the number recorded for each grade level.

There is a decrease in the number recorded for 8th grade students – which made me curious if this is due to students sitting the Regents Algebra 1 exam as well as some sitting the regular 8th grade level test. With the exception of the 8th grade numbers, the numbers were consistent throughout the levels, with Queens and then Brooklyn having the highest counts.
I then used a bar graph to look at the cumulative number of students that sat for the exam that received a level 3 or 4, which is proficient or higher. The numbers dropped starting with the highest being 3rd grade and by 7th grade the decrease was about 20%.

This made me curious as to why this was happening, so I graphed some statistics around 7th grade.

I graphed just 7th grade mean scores over the 10 year span, which told me that there was a sharp increase in the median score in 2018 and has remained until 2022, despite COVID and then decreased in 2023. The gap between 2020-2021 is due to COVID testing restrictions and tests were suspended temporarily. I also looked at students at level 1 that are ELL students. The trend was also downward.

I then compared the percent of level 1 students at each borough area.
When I started with the data, I tried looking at average proficiency across all schools and all grade levels. The numbers were not only grossly inaccurate, but it was also difficult to read or understand what potential information that could have otherwise been found. I re-did it with a focus instead on one grade level and one score over a 10 year period instead as that seemed more impactful.

When I first graphed the data, the numbers were not reasonably close. I tried again and was able to achieve more reasonable numbers. Due to the numbers that I excluded due to only 1 entry being present, I feel like that gives a strong indicator of a data gap. Another possible direction for further research would be to see if there is a consistent group that stays below level. Or if that is a misguided understanding of the system. To verify that, I would need a significantly larger data set. I also could have created or found a data set that shows the data in context of population, neighborhood income information & neighborhood safety data as those both have been shown to impact test scores. This data would probably be best shown as a chloropleth map as neighborhood usually affects the school.
I could have used more consistent colors for the 7th grade graph representing level 1 scores. I wanted to differentiate both graphs, but that might have made it confusing. I think I could have also organized the data using an outside platform to help further limit the unhelpful data (namely schools that answered 1-10 test scores).