Library After-School Programs & Youth Safety


Visualization

Click here to access the interactive map in CartoDB.

For this lab in CartoDB, I wanted to visualize multiple data sets to examine relationships between different geographic variables. I decided to use this data on after-school programs in NYC and to focus on the public library branches in the data, as opposed to schools and other community centers that are also included. To map these locations in relation to a data set with more quantitative variables, I chose to use this data compiled by the NYC Administration for Children’s Services (ACS) on child abuse/neglect investigations in each community district for the years 2010-2014. The ACS defines child abuse/neglect as: “the act, or failure to act, by any parent or caretaker that results in the death, serious physical or emotional harm, sexual abuse, or exploitation of a child under the age of 18.”

Because the ACS data relates to child/teen welfare, I thought there could be a story to tell by comparing the prevalence of abuse investigations by community district to the locations and concentration of libraries that offer youth services in the form of after-school programs. This story informed the goals I had for what information to emphasize and how the user should experience the map. I entered the lab aiming to do at least the following:

  • Map the community districts and the library location points together to see the relation between them
  • Users should be able to quickly determine which community districts have the highest prevalence of child abuse/neglect investigations as well as immediately identify the locations of libraries that offer after school programs
  • Users should be able to draw conclusions about the youth library services offered in high-risk areas of the city
  • Include filters for users to manipulate the data that is displayed, thereby making it possible to communicate more information within a single map

Inspirational Visualizations

I like this Danger Perception Map of NYC because it shows the relationships between spaces in the city and specific geospatial points. Color is used to encode different reasons why students feel certain areas are dangerous and the legend provides more precise reasons by percentage, while the points represent important locations for the students. It is easy to see where the two elements intersect. This made me consider how I would use color in my map and led me to think color would be most effective for encoding the community district statistics, since the library points are qualitative rather than quantitative.

This map created by the Institute for Children, Poverty, and Homelessness depicts the percent of cost-burdened renters divided by community district, the same way my data is organized. It is simple but effective, using just one color to encode a single variable, but incorporating a second color for sections of the map that are not community districts (parks and airports). It is also useful that the boroughs are divided with a black line, while the community districts are divided by white, making them less distracting. I incorporated this into my own design in order to show both borough and community district divisions without confusion.

I found this very interesting map of the homeless population in Los Angeles that plots individuals/groups within census tracts. When the user hovers over the map, an outline of the corresponding tract appears and gives the total number of people/groups/etc. in that area. I would really like to incorporate this feature into my visualization in the future, but I need to figure out how to create the outline when the user hovers. It appears to require a code or plug-in, and I need to investigate further.

Methods & Results

Before lab, I needed to prepare both sets of data. For the after-school program data, I filtered by location to isolate the libraries and then geocoded each address. The ACS data required normalization and I also needed to change the community district codes to match those within the community district shapefile (ex. BX01 became 201).

I imported the two separate data sets and the community district shapefile. I merged the shapefile and ACS data based on their shared community district code values. There were several issues with how CartoDB was reading the data which required me to make more changes, including reformatting numbers to remove commas (which turned into decimal points upon import) and to remove “%” signs which prevented the program from displaying percentages as numbers as opposed to strings of text.

Starting with the library points as the first layer, I decided to use the category encoding option to differentiate between full-time and part-time weekly hours for the after-school programs. There are many attributes in this data set, but I wanted to emphasize the availability of library services and weekly hours were most fitting for this. I include more variables in the hover and click boxes, however, so users can see more information. Together, these variables provide a general summary of what a given library offers to youth in that area.

The ACS data required much more experimentation because of its quantitative variables. The main difficulty was caused by the inclusion of multiple years in the data. I tried to make a time-slider that would allow all years to be included on a single layer on top of the library location map, but this function is not available for polygon shapefiles in CartoDB yet. Instead, I tried creating a layer for each year using the filtering function. However, the program would not recognize 2014 as a filtering option and there is a 4-layer limit. I could not resolve these issues, but ultimately this was for the best. I re-evaluated the approach I was taking and determined that displaying change over time was not relevant to the geographical information I aim to convey to the user. Using SQL, I isolated the 2014 data in order to show only the most recent information.

Layer showing indication rates.

Layer showing indication rates.

I experimented with the choropleth map layout using different variables in the ACS data. I ultimately chose to use 2 more layers to encode the number of investigations and the indication rate in each community district. As noted on my map, indication rate refers to the number of investigations in which credible evidence was found to substantiate the claims (more helpful background on investigations can be found in this ACS report). Both of these layers are effective in showing the prevalence of abuse, the former in terms of the total number of investigations and the latter in terms of how many cases were verified (perhaps a more precise indicator of the state of child welfare in a given area). Again, I include more information in the hover and click boxes so the user can see other variables.

Layer showing the number of investigations.

Layer showing the number of investigations.

After the lab period, I was not satisfied with my map because I wanted a layer that would present a more general and more easily interpreted view than the numerous community districts could convey. To summarize the data even more clearly, I imported a shapefile for the 5 boroughs and added an additional column for the total number of investigations for each borough in 2014. This became my fourth layer. I found that by increasing the transparency of the color of each borough, it is possible to view the community district layers at the same time (if both layers are enabled by the user). This gives the map more depth by adding another dimension for users to interpret. Now they can examine based on community district, borough, or both together.

Layer of borough totals displayed over layer of investigations by community district.

Layer of borough totals displayed over layer of investigations by community district.

Conclusions & Future Directions

Interestingly, there appear to be more libraries in community districts that have more child abuse/neglect incidents, most visibly in the Bronx and Manhattan. The next step will be to calculate the exact number of libraries in each district and compare each district based on that number, and also compare based on the rank assigned to each district by the ACS data. I will need to find a way to integrate those calculations into the visualization in a meaningful way. I hoped to start by adding a new column to the data within CartoDB for the total number of library programs in each district, but was having difficulty updating the data with the SQL query applied and did not want to risk any errors arising before submitting this lab.

I am also questioning the effectiveness of having a layer for the number of investigations by community district and a layer for indication rate. Switching between the layers does not facilitate comparison between the two variables. As I plan to use this visualization for user testing, I hope to gain insight into what users will find most helpful to be represented in the map.

There are questions for further research raised by the visualization including:

  • Why are all of the after-school programs in Queens part-time while the rest of the city is full-time?
  • Why are libraries concentrated in certain areas more than others? Is this related to population?
  • What are the population and demographics of each district?
  • Do the number of investigations correspond to other factors, like population?
  • How many children and teens are utilizing library resources?

My main concern about this visualization is that it represents just one measure of youth welfare conditions and does not account for the many variables behind the ACS data. It may be useful to create a series of visualizations that show more dimensions and paint a broader picture of the living conditions faced by youth such as poverty, hunger, and homelessness/runaway rates. An assessment of the quality of schools could also be important to analyze and would be an opportunity to include the after-school program locations that I left out of this map. Another option could be to explore a specific borough more deeply, accounting for more variables and analyzing the actual programs offered by the libraries. This is a huge topic with many possible directions, but my map could be a good starting point.

References

Abuse/Neglect by Community District data. Retrieved from https://data.cityofnewyork.us/Social-Services/Abuse-Neglect-by-Community-District-CD-/rnjn-x48k

After-School Program data. Retrieved from https://data.cityofnewyork.us/Social-Services/After-School-Programs/6ej9-7qyi

NYC Administration for Children’s Services. (2001). Progress on ACS Reform Initiatives, Status Report 3. Retrieved from http://www.nyc.gov/html/acs/downloads/pdf/stats_status_report3.pdf

NYC Administration for Children’s Services. “What is Child Abuse?” Retrieved from http://www1.nyc.gov/site/acs/child-welfare/what-is-child-abuse-neglect.page