
Summary
INFO 658 – Information Visualization – Spring 2025 Semester – Tableau Report and User Testing Assignment
This report uses Tableau to visualize variables in a dataset containing over 3000 incidents of K-12 school shooting incidents that took place in the U.S. from 1966 to 2024.
Context
I wanted to seek out more information about school shootings after seeing “Wake Up,” a data story with a very striking design. The author animated beautiful transitions between each part of the story, and also used the same visual language of moving data points to create a series of illustrations about a parent saying goodbye to their child, only to never see them alive again. I liked their colour choices (red, black, and white) and thought the design choices were perfect for the theme. There were a lot of interesting charts and I wanted to find out more.

Data
There was no data source cited, but I found one by David Riedman that is publicly available, frequently updated, and widely cited. Upon requesting access, I was given an Excel workbook with unique identifiers for each incident and detailed variables describing the situation, shooters, victims, weapons, latitude and longitude, and more. For most of the incidents, the source (which includes a reliability rating) is a news article.
To clean the data, I removed some Excel formatting and a handful of commas in some ot he observations, and filled all empty cells with “NA.” A full summary for each is also included, but for this project I omitted those in order to quickly make a clean CSV. I exported each of the sheets as a CSV and hooked them up in Tableau with [Incident_ID] as the primary key. At first, I made the mistake of trying to merge all the sheets together, but realized I couldn’t because there is a one-to-many relationship between incidents and shooters (there are many incidents that involved more than one shooter). I then read the CSVs into R to for some basic analysis and calculations, making some frequency tables to get a quick idea of how the data could potentially be aggregated and visualized.
Research
For inspiration, I searched data visualizations on the same topic to see how different people interpreted the variables and themes.
Datawrapper Blog uses the same dataset that I found and shows a map, incident counts by year, and a breakdown of school types showing that the proportion of shootings at elementary schools is growing.
The Washington Post created what looks like a bee swarm chart with each year since Columbine (1999) plotted onto a circle. They create their own database which is available on Github.
On Tableau Public, I found a few visualizations that were interesting, many of which used the same database (the first two were created by the database owners):
K-12 School Shooting Database & The Information Lab Collab by David Riedman
K-12 School Shooting Database Center for Homeland Defense and Security at Naval Postgraduate School
USA School Shootings in 2023 by Carlos Pacheo
School shootings in America by Amarendra
Twenty Years of Gun Violence in US Schools 1999-2018 by Joshua Preston
Fatal School Shootings by Stephen Mutea
Preventable Catastrophe by Elisa Davis
School Shooting Charts
In Tableau, I then created a series of charts to make sense of the database. To start, I charted the number of incidents by year. There is a small dip visible in 2020 when the pandemic hit, but otherwise it appears that school shootings, according to this database, have increased dramatically in frequency starting around 2017.

The four deadliest incidents so far have been highlighted above, and the chart below shows the number of fatalities for each one.

There were several types of location data that described where on the school property the shootings occurred. This chart shows the data grouped by location type, and a significant increase in shootings that occurred outside on school property with the pandemic as a seeming turning point.

Shooters who made it inside the building were less likely to flee or escape, according to the outcomes charted as a flow from shooter location.

A significant portion of shootings occurred at a sporting event. Some databases do not include these incidents and limit the data to school hours only. Apart from those, dismissal time the next most frequent time of day for shootings to occur over the last 7 years.

Since 1966, the reasons, if known, and weapon types have not changed too much. These charts show the categories of shooting incident reasons (the [Situations] variable) and weapons from 1966-2024.

Similarly, school shootings are overwhelmingly committed by males, and that has not changed over time, either. This chart shows the gender proportions of the shooters from 1966-2024, with unknown and others omitted because they were not visible at this scale.

The data on race is not comprehensive enough to draw any conclusions. Since the database extracts a lot of incident information from news reports, we are probably seeing biases in writing when it comes to whether or not race is mentioned at all (especially if the perpetrator was white), or over-emphasized if the perpetrator was not white.


User Testing
To determine if these charts were intelligible and made sense, I conducted some informal user testing. I recruited a couple of middle-aged people who are parents of school-age children and incorporated their feedback about chart effectiveness.
Previous versions and experiments
The following are the charts that didn’t make the final cut or could be explored further.
Most maps of school shooting incidents look like this (below), and show the absolute number of incidents around the country. To explore further in the future I would like to bring in some census data and calculate the number of shootings per capita.

Both the donut chart for gender and this stacked bar chart communicate the same message effectively, it was hard to choose between them. User testers preferred the donut chart, possibly because it was easier to discern the two parts from each other.

These tree maps illustrate the difference in location types before and after the sharp increase, but the feedback I received was that the stacked area chart provides a little bit more information like the point during the pandemic where shootings increasingly occurred outdoors. It was suggested that I try a stacked area chart with a wider range of years after users saw the draft below.



The two charts above did not elicit any strong emotions from the testers, possibly because they contain too much information. The stacked area chart was ultimately chosen over the bubble chart below, but I think this bubble chart has some interesting spatial analogies that could be turned into a data illustration.

Users asked more questions about these two drafts showing the time of day that the shooting occurred than the chronological bar chart that was selected in the end. For the donut chart, the clock-like shape made it unclear whether each segment was an amount of time or the number of shootings, despite the title.

Although this bar chart is sorted to show the most frequent times, the chronological bar chart was preferred by the testers.

Concluding Thoughts
For this database, the criteria for inclusion seem to be wide, it has more incidents than some others. The Datawrapper blog post mentioned above suggests there may be a recency bias at play. According to this interview, the creator of the database started the project in 2018 which coincides with the sharp increase in shooting observations as seen in the charts. However, other charts like this one from CNN shows a similar pattern using different sources. A synthesis of multiple school shooting data sources would be a logical next step to round out the analyses in this report. I would also like to combine population data for more robust spatial analysis.