Introduction
As a new resident in the United States, I have recently had significantly more exposure to the ongoing opioid crisis. Despite it being widely reported internationally, my recent proximity to evidence of widespread drug use, and community-based efforts, such as Narcan training sessions, have emphasized how this crisis has permeated across cities in America. I decided to use this as an opportunity to dive further into the available data to understand first hand the scale of this issue. By using a dataset with both high-level demographic indicators and detailed drug-specific statistics I hoped to gain an understanding of the full range of the problem and perhaps surface interesting trends that might help support solutions.
Visualizations that Informed my Design
There were significant amounts of visualizations available for reference, which also meant that I was able to see examples of both good and bad practices and what not to do.
I found Figure 1 to be a good reference as it was similarly based on state-level drug data over a few years. It provided a good indication of dimensions that were worth exploring to show a range of movement over time. However, the visualization was also an example of a cluttered dashboard that hindered understanding. For example, it is not clear what the chart in the background is meant to represent, lacking a clear title, labels, and axis. Therefore, I was conscious of using white space in my dashboard to separate charts and clearly include and display all necessary labels for understanding the charts.
Figure 2 was more useful as a reference for dashboard layouts/use of color. Stacking the charts made it easy to read each one individually. The simple but effective use of colors (limited to two per chart) and always with a prominently displayed key, was also helpful to inform my dashboard displays.
Materials
To produce this visualization, I used Tableau Public. Each chart/graph was created individually and then collated in a summary dashboard. The data for this visualization was produced by the state of Connecticut and was found on the data.gov website here. I then used OpenRefine to clean and group the data before moving it to Tableau Public.
Methodology
I chose the dataset from Connecticut for two particular reasons:
- It spanned 5+ years (2012-2018), allowing for richer illustrations of trends and movement over time;
- State-level data that included demographic and drug-specific statistics meant I would be able to explore overarching trends, and more minute details, providing a more complete picture of the crisis in Connecticut.
The data was already in a good format and didn’t need much manipulation. However, many causes of death were duplicated, incorrectly or similarly typed so I used OpenRefine to standardize the ‘COD’ column (Figure 3). There were also empty columns for state details, so I populated these with ‘CT’ in OpenRefine to avoid unnecessary blank spaces in the dataset (Figure 3).
Once the data was ready, I imported it into Tableau to begin building my charts. Each one was created individually and compiled in a dashboard to allow for direct comparison between state-wide demographic and drug-specific statistics. I wanted to create visualizations with time and demographic ranges, which would allow me to understand where and/or when certain trends developed. The scale was also important to me as I was unfamiliar with the volume of the opioid crisis on a smaller, more granular state level, and so I also included charts detailing death rates across different dimensions.
Data clarifications:
- In the map visualization, 145 cities and their corresponding death counts were not included. Tableau did recognize these cities due to small populations (under ~4000). For this same reason, their death counts would have been low and therefore their exclusion did not significantly impact the visualization. The cities in the map are cities of residence and not cities where the deaths occurred.
- In the drug-specific charts, there are likely duplicate records. The way the data was presented in the original .csv was geared to show combinations of drugs present at the time of death. Separating these into individual categories will have duplicated deaths where more than one drug was present. However, as I was interested in scale, being able to see the relative frequency use of certain drugs compared to others has already added to my understanding of the crisis at a more granular level.
Results and Interpretation
The results indicated that the rate and trends for accidental drug-related deaths in Connecticut were largely consistent with those often seen across America: rapidly increasing over time, largely male, occurring across all age groups and with the same drugs found frequently across the country (such as fentanyl, heroin, and cocaine).
I found the year/month-based graph especially curious. It seems that there is a clear trend in the first half of the year being the least deadly, and the last three months of the year being the most deadly.
The map also provided an interesting overview as it clearly illustrates that the rate of accidental overdoses is not confined to specific areas of the state. Due to population density, there are cities with much higher death rates, but it is clear there are dots spread across the whole state, indicating a widespread problem.
Reflections and Future Directions
I enjoyed working with Tableau and found it easy to manipulate data with. Using OpenRefine first to clean the data made a significant difference and much easier to sort categories into groups in Tableau. I like the charts I have created but think the final dashboard is too crowded. In my attempts to understand high and low-level trends, I may have included too much information. The correct approach would probably have been to create two separate dashboards.
In terms of future directions, I would like to continue exploring significant parts of the data I did not have time to go into. This includes the causes of death, injury type, and injury location columns. There is a lot of manual work required in OpenRefine to ensure these categories can be manipulated in Tableau, but I think they have the potential to produce very interesting charts.
Having a greater understanding of drug and opioid-related terms would also be very useful for future visualizations, especially when trying to determine how different labels can and should be grouped.
Dashboard
The final dashboard can be viewed here.