Visualizing Concerning Experiences Online with Tableau Public

Framing the Experiment

As more of daily life is experienced online, reports of both positive and negative experiences specific to online environments will increase. Abuse and harassment are gendered experiences, as are experiences on digital platforms. How prevalent are concerning experiences online? How do women, men and gender non-conforming individuals experience this differently? How are online platforms responding?

The dataset I wanted to use for this experiment does not exist. Historical, gender-sensitive, geographically and ethnically diverse, statistically representative (or at least large-scale), academically sound surveys on the impact, experiences, opinions, and use of online platforms have yet to be published. Although somewhat unsurprising, given the general underappreciated impact of how important gender is, I did expect more disaggregation in what I found.

For visualization inspiration, I referenced a paperback book, The Women’s Atlas by Joni Seager (2018), hardcover books by David McCandless, Knowledge is Beautiful (2014) and Information is Beautiful (2012). I also looked at the charts and graphs provided by the dataset sources. Pew’s visualizations in this write-up of the data embed responses to the survey as labels in the graph. I chose graph styles that highlighted these even further.

The corresponding Amnesty report (datasets descripted in detail below) includes analysis and interpretation, but no dashboard environment for exploring or visualizing the data, although in a blog some were visually represented and provided inspiration for some of the visuals. I liked the simplicity of the bar charts, and that the percentages were represented as labels, and I applied that logic to my representation of the data.

bar chart of data showing percentage of women who think response of social media platforms to abuse has been inadequate

Image Source: Unsocial Media: The Real Toll of Online Abuse against Women, Amnesty Global Insights, Nov. 20, 2017

Materials, Software and Datasets

Software used includes Tableau Public 2020.4 for data visualization, Google Sheets for cleaning and organizing data, Google Docs for drafting the lab report, which was posted on Pratt-hosted WordPress. These web applications were run on Chrome in a Windows 10 environment on a Dell XPS 15 900. Peer reviews were conducted using Zoom for desktop.

Three datasets were utilized, which I cleaned and formatted in several collated Google Sheets, and connected to Tableau, available to view here.

2017 Amnesty International-commissioned Ipsos MORI online poll of women aged 18–55 in the UK, USA, Spain, Denmark, Italy, Sweden, Poland and New Zealand about their experiences of online abuse or harassment on social media platforms, in .xlsx format. To import only the relevant data to Tableau Public, it was efficient to manually select data from the original dataset and copy to Google sheets.
The Pew Research Center American Trends Panel (ATP) Wave 74 (W74), from September 2020. I used two questions from this poll. First, for all survey respondents (N = 10,093), “Which, if any, of the following have happened to you, personally online?” and second, of that group who answered “Yes” to any (N = 3,893) “In which of the following online environments did your most recent experience occur?”
From the latest Facebook Transparency Report on Community Standards Enforcement, released February 2021, I used the data on “content taken action on” that violated the community standards pertaining to bullying and harassment.

Methods & Processes

Creating a logical flow of information across such diverse datasets through a dashboard design was a fun challenge. I noticed how easy it is to make a dashboard that inaccurately represents the information, especially with so many different sources, sample sizes, attributes and magnitudes. So I deliberately chose to include the sample size N, the source of the data, and some demographic information with each chart. For the first time, I experimented with two colors I almost never use: brown and yellow.

The red annotations show the information I added for clarity

After much deliberation, I settled on a consistent color scale across the dashboard to represent numerical data. I used advanced color modification on each graph to highlight the scale in more contrast (i.e. stepped color, start and end points). Left uncertain if it is more interesting or distracting than using less color variants, I will be experimenting with this in the future.

The largest graph in the dashboard warrants discussion: “In which of the following online environments did your most recent experience occur?”

Knowing that the area chart is unorthodox for this data, I chose it because it illustrated two key things: that more people than not have had concerning experiences online in more online environments. The main point I wanted to visually portray is the overwhelming volume of one online environment compared to others: social media sites. I also padded the interior 40 laterally and 10 vertically, to ensure that it wouldn’t be too flat, aesthetically.

The dataset only included percentages, and not the sum of the individual responses. Due to rounding and other statistical manipulation Pew conducted on the raw data before releasing the dataset, it would not have been prudent to crudely calculate the sum from the percentages provided. However, when interpreted critically, we can deduce that respondents experienced harassment on many platforms at the same time: over 100%.

*The red annotation show where the internal padding is relative to the dashboard section heading*

I tried to keep the other visualizations relatively simple given this complicated graph is featured so prominently. To do that, I took full advantage of the tooltips and details. Tooltips are responsive on all graphs and include pertinent information. Having grouped data in Tableau for the top two visualizations, I used color and details to show aggregations.

All the details are included in this tooltip, not visually represented as a label

Example showing tooltip explaining distinction in the data

It also shows the number of people, versus the mark label displayed in %

It proved time consuming to consistently format the tooltips, axes and labeling individually for each graph, and some minor inconsistencies likely remain.

Results, Interpretations & Reflections

The temptation to continually modify the graphs, the dashboard, the data involved as well as the styling was one I admittedly gave in to during this experiment. Oscillating between telling a complete narrative and illustrating key points succinctly, both iterated during drafting, I had to decide which main finding to focus on, resolving to create different visualizations for other findings in the future.

Given this was my first foray into Tableau Public, one goal was to test many different types of graphs, and include a variety in my dashboard. This helped me eliminate some graphs from the dashboard and guided further reductions, which were based on what was visually appealing as well as cohesive to the narrative. Below are two of the eliminated graphs.

Moreover, the Amnesty poll surveyed women aged 18–55 in eight countries. The Facebook data is presumably gathered from each country it operates in, but has not disaggregated as such (despite that doing so would represent more transparency). As its methodology states, the Pew ATP is a “nationally representative panel of randomly selected U.S. adults” and so has only one country-specific finding.

Therefore, including a world map visualization with only eight countries, although useful for the Amnesty data, when combined in my dashboard with the other data, could potentially be confusing. Had I decided to consistently feature the geographic attributes across all datasets, world maps would have been perfect.

Interesting Data Finding

The fact that in 2017, less concerning experiences were reported than the first iteration of the survey could be attributed to the fact that the surveys were not identical. According to footnote 7 in the Pew report, “In the 2017 survey, this was part of a larger battery regarding how much of a problem, if at all, people thought various experiences that might happen to people when they use the internet might be.” Moreover, in 2014, several metrics were different and specific wording changes were applied for the 2017 survey, results could have been artificially lower. Having done extensive topical research outside of this experiment, I have yet to find evidence supporting that 2017 was a unique moment online where less negative experiences would have been reported. Alternatively, theories about the digital aftermath of the 2016 U.S. election cycle could be applied in support of decreased negative experiences online.

Future Directions

As stated before, the dataset I wanted to use for this experiment does not exist.

Without the “perfect” dataset, I collated three surveys to craft the narrative (Google sheet here). To my dismay, the ATP survey was not gender-disaggregated, but it did ask some questions about gender and sexual orientation. Unsurprisingly, neither was Facebook. The Amnesty poll specifically targeted women.

Building on my years of survey design and implementation experience, I intend to create a survey involving the critical aforementioned aspects. Recalling the lessons learned in this experiment, I will be able to more efficiently and effectively analyze and visualize the resulting information.

Aside from that larger research project, I could create a series on this topic, and the next Tableau Dashboard would be on opinions about how social media companies are doing to address the problem of online abuse, featuring already-existing data from the Amnesty and Pew polls, among others.

Link to The Tableau Public Dashboard

Information Visualization

Student work at the School of Information, Pratt Institute