For this lab, I wanted to focus on an issue that relates to K-12 students both in and outside of the classroom and which also connects to my interest in data literacy and analysis. I began reading about issues arising from children and young adults’ social media and internet usage and landed on the Pew Research Center’s archive of surveys, one of which is the 2018 Tech and Teens survey. The survey covered 743 teenagers, ages 13-17, across the United States, and was repeated in 2022 (though the full 2022 data has yet to be released on Pew’s site). Reading through the top-line findings and several of Pew’s follow-up articles analyzing this and other datasets, I learned that though teens in 2018 were generally positive about social media, they were also frequently experiencing challenges such as bullying and different forms of privacy invasion. They also seemed to voice similar worries to adults about their device usage and about leaders moderating or addressing negative experiences online (Anderson & Jiang, 2020 and Pew, 2018). Some of this data Pew has developed and visualized in charts and graphs already, but since the survey was so lengthy, I was hoping to pull out some different and specific insights from the responses using several of Tableau’s visualization tools.
I downloaded a few different dataset folders from Pew’s “Tools and Resources” pages. After reviewing their top-line data and their methodologies, I narrowed my focus down to the Tech and Teens 2018 dataset. To fully translate and clean their data, which was in SPSS file format, I transformed this file into a data frame in R, then exported the underlying dataset that I created as a .csv file. I then cleaned the data further in Microsoft Excel before re-uploading the Excel file into Google Drive. From there, I connected Tableau Desktop to my Google Drive to query this spreadsheet and build my visualization. I both published and exported my visualizations via Tableau Desktop and Tableau Public.
As I was researching datasets to use from Pew, I discovered that the coding for much of their demographic data was missing from their CSVs, and realized this was probably in their original SPSS files. This forced me to go into R to translate and export the full data frames so I could save, understand, and clean the data. In Excel, I changed all the variable names to more explicitly describe the survey question they related to, removed columns with extraneous information or questions I wasn’t going to focus on, cleaned up formatting inconsistencies such as responses in all-caps and a range that had been read in as a date. Once cleaned, I uploaded this to Google Drive and connected it to Tableau.
In Tableau, I used aggregation and normalization to represent the variables in new ways. I calculated percentages instead of raw scores in pie chart form to show various demographic data such as race/ethnicity. I also grouped data, such as income levels, into brackets that were more easily divided and understood (reducing the overall slices from over ten brackets in the original dataset to five). Occasionally, if there were under five respondents who skipped a question entirely, and the sample size was in the hundreds still, I excluded skips in order to more clearly visualize the data. I chose colors to reflect certain cultural connotations; for instance, a green-gold scale to show income brackets and red-orange to highlight negative or “dangerous” issues.
I decided to focus on some key explanatory charts and tables in one Tableau story. I displayed charts and graphs answering these three questions: How do teenagers use social media? And how does it make them feel in general? What are they most concerned about online?
I created some introductory graphs in the first two parts of the Tableau story, highlighting first in side-by-side graphs the differences in social media use by gender, and then in a dashboard a couple of the major feelings that came up around teens’ own perspectives of their internet and social media use. I then created another dashboard to drill down on a major concern that was highlighted online: bullying.
In drilling down into the bullying issue, I represented the data a few ways: Firstly, through a heat map colored in a gradation of red-orange showing that across ages, teens agreed that bullying was a “Major problem.” I also represented this data in a cross-tab to give another numerical representation of the same data. Next, I decided to see if there was a correlation between age and opinions around bullying being a Major, Minor or No problem. To do this, I used the Analytics tab to drag a Trend Line over the data that was already measuring the total number of respondents in each opinion category as related to their age. Once I was able to see the various trend lines organized by opinion, I could view the linear function of each as well as the strength of the line via the r-squared and p-values for each category.
Results and Interpretation
How do teenagers use social media?***
Note the differences between male and female social media tools preferred.
***This data is from 2018, so it will be interesting to re-survey in 2022 and see what’s changed, and if there are still differences across gender lines. We already know that TikTok is by far the most popular social media platform now (Pew, 2022).
And how does it make them feel in general?
Note the overall positive feelings, though there are several negative ones that included a) female respondents thinking that they spent “Too Much Time” on the internet at a higher rate than males and b) a still sizable chunk of all respondents feeling overwhelmed by drama on social media.
What are they most concerned about online?
Since bullying appeared to be the first point of agreement between respondents, I decided to focus on this issue in a few ways. One thing that became fairly clear was a surprise to me: in this particular dataset, the higher the age of the teenager, the less they appeared to think bullying was a problem. I had hypothesized that as kids aged, they would experience more challenges like bullying behavior online, or at least hear more about it from others. It’s worth mentioning that the p-values are on the higher side and the correlations are not particularly striking here, so this wasn’t a very successful experiment.
I also built a dashboard to show more about the survey’s sample population in order to give my audience context for the research. Under a story title called “More about these respondents,” I included demographic data in two pie charts, one on self-reported race ethnicity and one on income levels, which I had grouped by ranges to simplify the brackets. I also added a stacked bar graph showing the region of the US represented, as well as how much each region drew upon metro-area respondents versus non-metro-area. Finally, I showed the comparison of genders represented.
Something that came up almost immediately for me was the need to translate the coding of the demographic data I downloaded from Pew (it was not readily available in their questionnaire, topline, or CSV files). I’m grateful to have had my background in R programming that allowed me to be able to transform and clean this data directly from their SPSS files. Once I did that, though I could discover data was transparent and robust to work with. However, I then realized how much more helpful it was for trend lines and other analyses to have the data coded from categorical (strings) to numerical data. For instance, income was bracketed in strings such as “Less than $5,000” up to “$200,000 and more” which was difficult to transform in Tableau but would have been easy to work with once coded as ordinal data (where 1, 2, 3 corresponded to a bracket).
Future directions for this experiment would be to do this same survey periodically to create time series data with cross-sectional data by year. Social media tools and usage change so rapidly that this is probably already quite outdated. The 2018 data doesn’t include now-popular apps like TikTok and BeReal, or a even a category for “Other,” newer apps. The Tech and Teens 2022 survey results begin to cover these changes, but Pew has only published articles and top-line data for this survey, not the full dataset for download.
I personally found working with Tableau challenging, not due to its robust amount of features and ability to customize the displays, but because I wanted to be able to mathematically transform data and create conditions for certain variables much more quickly. Understanding what Tableau could and couldn’t do, and what was easier to do outside of it, was a learning curve for me (maybe just due to my amateur level with this software). For instance, I realized that I’d rather clean the data so it’s exactly how I need it to be for visualizing before querying it in Tableau, rather than trying to work with cleaning it or doing more complex calculations in this software.
Pew Research Center. (2018, November 28). Teens and Tech Survey 2018. Pew Research Center: Internet, Science & Tech. Retrieved October 21, 2022, from https://www.pewresearch.org/internet/dataset/teens-and-tech-survey-2018
Anderson, M., & Jiang, J. (2020, August 28). Teens’ social media habits and experiences. Pew Research Center: Internet, Science & Tech. Retrieved October 22, 2022, from https://www.pewresearch.org/internet/2018/11/28/teens-social-media-habits-and-experiences/
Resources. Public.tableau.com. (n.d.). Retrieved October 22, 2022, from https://public.tableau.com/app/resources/learn