Visualizing Representation in Children’s/YA Literature (2018-2022) 

Final Projects


While discussions of book banning in school and public libraries have been dominating children’s and YA literature discourses for the last few years, I thought it would be rewarding to look at a (potentially) more positive element in the field. By utilizing the Cooperative Children’s Book Center’s Diversity Statistics, an ongoing organizational data project originating in 1985 that tracks the University of Wisconsin-Madison research library’s new acquisitions of teen and children’s literature that includes BIPOC, LGBTQ+, disability, and religious representation, I hoped to visualize recent and effective trends in this publishing sector. By just taking data from 2018-2022, my initial goal (which was of course always dependent on the data) was to visually represent two things: how much change can happen in just a few years time, while still being incredibly limited in what is actually changing when you break down the intersections of these tracked representations that appear in the data. 

I was able to speak with two employees of the CCBC before embarking on my design process, Madeline Tyner and Tessa Michaelson Schmidt, the director, discussed with me their need for simple graphics that would be applicable to use in a myriad of contexts in which this data is often pulled for – stakeholders, researchers, undergraduate and graduate students, journalists, educators, authors. In all likelihood, most of these users would probably not be shocked to hear that BIPOC LGBTQ+ representation still lags significantly behind depictions of white LGBTQ+ characters, but especially given public outcry about some of these topics appearing in YA and children’s lit, it could still be surprising to see just how little there is of the latter when you can see it comparatively. As the only professional entity who has been working on a project like this for as long as they have, the CCBC has considerable interest in being able to share these statistics in accessible and digestible formats for their wide range of users. 

Material and Methodology 

Basic formats of this data are freely available on the CCBC’s website. They also welcome any researcher or interested party to contact them with further questions or who are looking for more material. One of the librarians, Madeline Tyner, is a friend of mine and oversees the logistical elements of the projects; I discussed with them what information I was looking for, and they were able to share a Google sheets document of relevant information. This originally included all data for books featuring LGBTQ+ representation, starting in 2018, and listed all of the information the CCBC records: ID number, title, ISBN, year, call number, collection (fiction versus nonfiction), genre, publisher, character notes for sexuality, race, disability, and religion, author, and author identity notes. I decided to cut this down significantly to meet my needs for producing effective visuals and kept the year, collection, genre, and notes on primary character race and sexuality. This final collection of data had 539 rows.

Additionally, there was little in the way of data entry standardization. Notes on race and sexuality were sometimes entered in full sentences instead of individual markers, and given the massive perforations between the actual entries and simplified categories that I required, I decided to manually edit the data. Using searching tricks to mass edit small blocks at a time, I ended with two sets of categories for race and sexuality/gender. The former contained Arab, Asian, Black, Brown-skinned (when characters have been described as such but no specific identity was indicated within the data), Indigenous, Latinx, Multiracial, Pacific Islander, Unknown, and White. The second groupings, which were much more subjectively decided by myself, included Gay, Lesbian, Bisexual, Queer/Unspecified, and Trans/Nonbinary. I originally intended to include trans identities and gender nonconformity in with Queer/Unspecified but decided against it when I saw the range of books that could substantially constitute it as a stand alone category. 

With the dataset prepared for visualization, my next step was to upload my information into Tableau Public. Not only is Tableau the most accessible program that I encountered over this course, I see it as an effective tool in producing the kinds of straightforward visuals that were discussed with the CCBC director. Given that I both wanted to feel as comfortable and confident as possible during the process and the platform fulfilled the needs of the organization, it seemed the most appropriate choice. 

The Design Process 

My Tableau design process started with a considerable amount of experimentation; which measures resulted in the most accessible and interesting design options? I was aware of certain themes and ideas I was expecting to express through the data, but I needed to work with it in a tangible way in order to guide myself into the array of visualizations I ended up producing. 

The first visual is a line graph featuring all books with LGBTQ+ protagonists across 2018-2022, broken down by every available racial category. The line for ‘White’ soars above every other descriptor, generally increasing with every year save for a drop in 2022 (the CCBC site also notes that since 2020, given the pandemic, they have received fewer review copies in general from publishers). Multiracial consistently records the highest numbers among BIPOC categories (with, in 2021, a difference of over sixty between it and White), with only Black starting to come close in 2022. Arab and Indigenous maintain a level of only one or two total titles across all years. More than any other visual, I believe this one provides a general and clear understanding for users to take away from this data – that while LGBTQ+ representation may be reliably increasing across YA and children’s literature, there are still massive gaps among them when it comes to the racial makeup of these characters and that similar, consistent jumps are not being seen for these BIPOC groups. 

Secondly, I used the packed bubbles design to visualize the distribution of LGBTQ+ categories across all BIPOC representations, again across all years. After taking a broader approach in the first visual, I appreciated being able to delve deeper into the BIPOC representation that is available to see what LGBTQ+ identities are being explored. Queer/Unspecified dominated with 93 entries, with Gay and Trans/Nonbinary far behind but surprisingly close to each other (45 and 32, respectively). While I did not recreate this kind of visual to include White and Unknown, just from my own experience with this data, I do see this trend of Queer/Unspecified making up the majority of LGBTQ+ representation in line with the overall data. Given that titles were bestowed this category if no explicit identity is named in the text but queer or same-sex attraction is depicted, it makes sense that BIPOC narratives follow this inclination. 

The following two visualizations, Black LGBTQ+ Representation 2018 and 2022, are an attempt at seeing if the LGBTQ+ identities within a singular racial category changed at all between the first and last year of our available data. Interestingly, the overall total exactly doubled from six to twelve, but the makeup changed drastically from an almost even representation across all categories in 2018 to over half appearing in Queer/Unspecified in 2022. In 2022, lesbian disappeared entirely. 

Next, I included a discrete line graph that compares the categories of Gay and Trans/Nonbinary from 2018-2022 and across all racial categories. I found this one to be possibly the most unique and interesting visual of the group, as the findings across race are by no means consistent beyond Gay generally having higher numbers than Trans/Nonbinary, with the categories often starting to even out by 2022. Predictably, White has the highest of all, with Arab, Indigenous, Brown Skinned, and Unknown all having years with no representation in either category. Others, like Black, Asian, and Multiracial, stay largely close together except for an outlier year. 

Lastly, since I did intend to include another element outside of character identities and the year of publication, I produced an area chart for all 539 entries of books with LGBTQ+ protagonists broken down by genre. Contemporary overwhelmingly carried the majority every year, which runs in line with likely anyone familiar with the general trend of current children’s and YA literature, with Fantasy and Historical staying almost identical year to year from 2020 onward. 

UX Research 

For my user experience research, I decided to provide my visualizations to the two CCBC employees I had already interacted with prior: Madeline Tyner and director Tessa Michaelson Schmidt. I provided them both with a link to my public Tableau project in advance and spoke to each over the phone for a short interview, approximately fifteen minutes and during the work day. I asked them both the same four basic questions to receive instructive and informed feedback on my visualizations: 

  1. Is the data and information in each different visualization clear? Does anything need further explanation?
  2. As someone familiar with this data, are these visualizations the kind you’d find useful to share with users and/or stakeholders? 
  3. Is there anything else you would like to see? Any visual element you would do differently?
  4. Do you have any additional feedback of any type that you’d like to share?

Generally, both had extremely positive reactions to my work. In answer to the first question, both Madeline and Tessa had small quibbles that were not in my realm of Tableau expertise to fix but that were interesting: Madeline pointed out that when you hover over the bubbles design, the actual category titles from the data sheet are included in listing the total count (for example, PrimaryCharacterHeritageNotes1). Tessa implied interest in having more descriptive elements for the visualizations individually beyond the titles, essentially a method of providing more context around the information. In this vein, both expressed enthusiasm about these being shared with the various audiences who often interact with this data project. Elements that I have already highlighted, such as the comparative nature of seeing the massive difference between White and BIPOC LGBTQ+ representation, was communicated simply with a strong visual component. Tessa specifically cited stakeholders being able to better understand this discrepancy in more than just a conceptual way, as well as how dire the problem actually is and why the project itself is important to document. In the context of funding and allocating resources, thoroughly understanding this issue is paramount. Students were also highlighted as an important user group, as these could be a useful tool in the realm of education for appreciating the current state of the literature. 

When it came to the third question, Madeline reiterated my earlier hope to break down similar information to the first graph as a packed bubbles visual, which would employ a different and effective form of communication around the same data. Tessa was interested in work outside of the scope of this project which is definitely compelling; she proposed including studies done about the real life identities of children and teenagers today. If we know more about the actual makeup of the gender identities, sexual orientations, and racial identities of the intended audience for these works, it can add another layer to the apparent issue of diversity in publishing versus how students look and identify today. For final feedback, they both emphasized the importance of how these visuals can serve as complementary elements to the narrative the CCBC and its users tell around this data. Generally, they provide succinct and accessible visualizations that effectively portray numerous issues of contention in children’s and YA literature and provide key jumping off points for larger conversations. 

Findings and Reflection 

Going into this project, I had the same assumptions that Madeline and Tessa discussed when it came to stakeholders and other audiences who interact with the CCBC’s diversity data – as LGBTQ+ representation has grown within children’s and YA literature over the last few years, the racial breakdown of this work is more stagnant and still more glaring than some users may assume. My visualizations reflect this finding starkly, which gives them, as an organization, the means to both fundamentally justify this ongoing project and advocate for change across channels available to them and the users of their data. The sections of data that I have chosen to work with are simplified compared to the project as a whole, but the visualizations necessary for this information also require a level of simplification and easy access, so the average user does not have to sift through thousands of books over numerous years to understand these trends and issues in diverse children’s and YA representations. As Tessa said to me when we first spoke about the project, simple visuals are effective in a subject like this that has the emotional element of portraying marginalized experiences via distanced numbers and statistics – that huge gap between White LGBTQ+ characters and just about anyone else speaks volumes. 

Going forward, I see massive potential in finding different ways to portray both the macro and micro elements of this data. As Madeline pointed out, trying different formats for the first visual comparing all the racial categories within LGBTQ+ protagonists would likely be quite helpful, especially if a user experience process was included to see which visualizations are preferred by different users. In the process of writing a report on this data, other opportunities for more specific visualizations would likely become clear. Because there is so much available, the needs and interests of the user or audience would best guide these more specific visual works. When it came to my own work, I wish that there was more room and time to get more into a comparative track between the different categories, especially since my one product comparing the track of Gay and Trans/Nonbinary works had some of the least clear takeaways across racial groups. Regardless, the visualizations that have emerged from this project depict a scene of children’s and YA literature that still needs a significant push from all sides to do better by BIPOC LGBTQ+ representation, and it motivates me as a library professional to advocate for the same across other disciplines in the tradition that the CCBC has been championing since the project’s inception.


2022. Data on books by and about Black, Indigenous and People of Color compiled by the Cooperative Children’s Book Center (CCBC), School of Education, University of Wisconsin-Madison, based on its work analyzing the content of books published for children and teens received by the CCBC annually. [Data set.] CCBC. Accessed: 7 December 2022.