A Look Into Cataloging & Metadata Librarianship – Observation

For my observation, I met with Enerel Dambiinyam, Head of the Monographs Copy Cataloging Department at Columbia University. I chose to meet with Enerel because I’m interested in potentially pursuing cataloging and metadata librarianship in academic libraries. Enerel’s office is based in Columbia’s Butler library. In my time with her, I was able to ask her a myriad of questions related to cataloging, and librarianship in general; she also showed me around her department, and introduced me to her colleagues, in order to give me an idea for the sprawling operation that is cataloging in an academic library.

In meeting with Enerel, my goal was to get a sense of how she views the field of cataloging today, and where it’s going to go. I did this in order to get a perspective not only what I could focus on throughout my time in grad school here at Pratt, but also what the job market might look like when I leave. One aspect of cataloging that I’m really interested in is in how it’s evolving. There are less jobs that are titled just as ‘cataloging’ but now include metadata. In essence, cataloging is metadata, but the way the two jobs in librarianship are viewed are not necessarily the same. Enerel described the two jobs as essentially the same, but cataloging librarian jobs as typically focused more on just books, while metadata librarian jobs will focus on other types of media.

There is a debate in the field as to whether metadata jobs will simply subsume cataloging, especially as there are new cataloging standards on the horizon. More established standards such as MARC 21 are firmly established within all types of libraries, and have been amended with RDA standards. Looking to the future are standards such as BIBFRAME, and ultimately using Linked Open Data to log metadata. Newer standards allow for more flexibility in logging metadata of non-bibliographic materials, and make salient relationships between the content of all kinds of records.

Transitioning to these newer standards will be extremely labor intensive and require a lot of technical support in order to establish and use it, and so, will require deep thinking in how cataloging and metadata departments are composed and how they will interact with IT departments that will need to grow in institutions. In speaking with Enerel, she did stress a grounding in MARC, that will be a good foundation for the newer standards, especially if I ever work in creating roadmaps to convert those records.

For now, Columbia’s Monographs Copy Cataloging Department has one metadata librarian. I met with her, as well, and she mentioned there has been hesitation in adopting newer metadata practices because of a fear of a large knowledge gap. She did, however, say that she did not intend to get into metadata librarianship, but it just sort of happened, and she learned everything on the job. Right now, furthermore, there is not as much top-down pressure in institutions to push for newer standards, but Enerel feels those are inevitable, and the change will happen.

Aside from grander discussions of the field, I wanted to know more about what the day-to-day of a cataloger looked like in an academic institution. One of the biggest challenges is backlog. Books are constantly coming in that have either been requested by professors, students, the subject specialist librarians, and gifts in numerous languages and formats. Books that have been requested expressly by faculty and students need to be in circulation as quickly as possible. Backlog is also why any transition into new cataloging standards will be so difficult, and require so much support. And it is also why it can be hard to implement new ideas and policies that faculty may get from webinars. Another aspect of cataloging new books and backlog, is that depending on the book, creating the record can take anywhere from 20 minutes to a year, as information to create a complete record can be hard to come by. You could get through 30 books in one day or only 4. Depending on where the book is coming from and how obscure it is, the information needed to create a record might be hard to come by. In this way, Enerel describes cataloging as a kind of detective work. I think it’s that aspect of cataloging and metadata that is interesting to me – the idea that I can find new and interesting connections between information.

I also met other cataloging librarians at Columbia in Enerel’s department. There are many catalogers, and each focus on cataloging books of different regions of the world and in a few different languages, and who focus on other aspects of cataloging, such as authority control and database maintenance. I must have met around 15 catalogers total. It was really interesting to see how sprawling and collaborative they were, often checking in with a book they needed advice on how to catalog. For some reason, I’ve always thought of cataloging as a fairly isolated position, but in shadowing Enerel, I learned that was the opposite.

It was a really fantastic experience to shadow Enerel and learn more about cataloging. I’ve never worked explicitly in cataloging, only at the circulation desk of the library; although I have been learning a lot about different areas of librarianship in this semester so far, it was all still a bit opaque to me, so this was a great opportunity to make it all more concrete. I got a better sense of the field, and where I could potentially fit in, and where I could go in it. I also got a better sense of some of the challenges in the field, which only make me more motivated to be involved and get in the field.

Wikidata Workshop

On Saturday, November 10, 2018, I attended the Wikidata Workshop. The event was organized by the Semantic Lab at Pratt, and the workshop was led by Megan Wacha, the Scholarly Communications Librarian at CUNY, and President of Wikimedia NYC. The purpose of the event was to learn more about Wikidata, an initiative associated with the Wikimedia Foundation, and to have an opportunity to work with Wikidata by editing and/or adding new records.

To first learn about Wikidata as an initiative, we first learned about how it relates to the Wikimedia Foundation. Before the Wikimedia Foundation, Wikipedia was established in 2001 to be a free and open online reference, which today is the 5th largest website in the world, the largest reference source on the internet, with approximately 15 billion views a month, and largely written by volunteer editors. Wikipedia is a multilingual and international website, meaning there are numerous interfaces of the Wikipedia homepage in different languages, and thus has implications for what information is spread across each language in Wikipedia. The Wikimedia Foundation was established in 2003 to field donations to maintain Wikipedia and sister wiki-based projects. It disburses funds to the different wiki-based projects which include such projects as Wikipedia, Wikimedia Commons, Wikibooks, and Wikidata. Wikidata is a free, linked database that serves as central storage for structured data across Wikimedia. And Wikimedia NYC, of which Megan is president, is separate from but affiliated with Wikimedia, and acts a bridge between Wikimedia and cultural institutions to improve records and increase access to information through Wikidata. With those relationships established, we were led through examples of why Wikidata is necessary, and how we can contribute.

A couple of the biggest issues with Wikipedia are issues of consistency and redundancy. I mentioned that Wikipedia can be read in multiple languages, but pages about the same thing across languages are not necessarily consistent. For example, say a page is written for the English Wikipedia about Gabriel Garcia Marquez, the page in Spanish Wikipedia that is written about him is different. Pages are not simply translated from other pages but often written in the language with whatever information is available in that language to the writer, and this is where Wikidata can step in to solve these issues. In Garcia Marquez’s Wikidata page, his date of birth, death, nationality and profession are entered; with the help of a bot, this information will be pulled and displayed across all the different in pages in different languages about him.

After learning about the foundations of Wikidata and how it works, our next task to edit some actual records. Collectively, we worked on editing the records for previous Lambda Literary Award winners, as Wikimedia NYC is trying to push to update the data for LGBT+ people. Something especially pertinent we discussed in relation to updating entries for LGBT+ authors was the debate surrounding linking gender identity to peoples’ pages. In the current structure of Wikidata, there is the option to enter ‘sex or gender,’ which leads to a conflation between sex and gender that is not an accurate representation of the lived experience of many people. The category is further restrictive because there are not many options available within the category, and because the way gender and sex are expressed across languages and cultures is different, there is no good translation for the categories. Some words may be more particular in English, but have no equivalent in Chinese, for example. This begs the question, should we further try to classify sex and gender in entries for Wikidata for everyone? Or not include it at all? But what if it’s important to the person’s work? And even if classification can be made to be more specific, flexible and translatable across languages, there is still the issue that, as Emily Drabinski wrote, ‘as we attempt to contain entire fields of knowledge or ways of being in accordance with universalizing systems and structures, we invariably cannot account for knowledges or ways of being that are excess to and discursively produced by those systems.’[1] Drabinski goes on to show that according to queer theory, it is not desirable to move toward an all-encompassing standardized system of knowledge organization, but rather to move toward an environment which there is a more consistent critical eye toward our organization schemes. The category of ‘sex or gender’ in Wikidata is a prime example of this, and of how material these issues of categorization can be, and thus, how important it is to consider carefully how we categorize them.

Wikidata, and linked open data in general, can be a way forward for information to be more flexibly and fluidly categorized, because it explicates relationships between information, rather than creating hierarchies of information. But it is still a knowledge organization scheme that has been unevenly applied across cultural institutions or simply ignored. The example of the category ‘sex or gender’ in Wikidata shows us it still is a necessity to be critical, and that no knowledge organization scheme is going to be finish the work of being critical of our classification systems.

-Taylor Baker, INFO 601-03

[1] Drabinski, Emily. “Queering the Catalog: Queer Theory and the Politics of Correction’ in The Library Quarterly: Information, Community, Policy, 94-111. Chicago: The University of Chicago Press, 2013.