With Fellow Allysha A. Leonard
You can view the presentation here.
https://docs.google.com/presentation/d/1P4vVwc7MkNOYyXmy8GR2CzQDc0fsK6XH6wpiN4Om1Mg/edit?usp=sharing
MS Graduation Portfolio
https://allyshaaleonard.wixsite.com/pratt-portfolio
The presentation will discuss the project I have been working on with Jonathan Lill for the Linked Open Data Fellowship at the Museum of Modern Art. It will discuss the use of digital tools such as OpenRefine and uploading shared data to the open source site Wikidata.
Linked open data is a concept of publishing and interlinking data on the internet in a way that is freely accessible, shareable, and machine-readable. It involves using standardized formats, such as Resource Description Framework (RDF), to create links between different data sets, enabling them to be easily combined and analyzed by computers. By using open standards and making data available in a standardized format, linked open data allows for greater interoperability and sharing of information across different systems and platforms. This results in a more connected and collaborative web of data, making it easier to discover, analyze, and reuse information for various purposes.
My specific job included adding all MoMA exhibited artists since its opening in 1929, and linking them to the already uploaded assets of the past exhibitions. This work is helpful for connecting different databases and sources such as the MoMA website URLS for both exhibition and artist, the Virtual International Authority File, Social Networks and Archival Context, Union List of Artist Names withe the Getty Institute, all to one accessible page that can be populated by Wiki editors to add background information. Part of this work includes setting a base schema for each asset, and using machine processing to upload the data in bulk.
PROCESS
- Step 1: OpenRefine – Artist Reconciliation
– Download XLSX of all artists and data from MoMA Microsoft Access and Upload into OpenRefine
– Separated projects by Individuals and Institutions since they have different schemas
– Reconcile artists Display Name and Identifiers with existing Wikidata pages - Step 2: Artist and Institution Schema
– Create Schema for Artist Page using “instance of” (P31) “human” (Q5) for individuals
Individual Inclusions: Date of Birth (P569), Date of Death (P570), Part of (P361), Museum of Modern Art artist ID (P2174), Union List of Artist Names ID (P245), VIAF ID (P214), SNAC ARK ID (P3430), ADD REFERENCE URL (P854) for every provable statement
– Create Schema for Artist Page using “instance of” (P31) and “organization” (Q42339) for institutions
Institutions Inclusions: Start Time (P580), End Time (P582), Location (P276), Museum of Modern Art artist ID (P2174), Union List of Artist Names ID (P245), VIAF ID (P214), SNAC ARK ID (P3430), ADD REFERENCE URL (P854) for every provable statement - Step 3: Upload to Wikidata – QIDs
– Upload items to Wikibase – https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine/Editing/Uploading
– Track Uploads via Edit Groups Tool Forge and User Contributions on Wikidata – Important for avoiding duplicate upload, too many QID’s for one constituent
– Create a Spreadsheet with new and existing QIDs for artists by creating new QID column and export XLSX
– Combine exhibition data in Excel spreadsheet QIDs with artist QIDs using Exhibition_Constituents_Match table on Microsoft access – 50,890 Lines of Data
(QIDs will later be uploaded to MoMA TMS System) - Step 4: OpenRefine Schema: Linking Artist and Exhibition
– Use a new project in Open-Refine to reconcile by QID for now existing artists and exhibition. Reconciliation should now take ¼ the original time as all QID’s are created and more easily matched
– Upload in multiple projects/batches to assure correct tracking and quicker reconciliation
(18 projects with 2,000-3,000 lines of data were used in this case)
– Create a Schema for the Art Exhibition page using “Exhibited Creator” (P10661) for attaching reconciled artist names.