Unlocking the Treasure Chest: Archiving in the Digital Age

In his 2003 article titled “Scarcity or Abundance? Preserving the Past in a Digital Era,” historian Roy Rosenzweig wrote of his concern about the “fragility of evidence” in today’s digital world. In the past, archivists collected, organized, and preserved paper documents and photographs, known today as “analogue” records. Most of these are now being “made-digital”, meaning they are photographed, scanned, and converted into digital media. With the rise of technology, more and more records are “born-digital;” that is, they are initially created in electronic form, not intended to have an analogue equivalent.

Though digital records provide greater access to information and save shelf space, Rosenzweig laments their short life span.

“Digital and magnetic media deteriorate in ten to thirty years,” he writes.  But that’s not even the biggest problem. “The life expectancy of digital media may be as little as ten years, but very few hardware platforms or software programs last that long. Indeed, Microsoft only supports its software for about five years.”[1. Rosenzweig, Roy, “Abundance or Scarcity? Preserving the Past in a Digital Era,” The American Historical Review, Vol. 108, No. 3 (June 2003): 741-42.]

Imagine, if you will, that a treasure chest sits on the ground before you. You know it is full of something – gold, gems, riches, or perhaps something not so desirable. Your curiosity to find out what’s inside leads you to unlock it, but wait – the key doesn’t work! It’s an old treasure chest and the type of key needed to open it is no longer being made. Then, in a stroke of luck, you manage to find a key that fits! You turn the key, hearing the click of the lock that signifies you’ve opened the chest. You attempt to pry it open, only to find it’s rusted shut. You can’t open the treasure chest, and you’re unable to discover what it held.

This is the conundrum archivists are facing in the digital era. They have access to countless old files and floppy disks, but these records are no good if they can’t be opened because the software needed to run them is obsolete. Even if the software or hardware is available, the likelihood that the disk has deteriorated is high, and so the information contained within remains hidden.

A corollary to the short lifespan of digital records is the need to archive them as soon as possible, rather than allowing years to pass before they are collected. Rosenzweig provides an interesting example:

“What might happen, for example, to the records of a writer active in the 1980s who dies in 2003 after a long illness? Her heirs will find a pile of unreadable 5¼” floppy disks with copies of letters and poems written in WordStar for the CP/M operating system or one of the more than fifty now-forgotten word-processing programs used in the late 1980s.”[2. Ibid., 745-46]

As thought-provoking as this example is on its own, it’s considerably even more captivating because it mirrors a real-life situation.

In 1996, playwright and composer Jonathan Larson, best known for his hit Broadway show Rent, died suddenly the night before the musical was to open. He left behind seven years of drafts, compositions, and letters saved on 189 floppy disks. The Library of Congress acquired these records in 2003. Five years later, Doug Reside, New York Public Library’s Digital Curator of Performing Arts, obtained permission to work with the files in the hopes of discovering what they contained.

In a 2012 interview, Reside commented, “There were over 30 files containing texts of Rent, many of which contained within themselves early drafts preserved by Microsoft Word 5.1′s “fast save” feature. There were also music files in early versions of Digital Performer and Finale and letters Larson wrote to his agents, to Stephen Sondheim, and to friends about the show.”

Unfortunately, Larson wrote his drafts on software that is now obsolete, and saved them to storage systems that are now outdated. Simply opening the files on a modern-day computer was not an option.

First, Reside copied the materials bit-for-bit and stored them on a more stable medium at the Library.[3. Doug Reside, “‘No Day But Today”: A look at Jonathan Larson’s Word Files.” New York Times, 2012 <http://www.nypl.org/blog/2011/04/22/no-day-today-look-jonathan-larsons-word-files>.] This process is known as migration, defined by Rosenzweig as “moving documents from a medium, format, or computer technology that is becoming obsolete to one that is becoming more common.”[4. Op. cit., Rosenzweig, p.747.] Then, in order to read the drafts, Reside used a “Basilisk II emulator, which allowed him to see the files exactly as Larson had seen them, right down to the chunky fonts and irritating pop–up error messages.”[5. Jennifer Schuessler. “Tale of the Floppy Disks: How Jonathan Larson Created ‘Rent’.” New York Times, 2012. <http://artsbeat.blogs.nytimes.com/2012/02/01/tale-of-the-floppy-disks-how-jonathan-larson-created-rent/?_r=0>.]

The final draft of Rent, as Larson saw it January 15, 1996.[6. Op. cit., Reside.]

 Using a text editor called Text Wrangler, Reside was able to uncover the last 14 revisions Larson made, highlighting the playwright’s creative process. Because of Reside’s work, we now know what hidden information Larson’s floppy disks contained.

But as both Rosenzweig and Reside point out, other cases may not be so successful. What if Larson hadn’t died so young? What if he had gone on to write more shows, leaving behind an even larger body of work? What if his records hadn’t been made available so soon after his death? Most likely his work – his drafts, revisions, early compositions – would remain a mystery, hidden behind a deteriorated medium, unreadable by software and hardware now obsolete.

As the use of technology increases, archivists, librarians, and historians must find a way to keep up before records are lost forever.

Leave a Reply

Your email address will not be published. Required fields are marked *