Working With Digital Archives at the Harry Ransom Center A Presentation About Processing the Digital Archives of British Playwright Arnold Wesker Metadata.

Slides:



Advertisements
Similar presentations
Focus on Your Content, Not on Ingesting Your Content Terry Brady Applications Programmer Analyst Georgetown University Library
Advertisements

Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
BORN DIFFICULT? MICHELLE LIGHT Director, UNLV Libraries Special Collections (formerly Head of Special Collections, Archives, and Digital Scholarship at.
The PeDALS approach.  Pete Watters Arizona State Library, project coordinator  Richard Pearce-Moses Clayton State University, Georgia,
1 Lesson 14 Sharing Documents Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
Applying Theoretical Archival Principles and Policies to Actual Born Digital Collections LEIGH ROSIN | Digital Archivist | National Library of New Zealand.
Providing Online Access to the HKUST University Archives: EAD to INNOPAC Sintra Tsang and K.T. Lam The Hong Kong University of Science and Technology 7th.
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
1 CS 502: Computing Methods for Digital Libraries Lecture 27 Preservation.
Archival Prototypes and Lessons Learned Mike Smorul UMIACS.
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library,
“Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull Jenny Mitcham Digital Archivist Borthwick.
DSpace, CyberCemeteries and Other Active Sites for Community Networking Records Maria Esteva and Sue Soy School of Information, UT Austin Austin History.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
FITS: The File Information Tool Set
Automated Archiving of DVD Content Esteva, Vega, Nieto, Scott, Gunnels, Kumar, Lamphear, Henriksen, Lee, Martin TCDL 2013.
Finding a New Way Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records Using.
 Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013.
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
INF 392K: P ROBLEMS IN P ERMANENT R ETENTION OF E LECTRONIC R ECORDS D R. L UKENBILL D IGITAL A RCHIVE M AY 5, 2010 Kathryn Brooks, Alexandra Myers, Jessica.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Troubleshooting Security Issues Lesson 6. Skills Matrix Technology SkillObjective Domain SkillDomain # Monitoring and Troubleshooting with Event Viewer.
Persistent Digital Archives and Library System (PeDALS)
VITAL at the National Library of Wales Glen Robson
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
The DSpace Course Module – An introduction to metadata in DSpace.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Carcanet Case Study Fran Baker, John Rylands University Library University of Manchester SPRUCE event 19 January 2012.
Student Quick Start Guide Prepared by: Information Services Division Perpustakaan Sultan Abdul Samad Universiti Putra Malaysia
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
The Urley Bird Opens in Word: Digital Documentation of Leon Uris and John Crowley as Authors Jennifer Lindley Sidney Tibbetts Jose Javier Garza INF 392.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
@ulccwww.ulcc.ac.uk IRMS Cymru October 2015 From EDRMS to digital archive: a wish-list for ways to preserve digital records.
Launching E-Records with a PERPOS: The Presidential Electronic Records PilOt System 2005 NAGARA Annual Meeting.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Donald G. Davis Collection 392K Amy Baker, Megan Peck, Zach Vowell.
Digital Archives You Can Do It! The Collective - March 2016 Paul Kelly - Digital Archivist - The Catholic University of America.
A Beginner’s Guide to Preserving Digital Resources in Historic Environment Records Catherine Hardman and Kieron Niven Archaeology Data Service.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Sharing Your Finding Aids in CONTENTdm Encoded Archival Description (EAD) Files in Mountain West Digital Library June 3, 2009 Sandra McIntyre, Mountain.
Digitalcommons.unl.edu Archiving Department Records.
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
The Paradigm Project: Practical Lessons Project: Challenges of the e-environment for HE records managers & archivists Kings College,
7th Annual Hong Kong Innovative Users Group Meeting
Building A Repository for Digital Objects
Naming and Saving Files
Microsoft Windows 7 - Illustrated
Understanding File Management
UNC Digital Library Project
Lesson 9 Sharing Documents
Microsoft Access 2003 Illustrated Complete
Lesson 9 Sharing Documents
Use It or Lose It! Preserving Your Digital Documents
Managing Your Files.
MSC photo:  It was taken some time in the late 1930s, but we don’t have an exact date.  The college was known as MSC from 1925 until 1955 when we became.
Lesson 14 Sharing Documents
Beyond Description: Metadata for Catalogers in the 21st Century
Robin Dale RLG OAIS Functionality Robin Dale RLG
The Bentley Digital Media Library
Presentation transcript:

Working With Digital Archives at the Harry Ransom Center A Presentation About Processing the Digital Archives of British Playwright Arnold Wesker Metadata and Digital Object Roundtable Society of American Archivists Annual Meeting 2007 Catherine Stollar Peters New York State Archives

Background Worked at Harry Ransom Center in Austin, Texas from 2004 to early 2007

Austin

Albany

Background Now work at the New York State Archives Cultural Education Center (New York State Archives)

In January 2007 the Ransom Center was Processing collections with electronic records Developing policies and procedures for processing electronic records Evaluating options for a Trusted Digital Repository –At the School of Information at the University of Texas at Austin –At the University Libraries at the University of Texas at Austin –Or develop institutional TDR Conducting a general electronic records survey and needs assessment (with a more thorough survey planned for the fall)

HRC Dspace at School of Information

About the Case Study

In January 2007 at the School of Information Dr. Patricia Galloway offering Problems in Permanent Retention of Electronic Records Course Dr. Galloway contacts Ransom Center for potential support of group projects

School of Information Course Three collections were processed by students during Spring 2007 semester Leon Uris Papers –Lessons in digital archeology –Limited migrated content John Crowley Papers –Standard manual processing Arnold Wesker Papers –Largely automated processing, migration, ingest procedures –Fragile media –Living author

School of Information Course Three collections were processed by students during Spring 2007 semester Leon Uris Papers –Lessons in digital archeology –Limited migrated content John Crowley Papers –Standard manual processing Arnold Wesker Papers –Largely automated processing, migration, ingest procedures –Fragile media –Living author

Arnold Wesker British playwright and author Born in London in 1932 The Four Seasons ran in March 2007 at Arcola Theatre Ransom Center maintains paper archives Works include -As Much as I Dare (autobiography) -Longitude (adaptation of Dava Sobel’s book) -Groupie -Chips with Everything

Automated Processing Largely automated processing, migration and ingest procedures possible because One author Similar content/materials (works, correspondence, diaries, personal files) Mostly same format (Corel WordPerfect 5.0, 9.0 and Microsoft Word 97 and 2000) Easily migrated (to RTF) Well arranged Manageable number of files (5,000 +) Readable disks ( inch floppies and 1 zip disk)

Processing Issues Some files were password restricted Bank account numbers were included Encoded date fields would automatically update

Archival Theory Applied to Digital Materials Acquisition Create a disk catalog with all pertinent metadata Copy to a processing computer drive Appraisal Appraise for duplicates and restricted material Arrangement Arrange material according to author’s original arrangement Description Create a file catalog with the pertinent metadata Create and record checksums Extract metadata Transform metadata from NLNZ Schema to Dublin Core Preservation Migrate all of the files to a more stable format, such as Rich Text Format Make physical copies of all the files onto new media Ingest the files into DSpace Ingest the project documentation Reference Integration into paper-based finding aid

Archival Theory Applied to Digital Materials Acquisition Create a disk catalog with all pertinent metadata Copy to a processing computer drive Appraisal Appraise for duplicates and restricted material Arrangement Arrange material according to author’s original arrangement Description Create a file catalog with the pertinent metadata Create and record checksums Extract metadata Transform metadata from NLNZ Schema to Dublin Core Preservation Migrate all of the files to a more stable format, such as Rich Text Format Make physical copies of all the files onto new media Ingest the files into DSpace Ingest the project documentation Reference Integration into paper-based finding aid

Disk Catalog

File Catalog

Appraise for Duplicates Files on zip disk contained some duplicates Developed rules for removing duplicates to prevent automatic deletion of duplicate names but not duplicate files Erased duplicate files but recorded presence of duplicates in file catalog Zizasoft’s comparison software zsCompare and zsDuplicate Hunter Standard 2.31

Restricted Material Bank Account numbers –Investigate to see if the accounts were closed Password protected diary entries –Remove password to migrate –Place restrictions on access through DSpace instead of word processing software –Paper copy already exists and is in restricted section of stacks

Checksums Command line utility automatically creates checksum Jacksum is one Java checksum utility Export results to spreadsheet Compare to MD5 hash created by DSpace

Migrate Text to More Stable Format Chose RTF because it is widely accessible by multiple readers and it retains formatting –ODF is new and untested yet –TXT loses formatting –Microsoft Word DOC and Corel WordPerfect WPD are proprietary and accessible by few readers Used ABC Text Converter to migrate files from DOC or WPD into RTF –Used Perl script to add extensions to files to mitigate Wesker’s use of 3 digit extension

Create Duplicate Physical Copy Save files to CD, DVD or harddrive for extra, short-term backup copy while processing (and before ingest into Institutional Repository)

Extract Metadata

National Library of New Zealand XML

National Library of New Zealand XML (cont.)

Dublin Core XML

Directory Arrangement for DSpace Bulk Ingest

Automated Processes Created Perl scripts to automate processing –Modified Perl scripts from Queen’s University Library in Ontario, Canada project/tutorials/qspace_bulk_upload.doc –Metadata conversion script (from National Library of New Zealand Metadata Extraction Tool v 3.0) –Script to move individual xml files into individual directories –Script to create contents file for each directory –Scripts to rename files for format transformation

Issues with Metadata Extraction Author unreliable –Partially solved by adding code to Perl scripts to export standard author information) No subject metadata Inaccurate dates –Date created sometimes newer than date modified due to Windows file system Inaccurate titles –First line in document –Title from template Format problems when extensions are used as part of name field No recipient information (potential text mining project) Path name derived from location of file on processing computer, not original author’s system Sometimes NLNZ Metadata Extractor v 3.0 processes files with default adapter instead of actual suitable adapter Dublin Core metadata is not robust enough for digital preservation needs

New Zealand XML Wrong Author

Dublin Core XML

Ingest Created detailed ingest procedures based on Cornell’s procedures as example DSpace instructions

Takeaways More automated tools Toolkit to aggregate tasks Better metadata extraction potential Support of more schemas

MetaTools--Investigating Metadata General Tools JISC funded grant project undertaken by the Arts and Humanities Data Service, King’s College London 18 month project, ends September 2008 Project goals –Develop a methodology for evaluating metadata generation tools –Compare the quality of currently available metadata generation tools (including NLNZ Metadata Extractor, Droid, Jhove) –Develop, test and disseminate prototype web services that integrate metadata generation tools.

Student Publication Lorraine Dong, Megan Durden and Sarah Kim Presented Silicon Chips with Everything: Preserving Arnold Wesker’s Digital Manuscripts at SSA (Look for their forthcoming publication)

Contact Information Catherine Stollar Peters New York State Archives Cultural Education Center Albany, New York (518)