The world’s libraries. Connected. Demystifying Born Digital ARLIS/NA, Pasadena, 27 April 2013 Jackie Dooley Program Officer OCLC Research.

Slides:



Advertisements
Similar presentations
1 of 15 Information Access Internal Information © FAO 2005 IMARK Investing in Information for Development Information Access Internal Information.
Advertisements

Organising and Documenting Data Stuart Macdonald EDINA & Data Library DIY Research Data Management Training Kit for Librarians.
Animesh Bhattacharyya Librarian, Vivekananda Mahavidyalaya
OCLC Digital Archive: Creating Long Term Access to Digital Masters Roberta Gebhardt, Montana Historical Society Research Center Sarah McHugh, Montana State.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
AN OPEN-SOURCE SYSTEM FOR AUTOMATIC POLICY-BASED COLLABORATIVE ARCHIVAL REPLICATION Using the SafeArchive System The SafeArchive System coordinates six.
Taking Our Pulse The OCLC Research Survey of Special Collections and Archives Jackie Dooley Consulting Archivist RLG Partnership Annual Meeting Chicago,
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
Guide to Computer Forensics and Investigations Fourth Edition
US GPO AIP Independence Test CS 496A – Senior Design Fall 2010 Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
THE RUTGERS WORKFLOW MANAGEMENT SYSTEM Mary Beth Weber Cataloging and Metadata Services Rutgers University Libraries August 3, 2007.
Descriptive Metadata o When will mods.xml be used by METS (aip.xml) ?  METS will use the mods.xml to encode descriptive metadata. Information that describes,
1 Minerva The Web Preservation Project. 2 Team Members Library of Congress Roger Adkins Cassy Ammen Allene Hayes Melissa Levine Diane Kresh Jane Mandelbaum.
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Data Preservation Best Practices for preserving your research data for future reuse The goal of data preservation is to ensure that your data is in a sustainable.
August 14, 2015 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
Metadata standards, tools and processes for audio preservation at the British Library: An overview of new systems for audio description, preservation and.
Issue: Unknown / Unrecognized Filesystems Initial Analysis Extract Metadata Identify Restricted Info Identify Duplicates Generate Reports.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
Integrating Digital Curation in a Digital Library curriculum: the International Master DILL case study Anna Maria Tammaro University of Parma Florence,
“Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull Jenny Mitcham Digital Archivist Borthwick.
OCLC Research Webinar - 11 August 2015 The Archival Advantage Integrating Archival Expertise Into Management of Born-Digital Library Materials Jackie Dooley.
Trends in Preserving Scholarly Electronic Journals 1. Golnessa GALYANI MOGHADDAM Shahed University Dept. of Library and Information Science, Shahed University,
DigCCurr Winter Institute January 7-8, 2013 Chapel Hill, North Carolina, USA DigCCurr Professional Institute Jill Teasley.
Erin O’Meara Gates Archive January 27, 2012 ALCTS PARS Digital Preservation Interest Group Recent Publications about Managing Born-Digital Materials.
Copyright 2013 © President & Fellows of Harvard College Digital Forensics at Harvard Business School NE NDSA Lightning Talk, 10 May 2013 Rachel Wise, Baker.
24 March 2010Atlanta, Georgia Passing it on: Notes on digital initiative sustainability Marty Kurth HBCU Library Alliance – Cornell University Library.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
The Real At Risk E-Content: University Web Resources EDUCAUSE Joanne Kaczmarek University of Illinois at Urbana-Champaign Taylor Surface OCLC October 12,
Guide to Computer Forensics and Investigations Fourth Edition
RLG Programs Curating the Collective Collection Ricky Erway RLG Programs OCLC Programs and Research Western Digital Forum 9 August 2007.
OCLC Research Library Partner Briefing November 2011 Hello!
Small steps and lasting impact: making a start with preservation or It’s not all NASA Patricia Sleeman Digital Archives and Repositories University of.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
The Roger Conatser Aerial Photographs Collection Bethany C. Fiechter, Archivist for Manuscript and Digital Collections Amanda A. Hurford, Metadata and.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Carcanet Case Study Fran Baker, John Rylands University Library University of Manchester SPRUCE event 19 January 2012.
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
National Archives and Records Administration Status of the ERA Project RACO Chicago Meg Phillips August 24, 2010.
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
The Importance of Standards in Digital Preservation Tina Norris Kayla Payne Jennifer
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Digitization & Digital Preservation
Managing Access at the University of Oregon : a Case Study of Scholars’ Bank by Carol Hixson Head, Metadata and Digital Library Services
@ulccwww.ulcc.ac.uk IRMS Cymru October 2015 From EDRMS to digital archive: a wish-list for ways to preserve digital records.
Launching E-Records with a PERPOS: The Presidential Electronic Records PilOt System 2005 NAGARA Annual Meeting.
Digital Archives You Can Do It! The Collective - March 2016 Paul Kelly - Digital Archivist - The Catholic University of America.
Data Wrangling: Developing Local Best Practice for Born Digital Metadata Tracy Popp, Digital Preservation Coordinator Ayla Stein, Metadata Librarian University.
Repository-specific Spoke Scripts Content Repository JSR-170/283 Content Repository for Java Technology API Normalized H&S METS Files METS Import/ExportMETS.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
13 July 2005 Archives Hub day conference The Paradigm Project: The University of Oxford & The University of Manchester
Creighton Barrett Dalhousie University Archives
Tools for identifying duplicate files and known software files
Bentley Project Reel Digitization Bentley Historical Library t
Better than it was Finding what works for processing born-digital archives at the Bentley Historical Library Mike Shallcross U-M Bentley Historical Library.
Data Management: Documentation & Metadata
Digital Project Lifecycle Curating Across the Curriculum
CHFI & Digital Forensics [Part.1] - Basics & FTK Imager
A Brief Introduction to Digital Forensics
Presentation transcript:

The world’s libraries. Connected. Demystifying Born Digital ARLIS/NA, Pasadena, 27 April 2013 Jackie Dooley Program Officer OCLC Research

The world’s libraries. Connected. Taking Our Pulse(s)

The world’s libraries. Connected. Top education and training needs 1.Born-digital materials: 83% 1.Information technology: 65% 2.Intellectual property: 56% 3.Cataloging and metadata: 51%

The world’s libraries. Connected. Born-digital materials are … Undercollected Undercounted Undermanaged Unpreserved Inaccessible American Heritage Center

The world’s libraries. Connected. Demystifying Born Digital In response, we launched …

The world’s libraries. Connected. "Make things as simple as possible, but not simpler.” --Albert Einstein

The world’s libraries. Connected. Target audiences Research library directors and institutional administration Archivists and special collections librarians Other specialists Collection development Curatorial Digital library Information technology Metadata Records management

The world’s libraries. Connected. First Steps for Managing Born-Digital Physical Media (Published August 2012)

The world’s libraries. Connected. Intent of First Steps Seek confidence building rather than overwhelming novices with complex information and procedures. Knowing what you have (i.e., do an inventory) and taking some simple technical steps can allay the fear factor. Archivist may have to begin alone without help from IT staff. Having taken first steps, it’s then easier to continue learning.

The world’s libraries. Connected. Part 1: Inventory & prioritize Inventory what you have Types & quantities of physical media File formats Estimated number of gigabytes Prioritize materials for processing Anticipated level/nature of use Level of significance/uniqueness Potential loss due to age or type of media Unique content not replicated elsewhere

The world’s libraries. Connected. 1. Use a “clean” computer. 2. Use a write blocker. 3. Insert source media. Do not attempt to open any files. 4. Create a disk directory. 5. Copy files from media to the directory. Consider copying as a disk image. 6. Generate a copy of the directory. 7. Generate and record a checksum. 8. Create a readme file. 9. Copy the directory to trustworthy archival storage. 10. Return the original physical media to storage. 11. Create or update any associated descriptive tool(s). Part 2: Technical steps

The world’s libraries. Connected. Detailed Steps for Managing Born-Digital Physical Media

The world’s libraries. Connected. First Steps: Checksums 7. Generate and record a checksum (a unique value based on the contents of a file) on the disk image. Alternatively, if you copied the files instead of copying a disk image, generate and record a checksum on each file in the subdirectory.

The world’s libraries. Connected. Detailed Steps: Checksums Level of Difficulty: Easy to Complex Desirability: Highly Recommended A checksum, or hash, is a unique value based on the contents of a file and is generated by specific algorithms (e.g., MD5 or SHA-256). Comparison of checksums generated from the same file at different times identifies whether and when the file has changed. Creating checksums is not difficult and may be done during several processes described earlier (such as creating a disk image, generating a directory list, or using the Duke Data Accessioner). It is very easy to create a hash for a single file and then to compare that hash to one generated for another copy of the file. An automated technique is necessary, however, when processing a large number of files. It is important to note that while a changed checksum can alert a repository to the fact that something in a file or folder has changed, it cannot indicate what exactly has changed, nor can it reverse the change. Regularly hashing the file or image you have copied and checking those new hashes against the hashes made at the time of the transfer should be part of your digital curation workflow. During the lifecycle of your digital collections you will need to periodically verify the checksums to ensure that files remain unchanged.

The world’s libraries. Connected. Detailed Steps: Checksums, cont. Disk imaging or Disk copying tools that incorporate checksums (see the Copy the Files or Create a Disk Image section for more details on these tools): BitCurator: FTK Imager (Forensic ToolKit Imager): Duke Data Accessioner: File directory printing tools that incorporate checksums (see the Record the File Directory section for more details on these tools): Karen's Directory Printer: Beyond Compare: NARA File Analyzer and Metadata Harvester: Collection management tools that incorporate checksums: Archivematica: From the website: “A free and open-source digital preservation system that is designed to maintain standards- based, long-term access to collections of digital objects. Archivematica uses a micro-services design pattern to provide an integrated suite of software tools that allows users to process digital objects from ingest to access …

The world’s libraries. Connected. Detailed Steps: Checksums, cont. Curator’s Workbench: From the website: “The Workbench helps archivists manage files before they are stored in an institutional repository or dark archive. As the files are selected, arranged, and described, a METS file is generated by the software that documents these processes. In addition, checksums and UUIDs are generated for each object and MODS descriptive metadata elements can be mapped to individual objects and folders.” Developed at the University of North Carolina, Chapel Hill. Standalone checksum tools: Jacksum: Md5summer: Md5deep: command line tool that can also be used as a directory printerhttp://md5deep.sourceforge.net Further Resources: “Checksum Verification Tools: Guest Post by Carol Kussmann” Practical E-Records records.chrisprom.com/checksum-verification-tools/ (accessed December 2012) This blog maintained by Christopher Prom reviewed five checksum generating and verification tools. records.chrisprom.com/checksum-verification-tools/

The world’s libraries. Connected. Swatting the Long Tail of obsolete media

The world’s libraries. Connected. “A call for a network of hubs to enable cost-effective outsourcing of the transfer of various types of physical media, particularly obsolete formats. We seek to reduce the need for everyone to figure everything out on their own, and instead set up a network of expert sites that have the necessary equipment and experience.” “A community-based approach would use SWAT sites wherein a few self- selected institutions acquire and maintain the gear and expertise to read data and transfer content from particular types of obsolete media. The SWAT sites would provide transfer services for institutions that don’t have the capacity to read a particular medium...” Collaboration for converting obsolete media

The world’s libraries. Connected. Our next two reports will … Articulate the relevant skills and expertise of archivists Describe how these pertain to various types of born- digital material and how special collections and archives intersect with “born digital”

The world’s libraries. Connected. Jackie Dooley Thank you! Demystifying Born Digital