PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.

Slides:



Advertisements
Similar presentations
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar.
TIPR: Repository Exchange Package Use Cases and Best Practices Joseph Pawletko and Priscilla Caplan IS&T Archiving 2011.
DRS 2 one in a series of periodic updates Harvard University Library Andrea Goethals October 21, 2009 DRS = Digital Repository Service.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Trustworthy Repository Criteria, Virtual Organizations, and Infrastructure MacKenzie Smith, MIT Libraries NDIIPP Meeting, July 2010.
Digital Preservation Practices and Strategies at Colorado State University Libraries.
TRAC / TDR ICPSR Trustworthy Digital Repositories.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
3. Technical and administrative metadata standards Metadata Standards and Applications.
PREMIS What is PREMIS? o Preservation Metadata Implementation Strategies When is PREMIS use? o PREMIS is used for “repository design, evaluation, and archived.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
THE RUTGERS WORKFLOW MANAGEMENT SYSTEM Mary Beth Weber Cataloging and Metadata Services Rutgers University Libraries August 3, 2007.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Descriptive Metadata o When will mods.xml be used by METS (aip.xml) ?  METS will use the mods.xml to encode descriptive metadata. Information that describes,
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Use of METS in CDL Digital Special Collections Brian Tingle.
Trusted Datagrids: Library of Congress Projects with UCSD Ardys Kozbial – UCSD Libraries David Minor - SDSC.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
Ensuring Enduring Access: A Forum on Digital Preservation, July 21, 2009.
World Data Center for Human Interactions in the Environment Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as.
Jenn Riley Metadata Librarian Indiana University Digital Library Program.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Digital Preservation 101, or, How to Keep Bits for Centuries Julie C. Swierczek Digital Asset Manager and Digital Archivist Harvard Art Museums.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
PREMIS Rathachai Chawuthai Information Management CSIM / AIT.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
AGENTS, RIGHTS, EVENTS. Agents  The Agent entity aggregates information about agents (persons, organizations, or software) associated with rights management.
Library Repositories and the Documentation of Rights Leslie Johnston, University of Virginia Library NISO Workshop on Rights Expression May 19, 2005.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Conceptual Data Modelling for Digital Preservation Planets and PREMIS Angela Dappert.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.
AGENTS, RIGHTS, EVENTS. Agents  The Agent entity aggregates information about agents (persons, organizations, or software) associated with rights management.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Arwen Hutt & Bradley D. Westbrook Metadata Analysis and Specification Unit UCSD Libraries For PREMIS Workshop La Jolla, CA, 11 Feb 2008.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
13 July 2005 Archives Hub day conference The Paradigm Project: The University of Oxford & The University of Manchester
Joint Meeting of CSUL Committees,
Current as of April/May 2013
Policy-Based Data Management integrated Rule Oriented Data System
Joseph JaJa, Mike Smorul, and Sangchul Song
Statewide Digitization and the FCLA Digital Archive
Technical Issues in Sustainability
Robin Dale RLG OAIS Functionality Robin Dale RLG
Presentation transcript:

PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008

Context: UCSD Libraries and SDSC  Collaborative work in digital preservation  Long term preservation of video content (NDIIPP/DigArch)  LC Pilot Project  Mass Transit (with CDL)  NDIIPP / Chronopolis

Context: LC Pilot Project  National Digital Information Infrastructure Preservation Program (NDIIPP)   Project report:   Scenario  LC is looking for a trustworthy digital repository to manage its assets. Is SDSC that trustworthy repository?  Building trust  Deliverables and tests specified by LC  From the UCSD Libraries  Ardys Kozbial, Arwen Hutt

Parameters  Trusted Digital Repository Checklist  A1.2 Repository has an appropriate, formal succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope.   Preservation  Digital Archives  Metrics for...  TRAC  Transfer of all deposited data from SDSC to LC  Transferring preservation responsibility from SDSC to LC  Application must be system neutral, not proprietary  Assumption: after transfer of data, SDSC no longer has responsibility for maintenance of the file

State Information  Migration of files  - Institution name that provided the file - in this case LC  - Collection name for the record series - for example: Prokudin-Gorskii Collection, Web Crawl Data, NDNP  - LC identifier for each file  - name used to organize the files at SDSC  - physical file name for each file  - storage location for each file  - LC checksum for each file to verify integrity  - SDSC checksum for each file  - Date SDSC checksum was validated  - Status of transfer of file from LC  - Date file was received at SDSC  - number of replicas  - location of each replica  - creation date for each replica  - checksum for each replica  - synchronization date for each replica  LC access to the files  - Unique identity for each LC user (these would be the ~12 users stated in the TDL)  - Group membership for each LC user (simplifies assignment of permissions)  - Access controls on each file for each user for each allowed role  - access controls on metadata for each file  - access controls on storage systems

State Information  LC modification of data at SDSC  - link for each file to LC metadata. Note two different catalogs are being used by LC.  - version number for changes to file  - audit trails for logging accesses to file (at least all write accesses)  Data integrity  - logging of all errors for each collection  - logging of all errors for each storage system  - name of procedure for recovering from each error type  - logging of execution of recovery procedures  - Result of execution of each recovery procedure  - validation of consistency of the metadata catalog (file exists for each record)  - validation of consistency of the storage vaults (record exists for each file)  - dates of consistency checks  - most recent date all checksums have been verified  - most recent date all replicas have been synchronized  - location of metadata catalog backups  - most recent date metadata catalog backup created  - location of metadata catalog log file

Highlights File Preservation Transfer Report: Standards  What information is needed to effectively transfer preservation responsibility for the files themselves?  Use the data standards supported by LC  METS  Content packaging standard  Does not place restrictions on schemas  The METS Profile communicates rules about content and construction of METS objects.  METS is used to document this File Preservation Transfer Package  PREMIS  Use of metadata to support digital preservation  Does not proscribe how information is expressed  Data dictionary is valuable for identifying existing metadata which satisfies requirements of the standard (SDSC State Information)

Highlights File Preservation Transfer Report: Scope  Not relevant  Data used to describe the specific repository environment, but that are not intrinsic to the file outside of that repository context.  Example: storage location of replicas  Relevant  Preservation processes that were applied to the file

Highlights File Preservation Transfer Report: Characteristics  Descriptive metadata  None provided in this context, rather, a link to the LC Prints + Photographs database  Technical and digital provenance metadata  Technical characteristics of the file  Can be extracted from file headers  Preservation events associated with the file  Examples: ingestion, fixity check  Identification of agent(s) responsible for an event

Questions Outstanding  Not implemented  Procedures for handling file versions created as part of the preservation function should be explored.  Development of controlled value lists for event types, event outcomes, etc. to facilitate consistent application of terminology.  Although it was developed for all file preservation transfer needs, it was created in the context of a particular scenario – image files. Therefore it needs more testing.

Future Work: NDIIPP/Chronopolis  This profile is the starting point for work that will be done on the Chronopolis project.  NDIIPP/Chronopolis  Preservation environment  Replicate data  Providers  SDSC, UCSD Libraries, University of Maryland, NCAR  Clients  CDL (web crawls), ICPSR (social science data)

Discussion questions follow

Round Robin Discussion Questions 1. How are you currently using or planning to use PREMIS? 2. What types of information/objects do you currently preserve in your organization? 3. What preservation metadata do you currently record about the objects you preserve (if any)? 4. How are you recording it, eg database, METS/XML, other? 5. What are the barriers to PREMIS implementation?