Capturing preservation metadata from institutional repositories Preserv Project Presented by Steve Hitchcock Intelligence Agents Multimedia Group, School.

Slides:



Advertisements
Similar presentations
Preservation for IRs. Keep IR preservation in perspective You can't preserve an empty archive. Don't discourage deposits by making them more difficult.
Advertisements

Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
PRESERV Repositories and stakeholders Jessie Hey PRESERV Partners Meeting 18 Nov 2005.
PRESERV PReservation Eprint SERVices A two-year JISC 4/04 project: iii Institutional repository infrastructure development Steve Hitchcock and Jessie Hey.
Preserv: Preservation architecture and interface A brief overview of ideas wrt to the project plan For Preserv partners meeting, BL, London, 18th November.
Engaging repository policy with preservation Steve Hitchcock and Neil Jefferies* Preserv 2 Project School of Electronics and Computer Science (ECS), Southampton.
Engaging repository policy with preservation Steve Hitchcock and Neil Jefferies* Preserv 2 Project School of Electronics and Computer Science (ECS), Southampton.
Preserv Preservation Eprint Services Scenario: Digital lifecycle begins with author creation and deposit of paper or data content into the institutional.
IRs: towards preservation services Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer Science (ECS),
Reshaping Preserv 2 from a Life(cycle) perspective Steve Hitchcock and Dave Tarrant Preserv 2 Project School of Electronics and Computer Science (ECS),
PRESERV a JISC 4/04 project Bid conditionally accepted Friday 24 th September Steve Hitchcock Intelligence Agents Multimedia Group, School of Electronics.
Repository models and policies for preservation Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer.
Repository preservation services: divisible, viable and sustainable? Steve Hitchcock Preserv 2 Project Intelligence Agents Multimedia Group, School of.
PRESERVATION METADATA: IMPLEMENTATION STRATEGIES Preservation Metadata: The PREMIS Experience Priscilla Caplan Florida Center for Library Automation (FCLA)
SHERPA Jackie Wickham RSP Project Coordinator
Creating Institutional Repositories Stephen Pinfield.
FAIR – Focus on Access to Institutional Resources William J Nixon DAEDALUS Project, University of Glasgow e-libraries for e-learning.
Building Repositories of eprints in UK Research Universities Bill Hubbard SHERPA Project Manager University of Nottingham.
Preservation as a Process of a Repository David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
Digital Preservation for Digital Repositories David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
Applying preservation metadata to repositories For JISC KeepIt course on Digital Preservation Tools for Repository Managers Module 3, Primer on preservation.
University of Southampton EdSpace Hugh Davis, Leslie Carr, Jessie Hey and Debra Morris edspace.ecs.soton.ac.uk.
The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath Chinese-European Workshop.
The PREMIS Data Dictionary Michael Day Digital Curation Centre UKOLN, University of Bath JORUM, JISC and DCC.
Preservation Metadata Initiatives: Practicality, Sustainability, and Interoperability Michael Day UKOLN, University of Bath ERPANET Training.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Linking Repositories Scoping Study Key Perspectives Ltd University of Hull SHERPA University of Southampton.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
The OAIS experience at the British Library Deborah Woodyard Digital Preservation Coordinator ERPANET OAIS Training Seminar, Nov 2002.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
3. Technical and administrative metadata standards Metadata Standards and Applications.
PREMIS What is PREMIS? – Preservation Metadata Implementation Strategies When is PREMIS use? – PREMIS is used for “repository design, evaluation, and archived.
PREMIS What is PREMIS? o Preservation Metadata Implementation Strategies When is PREMIS use? o PREMIS is used for “repository design, evaluation, and archived.
Descriptive Metadata o When will mods.xml be used by METS (aip.xml) ?  METS will use the mods.xml to encode descriptive metadata. Information that describes,
Metadata for preservation Michael Day, UKOLN, University of Bath Chinese-European Workshop on Digital Preservation,
Documenting to preserve your data: metadata in support of digital preservation Michael Day, UKOLN, University of Bath
A disaggregated model for preservation of E-Prints Gareth Knight SHERPA DP Project Arts and Humanities Data Service.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
PREMIS Rathachai Chawuthai Information Management CSIM / AIT.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Conceptual Data Modelling for Digital Preservation Planets and PREMIS Angela Dappert.
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Quake Summit 2012, Boston, Massachusetts July 12, 2012 Stanislav Pejša.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
Integrating metadata schema registries with digital preservation systems to support interoperability Michael Day UKOLN, University of Bath, UK
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
AGENTS, RIGHTS, EVENTS. Agents  The Agent entity aggregates information about agents (persons, organizations, or software) associated with rights management.
The OAIS Reference Model Michael Day, Digital Curation Centre UKOLN, University of Bath Reference Models meeting,
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
An overview of the Reference Model for an Open Archival Information System (OAIS) Michael Day, Digital Curation Centre UKOLN, University.
Applying preservation metadata to repositories The British Library, 21 January 2008 Led by Steve Hitchcock With Bill Hubbard, Gareth Johnson.
OAIS Producer (archive) Consumer Management
DAITSS: Dark Archive in the Sunshine State
Active Data Management in Space 20m DG
Building Up the Strategic Components for Digital Preservation Policy
Metadata for preservation
PRESERV PReservation Eprint SERVices
Open Archival Information System
Presentation transcript:

Capturing preservation metadata from institutional repositories Preserv Project Presented by Steve Hitchcock Intelligence Agents Multimedia Group, School of Electronics and Computer Science (ECS), Southampton University DCC Workshop on the Long-term Curation within Digital Repositories Cambridge, 6 July 2005

Abstract Preservation scenarios are often based on hypothetical situations, generalised applications such as digital libraries or cultural heritage organisations, or on specific applications such as digitisation. This presentation will consider the emerging but real scenario of preservation in the context of institutional repositories (IRs), being investigated by the JISC Preserv project. While on the surface this particular scenario may not seem to differ significantly from others, we will build a picture of relationships between repositories and preservation service providers to reveal what differences there may be with other scenarios and to understand the implications. We will use this analysis to inform the capture of some preservation metadata from IRs through the user deposit interface, perhaps the most critical data capture point in the IR preservation chain. Some initial ideas on formalising these IR preservation elements will be proposed for consultation, with a view to learning from, and possibly contributing to, the standard reference in this area, currently the PREMIS Working Group Data Dictionary for Preservation Metadata.

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005

Preservation Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005

Preservation storage media, media refreshing, reformatting, backups and disaster recovery, environment, audit, security, preservation strategy, migration, technology preservation, emulation, records management, etc. Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005

Preservation storage media, migration, etc. Preserv partner British Library

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Institutional repository Eprints.org, DSpace, FAIR, JISC DRs

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc.

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI Access is still not the primary purpose of a preservation system Cornell OAIS tutorial

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess M I/F(OAI) M I/F

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface It is important to build the concept of preservation from the outset" (JISC Circular 4/04)

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI Eprints deposit interface

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI Eprints deposit interface Contents of IRs Many types of digital objects, formats Versioning issues, some duplication Different degrees of moderation: institutional membership is selection baseline

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface FormatFormat Format ID TNA + Pronom

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface FormatFormat Format ID TNA + Pronom Influence/feedback

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface FormatFormat Format ID TNA + Pronom

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface FormatFormat Format ID TNA + Pronom Users IR managers and admins Heads of institutions Research funders Course leaders Teachers Education funders Stakeholders

Connecting IRs (content providers) and preservation services How far can we apply the OAIS model across our IR-preservation model? How can we embrace preservation metadata in this model? To what extent do these apply just to the preservation component? It looks as if many of the ideas focus on the preservation archive rather than the content provider How can we connect content providers with preservation services?

Preservation metadata is seldom shared across organizations Clifford Lynch "there has been some useful work done on metadata standards for preservation, although that work is not highly advanced. Part of the problem is that a lot of the work on preservation metadata has given rise to organizational guidance, the kinds of things you should think about as you attach metadata to objects when you want to preserve them, rather than hard specifics that would be more typical in interchange format, because preservation metadata today is seldom shared across organizations. * Since this talk was given there has been a great deal of progress in relevant areas here. I would point the interested reader at the work on METS, PREMIS, and the NISO Still Image Technical Metadata draft standard Preserving Digital Documents: Choices, Approaches, and Standards, Law Library Journal, 96 (4),

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 OAIS model This model is not very different from the schematic sketched for Preserv, especially in terms of the core components – Ingest, Data Management, Archival Store, Access – but how effectively can OAIS be applied across different organisations?

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface FormatFormat Format ID TNA + Pronom Users IR managers and admins Heads of institutions Research funders Course leaders Teachers Education funders Stakeholders

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Ingest relies upon rules established to determine the metadata that must be present, the formats that are acceptable, the means that may be used for transferring objects, and the quality checks that must be performed Archival Storage functions are like storage functions that are performed in all kinds of digital storage environments, whether long-term preservation is a goal or not. The difference lies in the added rigor in error checking, media replacement, and disaster recovery. Data management provides the glue for the system by capturing and managing all of the metadata that is needed to operate the system. As in Archival Storage, the functions of Data Management are familiar to anyone who has worked with production databases. Access in OAIS may provide objects to an intermediary system that then interacts directly with users, or it may deliver directly to users. Access is still not the primary purpose of a preservation system. From Digital Preservation Management, 4B. The OAIS Reference Model. A Cornell tutorial

OAIS information model: AIP Archival Information Package (AIP): –Content Information Original target of preservation Information Object (Data Object & Representation Information) –Preservation Description Information (PDI) Other information (metadata) "which will allow the understanding of the Content Information over an indefinite period of time From Michael Day, Categories, uses and challenges of metadata and process documentation

OAIS information model: PDI Preservation Description Information Reference Information Provenance Information Context Information Fixity Information PDI Preservation Description Information (Figure 4-16) From Michael Day, Categories, uses and challenges of metadata and process documentation

A suggestion PDI – or somewhere - ought also to indicate what you want to do with the object, and perhaps act as the basis of selection for preservation services. What isn't clear is how these features could be incorporated.

Preservation metadata PREMIS = Preservation Metadata: Implementation Strategies Preservation metadata = "the information a repository uses to support the digital preservation process" The PREMIS Data Dictionary for Preservation Metadata (May 2005)

PREMIS data model Intellectual entities Objects Events Rights Agents

PREMIS Data Dictionary, v 1.0 Defines semantic units for Objects, Events, Agents and Rights –Object: objectIdentifier, preservationLevel, objectCategory, objectCharacteristics (format, significant properties, etc.), creatingApplication, storageMedium, environment (dependencies, hardware and software details, etc), relationship, … –Event: eventIdentifier, eventType (from a controlled list, e.g. ingestion, migration, normalization), eventDateTime, eventDetail, eventOutcomeInformation, linkingAgentIdentifier, … –Agent: agentIdentifier, agentName, agentType, … –Rights: permissionStatement, …

Limits to scope of PREMIS dictionary –Does not focus on descriptive metadata Domain specific and dealt with by many other schemes –Does not deal with technical metadata for all different types of digital file (left to format experts) –Does not consider in detail the business rules of a repository, e.g. roles, policies, and strategies (but this could be added to data model)

Resonance Points raised by PREMIS that have resonance for Preserv are: "Questions about business plans, policies, preservation strategies, as well as metadata" "Recognition of the need for automatic capture of metadata

Some questions Is there a need or scope for Preserv to describe semantic units within the PREMIS data dictionary that may be relevant to preservation metadata for IRs? Is what we need already included? For example, might the result of interaction with Pronom produce data residing in the Object entity (but noting PREMIS "Does not deal with technical metadata)? Which other entities might we contribute to? Events is one possibility. It's possible that some information we'd like to capture, e.g. funder, might fall within the scope of the Intellectual entity, which is outside the scope of PREMIS.

Preserv, Capturing preservation metadata from institutional repositories, DCC, Cambridge, 6 July 2005 Preservation storage media, migration, etc. Preserv partner British Library Other preservation service providers Preserv partners eprints.soton Oxford Univ. IRs, Eprints.org, DSpace, etc. User/authorUser/reader DepositAccess Machine interface OAI IR author deposit interface FormatFormat Format ID TNA + Pronom Users IR managers and admins Heads of institutions Research funders Course leaders Teachers Education funders Stakeholders Q. Preservation metadata or Selection metadata?

Preservation metadata or selection metadata? Who are the key stakeholders in selecting materials to preserve? Institutions – selection by admission to IR, or other criteria? Research funders – preserve outputs of a funded programme Authors – objective or subjective? Preservation business models for IRs: perhaps the answer lies in who pays?

Credits Southampton University Les Carr, Tim Brody, Jessie Hey, Steve Hitchcock British Library Richard Boulderstone, Adam Farquhar, Richard Masters National Archives Adrian Brown Oxford University David Price, Frances Boyle, Neil Jefferies, Michael Popham