Preservation Fedora Preservation Svcs WG - Sept. 1, 2006 1 Fedora Preservation Services (A Working Group Report) Long-term Repositories: Taking the Shock.

Slides:



Advertisements
Similar presentations
DuraSpace: Digital Information All Ways, Always Pretoria, South Africa May 14 th, 2009.
Advertisements

An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
Goals for RUcore o Flexible, extensible cyberinfrastructure for Rutgers University o Integrating platform for legacy information systems o Support preservation.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
ISO & OAI-PMH By Neal Harmeyer, Amy Hatfield, and Brandon Beatty PURDUE UNIVERSITY RESEARCH REPOSITORY.
WMS, RUcore and Fedora Mini-Conference
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
R.Jantz, August 31, Two-day forum on PREMIS Preservation Metadata and the Trusted Digital Repositories August 31, September 1 National Library of.
Fedora Commons: Introduction and Update Swedish National Library June 24, 2008.
May , IASSIST 2006 May Ann Arbor, MI Ronald C. Jantz Rutgers University Libraries RUtgers COmmunity REpository (RUcore) A FEDORA-based.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
Kevin L. Glick Electronic Records Archivist Manuscripts and Archives Yale University ECURE Arizona State University March 2, 2005 Fedora and the Preservation.
WMS: Democratizing Data
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
International Council on Archives Section on University and Research Institution Archives Michigan State University September 7, 2005 Preserving Electronic.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
World Data Center for Human Interactions in the Environment Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as.
USING METADATA TO FACILITATE UNDERSTANDING AND CERTIFICATION ABOUT THE PRESERVATION PROPERTIES OF A PRESERVATION SYSTEM Jewel H. Ward, Hao Xu, Mike C.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
A disaggregated model for preservation of E-Prints Gareth Knight SHERPA DP Project Arts and Humanities Data Service.
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
OAIS in the Library Environment Managing and Preserving Electronic Resources FLICC/CENDI Washington DC, December 11,2001 Anne Van Camp RLG, Member Initiatives.
R utgers C ommunity R epository RU CORE 1 A Statewide Community of Trust: An RUcore Implementation using Shibboleth and XACML The Fourth International.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
April 12, 2005 WHAT DOES IT MEAN TO BE AN ARCHIVES? Trusted Digital Repository Model Original Presentation by Bruce Ambacher Extended by Don Sawyer 12.
Fedora and the Preservation of University Electronic Records Project NHPRC Electronic Records Research Grant Kevin L. Glick Manuscripts and Archives, Yale.
The OAIS Reference Model Michael Day, Digital Curation Centre UKOLN, University of Bath Reference Models meeting,
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Fedora Service Framework Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Joint Meeting of CSUL Committees,
Building A Repository for Digital Objects
DAITSS: Dark Archive in the Sunshine State
Exercise: understanding authenticity evidence
Policy-Based Data Management integrated Rule Oriented Data System
Statewide Digitization and the FCLA Digital Archive
Overview: Fedora Architecture and Software Features
Exercise: understanding authenticity evidence
Implementing an Institutional Repository: Part II
Fedora and the Preservation of University Records ECURE
Digital Archiving: A FEDORA-Based Infrastructure for Preserving Electronic Journals LACUNY Institute 2005 Scholarly Publishing and Open Access: Payers.
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

preservation Fedora Preservation Svcs WG - Sept. 1, Fedora Preservation Services (A Working Group Report) Long-term Repositories: Taking the Shock out of the Future Two-day forum on PREMIS Preservation Metadata and the Trusted Digital Repositories August 31, September 1 National Library of Australia

preservation Fedora Preservation Svcs WG - Sept. 1, Topics for Discussion Today Working Group Formation Digital Preservation - Background and Philosophy Concept Architecture and the Digital Object Fedora Preservation Services and the Audit Checklist

preservation Fedora Preservation Svcs WG - Sept. 1, Working Group Vision: Expand the Fedora framework to facilitate the creation of trusted digital repositories Objectives Define the requirements and architecture for preservation services that can be integrated into Fedora. Process focus: from ingest throughout the object life cycle Organize and coordinate collaborative development Formation of WG From Fedora Users’ Conference at Rutgers (May, 2005) Charter Members from: Cornell, Harris, Northwestern, Rutgers, Tufts, Yale

preservation Fedora Preservation Svcs WG - Sept. 1, Reference Documents RLG/NARA draft “An Audit Checklist for the Certification of Trusted Digital Repositories” RLG (2001). Attributes of a Trusted Digital Repository: Meeting the Needs of Research Resources. Mountain View, CA. PREMIS at OAIS Reference Model Global Digital Format Registry: Maintain Guide, draft – March 2006, Digital Collections and Archives, Tufts University, Manuscripts & Archives, Yale University.

preservation Fedora Preservation Svcs WG - Sept. 1, Skeptics and Possibilities Cullen (2000) asks rhetorically “How confident can we be when an object whose authentication is crucial depends on electricity for its existence?”. “... the proliferation of experience, research, and infrastructure throughout the cultural heritage community has made trustworthy digital repositories conceptually realistic”** * Cullen, C. (2000). Authentication of digital objects: Lessons from a historian’s research. In Authenticity in a Digital Environment (CLIR publication 92, pp. 1 – 7). Washington, D.C., Council on Library and Information Resources. **RLG (2005). An Audit Checklist for the Certification of Trusted Digital Repositories. Mountain View, CA.

preservation Fedora Preservation Svcs WG - Sept. 1, Archiving Digital Objects (Some questions) Can we define the “digital original”? Can we trace back from the nth migration to the digital original? Is this object format at risk of obsolescence? Can this object be properly preserved/migrated? Can/should we preserve dynamic behavior?

preservation Fedora Preservation Svcs WG - Sept. 1, Digital Preservation Definitions (There are many) The long term maintenance of a byte stream (including metadata) sufficient to reproduce a suitable facsimile of the document, and continued accessibility of the contents thru time and changing technology. (Research Libraries Group) The ability to keep digital documents and files for time periods that transcend technological advances without concern for alteration or loss of readability (Association for Information and Image Management) “... the process of migrating a digital entity forward in time while preserving its authenticity and integrity” (Moore & Marciano, 2005)

Preservation Services Concept Architecture Fedora Repository Service Preservation Services Alerting MigrationStatistics Event Notification Preservation Monitoring Preservation Integrity Preservation Portal Digital Object Repository Content Models... Format Registry Monitoring Fedora Framework

preservation Fedora Preservation Svcs WG - Sept. 1, The Digital Object The digital object is the basic unit of management, encapsulating all essential information about the “document” to be disseminated and preserved The digital object should be independent of the environment where possible Use standards and non-proprietary formats to minimize dependencies

preservation Fedora Preservation Svcs WG - Sept. 1, Events (Transformations) in the Life Cycle of the Digital Object Output Device T3T3 Submission Information Package T1T1 Digital Original (AIP) T2T2 Migrated Derivatives Repository

preservation Fedora Preservation Svcs WG - Sept. 1, Preservation Services (Candidate Capabilities) Object Level Features Audit trails and datastream versioning (available in Fedora 2.1) Persistent Identifiers (available in Fedora 2.1) Checksum creation and validation (active) Object format validation (active) Content model validation (active) Whole object versioning System Level Services Event management and alerting (active) Repository redundancy/mirroring service (active) Format migration Enable Repository static/active states History service of major repository events Preservation planning – set up object life cycle policies Statistics reporting – ingests, purges, signature failures, etc.

preservation Fedora Preservation Svcs WG - Sept. 1, Digital Object Integrity  Ability to create and compare checksums on a datastream On-Demand Checksum - A new Fedora API to support client- initiated checksums. CreateChecksum - Allows the application to request the Fedora repository service to create a checksum for a datastream. CompareChecksum - Compares a checksum from “contentDigest” on a datastream to a re-computed checksum Auto-Checksum option – A repository configuration option that will automatically calculate a checksum of datastream content for every datastream in every object.

preservation Fedora Preservation Svcs WG - Sept. 1, Events and Outcomes An event is an:... action that involves at least one object, agent, and/or rights entity (PREMIS).... occurrence that is significant to the performance of a task Event outcome – a situation or state that follows an event and is a result of the event.

preservation Fedora Preservation Svcs WG - Sept. 1, Fedora Event Management Generic Framework Events can have messages which are associated with all types of services (preservation, collection, user, etc) Messages represent events with actions and outcomes Fedora will provide a middle-ware messaging solution based on open-source Java Messaging Service (JMS) Fedora Working Group Focus Preservation events are atomic (i.e. associated with a Fedora API) The event message will be based on the PREMIS event entity Initial types: ingest, delete, modify, fixityCheck

preservation Fedora Preservation Svcs WG - Sept. 1, The Event Message Event message structure The message payload will be xml-based and use the PREMIS event entity semantic units Global identifiers (URIs) will be used for event type and outcome An example might look like the following: Rucore event info:premis/preservation/event/ingest T19:20:30 (to be used for general information) info:premis/preservation/outcome/success (more text) rutgers-lib:200 rutgers-lib:400 rutgers-lib:4291

preservation Fedora Preservation Svcs WG - Sept. 1, Event Management - Ingest (Using the publisher/subscriber model) XML Digital Object Ingest Workflow Management System User Input Digital Object Repository (Fedora) JMS (snd/rcv) JMS Topic Queue ingest<> delete<> Preservation Service (reporting) JMS (snd/rcv) Preservation Service (alerting) JMS (snd/rcv)

preservation Fedora Preservation Svcs WG - Sept. 1, Content Models (Content Model Dissemination Architecture – CMDA) The CM object specifies constraints on the digital object (DO) MIME type and format Min/max of number of datastreams Whether multiple datastreams are ordered The CM is used to determine runtime behavior On ingest, Fedora validates DO based on CM constraints Disseminators are not bound into the DO Run time binding occurs through the CM object and the rels- ext datastream The CM can point to a format registry

preservation Fedora Preservation Svcs WG - Sept. 1, Content Models and Disseminators (A book example) Book Object Persistent ID Metadata Rels-Ext Data streams PDF1 - presentation XML1 – OCR text ARCH1- Archival master (tiffs of each page) DJVU1- presentation SMAP1 – StrMap (TOC) Persistent ID Metadata Rels-Ext Composite Model Content Model Persistent ID Metadata Rels-Ext WSDL Bmech Object Persistent ID Metadata Bdef Object MethodMap hasCM hasBmech hasBdef <dsTypeModel ID=“ARCH1” ordered=“false” min=“1” max=“1”>. Format Registry

preservation Fedora Preservation Svcs WG - Sept. 1, A Trusted Repository  is one “...that establishes methodologies for system evaluation that meet community expectations of trustworthiness; that can be depended upon to carry out its long-term responsibilities to depositors and users openly and explicitly; and whose policies, practices, and performance can be audited and measured”*  *RLG (2001). Attributes of a Trusted Digital Repository: Meeting the Needs of Research Resources. Mountain View, CA.

preservation Fedora Preservation Svcs WG - Sept. 1, Certification Checklist (How Fedora Preservation Services Can Help) B. Repository Functions, Processes, & Procedures B1. Ingest/acquisition of content B1.1 Repository identifies properties it will preserve for each class of digital object. Content Models B1.3 Repository has an identifiable, written definition for each SIP or class of information ingested by the repository. Content Models B1.6 Repository’s ingest process verifies each SIP for completeness and correctness. Content Model validation B1.7 Repository provides Producer/depositor with appropriate responses at predefined points during the ingest processes. Event Management B2. Archival storage: management of archived information B2.1 Repository has an identifiable, written definition for each AIP or class of information preserved by the repository. Content Models B2.4. Repository has and uses a naming convention that can be shown to generate visible, unique identifiers for all AIPs. Persistent IDs B2.6. Repository verifies each AIP for completeness and correctness when generated. Content Models B3. Preservation planning, migration, & other strategies B3.3 Repository uses appropriate international Representation Information (including format) registries. Content Models B3.7 Repository actively monitors AIP integrity. Create and validate checksum B3.8 Repository has contemporaneous records of actions taken associated with ingest and archival storage processes and those administration processes that are relevant to the preservation. Audit trails B3.9 Repository has mechanisms in place for monitoring and notification when Representation Information (including formats) approaches obsolescence or is no longer viable. Event Management

preservation Fedora Preservation Svcs WG - Sept. 1, Certification Checklist (How Fedora Preservation Services Can Help) B. Repository Functions, Processes, & Procedures (continued) B4. Data management B4.1 Repository captures or creates minimum descriptive metadata and ensures that it is associated with the AIP. Content Models B5. Access management B5.2 Repository logs all access management failures, and staff review inappropriate “access denial” incidents. Event Management B5.6 Repository enables the dissemination of authentic copies of the original or objects traceable to originals. Audit trails and versioning C. The Designated Community & the Usability of Information. C3. Use & usability C3.2 Repository has implemented a policy for recording all access actions (includes requests, orders etc.) that meet the requirements of the repository and information Producers/depositors. Event Management C3.4 Repository has documented and implemented access policies (authorization rules, authentication requirements) consistent with deposit agreements for stored objects. Security – XACML policy enforcement D. Technologies & Technical Infrastructure. D1. System infrastructure D1.2 Repository ensures that all platforms have a backup function sufficient for therepository’s services and for the data held, e.g., metadata associated with access controls, repository main content, etc. Journaling/mirroring D1.5 Repository has effective mechanisms to detect data corruption or loss. Checksum compare D1.6 Repository reports to its administration all incidents of data corruption or loss, and steps taken to repair/replace corrupt or lost data. Event Management

preservation Fedora Preservation Svcs WG - Sept. 1, Proposed Development Plan Core Fedora Development Support for checksums on content bytestreams (Fedora R2.2) Messaging service in Fedora framework (R2.3) Formal expression and registration of “content models” (R3.0) Object validation based on “content models” (R3.0) Fedora API-M journaling and replay (for repository replication) Sun Center of Excellence Partnership – Rutgers University Libraries Preservation services Digital Preservation Portal Community Development  We’re looking for those in the Fedora community who would be interested in developing preservation features and services.

preservation Fedora Preservation Svcs WG - Sept. 1, Next Steps Continuation of First Year WG Activities Decisions on WG renewal and continuation White paper on reference architecture Possible Second Year Activities Workshop on Fedora-based preservation Possible grant applications Initiation of community development partnerships

preservation Fedora Preservation Svcs WG - Sept. 1, Membership in the WG Grace Agnew – Rutgers Paul Bevan – National Library of Wales Dan Davis – Harris Corporation Kevin Glick – Yale Ron Jantz (chair) – Rutgers Karen Miller - Northwestern Sandy Payette – Cornell Eliot Wilszek - Tufts

preservation Fedora Preservation Svcs WG - Sept. 1, Digital Preservation Process (Working Group Focus) Acquire or Create the Submission Information Package Ingest, Store, create access to the digital Object Life Cycle Management Of the Digital Object Yes Decision To Manage or Preserve Object No P1.0P3.0P2.0 XML- SIPXML- AIP

Event Management Concept Architecture (Using Java Messaging Service – JMS) JMS Msg Broker Msg 1 Msg 2 Msg 3 Msg n Msg 4... Fedora Messaging Framework Systems & Applications (send/receive msgs) Listening & Communication Services Closed Msg Pool Msg Structure Header Property Body (payload) Crawler JMS (snd/rcv) Applications JMS (snd/rcv) Fedora JMS (snd/rcv) JMS (snd/rcv) Collection JMS (snd/rcv) User JMS (snd/rcv) Preservation