Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.

Slides:



Advertisements
Similar presentations
ETD Preservation Survey Results Gail McMillan Digital Library and Archives, Virginia Tech 11th International ETD Symposium Robert Gordon University.
Advertisements

Texas Digital Library Services Preservation Network.
A Community Approach to Preservation: Experiences with Social Science Data ASIST Summit 2010 Jonathan Crabtree April 9, 2010.
Katherine Skinner Executive Director, Educopia Institute Program Manager, MetaArchive Cooperative An Age of Discovery, ARL-CNI Washington D.C. Friday,
Mairéad Martin, Penn State University Commons Solutions Group Storage Workshop May 2010.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Trustworthy Repository Criteria, Virtual Organizations, and Infrastructure MacKenzie Smith, MIT Libraries NDIIPP Meeting, July 2010.
Trustworthy repository criteria, virtual organizations, and infrastructure Chronopolis.
AN OPEN-SOURCE SYSTEM FOR AUTOMATIC POLICY-BASED COLLABORATIVE ARCHIVAL REPLICATION Using the SafeArchive System The SafeArchive System coordinates six.
DCAPE Project Update Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
ADAPT An Approach to Digital Archiving and Preservation Technology Principal Investigator: Joseph JaJa Lead Programmers: Mike Smorul and Mike McGann Graduate.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Collaborative Digital Preservation with LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
A Practical, Working and Replicable Approach to ETD Preservation Catherine M. Jannik, Georgia Institute of Technology Robert H. McDonald, Florida State.
Archival Prototypes and Lessons Learned Mike Smorul UMIACS.
SAN DIEGO SUPERCOMPTER CENTERUC SAN DIEGO LIBRARIESNDIIPP PARTNERS MEETING David Minor SDSC Robert H. McDonald SDSC Sangchul Song UMIACS Bryan.
Collaborative Preservation of ETDs: The MetaArchive Cooperative and LOCKSS Gail McMillan Digital Library and Archives, Virginia Tech 1 st Canadian ETD.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Trusted Datagrids: Library of Congress Projects with UCSD Ardys Kozbial – UCSD Libraries David Minor - SDSC.
National Digital Information Infrastructure and Preservation Program (NDIIPP) Building a Network of Preservation Partners CNI Spring Task Force Meeting.
North Carolina Geospatial Data Archiving Project (NCGDAP) Project Overview Partnership –University library (NCSU) and state agency (NCCGIA) –$520,000 funding,
Toward a Distributed and Collaborative Framework for Preservation Martin Halbert, UNT Dean of Libraries David Minor, Chronopolis Program Manager Katherine.
Mid-Michigan Digital Practitioners, March 14, 2014 The National Digital Stewardship Alliance Agenda Mid-Michigan Digital Practitioners Meeting Abigail.
Tyler Walters Dean, University Libraries and Professor Virginia Tech July 18, 2013 Collaboratively Preserving Our Digital Memory.
Katherine Skinner, Executive Director, Educopia Institute Martin Halbert, Dean of Libraries, University of North Texas CNI 2010 Spring Forum, Baltimore.
Copyright © 2008, Open Geospatial Consortium, Inc., All Rights Reserved. NDIIPP Partnership Update: North Carolina and Multi-state Demonstration Projects.
Digital Preservation through Cooperation: LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
Katherine Skinner Educopia Institute and MetaArchive Cooperative Matt Schultz Educopia Institute and MetaArchive Cooperative NDIIPP Partners Meeting Arlington,
Preserving ETDs: NDLTD & MetaArchive Collaboration Gail McMillan Digital Library and Archives, Virginia Tech Newcomers’ USETDA 2012.
Why Archiving and Preserving GIS Data Is Important Maps tell a compelling story of change over time. They document movement, progress, and change to the.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
1 Designing Storage Architecture for Digital Collections 2012.
Martin Halbert UNT Dean of Libraries MetaArchive President Monday, April 11, 2011 Newspaper Archive Summit University of Missouri Columbia, MO.
Preserving eScholarship and Digitized Special Collections Distributed Digital Preservation Bill Donovan
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
Interoperability within the Grid NDIIPP Partners Meeting Arlington, VA July 9, 2008 Interoperability within the Grid Robert H. McDonald Digital Preservation.
T HE M ETA A RCHIVE M ODEL : D ISTRIBUTED D IGITAL P RESERVATION N ETWORKS Dr. Martin Halbert VIVA/SCHEV LAC Meeting Christopher Newport University Trible.
Katherine Skinner, Executive Director, Educopia Institute ESOPI 2013 Chapel Hill, NC April 19, 2013.
Session 3.  Now you know WHY to make policies and WHAT they should contain…  But HOW do you implement policies?  And then HOW do you implement a program.
Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at.
Martin Halbert President, MetaArchive Cooperative DigCCurr 2009 Meeting Chapel Hill, NC Friday, April 3, 2009.
What is NDIIPP doing?. July 7 th, Web-At-Risk is opening its archives for public access, having captured nearly 6 TB of data—the entire CA State Government.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Dr. Katherine Skinner, Executive Director SILS CRADLE Seminar UNC-CH Manning Hall April 25, 2014 Using Collaborative Networks To Support Scholarly Communications.
Katherine Skinner, Educopia Institute Emily Gore, Clemson University U.S. Workshop on Roadmap for Digital Preservation Interoperability Framework NIST,
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
C HRONOPOLIS TM and the D IGITAL P RESERVATION I MPERATIVE Brian E. C. Schottlaender The Audrey Geisel University Librarian ECAR Symposium, 4 December.
Martin Halbert MetaArchive Cooperative Thursday, June 25, 2009 NDIIPP Annual Meeting Washington, D.C.
APPLYING OAIS TO DISTRIBUTED DIGITAL PRESERVATION Katherine Skinner, Eld Zierau IDCC Workshop, Amsterdam, January 14, 2013.
Distributed Digital Preservation Networks Across a Region, Across a State: Stretching LOCKSS Gail McMillan, Virginia Tech Martin Halbert, Emory Aaron Trehub,
Digital Preservation through Cooperation: LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
LOCKSS at Georgia Tech Patricia E. Kenly April 2007.
SAN DIEGO SUPERCOMPUTER CENTER Replication Policies for Federated Digital Repositories Robert H. McDonald Chronopolis Project Manager
Digital preservation of CBUC theses with MetaArchive 11th SELL Meeting Porto, June 4th 2011.
Katherine Skinner, Martin Halbert & Matt Schultz Educopia Institute and MetaArchive Cooperative NDSA Infrastructure Committee
CMU Libraries’ Digital Assets Preservation Strategy Presenter Gabrielle V. Michalek Principal Archivist and Head, Archives/Digital Library Initiatives.
Trustworthiness of Preservation Systems
Joseph JaJa, Mike Smorul, and Sangchul Song
Implementing an Institutional Repository: Part II
Gail McMillan Digital Library and Archives, Virginia Tech
Implementing an Institutional Repository: Part II
The MetaArchive Model: Distributed Digital Preservation Networks
How to Implement an Institutional Repository: Part II
Presentation transcript:

Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation

From Silos to Interoperability Digital preservation is still an emerging field Two successful approaches: – Integrated Rule-Oriented Data System (iRODS) – Lots of Copies Keep Stuff Safe (LOCKSS) Powerful technologies, currently isolated Seeking to bridge the gap and foster interoperability.

Presentation sections Chronopolis Program overview MetaArchive Cooperative overview Current and proposed work to automate the exchange of data between the systems.

Chronopolis Basic Facts Three node federated data grid at UCSD/SDSC, NCAR and UMIACS with capacity for up to 50 TB of data per node (150 TB total) Using the Storage Resource Broker (SRB) for data management (moving to iRODS) Using BagIt file packaging format and SRB tools to ingest and transfer data Using Auditing Control Environment (ACE) for integrity checking.

Current Chronopolis collections Spring 2010 Data Providers: Inter-university Consortium of Political and Social Research – preservation copy of collections including 40 years of social science data and Census California Digital Library – political and government web crawls, Web-at-risk collection SIO Explorer – data from 50 years of research voyages NCSU Libraries -- state and local geospatial data

MetaArchive Basic Facts Established in 2004, preserving content for 15 member institutions Uses LOCKSS software to provide long-term care for materials in a distributed digital preservation network Sustainable organizational framework: Membership organization with a 501c3 host (Educopia Institute) 254 TB network capacity (adding more as new members join) Compliant as a Trustworthy Digital Repository (2009 TRAC audit available on our site).

MetaArchive Collections. Current Members/Contributors, Spring 2010 Auburn University Boston College Clemson University Florida State University Folger Shakespeare Library Georgia Tech Library of Congress Penn State University PUC Rio de Janeiro Rice University University of Hull University of Louisville University of North Texas University of South Carolina Virginia Tech Current Affiliates Library of Congress NDLTD SDSC Chronopolis We welcome new members!

Collaboration Roadmap Chronopolis and MetaArchive realize the value in looking at inter-institutional preservation Have been pursuing informally Looking at ways of formalizing this process for long-term preservation goals.

The Plan Develop tools and methods to automate exchange of data between MetaArchive Cooperative (LOCKSS-based) and Chronopolis (iRODS-based) Examine data transfer tools/protocols from: – California Digital Library micro-services – iRODS protocols for data transfer – LOCKSS “plug-in” approach for data transfer Goal: A highly robust, easy to use preservation “system,” allowing digital objects to be shared between several major preservation networks in the U.S.

Focus Issues What does it mean to unite systems? Ability to export data between systems – Verify appropriate fixity – Transparency for system administrators Ability to track collections between systems – Verify collections are retrievable – Verify collections retain original characteristics.

Technical Issues What are the best ways to have an SRB/iRODS datagrid and a LOCKSS PLN interact? What does it mean to have an active system (MetaArchive) and an archival system (Chronopolis) work together? What are the appropriate transfer technologies? – iRODS and LOCKSS native tools – CDL Micro-services, e.g. BagIt.

The Process Identify the atomic units in our process – E.g. ingest, verification, data transfer, fixity checking Identify commonalities and differences Resolve needed issues.

Transfer technology: BagIt Hierarchical file packaging format for exchanging digital content – There is no software to install – Consists of base directory with manifest file & subdirectory with content – Manifest file has a row for each content file with: Full path in content directory A checksum for file “Holey” Bags – Have additional ‘fetch.txt’ file in base directory & empty content directory – URLs for each content file are listed in fetch.txt file. – Can reduce transfer time by fetching content in parallel

Initial development goals XML-standardized representation of common technical data that needs to be tracked for exchange and preservation of data and metadata Ingestion reference model and framework to enable automated and interoperable capture of metadata from files in MetaArchive and Chronopolis.

Procedural Issues What exactly are the inter-institutional ties? – “Just” backup? – Added service for our customers/members? – Will all customers want this? Legal issues with data owners MetaArchive and Chronopolis have very different management approaches. How do cross- institutional decisions get made?.

Organizational Issues Having a “seat at the table” at meetings and planning processes Working together on staffing and hiring Working together to identify customers and new opportunities.

The Big Win Important data preservation demonstration – No single system can solve all problems – No single system appeals to all user needs Practical, useful process for our organizations – Makes us individually stronger – Provides LOCKSS and iRODS systems with exit strategies if they ever prove necessary – Enables tools built for one system to be used by both.

Contacts MetaArchive: Chronopolis: Katherine Skinner: David Minor: