Preserving eScholarship and Digitized Special Collections Distributed Digital Preservation Bill Donovan

Slides:



Advertisements
Similar presentations
Current State of Play in Digital Preservation Peter B. Hirtle Cornell University Library Society of American Archivists.
Advertisements

Research Data Access and Preservation Summit Panel 2 - Promoting Re-Use of Scientific Collections Some responses to the questions posed... John Harrison.
Panel 2 – Promoting Re-Use of Scientific Collections John Harrison SHAMAN Project University of Liverpool
A Community Approach to Preservation: Experiences with Social Science Data ASIST Summit 2010 Jonathan Crabtree April 9, 2010.
Ensuring Long-term Access to ETDs through Distributed Digital Preservation Gail McMillan Director, Digital Library and Archives Virginia Tech Newcomers.
ETD Preservation Workshop Session Four: Collection Management for Preservation Gail McMillan, Virginia Tech.
Katherine Skinner Executive Director, Educopia Institute Program Manager, MetaArchive Cooperative An Age of Discovery, ARL-CNI Washington D.C. Friday,
Mairéad Martin, Penn State University Commons Solutions Group Storage Workshop May 2010.
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
CC 2007, 2011 attribution - R.B. Allen Information System Architectures and Services.
Current Thinking on Digital Preservation: Role of Metadata Oya Y. Rieger Coordinator, Library Office of Distributed Learning Cornell University Library.
Collaborative Digital Preservation with LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
A Practical, Working and Replicable Approach to ETD Preservation Catherine M. Jannik, Georgia Institute of Technology Robert H. McDonald, Florida State.
Working Together Revisited: Diverse Skills for Sustainability Robert P. Spindler Arizona State University December 5 th, 2006.
Collaborative Preservation of ETDs: The MetaArchive Cooperative and LOCKSS Gail McMillan Digital Library and Archives, Virginia Tech 1 st Canadian ETD.
Preservation Collaboration: NDLTD & MetaArchive Cooperative Gail McMillan Digital Library and Archives, Virginia Tech Newcomers’ ETDs 2010 University.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Promoting Digital Preservation Partnerships at the U.S. Library of Congress April 2004.
The Alabama Digital Preservation Network (ADPNet) A statewide private LOCKSS network Aaron Trehub, Auburn University Libraries NDIIPP Partners Meeting.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
Wrangling DigiTool Data For LOCKSS Brian Meuse - Digital Collections Systems Analyst University Libraries Boston College MetaArchive Cooperative Annual.
Persistent Digital Archives and Library System (PeDALS) SC Department of Archives and History.
Digital Preservation through Cooperation: LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
Digital Preservation 101, or, How to Keep Bits for Centuries Julie C. Swierczek Digital Asset Manager and Digital Archivist Harvard Art Museums.
Electronic Thesis and Dissertation Initiative at Indiana State University(ISU) where to start and where to go Valentine Muyumba (Chair of Cataloging and.
Growing the MetaArchive Cooperative: ETDs (electronic theses and dissertations) Gail McMillan Digital Library and Archives, Virginia Tech July 2008 NDIIPP.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Katherine Skinner Educopia Institute and MetaArchive Cooperative Matt Schultz Educopia Institute and MetaArchive Cooperative NDIIPP Partners Meeting Arlington,
Preserving ETDs: NDLTD & MetaArchive Collaboration Gail McMillan Digital Library and Archives, Virginia Tech Newcomers’ USETDA 2012.
File format registries - a global infrastructure for local persistence Andreas Aschenbrenner, ERPANET.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
1 Designing Storage Architecture for Digital Collections 2012.
Martin Halbert UNT Dean of Libraries MetaArchive President Monday, April 11, 2011 Newspaper Archive Summit University of Missouri Columbia, MO.
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
T HE M ETA A RCHIVE M ODEL : D ISTRIBUTED D IGITAL P RESERVATION N ETWORKS Dr. Martin Halbert VIVA/SCHEV LAC Meeting Christopher Newport University Trible.
Katherine Skinner, Executive Director, Educopia Institute ESOPI 2013 Chapel Hill, NC April 19, 2013.
Growing the MetaArchive Cooperative ETDs Gail McMillan Digital Library and Archives, Virginia Tech July 2008 NDIIPP Partners Meeting.
Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at.
Martin Halbert President, MetaArchive Cooperative DigCCurr 2009 Meeting Chapel Hill, NC Friday, April 3, 2009.
The Alabama Digital Preservation Network (ADPNet) Aaron Trehub Director of Library Technology Auburn University State Council of Higher Education for Virginia.
The Alabama Digital Preservation Network (ADPNet) A statewide Private LOCKSS Network Aaron Trehub, Auburn University Libraries SAA/CoSA Joint Annual Meeting.
UK LOCKSS Alliance: Investigation into Private LOCKSS Networks Adam Rusbridge EDINA, University of Edinburgh.
The Story of at the Alaska State Library Presented by Sheri Somerville Alaska State Library March 14, 2009.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
MetaArchive Cooperative Annual Membership Meeting Welcome & Overview Dr. Martin Halbert, President MetaArchive Annual Membership Meeting Houston, TX Friday,
Katherine Skinner, Educopia Institute Emily Gore, Clemson University U.S. Workshop on Roadmap for Digital Preservation Interoperability Framework NIST,
Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.
Martin Halbert MetaArchive Cooperative Thursday, June 25, 2009 NDIIPP Annual Meeting Washington, D.C.
Distributed Digital Preservation Networks Across a Region, Across a State: Stretching LOCKSS Gail McMillan, Virginia Tech Martin Halbert, Emory Aaron Trehub,
Digital Preservation through Cooperation: LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
Digital preservation of CBUC theses with MetaArchive 11th SELL Meeting Porto, June 4th 2011.
Libraries in the digital age Collection & preservation for generational access part two The LOCKSS Program.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Katherine Skinner, Martin Halbert & Matt Schultz Educopia Institute and MetaArchive Cooperative NDSA Infrastructure Committee
Digital Preservation MetaArchive Cooperative, Digital Preservation Policy Planning Workshop Boston College, Boston, MA October 26, 2010.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
CMU Libraries’ Digital Assets Preservation Strategy Presenter Gabrielle V. Michalek Principal Archivist and Head, Archives/Digital Library Initiatives.
Ingest and Dissemination with DAITSS

Joseph JaJa, Mike Smorul, and Sangchul Song
Statewide Digitization and the FCLA Digital Archive
Storage Basic recommendations:
The MetaArchive Model: Distributed Digital Preservation Networks
Presentation transcript:

Preserving eScholarship and Digitized Special Collections Distributed Digital Preservation Bill Donovan

25 March 2010Bill Donovan Boston College2 Summary As stewards of eScholarship and digitized special collections, we are responsible for saving these and other treasures effectively and economically. One approach for digital preservation is being spearheaded by the MetaArchive Cooperative; collections are replicated by peer institutions to guard against loss. The MetaArchive approach is one model for cultural memory organizations to consider adopting/adapting for their own use.

25 March 2010Bill Donovan Boston College3 Rationale for this talk Not recruiting for MetaArchive Cooperative Not recruiting for MetaArchive Cooperative DDP = a work in progress DDP = a work in progress Just one approach, but promising Just one approach, but promisingpromising –Adaptable for other “CMO” consortia? –Cultural memory organizations (CMOs) Perspective of just one member Perspective of just one member Ulterior motive: convince management Ulterior motive: convince management

25 March 2010Bill Donovan Boston College4

25 March 2010Bill Donovan Boston College5 Special Collections

25 March 2010Bill Donovan Boston College6 “Digital Preservation” defined “Digital preservation” combines policies, strategies and actions that ensure access to digital content over time. “Digital preservation” combines policies, strategies and actions that ensure access to digital content over time. esources/preserv/defdigpres0408.cfm esources/preserv/defdigpres0408.cfm esources/preserv/defdigpres0408.cfm esources/preserv/defdigpres0408.cfm

25 March 2010Bill Donovan Boston College7 Distributed Digital Preservation (DDP) geographically dispersed sites

25 March 2010Bill Donovan Boston College8 “MetaArchive Cooperative”? low-cost, high-impact DDP for “CMOs” – –e.g. libraries, research centers, and museums founded in 2004; funding from: – –NDIIPP (Library of Congress) – –NHPRC (National Archives) Not vendor-based; enable CMOs to own and control the process of digital preservation for themselves.

25 March 2010Bill Donovan Boston College9 MetaArchives’s networks

25 March 2010Bill Donovan Boston College10 MetaArchive’s ETD network

25 March 2010Bill Donovan Boston College11 Policies & Strategy  Flat, Trim, Tight-Knit organization P2P: no supermember, no host institutionP2P: no supermember, no host institution Minimal overhead, bureaucracyMinimal overhead, bureaucracy Emphasis on communication & collaborationEmphasis on communication & collaboration Committees: steering, technical, content, preservationCommittees: steering, technical, content, preservation  Self-sufficiency avoid outsourcing; retain controlavoid outsourcing; retain control cost containment, understand & refine processcost containment, understand & refine process sustainable sources of fundingsustainable sources of funding

25 March 2010Bill Donovan Boston College12 Policies & Strategy Caches (dark archives) Caches (dark archives) –6 replications –Access only via contributing member Active monitoring of the integrity of stored digital content --- NOT just back-ups Active monitoring of the integrity of stored digital content --- NOT just back-ups For ETDs, discovery via Networked Digital Library of Theses & Dissertations, NDLTD For ETDs, discovery via Networked Digital Library of Theses & Dissertations, NDLTDNDLTD

25 March 2010Bill Donovan Boston College13 Local actions/responsibilities Skills & infrastructure Skills & infrastructure Copyright responsibility Copyright responsibility Data wrangling Data wrangling –Format choices  Proprietary versus open formats open –Bit preservation versus migration –Filenaming & directories Preservation information (OAIS) Preservation information (OAIS)

25 March 2010Bill Donovan Boston College14 Adapted from: “Reference Model for an Open Archival Information System” CCSDS B-1 (2002) OAIS = Open Archival Information System

25 March 2010Bill Donovan Boston College15 OAIS preservation information Preservation Description Information Reference Information Provenance Information Context Information Fixity Information

25 March 2010Bill Donovan Boston College16 OAIS preservation information Preservation Description Information Reference Information Provenance Information Context Information Fixity Information … identifies, and if necessary describes, one or more mechanisms used to provide assigned identifiers for the Content Information. It also provides identifiers that allow outside systems to refer, unambiguously, to a particular Content Information. An example of Reference Information is an ISBN.

25 March 2010Bill Donovan Boston College17 OAIS preservation information Preservation Description Information Reference Information Provenance Information Context Information Fixity Information … documents the history of the Content Information. … tells the origin or source of the Content Information, any changes that may have taken place since it was originated, and who has had custody of it since it was originated. Examples of Provenance Information are the principal investigator who recorded the data, and the information concerning its storage, handling, and migration.

25 March 2010Bill Donovan Boston College18 OAIS preservation information Preservation Description Information Reference Information Provenance Information Context Information Fixity Information … documents the relationships of the Content Information to its environment. This includes why the Content Information was created and how it relates to other Content Information objects.

25 March 2010Bill Donovan Boston College19 OAIS preservation information Preservation Description Information Reference Information Provenance Information Context Information Fixity Information … documents the authentication mechanisms and provides authentication keys to ensure that the Content Information object has not been altered in an undocumented manner. Example: Cyclical Redundancy Check code for a file.

25 March 2010Bill Donovan Boston College20 MetaArchive hierarchy Archive (6 + caches per network) Archive (6 + caches per network) –Genre- or Format-based Collections (1 + per member) Collections (1 + per member) –Collection level metadata Archival unit (1 + per ingest) Archival unit (1 + per ingest) –e.g., all ETDs for each year

25 March 2010Bill Donovan Boston College21 Lots of Copies Keep Stuff Safe LOCKSS open-source software/support to preserve web-published materials LOCKSS open-source software/support to preserve web-published materials LOCKSS decentralized digital preservation infrastructure decentralized digital preservation infrastructure migrates content forward in time migrates content forward in time migrates content forward in time migrates content forward in time bits & bytes continually audited & repaired bits & bytes continually audited & repairedcontinually audited & repairedcontinually audited & repaired MetaArchive members also join LOCKSS MetaArchive members also join LOCKSS

25 March 2010Bill Donovan Boston College22 Private LOCKSS network (PLN) PLN is a LOCKSS network deployed by a set of like-minded institutions in order to preserve content in a closed preservation network. PLN is a LOCKSS network deployed by a set of like-minded institutions in order to preserve content in a closed preservation network. Not maintained by the Stanford University- based LOCKSS staff Not maintained by the Stanford University- based LOCKSS staff

25 March 2010Bill Donovan Boston College23 Manifest page

25 March 2010Bill Donovan Boston College24 Archival unit An independent collection of content in a LOCKSS cache. Archival units are maintained as a whole by LOCKSS daemons. They are defined by the plugin and plugin parameters.

25 March 2010Bill Donovan Boston College Digital object and its metadata

25 March 2010Bill Donovan Boston College26 Metadata xml file

25 March 2010Bill Donovan Boston College27 ETD (electronic thesis/dissertation)

25 March 2010Bill Donovan Boston College28 Plug-in An XML file that instructs the LOCKSS software how to ingest and preserve content. Each cache on the network writes a plug-in for its collection, enabling other caches to replicate its content

25 March 2010Bill Donovan Boston College29 Security Copies on different power grids Copies on different power grids All copies not accessible to one person All copies not accessible to one person Each cache secure and for DDP-only Each cache secure and for DDP-only Security-enhanced Linux Security-enhanced Linux SSL-encrypted inter-cache communication SSL-encrypted inter-cache communication IP address based Firewall exceptions IP address based Firewall exceptions

25 March 2010Bill Donovan Boston College30 For more details…

25 March 2010Bill Donovan Boston College31 MA regional library systems Massachusetts Networks: CLAMS*MBLNSAILS* NOBLE*C/W MARS*MVLC Minuteman* OCLN