MetaArchive of Southern Digital Culture: A Practical, Working and Replicable Approach to Preservation Martin Halbert, Emory University Gail McMillan, Virginia.

Slides:



Advertisements
Similar presentations
Planning Collaborative Spaces in Libraries
Advertisements

Current State of Play in Digital Preservation Peter B. Hirtle Cornell University Library Society of American Archivists.
Ensuring Long-term Access to ETDs through Distributed Digital Preservation Gail McMillan Director, Digital Library and Archives Virginia Tech Newcomers.
ETD Preservation Workshop Session Four: Collection Management for Preservation Gail McMillan, Virginia Tech.
BUILDING A COLLABORATIVE DIGITAL PRESERVATION NETWORK Caroline Arms Office of Strategic Initiatives, Library of Congress Robert H. McDonald Associate Director.
Katherine Skinner Executive Director, Educopia Institute Program Manager, MetaArchive Cooperative An Age of Discovery, ARL-CNI Washington D.C. Friday,
Distributed Digital Preservation Networks Across a Region, Across a State: Stretching LOCKSS Gail McMillan, Virginia Tech Martin Halbert, Emory Aaron Trehub,
Developing a Records & Information Retention & Disposition Program:
MetaArchive of Southern Digital Cultural Partners in the dispersed redundant dark archive University Libraries at Emory Auburn Florida State Georgia Tech.
THE RUTGERS WORKFLOW MANAGEMENT SYSTEM Mary Beth Weber Cataloging and Metadata Services Rutgers University Libraries August 3, 2007.
Collaborative Digital Preservation with LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
A Practical, Working and Replicable Approach to ETD Preservation Catherine M. Jannik, Georgia Institute of Technology Robert H. McDonald, Florida State.
NHPRC ELECTRONIC RECORDS RESEARCH FELLOWSHIP SYMPOSIUM Nov. 19, 2004 Rebecca Schulte University of Kansas Project Title: Testing Boundaries—An Exploration.
Chapter 3: The Project Management Process Groups
Orientation to the Accreditation Internal Evaluation (Self-Study) Flex Activity March 1, 2012 Lassen Community College.
Collaborative Preservation of ETDs: The MetaArchive Cooperative and LOCKSS Gail McMillan Digital Library and Archives, Virginia Tech 1 st Canadian ETD.
Preservation Collaboration: NDLTD & MetaArchive Cooperative Gail McMillan Digital Library and Archives, Virginia Tech Newcomers’ ETDs 2010 University.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
The Alabama Digital Preservation Network (ADPNet) A statewide private LOCKSS network Aaron Trehub, Auburn University Libraries NDIIPP Partners Meeting.
MetaArchive Distributed Digital Preservation Workshop Session 3: Costs and Operational Considerations Wednesday, May 30, 2007 Robert W. Woodruff Library.
MetaArchive of Southern Digital Cultural Partners in a dispersed redundant dark archive University Libraries at Emory Auburn Florida State Georgia Tech.
1.2 Content Management Catherine M. Jannik Georgia Institute of Technology MetaArchive Distributed Digital Preservation Workshop Emory University – Atlanta,
T OWARD A C OLLABORATIVE A PPROACH TO S TAKEHOLDERS ’ I NVOLVEMENT IN ETD S C URATION Presenters: Daniel Gelaw Alemneh, Geneva Henry, & Shannon Stark L.
Introducing New Services with DSpace Open Repositories Conference 2007 Susan Wells Parham Kent Woynowski Julie Griffin.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
DuraSpace Summit meeting Baltimore, Md March 13,
Digital Preservation through Cooperation: LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
DINI „Electronic Publishing Group“ DINI – Certificate Document and Publication Repositories “Electronic Publishing Group“
Growing the MetaArchive Cooperative: ETDs (electronic theses and dissertations) Gail McMillan Digital Library and Archives, Virginia Tech July 2008 NDIIPP.
24 March 2010Atlanta, Georgia Passing it on: Notes on digital initiative sustainability Marty Kurth HBCU Library Alliance – Cornell University Library.
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
MetaArchive Cooperative Membership Agreements Martin Halbert NDIIPP Partners Meeting Washington, D.C. Wednesday July 9, 2008.
Case Study: The MetaArchive Cooperative Charter Katherine Skinner Distributed Digital Preservation Workshop May 31, 2007.
Katherine Skinner Educopia Institute and MetaArchive Cooperative Matt Schultz Educopia Institute and MetaArchive Cooperative NDIIPP Partners Meeting Arlington,
Preserving ETDs: NDLTD & MetaArchive Collaboration Gail McMillan Digital Library and Archives, Virginia Tech Newcomers’ USETDA 2012.
Session 2.  Wake Up Call, LSTA Digitization Grant  Digital Preservation Summit, May 2008  ISU Digital Preservation Group, September 2009.
Preserving eScholarship and Digitized Special Collections Distributed Digital Preservation Bill Donovan
T HE M ETA A RCHIVE M ODEL : D ISTRIBUTED D IGITAL P RESERVATION N ETWORKS Dr. Martin Halbert VIVA/SCHEV LAC Meeting Christopher Newport University Trible.
HATHITRUST A Shared Digital Repository The HathiTrust Print Monograph Archive Planning Task Force Print Archive Network Forum ALA 2015 Annual Meeting June.
The Legal Agreements of the National Geospatial Digital Archive Julie Sweetkind-Singer Stanford University NDIIPP National Conference, Washington, DC June.
Katherine Skinner, Executive Director, Educopia Institute ESOPI 2013 Chapel Hill, NC April 19, 2013.
Session 3.  Now you know WHY to make policies and WHAT they should contain…  But HOW do you implement policies?  And then HOW do you implement a program.
Martin Halbert President, MetaArchive Cooperative DigCCurr 2009 Meeting Chapel Hill, NC Friday, April 3, 2009.
February, 2006 Open Repositories, Sydney, Australia Transition to a Broader Participation: Experience from the DSpace Project MacKenzie Smith MIT Libraries.
Dr. Martin Halbert Dr. Katherine Skinner Digital Preservation: What’s Now, What’s Next. Amigos Online Conference, August 12, 2011.
The Alabama Digital Preservation Network (ADPNet) Aaron Trehub Director of Library Technology Auburn University State Council of Higher Education for Virginia.
The Alabama Digital Preservation Network (ADPNet) A statewide Private LOCKSS Network Aaron Trehub, Auburn University Libraries SAA/CoSA Joint Annual Meeting.
UK LOCKSS Alliance: Investigation into Private LOCKSS Networks Adam Rusbridge EDINA, University of Edinburgh.
Module V: Writing Your Sustainability Plan Cheri Hayes Consultant to Nebraska Lifespan Respite Statewide Sustainability Workshop June 23-24, 2015 © 2011.
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
Katherine Skinner, Educopia Institute Emily Gore, Clemson University U.S. Workshop on Roadmap for Digital Preservation Interoperability Framework NIST,
The DEER The Distributed European Electronic Resource.
Martin Halbert MetaArchive Cooperative Thursday, June 25, 2009 NDIIPP Annual Meeting Washington, D.C.
Digital Library Program Forum March 31, 2003.
Distributed Digital Preservation Networks Across a Region, Across a State: Stretching LOCKSS Gail McMillan, Virginia Tech Martin Halbert, Emory Aaron Trehub,
Digital Preservation through Cooperation: LOCKSS Gail McMillan Digital Library and Archives, University Libraries Virginia Polytechnic Institute and State.
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
Custodians of Culture, Architects of Archives  Martin Halbert (Emory Univ., MetaArchive Cooperative) - Facilitator  Thib Guicherd ‐ Callin (Stanford.
LOCKSS at Georgia Tech Patricia E. Kenly April 2007.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Katherine Skinner, Martin Halbert & Matt Schultz Educopia Institute and MetaArchive Cooperative NDSA Infrastructure Committee
Digital Preservation MetaArchive Cooperative, Digital Preservation Policy Planning Workshop Boston College, Boston, MA October 26, 2010.
Rebecca L. Mugridge LFO Research Colloquium March 19, 2008.
Beyond Technology: Creating and Sustaining the MetaArchive Cooperative Joint Annual Meeting, Society of American Archivists & the Council of State Archivists.
The Alabama Digital Preservation Network (ADPNet)
MetaArchive of Southern Digital Culture: A Practical, Working and Replicable Approach to Preservation Martin Halbert, Emory University Gail McMillan, Virginia.
Gail McMillan Digital Library and Archives, Virginia Tech
Digital Stewardship Curriculum
The MetaArchive Model: Distributed Digital Preservation Networks
Presentation transcript:

MetaArchive of Southern Digital Culture: A Practical, Working and Replicable Approach to Preservation Martin Halbert, Emory University Gail McMillan, Virginia Tech Tyler Walters, Georgia Tech Aaron Trehub, Auburn University

Introduction to the MetaArchive of Southern Digital Culture Martin Halbert Director for Library Systems Emory University CNI Fall 2006 Task Force Meeting Washington, D.C. December 4-5, 2006

The Problem Preservation of digital content is an enormous problem, too big for individual institutions to solve in isolation We realized that we needed a way to work together cooperatively on this challenge, but there was a dearth of effective models for how to do this

MetaArchive Project Summary Six partner institutions collaborating with LoC to develop a Cooperative Network for the preservation of digital content in targeted cultural heritage subject domains (One of the eight NDIIPP partnership projects)

Project Goals: Conspectus of digital content held by the partner sites 2. Harvested body of the most critical content to be preserved 3. Model cooperative agreement for ongoing collaboration 4. Distributed preservation network infrastructure for replication based on the LOCKSS software

MetaArchive Network

Key Features of a Secure MetaArchive A collaboratively maintained archive of archives (a meta-archive) is a new concept to be modeled Our group advocated seven principal attributes of such a system in the project plan preamble These key features were identified in the planning process that led to the project

#1: Distributed Preservation Effective preservation succeeds by distributing copies of content in secure, distributed locations over time This preservation network is based on a leading preservation software package for distributed digital replication (LOCKSS), establishing from the beginning a distributed means of replicated archives

#2: Flexible Organizational Model The project has developed a relatively simple and flexible cooperative agreement as a model for other institutions seeking to cooperate for purposes of digital preservation. Agreement entails minimal overhead, enlist straightforward mechanisms for collaboration, and is widely applicable to many sorts of institutions. Fundamentally, a commitment of institutions to preserve each others’ content

#3: Content Selection A shared interest in preserving targeted types of content is what brings groups together to collaborate The subject domain conspectus for digital content was guided by a group of content experts from the partner libraries The team evaluated content at the partner sites in terms of its importance for cultural heritage, and preservation considerations (including formats and planning for subsequent migration)

#4: Migrating Archives Factors laying the basic groundwork for subsequent migration efforts: Metadata concerning the archived content must be carefully maintained Selecting migratable formats and data structures Open source software, so that the software itself can be preserved and evolved

#5: Relatively Dark Archiving (to start with) Our philosophy decouples the needs of long-term preservation from those of presentation, access, and high availability Our initial scope is long-term preservation, avoiding the expense and effort associated with the other needs Information about MetaArchive contents is available publicly through our conspectus database

#6: Relatively Low Cost This approach to digital preservation is intentionally designed to require minimal expenditures by collaborating groups of medium sized institutions Building on the LOCKSS approach of low- cost, low barriers to adoption, our preservation network is a model that can be easily implemented by many ad hoc groups of collaborating institutions

#7: Self-Sustaining Incentives Our cooperative addresses sustainability issues in two ways Provides participating institutions with a capability that they fundamentally lack individually Plan to offer pricing models for additional institutions to participate without requiring technical infrastructure

Simple Preservation Exchange Mechanisms The distributed and automated approach of this project to digital preservation simplifies the mechanism for sharing the resulting archive of digital content with the Library of Congress and other preservation networks Validated, ongoing replication of the exact content of the MetaArchive at the Library of Congress is ensured by the design of the system.

Project Successes Feb 2005: Conspectus completed May 2005: Network in operation Aug 2005: Initial archiving completed (ongoing) Feb 2006: Cooperative model analysis completed Aug 2006: Cooperative Charter drafted Oct 2006:Nonprofit host organization formed 2007: Collaborative workshops for others interested in LOCKSS VPNs, extension of project

The MetaArchive’s Collections: Decisions and Descriptions Gail McMillan Director, Digital Library and Archives Virginia Polytechnic Institute & State University CNI Fall 2006 Task Force Meeting Washington, D.C. December 4-5, 2006

Planning the MetaArchive Conspectus Scope Standards  Schema  Controlled vocabulary Database and Conspectus  Inventory of Collections  Formats Prioritizing  At risk  Data wrangling Adapting LOCKSS Rights Issues

Scoping the MetaArchive’s Content Southern digital cultural heritage  Broad topics Not just the Civil War Slave Narratives Civil Rights Movement Business, industry, and technological development Music Crafts Church histories Encyclopedia of Southern Culture  Local decisions

MetaArchive Conspectus Database Describes the collections to be preserved Provides information for  Storage estimates  Format migration  Accrual rules  Location  Ownership  LOCKSS specific elements

Genesis of the MetaArchive’s Metadata Specifications MetaArchive Metadata Specification Dublin Core Elements & Refinements Dublin Core Collection Level Description RSLP Collection Level Description MODS Physical Description MetaArchive Specific Elements

MetaArchive Collection-Level Conspectus Metadata Specification

MetaArchive Conspectus Database Auburn: 4 collections/7.9 GB  Extensions pubs, yearbooks (+TIFFs) Emory: 10 collections/23 GB  Born digital (Southern Spaces), image masters FSU: 3 collections/101 MB  Juvenile lit, historic photos, 2004 theses Georgia Tech: 12 collections/809 MB  Digitized special collections, SMARTech, ETDs Louisville: 3 collections/17 GB  Oral histories, image masters VT: 50 collections/1.9 GB  Bio DB, online exhibits, faculty archives, digital Spec Coll

Risk Analysis Auburn: Glomerata TIFFs  Risk Rank: 3  Risk Factors: …stored on a single server at Auburn, with backup copies on multiple DVDs. Emory: Nunn’s s  Risk Rank: 2  Risk Factors: … part of the Electronic Data Center's collections… The master file (restricted) needs to be maintained along with the access file (open for research.) FSU: campus photos  Risk Rank:2  Risk Factors:Images are in a transitional location... They are at risk of degradation, loss due to no systematic, periodic integrity verification processes at this time.

Risk Analysis Georgia Tech: ETDs  Risk Rank:5  Risk Factors:Born digital material Louisville: interviews  Risk Rank:3  Risk Factors: Masters exist only on CD. Analog originals are on audiocassette (i.e., also at risk). VT: digital image database  Risk Rank:5  Risk Factors: This is THE source of digital masters for all official scanning done for teaching, research, and historical preservation.

Adapting LOCKSS: Digital Collections are not like EJournals LOCKSS ejournal model  Expects stable Archival Units  Completed volumes do not change Digitizing Special Collections  Changing Archival Units  Not yet scanning entire collections ETDs don’t even fit the ejournal pattern  Annual academic cycles Born-digital: Completion date vs. Accessible date  Scanned theses/dissertations: as they circulate

Rights Issues for the MetaArchive 1.Fit “fair-use” doctrine or other provisions relating specifically to library copying and other activities 2.Determine whether the work still enjoys protection or has lapsed into the public domain 3.Occur as a result of valid permission from the copyright owner(s)

Rights Issues for the MetaArchive 4.Constitute an acceptable risk for the institution in the potential absence of “clear” resolution Group or individual definitions? Some more risk averse than others? Who will decide? Dark archive--less risk of infringement? In the spirit of Sect. 108, US Copyright Law

The MetaArchive’s Current Working Environment Tyler Walters Associate Director for Technology and Resource Services Georgia Institute of Technology CNI Fall 2006 Task Force Meeting Washington, D.C. December 4-5, 2006

Current Developments Testing the Network: Disaster Recovery We will: Focus on three components: Hardware, Content (LOCKSS), Network Simulate crashing primary node Intentionally damage content (truncate files) Disable access to plug-ins Run routine tests for “bad disk,” cache manager, conspectus database, yum repository, kickstart script, xml config. file, etc. Then: Reconstruct primary node, resurrect network, reconstruct content Create documentation on these three items

New Services / New Members Contributing Partners – cultural memory institutions that possess digital content to preserve via MetaArchive Preservation Network. They contribute fees for this service, they do not operate a node. Fee Structure / “Pricing” MetaArchive Cooperative services to members  Digital preservation (network dev./maintenance, content ingest/retrieval)  Format migration  Digital collection disaster recovery  Digital preservation network consulting / training  LOCKSS services Adding New Members  Issues / practical steps, MoU, technology adherence, funds, collection profile, etc.

MetaArchive and Educopia Institute Non-profit management entity – Three issues: 1) Continuing need for financial resources 2) Expose MetaArchive to new digital projects to inform development 3) Economically efficient, catalytic structure to bring these about Educopia Institute:  Provide oversight to MA Cooperative and other digital projects  Low-cost, low overhead conduit for digital library, scholarly communications technology projects  Advance cyberinfrastructure needed to drive research, teaching and learning in contemporary digital era  NSF (2003) and ACLS (2006) Cyberinfrastructure reports:  Scholarly activity – teaching, research, learning, knowledge transfer via scholarly communications – need rational, strategic cyberinfrastructure EI: Generate DL technology projects to support mission and goals

Drafting the MetaArchive Cooperative Charter Aaron Trehub Director of Library Technology Auburn University CNI Fall 2006 Task Force Meeting Washington, D.C. December 4-5, 2006

Episodes 1: The Assignment 2: Getting Started 3: Models? 4: First Pass: Fall-Winter : Second Pass: Spring-Summer : In Which Things Get Complicated 7. Enter the Educopia Institute 8: The End Product: Fall-Winter Questions

1: The Assignment Cooperative agreement = one of the project’s deliverables Mission: to draft, from scratch, an agreement that will govern and sustain the cooperative’s ongoing activities—and serve as a model for other projects Principles: Clarity Simplicity Flexibility Openness Sustainability

2: Getting Started “Cooperative Agreement Analysis Plan” (CAAP) prepared in August 2005 Six-member working group formed Editing tasks divvied up 20 questions (what, who, how, how much, etc.)… …Organized into 5 categories: preparatory, cultural/philosophical, organizational, financial, technical

3: Models? CIC OAI-PMH MoA ExLibris Users’ Constitution InCommon Federation Operating Practices and Procedures TEI Bylaws and Prospectus Selected language from the MetaArchive proposal

4: First Pass: Fall-Winter 2005 Rough draft of cooperative agreement and MoA posted to MetaArchive wiki in late September 2005 Four sections: Introduction; Partnership; Organization and Governance; Financial and Economic Sustainability, plus appendices Editing via and in weekly conference calls, with input from the working group and the larger collaborative Extensive contributions from Caroline Arms (LC), Gail McMillan (Virginia Tech), and Tyler Walters (Georgia Tech) Revised drafts posted to wiki in October-November 2005

5: Second Pass: Spring-Summer 2006 Status check and on-the-spot editing at in- person meeting in February 2006 Further editing by project PI, Martin Halbert (Emory) Agreement begins to take on more-formal character Legal review: consultants at the University of Louisville and a private firm in Atlanta

6. In Which Things Get Complicated Sticky Issues: Indemnification and liability Copyright Exiting the cooperative Non-compliance and breach of contract Sanctions Money Cooperative Agreement  Cooperative Charter One-page letter of agreement  Multipage Memorandum of Understanding (MoU)

7. Enter the Educopia Institute Non-profit organization, created in October 2006 in Atlanta Provides administrative services for the MetaArchive Cooperative, including: Billing member organizations for annual dues; Maintaining and distributing such funds; Organizing and hosting annual meetings of MetaArchive members; Holding members accountable for completing agreed-upon tasks; Hosting workshop programs on digital preservation topics.

8. The End Product: Fall-Winter 2006 The MetaArchive Cooperative Charter and MoU will address: Membership Duration and cost Governance structure Conditions of breach Liability issues Final editing by Katherine Skinner (Emory), working with lawyers at University of Louisville and private law firm in Atlanta Finished versions available by early 2007

Draft cooperative charter and MoU Draft charter available at Charter0906.pdf Charter0906.pdf Memorandum of Understanding still being drafted; expected in early 2007 Will require review by legal counsel

9. Example of effective partnership… The drafting process involved various constituencies: MetaArchive members Colleagues at other institutions Administrators at other institutions Legal consultants Successfully mixed virtual and in-person editing Approximately one year from start to finish— not bad for a complex legal document

…Or tour of the sausage factory? “Laws are like sausages. It's better not to see them being made.” (Otto von Bismarck)

Answer: Well, yes Pluses Effective collaboration among different constituencies and institutions Effective drafting process (eventually) Difficult questions resolved or at least broached Not-so-pluses and “known unknowns” Departure from original conception Additional complexity Institutional buy-in remains to be seen

The Verdict Fulfilled one of the project’s deliverables A more-complex-than-anticipated instrument that nevertheless could be adopted (and perhaps whittled down) by other projects So: a qualified success that needs now to be tested in practice

Thank you! Questions re the MetaArchive? Aaron Trehub (334) Tyler Walters (404) Gail McMillan (540) Martin Halbert (404)