Digital information lasts forever or five years, whichever comes first. —Jeff Rothenberg.

Slides:



Advertisements
Similar presentations
Current State of Play in Digital Preservation Peter B. Hirtle Cornell University Library Society of American Archivists.
Advertisements

Preserv: Preservation architecture and interface A brief overview of ideas wrt to the project plan For Preserv partners meeting, BL, London, 18th November.
OCLC Online Computer Library Center Steering Around the Iceberg: Economic Sustainability for Digital Collections Brian Lavoie Research Scientist OCLC Economics.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Mairéad Martin, Penn State University Commons Solutions Group Storage Workshop May 2010.
National Digital Information Infrastructure and Preservation Program (NDIIPP) Data-PASS/NDIIPP: A new effort to harvest our history A funder view May 25,
Niklas Köhn HS 'Digital Libraries' Digital Preservation – Reasons & Methods Summary of Delos Summer School 2005 Digital Preservation Reasons & Methods.
Preservation and Long-term access through Networked Services Adam Farquhar, The British Library iPres2006 Cornell University, October 2006.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
NHPRC ELECTRONIC RECORDS RESEARCH FELLOWSHIP SYMPOSIUM Nov. 19, 2004 Rebecca Schulte University of Kansas Project Title: Testing Boundaries—An Exploration.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Robust Technologies for Automated Ingestion and Long-Term Preservation of Digital Information PI: Joseph JaJa Co-PIs: Allison Druin and Doug Oard Major.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Developing PANDORA Mark Corbould Director, IT Business Systems.
Enterprise Architecture
Promoting Digital Preservation Partnerships at the U.S. Library of Congress April 2004.
What is Business Analysis Planning & Monitoring?
“Internet Time” and Time Stewart Brand The Long Now Foundation.
DSpace, CyberCemeteries and Other Active Sites for Community Networking Records Maria Esteva and Sue Soy School of Information, UT Austin Austin History.
What is Enterprise Architecture?
EXPECTATIONS OF TURKISH ENVIRONMENTAL SECTOR FROM INSPIRE Ministry of Environment and Forestry June, 2010 Özlem ESENGİN Ahmet ÇİVİ Tuncay DEMİR.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
1 Working Group on Archives and Records Management WGARM.
Chinese-European Workshop on Digital Preservation, Beijing, July 14 – Network of Expertise in Digital Preservation Preservation Planning, Institutional.
Science Archives in the 21st Century 25/26 April Towards an International standard for Audit and Certification of Digital Repositories David Giaretta.
Open Access Symposium 2015 Open Access, the Law, and Public Information Mary Alice Baish UNT Dallas College of Law May 19, 2015 National Plan for Access.
27. August Kyung-Ho Choi Manager of Digital Archiving Division The National Library of Korea Sang-hoon Oh Secretary of General in.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
NDIIPP The Next Phase Meg Williams Associate General Counsel The Library of Congress.
OAIS in the Library Environment Managing and Preserving Electronic Resources FLICC/CENDI Washington DC, December 11,2001 Anne Van Camp RLG, Member Initiatives.
1 Digital Archives - Past, Present & Future Issues Anne Van Camp Manager, Member Initiatives The Research Libraries Group Digital Archives Directions (DADs)
The Library of Congress Martha Anderson Program Officer, NDIIPP Office of Strategic Initiatives Library of Congress April 2005 LC Perspective : Preservation.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
November 2004 NDIIPP: Future Directions and Relevance to Other Countries Beth Dulabahn Office of Strategic Initiatives Library of Congress November 7,
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
European Commission on Preservation and Access Preservation of digital heritage Yola de Lusenet Lisbon, November
HATHITRUST A Shared Digital Repository The HathiTrust Print Monograph Archive Planning Task Force Print Archive Network Forum ALA 2015 Annual Meeting June.
Big Heads July 10, 2009 Next Generation Technical Services Rethinking Library Technical Services for the University of California.
Digital Accountability: The Line Between Producing and Preserving Digital Government Information Mary Alice Baish Superintendent of Documents Indiana State.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Digital Preservation across the technologies, strategies, open standards & interoperability aspects including the legal issues Pratik Shrivastava Scientist.
Managing Documents the Right Way IA354 Amanda Murphy.
NDSR Boston webinar: Digital Preservation Introduction Presenter: Nancy Y McGovern October 2015.
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe September 26-27, 2006 ARL Prue.
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
JISC/CNI Conference Edinburgh, 26th June 2002 Challenges of Digital Preservation – do we have a road map? Maggie Jones.
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Aligning Digital Preservation Policies with Community Standards Nancy McGovern Digital Preservation Officer.
Library of Congress Partnerships for Managing Geospatial Data North Carolina Geographic Information Coordinating Council Raleigh, NC November 7, 2007 William.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
The National Digital Information Infrastructure and Preservation Program (NDIIPP) Challenges and Solutions Laura E. Campbell Associate Librarian for Strategic.
Digital Asset Management Systems and Digital Preservation EUAN COCHRANE – DIGITAL PRESERVATION MANAGER YALE UNIVERSITY LIBRARY.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
Digital Preservation Initiatives in the United States A Summary Deanna B. Marcum.
Practical Aspects of Preservation Peter Simpson Development Officer Arts and Humanities Data Service.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Digital Asset Management at Michigan Tech
Database Development (8 May 2017).
Building A Repository for Digital Objects
Joseph JaJa, Mike Smorul, and Sangchul Song
Presentation transcript:

Digital information lasts forever or five years, whichever comes first. —Jeff Rothenberg

Digital information lasts forever, or five years, whichever comes first. —Jeff Rothenberg The average lifespan of info on the Web? 44 days. —Brewster Kahle, Internet Archive

The Digital Dark Age New generations of software and platforms make old data unreadable –“Migration” required every 10 years –Old data grows in value –Almost no one is working on the problem, because it lives in a slower time frame RESULT: catastrophic extinction of data

Moore’s WallMoore’s Law

©

T i m e s c a t t e r s e v e r y t h i n g.

T i m e s c a t t e r s e v e r y t h

T i m e s c a t t e r s e v e r y t h i n g.

Cuneiform writing circa 6th Century BCE

Hyakumanto Darani CE 1 million copies World’s oldest printed text

ROSETTA STONE Created BCE Found CE Deciphered CE

Library Of Congress

Digital Preservation: A Five-Layered Challenge Curatorial challenge: How to decide what to preserve? Technological challenge: How to deal with obsolescence of hardware and software? Intellectual property challenge: How to provide access while protecting rights of creators? Economic challenge: How to pay for it all? Organizational / institutional challenge: How to enable creation of a distributed system that works without the federal government in control?

Five-Stage Design Process for Phase One 1.Environmental Scan three convening sessions with about 50 leading experts on relevant issues 2.Scenario Development Workshop 30 leading experts explored different solution spaces and how they might be affected by external forces 3.Draft design of High-Level Technical Architecture 4.Pilot Projects Workshop Convening of stakeholders to explore possible experiments 5.Draft NDIIPP Plan with Recommended Path Forward for Iterative Learning

Preservation Architecture Design Principles The NDIIPP digital preservation architecture must: support relationships between institutions, allow questions of preservation to be handled separately from questions of access, be built modularly, using existing technology and efforts where possible, be able to be assembled over time, rather than needing to be built all at once, be upgradable in pieces, without disrupting the whole system, and be specified using broadly acceptable protocols Source: “Plan for the National Digital Information Infrastructure and Preservation Program” (p. 52)

Congressional Approval for Next Phases Source: New York Times, February 13, 2003 (p. E7)

Next Steps for Develop working prototype of the technical architecture 2.Continue to build the preservation network through additional workshops and other outreach 3.Recruit and select pilot projects, applying criteria agreed upon in plan 4.Create strong learning feedback loops between work of the technical team and pilot projects by creating a learning interface and infrastructure

Preliminary Architectural Proposal for Long-term Digital Preservation (via Clay Shirky) System design by Bob Spinrad

Argument to Congress Digital data loss is huge and growing exponentially The problem will not solve itself Solving the problem will be very hard-- but not impossible It must be solved for national continuity

People Above Institutions In Between Bits Below

Design Principles First, do no harm Humility in the face of change – Legal – Economic – Cultural – Technological

Design Principles II Modular approach Minimal requirements at each level Implementation-agnostic The system should never be optimized Survive the first migration Piggyback on existing work – OAIS/W3C/IETF/Digital Libraries

Caveats 30,000 Foot View Iterative Process Collaborative work

Interfaces Collections Gateways Repositories 4 Layers Between People and Bits

Up

Repositories

Repository Characteristics Don't know about patrons, Interfaces, or even Collections Don't know about contents, only numbers Intentionally “stupid”

Gateways

Gateway Characteristics Go-betweens for Collections & Repositories Bridge Collections & Repository Views –"This is a photo" vs. "These are some bits" Hold minimal metadata -- IDs and Authorizations Both Facilitator and Barrier

Collections

Collection Characteristics Semantics, not storage Where judgment & responsibility reside Served by Gateways, not Repositories Serve Interfaces, not patrons

Interfaces

Interface Characteristics Link patrons with digital assets Build on Collections Minimally defined, to allow for innovation

Down

Connection Principles Each layer is potentially opaque Dark archives can be triply opaque Each connection defined as Protocol Minimally defined – Easily audited – Easily debugged – Easily extended – Easily replaced Exit strategies

Patron and Interface Only negative definitions –Interface must not violate the terms of any Collection it accesses Conversations between Interfaces and patrons otherwise out of scope

Interface and Collection Collection presents Interface with: –What material Interface may use –In what fashion Interface presents Collection with: –Requests for material –Authorization credentials, if required OAIS/DIP

Collection and Gateway Collection presents Gateway with: –Requests for operations on bits –Authorization credentials, if required Gateway presents Collection with: –Results of its request, From 'Not Authorized' Through 'Here are the bits’

Gateway to Repository Gateway presents Repository with: Requests for operations on bits –Read, write, return –Authorization credentials, if required Repository presents Gateway with: –Results of request From 'Not Authorized' Through 'Here are the bits"

Connection Overview These matter more than any instantiation Must survive changes of hardware Must survive element design changes

Top to Bottom Comparisons Interface is:Repository is: VolatileStable Frequent useRare use Read/WriteRead-Only

Ingest Principles Intake is done by Collections Easy in/Hard out Follow OAIS SIP model

Ramifications for Ingest Bit storage is cheap Asset preservation is expensive Security is potentially very expensive Responsibility resides with Collections

Sideways

LoC Involvement Left: Owned and operated by the LoC. Right: The National Library of Latvia Center: Certified partners in National Digital Preservation Infrastructure

Principles of Horizontality Interface definitions fit on a floppy –The manual will have to go on a CD System could be built on one box Many inter-layer diagonal connections could exist. No lateral connections are required.

System as a Whole So where's the preservation layer? Preservation arises from institutions, not technology System as a whole provides tools for stewardship

The Library Cares About Everything In Red

Use Cases: System Stories Caddyshack III - The Iron Mountain Janine's American Memory The Weblogs that Changed the World

Certified Preservation Fills three characteristics – In a certified repository – Referenced by a certified agent – Listed in a certified collection

LC Scenario Framework for the Future Evolution of Digital Preservation What is Saved? Who saves? TriageTriage Congress of Libraries Libraries UniversalLibraryUniversalLibrary Library of Congress Everyone Most Important Everything LC takes the lead in collecting the most critical materials of our digital heritage LC facilitates the development of a “peer-to-peer” system of comprehensive preservation LC plays a clearinghouse role in coordinating preservation efforts that are executed mainly by other institutions

Open Architectural Issues One too many layers for any given problem – Right number of layers for all problems Separating semantics from presentation will be very difficult How to handle dynamic materials? – Self-altering content – Time-sensitive content – Volatile databases

Open Organizational Issues Questions from rights holders –Legal –Economic Moral hazard as collector of last resort Faith in the system is a critical lubricant

Conclusions Preservation is a systemic process What you sign up for is stewardship Divorce between preservation and access Elements can change quickly Connection protocols must change slowly Costs are mainly human and ongoing, not technological and upfront

Conclusions II Profound mental shift in what a library is. Architecture provides tools, not a full solution. Hardware stores, institutions preserve.