Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital information lasts forever or five years, whichever comes first. —Jeff Rothenberg.

Similar presentations


Presentation on theme: "Digital information lasts forever or five years, whichever comes first. —Jeff Rothenberg."— Presentation transcript:

1 Digital information lasts forever or five years, whichever comes first. —Jeff Rothenberg

2 Digital information lasts forever, or five years, whichever comes first. —Jeff Rothenberg The average lifespan of info on the Web? 44 days. —Brewster Kahle, Internet Archive

3 The Digital Dark Age New generations of software and platforms make old data unreadable –“Migration” required every 10 years –Old data grows in value –Almost no one is working on the problem, because it lives in a slower time frame RESULT: catastrophic extinction of data

4 Moore’s WallMoore’s Law

5 ©

6 T i m e s c a t t e r s e v e r y t h i n g.

7

8 T i m e s c a t t e r s e v e r y t h

9 T i m e s c a t t e r s e v e r y t h i n g.

10 Cuneiform writing circa 6th Century BCE

11 Hyakumanto Darani 00764 CE 1 million copies World’s oldest printed text

12 ROSETTA STONE Created 00196 BCE Found 01799 CE Deciphered 01822 CE

13

14

15

16 Library Of Congress

17

18 Digital Preservation: A Five-Layered Challenge Curatorial challenge: How to decide what to preserve? Technological challenge: How to deal with obsolescence of hardware and software? Intellectual property challenge: How to provide access while protecting rights of creators? Economic challenge: How to pay for it all? Organizational / institutional challenge: How to enable creation of a distributed system that works without the federal government in control?

19 Five-Stage Design Process for Phase One 1.Environmental Scan three convening sessions with about 50 leading experts on relevant issues 2.Scenario Development Workshop 30 leading experts explored different solution spaces and how they might be affected by external forces 3.Draft design of High-Level Technical Architecture 4.Pilot Projects Workshop Convening of stakeholders to explore possible experiments 5.Draft NDIIPP Plan with Recommended Path Forward for Iterative Learning

20 Preservation Architecture Design Principles The NDIIPP digital preservation architecture must: support relationships between institutions, allow questions of preservation to be handled separately from questions of access, be built modularly, using existing technology and efforts where possible, be able to be assembled over time, rather than needing to be built all at once, be upgradable in pieces, without disrupting the whole system, and be specified using broadly acceptable protocols Source: “Plan for the National Digital Information Infrastructure and Preservation Program” (p. 52)

21

22 Congressional Approval for Next Phases Source: New York Times, February 13, 2003 (p. E7)

23 Next Steps for 2003 1.Develop working prototype of the technical architecture 2.Continue to build the preservation network through additional workshops and other outreach 3.Recruit and select pilot projects, applying criteria agreed upon in plan 4.Create strong learning feedback loops between work of the technical team and pilot projects by creating a learning interface and infrastructure

24 Preliminary Architectural Proposal for Long-term Digital Preservation (via Clay Shirky) System design by Bob Spinrad

25 Argument to Congress Digital data loss is huge and growing exponentially The problem will not solve itself Solving the problem will be very hard-- but not impossible It must be solved for national continuity

26 People Above Institutions In Between Bits Below

27 Design Principles First, do no harm Humility in the face of change – Legal – Economic – Cultural – Technological

28 Design Principles II Modular approach Minimal requirements at each level Implementation-agnostic The system should never be optimized Survive the first migration Piggyback on existing work – OAIS/W3C/IETF/Digital Libraries

29 Caveats 30,000 Foot View Iterative Process Collaborative work

30 Interfaces Collections Gateways Repositories 4 Layers Between People and Bits

31 Up

32 Repositories

33 Repository Characteristics Don't know about patrons, Interfaces, or even Collections Don't know about contents, only numbers Intentionally “stupid”

34 Gateways

35 Gateway Characteristics Go-betweens for Collections & Repositories Bridge Collections & Repository Views –"This is a photo" vs. "These are some bits" Hold minimal metadata -- IDs and Authorizations Both Facilitator and Barrier

36 Collections

37 Collection Characteristics Semantics, not storage Where judgment & responsibility reside Served by Gateways, not Repositories Serve Interfaces, not patrons

38 Interfaces

39 Interface Characteristics Link patrons with digital assets Build on Collections Minimally defined, to allow for innovation

40 Down

41

42 Connection Principles Each layer is potentially opaque Dark archives can be triply opaque Each connection defined as Protocol Minimally defined – Easily audited – Easily debugged – Easily extended – Easily replaced Exit strategies

43 Patron and Interface Only negative definitions –Interface must not violate the terms of any Collection it accesses Conversations between Interfaces and patrons otherwise out of scope

44 Interface and Collection Collection presents Interface with: –What material Interface may use –In what fashion Interface presents Collection with: –Requests for material –Authorization credentials, if required OAIS/DIP

45 Collection and Gateway Collection presents Gateway with: –Requests for operations on bits –Authorization credentials, if required Gateway presents Collection with: –Results of its request, From 'Not Authorized' Through 'Here are the bits’

46 Gateway to Repository Gateway presents Repository with: Requests for operations on bits –Read, write, return –Authorization credentials, if required Repository presents Gateway with: –Results of request From 'Not Authorized' Through 'Here are the bits"

47 Connection Overview These matter more than any instantiation Must survive changes of hardware Must survive element design changes

48 Top to Bottom Comparisons Interface is:Repository is: VolatileStable Frequent useRare use Read/WriteRead-Only

49 Ingest Principles Intake is done by Collections Easy in/Hard out Follow OAIS SIP model

50 Ramifications for Ingest Bit storage is cheap Asset preservation is expensive Security is potentially very expensive Responsibility resides with Collections

51 Sideways

52

53 LoC Involvement Left: Owned and operated by the LoC. Right: The National Library of Latvia Center: Certified partners in National Digital Preservation Infrastructure

54 Principles of Horizontality Interface definitions fit on a floppy –The manual will have to go on a CD System could be built on one box Many inter-layer diagonal connections could exist. No lateral connections are required.

55 System as a Whole So where's the preservation layer? Preservation arises from institutions, not technology System as a whole provides tools for stewardship

56 The Library Cares About Everything In Red

57 Use Cases: System Stories Caddyshack III - The Iron Mountain Janine's American Memory The Weblogs that Changed the World

58 Certified Preservation Fills three characteristics – In a certified repository – Referenced by a certified agent – Listed in a certified collection

59 LC Scenario Framework for the Future Evolution of Digital Preservation What is Saved? Who saves? TriageTriage Congress of Libraries Libraries UniversalLibraryUniversalLibrary Library of Congress Everyone Most Important Everything LC takes the lead in collecting the most critical materials of our digital heritage LC facilitates the development of a “peer-to-peer” system of comprehensive preservation LC plays a clearinghouse role in coordinating preservation efforts that are executed mainly by other institutions

60 Open Architectural Issues One too many layers for any given problem – Right number of layers for all problems Separating semantics from presentation will be very difficult How to handle dynamic materials? – Self-altering content – Time-sensitive content – Volatile databases

61 Open Organizational Issues Questions from rights holders –Legal –Economic Moral hazard as collector of last resort Faith in the system is a critical lubricant

62 Conclusions Preservation is a systemic process What you sign up for is stewardship Divorce between preservation and access Elements can change quickly Connection protocols must change slowly Costs are mainly human and ongoing, not technological and upfront

63 Conclusions II Profound mental shift in what a library is. Architecture provides tools, not a full solution. Hardware stores, institutions preserve.

64 www.longnow.org www.rosettaproject.org www.digitalpreservation.gov/ndiipp


Download ppt "Digital information lasts forever or five years, whichever comes first. —Jeff Rothenberg."

Similar presentations


Ads by Google