Presentation is loading. Please wait.

Presentation is loading. Please wait.

Besser--TextOneZero 5/22/01 1 The New Information Environments: Helping content persist over time Howard Besser UCLA School of Education & Information.

Similar presentations


Presentation on theme: "Besser--TextOneZero 5/22/01 1 The New Information Environments: Helping content persist over time Howard Besser UCLA School of Education & Information."— Presentation transcript:

1 Besser--TextOneZero 5/22/01 1 The New Information Environments: Helping content persist over time Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard

2 Besser--TextOneZero 5/22/01 2 The New Information Environments: Helping content persist over time-  What the Movie Industry is learning  Major Issues Facing Digital Projects  The Short Life of Digital Info  Whose working on these problems?  Important Planning Considerations

3 Besser--TextOneZero 5/22/01 3 What the Movie Industry is learning _ Repurposing is a key part of future business models _ The products they sell will be an integral part of a larger infrastructure and a larger set of informational products

4 Besser--TextOneZero 5/22/01 4 What this implies _ Must save digital content over very long periods of time (much longer than backlists) _ Digital content must be designed to interoperate with other digital content coming from other publishers/vendors (Age of the stand-alone book are gone) _ Publishers need to seriously worry about –Longevity Issues –Standards

5 Besser--TextOneZero 5/22/01 5 Major Issues Facing Digital Projects  Changes in Intellectual Property Law  Intellectual Access  Storage  Delivery  Integration with other tools  Interoperability

6 Besser--TextOneZero 5/22/01 6 Serious Longevity Problems _ What we know from prior widespread digital file formats _ Images separating from their metadata _ Inaccessibility of software needed to view a complex work _ Inability to even decode the file format of a work

7 Besser--TextOneZero 5/22/01 7 The Short Life of Digital Info: Digital Longevity Problems-  Disappearing Information  The Viewing Problem  The Scrambling Problem  The Inter-relation Problem  The Custodial Problem  The Translation Problem

8 Besser--TextOneZero 5/22/01 8 The Viewing Problem  Digital Info requires a whole infrastructure to view it  Each piece of that infrastructure is changing at an incredibly rapid rate  How can we ever hope to deal with all the permutations and combinations

9 Besser--TextOneZero 5/22/01 9 The Scrambling Problem Dangers from:  Compression to ease storage & delivery  Container Architecture to enhance digital commerce

10 Besser--TextOneZero 5/22/01 10 The Inter-relation Problem  -Info is increasingly inter-related to other info  -How do we make our own Info persist when it points to and integrates with Info owned by others?  -What is the boundary of a set of information (or even of a digital object)?

11 Besser--TextOneZero 5/22/01 11 The Custodial Problem  In the past, much of survival was due to redundancy  How do we decide what to save?  Who should save it?  Mellon-funded E-Journal Archives  How should they save it?-

12 Besser--TextOneZero 5/22/01 12 The Custodial Problem: How to save information?  Methods for later access  Refreshing  Migration  Emulation  Issues of authenticity and evidence

13 Besser--TextOneZero 5/22/01 13 The Translation Problem  Content translated into new delivery devices changes meaning – -A photo vs. a painting – -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format? – Behaviors

14 Besser--TextOneZero 5/22/01 14 Still another problem: Layers of rights _ eg. recent electronic versions of art books have been released with most of the art missing!

15 Besser--TextOneZero 5/22/01 15 Pieces of the Solution (1/2)  -We need to insist upon clearly readable standardized ways for digital objects to self- identify their formats  -We need to standardize on fewer file formats  -We should discourage scrambling  -We need to better understand information inter- relates to other Info, and what constitutes “boundaries” of Info objects

16 Besser--TextOneZero 5/22/01 16 Pieces of the Solution (2/2)  -People and organizations wishing to make information persist need guidelines of how to go about doing it  -We need to better understand how translating from one storage or display format to another affects the meaning of a work  -We need to save the “behaviors” of a digital object, not just its “contents”  -Supporting strong Copyright legislation can come back to bite us

17 Besser--TextOneZero 5/22/01 17 Conceptual Approaches to Digital Preservation _ Refreshing always necessary due to volatility of physical strata –Impact on evidential value _ Migration -- advantages & disadvantages _ Emulation -- advantages & disadvantages

18 Besser--TextOneZero 5/22/01 18 To deal with Immediately- _ Persistent IDs _ Metadata

19 Besser--TextOneZero 5/22/01 19 Persistent IDs--the Problem _ Need to separate work ID from work location _ URNs probably won’t be ready until 2003 _ Becomes a business process issue when one organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)

20 Besser--TextOneZero 5/22/01 20 More Persistent IDs --the Approach for today _ PURLs _ Handles _ HTTP redirects _ And worry about costs now and conversion costs when URNs become feasible

21 Besser--TextOneZero 5/22/01 21 Data Set Management More issues with referencing IDs _ References for mirror sites _ References for back-up sites when main site is down or bottle-necked _ References for off-site copies and archival copies

22 Besser--TextOneZero 5/22/01 22 Metadata can be the first line of defense  Can tell you – where the file is (if you can’t find the file) – where more info about the file is (if you have the file but most other metadata has become separated) – what the file format is – what the compression scheme is – what application program and version is needed for the file

23 Besser--TextOneZero 5/22/01 23 Metadata Encoding _ XML Mark-up _ Structural & Administrative Metadata -- http://sunsite.berkeley.edu/moa2 _ File Name management

24 Besser--TextOneZero 5/22/01 24 Groups Working on the Big Problem http://sunsite.berkeley.edu/Longevity/  CPA Task Force  Getty “Time & Bits” Conference & Follow-ups-  Emulation experiments in US and Europe  NEDLIB, CURL, Michigan  Mellon Journal Archiving experiments  Internet Archive  Long Now

25 Besser--TextOneZero 5/22/01 25 Time & Bits

26 Besser--TextOneZero 5/22/01 26 Time & Bits Participants  Steward Brand  Howard Besser  Brian Eno  Danny Hillis  Peter Lyman  Brewster Kahle  Kevin Kelly  Jaron Lanier  Doug Carlston  John Heilemann  Ben Davis  Margaret MacLean  Bruce Sterling  Paul Saffo

27 Besser--TextOneZero 5/22/01 27 Groups Working on Pieces of the Big Problem http://sunsite.berkeley.edu/Longevity/  Internet Archive  Long Now  Emulation experiments in US and Europe  NEDLIB, CURL, Michigan  Mellon Journal Archiving experiments

28 Besser--TextOneZero 5/22/01 28 Important Planning Considerations  File Formats  Choosing Interoperable Systems  Adhere to standards  Vendors with large installed base  Refreshing and/or Migration

29 Besser--TextOneZero 5/22/01 29 Key Considerations for Imaging Projects-  Users' Needs  Image Quality  Intellectual Property  Standards  Topology  Tools & Processes

30 Besser--TextOneZero 5/22/01 30 Key Considerations for Imaging Projects (1 of 3)  Users' Needs – Quality of Digital Surrogate – Interoperable desktop applications  Image Quality – Archival – Current online delivery

31 Besser--TextOneZero 5/22/01 31 Some nuts-and-bolts Planning Considerations  Think about users (and potential users), uses, and type of material/collection  Scan at the highest quality that does not exceed the likely potential users/uses/material  Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery  Many documents which appear to be bitonal actually are better represented with greyscale scans  Include color bar and ruler in the scan  Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)  Don’t use lossy compression  Store in a common (standardized) file format  Capture as much metadata as is reasonably possiple (including metadata about the scanning process itself)

32 Besser--TextOneZero 5/22/01 32 Howard Besser UCLA School of Education & Information  http://sunsite.berkeley.edu/Longevity/  http://www.gseis.ucla.edu/~howard  http://sunsite.berkeley.edu/moa2  http://lockss.stanford.edu  http://www.longnow.com/10klibrary/TimeBitsDisc/  http://www.archive.org/ The New Information Environments: Helping content persist over time

33 Besser--TextOneZero 5/22/01 33

34 Besser--TextOneZero 5/22/01 34 Architecture: Separating Longevity and Delivery Servers Berkeley Longevity Server Berkeley Delivery Server Other Delivery Server Other Delivery Server Other Delivery Server User

35 Besser--TextOneZero 5/22/01 35 Journal Archiving _ License, don’t own; may not be even able to obtain right to make archival copy _ Increasingly no paper back-up at all _ Usually we don’t have the important redundancy factor _ Stanford’s LOCKSS Project (Lots of Copies Keeps Stuff Safe) and its problems (http://lockss.stanford.edu)


Download ppt "Besser--TextOneZero 5/22/01 1 The New Information Environments: Helping content persist over time Howard Besser UCLA School of Education & Information."

Similar presentations


Ads by Google