Designing Digital Libraries, Museums, Archives

Slides:



Advertisements
Similar presentations
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Advertisements

Introduction to METS (Metadata Encoding and Transmission Standard) Jerome McDonough New York University
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
METS: An Introduction Towards a Digital Object Standard Rick Beaubien Library Systems Office U.C. Berkeley.
METS: An Introduction Structuring Digital Content.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Besser--JISC Image Metadata 6/20/02 1 Image Metadata: Important Recent Activities Howard Besser UCLA School of Education & Information
From EAD to METS An overview and history of METS Rick Beaubien UC Berkeley.
3. Technical and administrative metadata standards Metadata Standards and Applications.
Besser--Planning (Brazil) 31/5/01 1 Planning to Maximize Longevity of Digital Information Howard Besser UCLA School of Education & Information
Besser--JISC Image Metadata 6/20/02 1 Image Metadata: Important Recent Activities Howard Besser UCLA School of Education & Information
Merrilee Proffitt e(X)literature / Digital Cultures Project April 2003 News from the Digital Library The Metadata Encoding and Transmission Standard; the.
Keeping the pieces together: The Role of METS in the Preservation of Digital Content Robin Wendler Harvard University Library January 16, 2005 [Men in.
Besser--NINCH-recent Preservation 12/8/01 1 Recent Digital Preservation Activities Howard Besser UCLA School of Education & Information
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
Besser--Dublin Core Metadata 2/14/02 1 Dublin Core Metadata Howard Besser UCLA School of Education & Information
Besser--CNI/JISC 6/16/00 1 Projected Changes: Prospect of digitized movies already has some mourning loss of film (SF Chronicle, 3/5/00)
Besser--Digital Longevity 9/2/00 (12/12/99) 1 Planning to Maximize Longevity of Digital Information Howard Besser UCLA School of Education & Information.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
Besser--ELO 4/6/02 1 Problems of Preserving Electronic Literature Electronic Literature Organization Howard Besser UCLA School of Education & Information.
Besser--UC Librarians Metadata 5/19/00 1 Introduction to Metadata for Digital Libraries Howard Besser UCLA School of Education & Information
Introduction to Metadata for Digital Asset Management
1 New Roles in Digital Libraries... José Borbinha National Library of Portugal Direction of Services for Innovation and Development.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Besser--SfS-Hague 18/10/02 1 Building a Digital Future: Sustainable, Interoperable, Accessible Repositories Howard Besser NYU Archiving and Preservation.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
SfS-Getty, 4/25/03 Digital Longevity Howard Besser
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
An Introduction to METS Morgan Cundiff Network Development and MARC Standards Office Library of Congress Metadata Encoding and Transmission Standard.
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
Besser--TextOneZero 5/22/01 1 The New Information Environments: Helping content persist over time Howard Besser UCLA School of Education & Information.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
Besser--VALA 2/8/02 1 Moving from Isolated Digital Collections to Interoperable Digital Libraries VALA 2002 Conference Howard Besser UCLA School of Education.
Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Introduction to metadata
Besser--SfS-Hague 18/10/02 1 Building a Digital Future: Sustainable, Interoperable, Accessible Repositories Howard Besser NYU Archiving and Preservation.
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Building A Repository for Digital Objects
Introduction to Metadata
VI-SEEM Data Repository
An Overview of MPEG-21 Cory McKay.
Metadata for research outputs management
Implementing an Institutional Repository: Part II
Metadata for preservation
Metadata to fit your needs... How much is too much?
UCLA School of Education & Information
Metadata in Digital Preservation: Setting the Scene
An Open Archival Repository System for UT Austin
Oya Y. Rieger Cornell University Library May 2004
Some Options for Non-MARC Descriptive Metadata
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Image Metadata Summary of 4/18/99 NISO/DLF Image Metadata Meeting
Introduction to METS (Metadata Encoding and Transmission Standard)
Presentation transcript:

Designing Digital Libraries, Museums, Archives Howard Besser NYU Archiving and Preservation Program and Library Senior Scientist http://www.tisch.nyu.edu/preservation http://www.gseis.ucla.edu/~howard

Designing Digital Libraries, Museums, Archives- Models for Digital Repositories Importance of Metadata Standards & Philosophies Introduction Discovery Metadata: The Dublin Core Administrative & Structural Metadata; Digital Object Standards (METS) Fitting METS in--Content Management Content Format Standards (Images) Longevity & Preservation Repositories Other Elements Actors Metadata Preserving Electronic Art...

Columbia Library

Using Online Catalog for…

Library Workstations for…

Models for Digital Repositories

From Digital Collections to Digital Libraries, Museums, and Archives No longer merely experiments Adhere to our fields’ traditions (access, interoperability, sustainable, privacy, …) Provide services

To respond to our needs for both Service & Traditions, we face the challenges of: Access (discovery) Sustainability (longevity)- Interoperability-

Serious Longevity Problems What we know from prior widespread digital file formats Images separating from their metadata Inaccessibility of software needed to view an image Inability to even decode the file format of an image

Traditional Digital Repository Model DL DL DL DL search & presentation search & presentation search & presentation search & presentation user user

Ideal Digital Repository Model DL DL DL DL search & presentation user user

Importance of Metadata Standards & Philosophies

For Interoperability, Repositories Need Standards (as well as Sustainability & Access) Descriptive Metadata for consistent description Discovery Metadata for finding Administrative Metadata for viewing and maintaining Structural Metadata for navigation ... Terms & Conditions Metadata for controlling access...

Why are Standards and Metadata consensus important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create applications that support this

Philosophical Metadata Decisions- Warwick vs MARC Where to put the metadata

Containers and Packages of Metadata Warwick, not MARC modular overlapping extensible community-based designed for a networked world to aid commonality btwn communities while still providing full functionality within each community

Some different schemes where Metdata is kept embedded within the object (TIFF headers) encapsulated with image (MOA2/METS) in a separate related DB maintained by same organization (OPAC) in a separate DB maintained by a separate organization (Books in Print, ratings systems)

Discovery Metadata Dublin Core - NISO Z39.85 (3/95)- CBIR (ongoing)

Dublin Core--further work Warwick Framework metadata packages for extensible functions layed groundwork for RDF Canberra Qualifiers refining the semantics of the element set to provide more precise info SUBELEMENT, SCHEME, LANG Granularity no hierarchical relationships w/i a given DC record; only one record per discrete object (collection or item-level), and relationship field plus qualifier links them

The Research Process and Functional Categories of Metadata Discovery Retrieval Collation Analysis Re-presentation

Metadata Mapping Crosswalks Resource Description Framework (RDF) Open Archives & metadata harvesting

Crosswalks mapping btwn differing metadata structures eliminate the need for monolithic, universally adopted standards focus on flexibility and interoperatiblity RDF-based metadata registries

Crosswalk Example

Resource Description Framework (RDF, spec released 2/99) W3C Metadata activity designed to move the Web beyond simple links to semantically-rich relationships btwn resources metadata application using XML as a common syntax for exchange and processing flexible architecture for managing diverse application-specific metadata packets that can be processed by machines associates resources, property types, and corresponding values http://www.w3.org/RDF/

RDF Resources (character strings, names, digital objects) Property (“is the author of”) Value resources+properties=relationships many different relationships can be reflected

XML-encoded RDF <?xml:namespace ns=http://www.w3.org/RDF/RDF prefix="RDF" ?> <?xml:namespace ns=http://purl.oclc.org/DC/ prefix="DC" ?> <RDF:RDF> <DC:Creator>Howard Besser</DC:Creator> </RDF:Description> </RDF:RDF>

Open Archives & metadata harvesting

Standardized Digital Objects METS Metadata Encoding & Transfer Syntax (slides courtesy of Bernie Hurley, UCB Library Chief Scientist)

Structural & Administrative Metadata Not enough to merely capture still images (book example) Must capture Behaviors

What is a “Digital Object?” Combined Digital Content & Metadata Digital Content Digitized materials -- photographs, page images from a book, maps, digitized audio or video… Born Digital – GIS maps, digitally captured audio or video, numeric datasets (census files, scientific dataset), Web sites… Metadata Descriptive Administrative Structural Behavior 21

What is METS? An XML Schema that is used to Encode all the Content and Metadata for a Digital Object The relationships between content and metadata are also captured METS Object -- METS Document A METS Document can be A single file with all content & metadata A “hub document” that points to content and metadata A combination of the above

Uses of METS Transfer Syntax Functional Syntax Archiving Syntax Standard for transmitting/ exchanging digital objects. SIP (Open Archival Information Systems Reference Model) Functional Syntax basis for providing end users with the ability to view and navigate digital content and its associated metadata DIP Archiving Syntax standard for archiving digital objects. AIP

Why Is METS Important? Interoperability Scalability Preservation Share objects between digital library systems Allow a DL to work with objects from other repositories Scalability Same software can be used to index, navigate and display different content types E.g., book, diary, scrapbook, music score, etc. Preservation Aids Migration Strategies

History of METS Originates in Making of America II Initiative Making of America II (MOA2) was a NEH funded Digital Library Federation initiative started in 1997. Participants included UC Berkeley (lead), Stanford, Penn State, Cornell, and NYPL. GOAL: to create a digital object standard for encoding structural, descriptive and administrative metadata along with primary content RESULT: MOA2.DTD (an XML DTD) Adopted by UC Libraries

History of METS (cont’d) Concerned Parties Meet at NYU in February, 2001 to Discuss Future of MOA2 Additional needs emerge Support for time-based content More flexibility in Descriptive and Administrative metadata Outcome MOA2 revised & renamed to METS Outcome: mets.xsd is endorsed by DLF METS Governance Structure Editorial Board, Jerry McDonough is Chair RLG coordinates editorial board activities Library of Congress is the Maintenance Agency for METS

A Partial List of Organizations that Plan to Use METS California Digital Library UC Berkeley Library of Congress (A/V project) Harvard NYU Stanford MIT MetaE (Metadata Engine Project: R&D project funded by the European Commission) British Library

Display of METS Objects

How Does METS Work? METS uses XML to 1) Identify the digital pieces (files) that together comprise a digital object Scrapbook: Digitized pages, photographs, newspaper clippings, digital audio, etc. 2) Specify the location of these pieces Are we pointing to these files? Are they embedded in the METS document? A combination of the above?

Express structural relationships between: [Think of the “structure” as a “Table of Contents”] Content files Links the proper content files to the TOC entry for the scrapbook’s cover, page1, page2, the photo on page20, the DVD on page 50, etc. Descriptive Metadata (DM) Links the proper DM entries to the TOC, so you can have separate DM entries for the scrapbook, photos, audio DVDs… Administrative Metadata (AM) Links AM entries to the TOC or to files (e.g., links rights MD to a photo, Tech. MD to a group of files) Behaviors Links the proper behaviors to TOC entries (e.g., links program to run the audio to the DVD TOC entry)

Anatomy of METS METS METS Header Descriptive Metadata Admin. Metadata (Optional) Descriptive Metadata (Optional) Admin. Metadata (Optional) File Inventory (Optional, but typical) Structural Map (Required) Behavior Metadata (Optional)

1. METS Header Records Administrative Metadata about the METS Document itself, such as Author/agent & agent role E.G., UC Berkeley Library as custodian Alternate identifiers for METS document Creation and update dates and times Status

2. Structural Map Section(s) Specifies the Structure of the Digital Object as a Hierarchy of Division (div) Elements Division (type=“scrapbook”) Division (type=“page”) Division (type=“photo”) Division (type=“digital audio file”) Division (type=“letter”) Division (type=“newspaper clipping”)

3. File Section Records all of the Files that Together Comprise the Content of the Digital Object Files may be internal or external to the METS document (or both) Files are organized into File Groups based on format (tiff, hi-res jpeg, med-res jpeg, gif, etc) Files are linked to the Structural Map

3. File Section (cont.) Scrapbook Example (a complex object) 100 Digitized pages with text entries Three images per page (GIF, JPEG, TIFF) Transcribed text for each page Photos and newspaper clippings attached to the pages Envelopes glued to the pages that hold Letters & cards DVDs

4. Descriptive Metadata Section(s) METS can Record all of the Units of Descriptive Metadata Pertaining to the Digital Object Multiple Descriptive Metadata Sections can Exist in a METS Document Descriptive Metadata could take any form E.g., a MARC or Dublin Core record, Finding Aid May be Internal or external to the METS document (or both)

5. Administrative Metadata Section(s) 4 Flavors of Admin. Metadata Per Section Technical metadata Source Metadata Rights Metadata Digital Provenance Metadata Admin. Metadata may be Internal or external to the METS document (or both) Linked to files or file groups, or the structural map

6. Behavior Section Behavior Sections Identity Software that can be used with the Digital Object, or its Parts E.g., Software to View the Complex Digital Object which is the Scrapbook; Software to listen to the DVD A Behavior Unit May Contain: A reference to an external interface definition that defines a set of related behaviors A reference to an external executable that implements these behaviors A reference to the Division or Divisions of the object structure to which the behaviors apply.

Some Characteristics of the New Information Environment Increased Quantity of Information With the Web, everyone can become a publisher Varying level of quality Digital Libraries Need to Work With New Classes of Information Web Pages, Museum Artifacts, GIS, Statistical Information, etc. 10

Characteristics of the New Information Environment (Cont.) Information is Decentralized Distributed repositories Information is in Proprietary Formats Everyone has their own method of creating a digital book, journal, manuscript, Etc. How Do We Cope????

Defining Digital Libraries in the NIE A Series of Collaborating Services & Systems that Allow for the Discovery, Display, Maintenance and Preservation of Complex Digital Objects The Traditional ILS Created to manage physical materials Almost all metadata is descriptive (e.g., MARC) Digital Libraries Created to manage complex digital objects New types of metadata (administrative, structural, etc.) New Services (content management, digital preservation)

Complex Digital Objects Scrapbook Example Digitized pages with text entries Photos and newspaper clippings attached to the pages Envelopes glued to the pages that hold Letters & cards DVDs The Scrapbook has Multiple material types (text, image, audio) Structure (e.g., like a table of contents) Internal Relationships The DVD on page 5 is linked to the file that is the DVD content and to its descriptive metadata

“A Series of Collaborating Services” Content Management Systems (CMS) Create & maintain complex digital objects Preservation Repositories Long-term retention of digital objects Access Systems & Integration Global Access Portals Subject Access Portals Material Type Portals

How Can These Systems Collaborate? Via “Standardized Digital Objects” A means to “wrap-up” a digital object and send it to another system or repository Same idea as MARC, but for entire digital objects E.g., A CMS sending a digital object to a Preservation Repository The METS Digital Object Standard Metadata Encoding and Transmission Standard

Illustrative Digital Library Services Diagram Global Access Portal Material Type Portal [books] Material Type Portal [images] Material Type Portal [fossils] METS Content Management METS Content Management Preservation Repository Preservation Repository METS

Content Management Systems

Content Management Systems Used to… Create and edit digital objects Import & export digital objects Manage objects (acquire, inventory, validate) Content Management Systems will Vary Depending on the Materials they Support Metadata schemes will vary Descriptive Metadata MARC/MODS/Dublin Core for Books Code books for numeric datasets Administrative Metadata Images, audio, test, etc.

Content Format Standards (Images)

Images- Content Format & Best Practices Identification/Provenance Technical Imaging metadata Special discovery & descriptive metadata

Best practices Use/Users/Collection: Benchmarking Masters vs. Derivatives Scanning- Administrative Metadata- Structural Metadata-

Scanning Best Practices Think about users (and potential users), uses, and type of material/collection Scan at the highest quality that does not exceed the likely potential users/uses/material Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery Many documents which appear to be bitonal actually are better represented with greyscale scans Include color bar and ruler in the scan Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct) Don’t use lossy compression Store in a common (standardized) file format Capture as much metadata as is reasonably possiple (including metadata about the scanning process itself)

Why Scale is important

Identification/Provenance (Images)- The number of variant forms of a work can be enormous Image Families A digital image frequently has many layers of parentage Information about the parentage that can indicate the quality and veracity of the image (Dublin Core "Source" and "Relation") how to deal with different versions derived from the same scan or different encoding schemes Vocabulary Standards to express this

The number of variant forms of a work can be enormous different views of the same object different scans of the same photo different resolutions different compression schemes different compression ratios different file storage formats different details of the same image ...

Image Families

Identification/Provenance how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF) Vocabulary Standards to express this VRA Surrogate Categories CIMI's "Image Elements”

Incorporate parts of Functional Requirements for Bibliographic Records (FRBR) work expression manifestion item (and push into “change history” section of Technical Image Metadata)

NISO/DLF Technical Image Metadata Workshop--4/99 (Z39.87-2002 draft) create metadata needed to manage images in digital repositories over long periods of time (full life-cycle mgmt) document image provenance & history ensure that the images will be rendered accurately on any output device

Technical Image Metadata Focus on Metadata that may prove helpful for management use preservation ...

Technical Image Metadata In Scope still, bit-mapped pictorial images scanned/reformatted images (+ born digital)

Technical Image Metadata Out of Scope vector images moving images images of OCR-able text structural and hierarchical relationships between images rights management, terms of use (authenticity/security)

Technical Image Metadata Technical Image Metadata-Z39.87 Image parameters (MIME type, compression, colorspace & profile, …) Image Creation (source, capture info, etc.) Image performance assessment (sampling, colormap, whitepoint, target data, etc.) Change history (source, processing, etc.)

Technical Image Metadata Technical Image Metadata-Z39.87 additional XML implementation schema (MIX)

Other Metadata Description of depiction/surrogate (What VRA calls its "Surrogate Categories") Description of original object Rights and Reproduction Information Location Information VRA Core, LCSH, TGM, AAT, ULAN, TGN, DOI, <indecs>, ...

Longevity & Preservation Repositories

Digital Preservation- The Problem Preservation Repositories Preservation Metadata Other Digital Preservation Activities Special concerns of Cult Heritage community

Serious Longevity Problems What we know from prior widespread digital file formats Previous formats required little ongoing intervention (remote storage facilities, Iron Mtn); digital formats require intense ongoing management The Short Life of Digital Info-

The Short Life of Digital Info: Digital Longevity Problems Disappearing Information The Viewing Problem The Scrambling Problem The Inter-relation Problem The Custodial Problem The Translation Problem

The Viewing Problem Digital Info requires a whole infrastructure to view it Each piece of that infrastructure is changing at an incredibly rapid rate How can we ever hope to deal with all the permutations and combinations

The Scrambling Problem Dangers from: Compression to ease storage & delivery Container Architecture to enhance digital commerce

The Inter-relation Problem -Info is increasingly inter-related to other info -How do we make our own Info persist when it points to and integrates with Info owned by others? -What is the boundary of a set of information (or even of a digital object)?

The Custodial Problem In the past, much of survival was due to redundancy How do we decide what to save? Who should save it? Mellon-funded E-Journal Archives How should they save it?-

The Custodial Problem: How to save information? Methods for later access Refreshing Migration Emulation Issues of authenticity and evidence

The Translation Problem Content translated into new delivery devices changes meaning -A photo vs. a painting -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format? Behaviors

Older Longevity Projects http://sunsite.berkeley.edu/Longevity/ CPA Task Force Getty “Time & Bits” Conference & Follow-ups- Preservation experiments in US and Europe NEDLIB, CURL, Michigan Internet Archive Long Now

Preservation Repositories: Projects based on OAIS Model CEDARS NEDLIB Pandora CDL OCLC/RLG Working Group on Preservation Metadata, Attributes of a Trusted Digital Repository, August 2001-

Preservation Metadata OCLC/RLG Working Group on Preservation Metadata, Preservation Metadata for Digital Objects: A Review of the State of the Art, January 31 2001 OCLC/RLG Working Group on Preservation Metadata, A Recommendation for Content Information, October 2001

Preservation Repositories: Open Archival Info System Model OAIS Repository (Repository Administration) DIP Consumer SIP Producer AIP Management

Preservation Repositories: Open Archival Info System Model High-level reference model describing submission, organization and management, and continuing access Conceptual framework for different organizations to share discussions with a common language Producers, consumers, management, actual repository SIP, DIP, AIP AIP consists of data objects plus representation info (Content, Preservation Description, Packaging, Descriptive) Originally developed for Space Science community

Preservation Repositories -- AIP Metadata Preservation Description Info reference info context info provenance info fixity info Packaging Info Descriptive Info Content Info

OCLC/RLG Digital Repository Attributes Administrative responsibility Organizational viability Financial sustainability Technological suitability System security Procedural accountability

OCLC/RLG Selected Recommendations Policies, Certification processes, Risk management, Persistent ID, Migration/Emulation experiments Stakeholders meet to decide how to describe what is in a dig repository Examine special properties of particular classes of digital objects Technical standards for exchange and interoperability btwn repositories Develop projects and case studies Copyright issues

Other Digital Preservation Activities- LC Natl Dig Info Infrastructure & Preservation InterPARES Emulation Projects E-Journal Archiving ERPANET Persistent Naming

LC’s National Digital Information Infrastructure and Preservation Program Authorized Dec 2000 LC, Dept of Commerce, NARA, White House Office of Sci & Tech Policy with help from CLIR, NLM, NAL, OCLC, RLG Ongoing collab process Commissioned papers on preserving: the Web, periodicals, digital sound, E-Books, Digital TV, Digital Video

InterPARES International Research on Permanent Authentication Records in Electronic Systems Ongoing international archival world project examining how to make electronically-generated records last over time Developing the theoretical and methodological knowledge needed, then will formulate model policies, strategies, and standards In 2003 was extended to include images and rich media

Emulation Projects CAMiLEON (Michigan/Leeds) NEDLIB

E-Journal Archiving Issues Mellon funded projects (2001) License, don’t own; may not be even able to obtain right to make archival copy Increasingly no paper back-up at all Usually we don’t have the important redundancy factor Mellon funded projects (2001) Yale, Harvard, Penn working w/individual publishers Cornell, NYPL--specific disciplines MIT exploring characteristics that change (dynamic)\ Stanford--archiving software tools

Electronic Resource Preservation and Access NETwork (ERPANET) Best practices and skills development for digital preservation of cultural heritage and scientific objects 3 year project launched Nov 2001; 1.2 million Euros

Persistent Naming URNs Handles PURLs Re-directs

Other Elements- Actors Metadata Other Metadata Preserving Electronic Art

Reference Models for Digital Libraries: Actors and Roles http://www.delos-nsf.actorswg.cdlib.org/ DELOS/NSF Working Group Reference Models for Digital Libraries: Actors and Roles

NSF/DELOS Actors/Roles Project Classes of Actors, including Persons Organizations automata Roles & implications Production Dissemination Management use

Multimedia & Collaborative Authorship imply Not only: Authors Editors Publishers But also creators of Text Illustrations Composers Musicians...

And goes beyond conventional authors Others that are part of digital library process Users Catalogers Reference librarians Even other groups/entities Software agents Mediators Special rights holders...

Borbinha’s “naive tentative sketch” of the problem... User Registered Anonymous Librarian Agent Creator Editor Distributor Preservation Publication Licensing Acquisition Registration Dissemination Search Digital Library Access

Benefits for Linking metadata to authority records Rights management Privacy protection

Deliverables Workshop proceedings: proceedings with invited contributions and papers selected from a call, intended to be a reference source for the current state of the art. White paper: Definition and introduction to the problem. Description and analysis of the requirements. A proposal to the community for a reference model, focusing on definitions of key concepts, terminology, classes of agents, services, relationships, etc. Proposals for an international agenda for further technical and collaborative developments.

Core group DELOS (Europe) José Borbinha, National Library of Portugal (DELOS coordinator) Michel Mabe, Elsevier Science, UK (Publishing industry) Peter Mutschke, Social Science Information Centre, Germany (Software agents, Information Retrieval) Hans-Jörg Lieder, Berlin State Library, Germany (LEAF project) Gunnar Karlsen, University of Bergen, Norway (Archives) WIPO – World Intellectual Property Organisation Glenn Macstravic NSF (USA) John Kunze, University of California, USA (NSF coordinator) Barbara Tillett, Library of Congress, USA (Libraries) Becky Dean, OCLC, USA (Libraries services) Angela Spinazze, CIMI/RLG, USA (Museums) Howard Besser, University of California, USA (Multimedia and digital art production) DCMI - Dublin Core Metadata Initiative Warwick Cathro, National Library of Australia

Work plan Phase 1: Starting (March - April 2002) Tuning objectives, scope, and action plan Identification of reference sources Call for contributions to the workshop Phase 2: Internal Discussion (May - June 2002) Analysis of the problem Draft paper Phase 3: Public Discussion (July - October 2002) Expose the draft paper. Promote open public discussion Workshop in Portugal (July 3-5). Workshop report Draft paper (second version) Phase 4: Conclusions (November - December 2002) Review of the work done... Final report

... Actors and Roles ???

Data Structures: The VRA Core 28 elements specifically for visual resource collections Work Description Categories- Visual Document Description Categories- http://www.oberlin.edu/~art/vra/dsc.html

VRA Core: Work Description Categories Work type Title Measurements Material Technique Creator Role Date Repository name Repository place Repository number Current site Original site Style/period/group/movement Nationality/culture Subject Related work Relationship type Notes

VRA Core: Visual Document Description Categories Visual document type Visual document format Visual document measurements Visual document date Visual document owner Visual document owner number Visual document view description Visual document subject Visual document source

Data Value Metadata (vocabularies) LCSH TGM AAT ULAN TGN VRA Core

LCSH very general

Thesaurus for Graphic Materials designed for subject indexing of pictorial materials, particularly large general collections of historical images for cataloging and retrieval good for general audiences and broad approaches to the material TGM-I: Subject Terms & TGM-II: Genre and Physical Characteristic Terms http://lcweb.loc.gov/rr/print/tgm/toc.html

AAT 120,000 terms for describing objects, textual materials, images, architecture, and material culture from antiquity to present large and complex http://www.getty.edu/gri/vocabularies/

ULAN name authority http://www.getty.edu/gri/vocabularies/

Thesaurus of Geographic Names over 1 million records hierarchical and global throughout history most records include coordinates and descriptive notes

Metadata for Digital Commerce DOI <indecs>-

<Indecs> formal structure for describing and uniquely identifying intellectual property itself, the people and businesses involved in its trading, and the agreements which they make about it (primarily for publishing, music, and visual arts) will develop high-level specifications for the services that will be required to implement a global IP trading system based on this <indecs> generic data model focus is on encoding rights at a high level, not on resource discovery likely to involve metadata schma registration and directory to allow interoperation of personal identifiers for rightsholders and users supported by EEC DG-13 First meeting July 1999 http://www.indecs.org/

What’s special about Cult Heritage Materials? Images & rich media Inter-relationships btwn parts For Contemporary Art: What is the Work?-

LeWitt: Wall Drawing 340

Installing LeWitt

LeWitt Install Directions

Complexity of Rich Media Works often have artistic nature (including video games) Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact) Too complex to save every one of these aspects for every type of material Importance of saving documentation

What can we do specific to Electronic Art? Works themselves may no longer even exist; in many cases, what we can save amounts to forensic evidence Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact) Too complex to save every one of these aspects for every type of material Importance of saving pieces, representations, and documentation Involve the artists to capture their intentions Importance of Standards Familiarize ourselves with recent conservation developments (Who Knows?, TechArcheology, Tate, IMAP)

Standards for encoding artists intentions (group efforts w/i Cult Heritage community) Artists Interviews Project, Netherlands Institute for Cultural Heritage 1998-1999, Modern Art: Who Cares (http://www.icn.nl/english/6.4.2.html) TechArcheology: A Symposium on Installation Preservation (SFMOMA) More recent SFMOMA/Tate collaborations IMAP Guggenheim’s Variable Media

Structural Metadata Standards for Encoding Multimedia- (no time for details) SMIL MPEG 4

A few questions our community should address Special issues raised by non-library institutions Special issues raised by images and rich media What is the work (or salient points we need to preserve)? Bring the arts communities (artist intent, BAVC) together with the preservation repository communities and the preservation metadata communities Specifically get Cult Heritage communities involved with the selected OCLC/RLG recommendations Get cult heritage groups started on working to make sure that structure standards incorporate our works What organizations will take responsibility to save today’s digital “ephemeral” materials (online ‘zines, arts discussion groups, etc.)?

Digital Repository Traditions & Services require Sustainability Interoperability Access And all of these require Standards and Metadata

Building a Digital Future: Sustainable, Interoperable, Accessible Repositories Howard Besser, NYU Archiving & Preservation Program Bernie Hurley, UC Berkeley Library http://www.firstmonday.dk/issues/issue7_6/besser/ Baca, Murtha (ed). Introduction to Metadata, Los Angeles: Getty Information Institute, 1998 http://www.getty.edu/gri/standard/intrometadata/ http://www.gseis.ucla.edu/~howard/Metadata/UC-May00/ http://sunsite.berkeley.edu/Metadata/sp2000.html http://sunsite.berkeley.edu/Longevity/ http://www.oclc.org/digitalpreservation/presmeta_wp.pdf http://is.gseis.ucla.edu/us-interpares/ http://www.niso.org/commitau.html http://www.ifla.org/II/metadata.htm METS official site: http://www.loc.gov/standards/mets UC Libraries Systemwide Operations and Planning Advisory Group (SOPAG) Site http://www.slp.ucop.edu/sopag/ for the UC Digital Preservation & Archiving Committee Final Report, the Access Integration Model white paper and the Library Services Privacy report