Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata for Digital Objects

Similar presentations


Presentation on theme: "Metadata for Digital Objects"— Presentation transcript:

1 Metadata for Digital Objects
With an emphasis on preservation… Pat Galloway, SoD, 9/10/09

2 Remarks on digitization
Cost-benefit Sliver of a sliver? Or corpus? Digitization as preservation Obligation to preserve Resulting requirements for metadata

3 What is metadata? Data about data Functions? Kinds?
Database usage Web usage (metatags) Functions? Kinds? Several perspectives from which to consider metadata: orders, functions, life-cycle

4 First-order metadata: representation schemes
Encoding (ASCII, proprietary formatting schemes) Compression schemes Encryption or other intentional distortion schemes These lie at the base of digital objects and exist before the creation of the object

5 Second-order metadata
Written natural language (for example) Layout conventions Separation of words Arrangement of groups of words Punctuation, capitalization, etc. Note that this is usually considered to belong to an external standard (“English”)

6 Third-order metadata “Connections to the world” Meaning Semantics
Pragmatics

7 Fourth-order metadata
Functions What can you do with the digital object? What is its purpose? How does it work? Functionality significant for preservation Explicit digital object types

8 Fifth-order metadata Groups of digital objects Context of the group
Archival series Project files “Complex documents” Context of the group

9 More orders? Additional intermediate orders could be thought of
Depends on granularity May depend on object type

10 Classic objects of preservation in archives
Content Context Structure

11 Functional types of metadata
Administrative Descriptive (especially resource discovery) Preservation Technical Use

12 Life cycle view of metadata
Appraisal/Inventory/Scheduling Creation and versioning Transfer/Authenticity Descriptive Use Rights management Preservation and disposition

13 Attributes of metadata items
Source of metadata (internal or external) Method of metadata creation (auto or manual) Nature of metadata (lay or expert) Status (static or dynamic) Structure (structured or unstructured) Semantics (controlled or uncontrolled) Level (item or collection) Note: these attributes are relevant for all metadata)

14 Major Archival Metadata Schemes

15 University of Pittsburgh metadata reference model in six layers
Handle Terms & Conditions Structural Contextual Content Use History

16 Example: Structural Layer specifies technical details
File identification metadata File encoding metadata File rendering metadata Record rendering metadata Content structure metadata Source metadata

17 InterPARES Project Authenticity template
Documentary form Extrinsic elements Intrinsic elements Annotations Medium Context

18 Dublin Core Metadata Initiative
Supported by OCLC Primarily a surrogate/discovery metadata scheme Does not aim to document everything Useful for management of active digital objects

19 Basic Dublin Core elements
Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights

20 Dublin Core development
Initial development of simple elements Subelements and user communities Warwick Framework Qualified Dublin Core RDF and XML

21 Metadata Encoding and Transmission Standard (METS)
Developed out of LoC’s MOA project Designed to support maintenance of libraries of digital objects METS document is a “wrapper” containing pointer to the object plus its metadata Three overall types of metadata (three segments of METS document) Descriptive Administrative Structural

22 METS Descriptive metadata
External (e.g., finding aid that can be pointed to via a URL) Internal (included in the document) Can include several different metadata sets as relevant

23 METS Administrative metadata
Technical metadata Intellectual property rights metadata Source metadata (for analog source) Digital provenance metadata Relations between files Migration/transformation data

24 METS Structural metadata
File groups list Structural map (defines relations between files and METS element structure) Behavior segment (associates executable methods with specific files, e.g. for display)

25 METS and XML The METS XML schema
Why is it all so complicated? How can anyone ever keep track of all this metadata?

26 XML in 10 Points XML is for structuring XML looks like HTML
XML is text for computers XML is purposely verbose XML is a family XML is only partly new XHTML->XML XML is modular XML is base for RDF, Semantic Web XML is free, universal, supported

27 Creation Metadata

28 Metadata added at creation
By the creator By the creating application program (note: some of this is meant for system use) Example of hybrid process: creation of Word file

29 Example: Word processing

30

31

32

33

34 Digitization as creation
Preprocessing Conversion Quality control Object manipulation Surrogate outputs (see handouts)

35 Appraisal / Inventory / Retention Schedule Metadata

36 Digital Appraisal Decisions
Keep (costs of carrying into the future) Allow to Die (keep but do nothing) Repurpose (separating content and form) Destroy (microwave the disk?)

37 Digital Appraisal: What to Appraise
Content (as with paper?) Technical support System Creating application Display requirements Functionality

38 What is a Retention Schedule?
Classic record statuses: active, semiactive, inactive Keep Alter function of custodian Alter custodianship Allow to Die Leave with creator? Why not always do this? Destroy Determine when to destroy Almost always a method for reprieve exists…

39 Record-level vs Group-level Metadata
Record-level: Metadata orders 1-4 1 encoded (content) 2 written (content) 3 meaning (ontology) 4 function/purpose=type (form) Group-level: Metadata order 5 5 Object grouping schemes (categories) Record groups, record series (intellectual management) Format, security concerns (physical management)

40 Transfer / Authenticity Metadata

41 The central problem: Security guaranteeing Authenticity
Guarding the object (authenticity, integrity) Tracking the object through its lifetime Proving the identities of the people responsible for transferring the object (authentication, non-repudiation) Transferring the object in a secure way

42 What is transfer about? What is a digital copy? What qualifies?
Data compression issues Data segmentation issues Creating application vs file-management application How can a digital copy be guaranteed? Digital object as string of bits Message digest of object as math on the bits Ship the message digest with the object Recalculate and compare at the other end

43 Guaranteeing the authenticity of the object (Integrity)
Object as open or secret Must we disguise the object? Can we move it around in clear? Message digest Creates single number: “one-way hash” Number will change with the slightest change in the object on which it was calculated Encryption (Confidentiality) Asymmetric Symmetric

44 Accession Metadata

45 What is the nature of the accession task?
The object received has been uprooted from its former context Object is equipped with enough metadata to reconstruct that context Contextual metadata now is no longer functional but descriptive of the old context Object must be integrated into a new context (which may mirror the old) New functions must be provided for (meta-activities)

46 Validation of the object
Validation test suite Validation tools Formal validation process Validation outcomes Rejection Re-transfer Acceptance

47 Preparation of the object for storage
Metadata as data and as processing instructions Digital object and use copy Storage issues

48 Descriptive Metadata

49 Descriptive metadata for what?
Individual objects (Dublin Core, RDF) Books and other chunks (MARC, MODS) Multimedia objects (METS, MPEG 21) Finding aids (EAD): collection-level

50 What about the single object?
Is Dublin Core enough? What for? Who will describe at the object level? Zillions of archivists? Automatic analysis? Ad hoc analysis? Taggers on the Internet?

51 Preservation Metadata

52 What is Preservation Metadata?
Object stability (OAIS “content data object”) What elements of the object’s content should be preserved? What is it? What is it for? What functions of the object should be preserved? (i.e., how can it remain itself into the future, and what do we mean by “itself”?) Environmental support (OAIS “environment”) What kind of environmental characteristics does the object need to stay alive (software, hardware)? (i.e., how do we specify its life support system?)

53 Object Stability I: Content
Authenticity revisited: stability for what? Access to genuine article Historical truth Guarantee of prior art Intellectual property guarantee Range of attributes needed for each What does “content” mean?

54 Object Stability II: Functionality
Static objects (e.g. text) Look and feel Dynamic objects (e.g. computer game) Connectivity Interactivity

55 Environmental Support I: Emulation
Making it possible to see the object as it was originally seen Making it possible for the object to function as it originally did Providing software support for that to happen Running the original program (in an environment that emulates the original environment) Running something that looks like (emulates) the original program

56 Environmental Support II: Migration
Deciding what to migrate (deciding what to lose) Transformations to the object If reversible, no need to keep original object If not, retention of original object necessary

57 Documentation requirements for preservation
What the object was What the object is What happened in between

58 OAIS metadata model I

59 OAIS metadata model II SIP (send), AIP (archive), DIP (disseminate)
Parts of an object Content Preservation description Reference (unique identifier) Provenance (history in and out of repository) Context (archival bond) Fixity (message digest) Packaging Descriptive

60 OAIS metadata model III
What is “representation information”? How much must be kept? Monitoring changes What is the “knowledge base”? Designated user community DUC as “the public”

61 PREservation Metadata Implementation Strategies
Preservation metadata set, 2003-present Assumes OAIS model Maintaining viability, renderability, understandability, authenticity, identity Emphasis on provenance and relationships Entity concept [Intellectual entity: descriptive metadata] Object Event Agent (MARC, MADS) Rights Technical/hardware metadata out of scope

62 PREMIS Example: Object
objectIdentifier objectCategory preservationLevel significantProperties objectCharacteristics originalName storage environment signatureInformation relationship linkingEventIdentifier linkingIntellectualEntityIdentifier linkingRightsStatementIdentifier

63 Usage Metadata

64 What is Usage Metadata? Internal users (with respect to the creator)
External users (with respect to the creator) Internal users (with respect to the repository) External users (with respect to the repository)

65 Creator Usage The creator’s actual use of the object
Version control The creator’s colleagues’ use of the object Object function Object used for reference, model The creator’s customers’ use of the object Object function: mediates relationship

66 Repository Usage Management usage Designated user community
Object maintenance and preservation Object analysis Designated user community Object viewing Object acquisition

67 Rights Management Metadata

68 What is Rights Management?
Protection of copyright Protection of patent Protection of the integrity of the digital object (and thereby reputation of the author/creator herself)

69 What is being protected?
Object itself (integrity) Uses of the object (access controls) Limiting use (protecting rights of the owner) Enabling use (protecting rights of the user)

70 Protection against theft
Threats of the law Fully document with metadata and protect the metadata Authentication of users and user requests Watermarking/steganography

71 What about integrity of the digital object?
Relevant even in public domain E.g. “copyleft” agreement: See but not change, or change only with notification

72 Metadata Conclusions?


Download ppt "Metadata for Digital Objects"

Similar presentations


Ads by Google