Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata Modularization Concepts and Tools Carl Lagoze CS502 2001-03-14.

Similar presentations


Presentation on theme: "Metadata Modularization Concepts and Tools Carl Lagoze CS502 2001-03-14."— Presentation transcript:

1 Metadata Modularization Concepts and Tools Carl Lagoze CS502 2001-03-14

2 Metadata Structured data about data….

3 Why is Metadata important? Key to organizing, managing, preserving, and locating content and services in digital libraries

4 Why is Metadata difficult? Cost Interoperability –Syntax –Semantics Customizability Extensibility Distribution Integrity, Authenticity, Quality Human and Machine Factors Naming

5 Metadata Thoughts Metadata takes a variety of forms –descriptive cataloging –specialized terms and conditions administrative content ratings provenance linkage

6 More Metadata Thoughts New metadata sets will continually evolve Many metadata sets are “community- specific” –administration –use Human and machine use

7 Dublin Core Metadata Set for Simple Resource Discovery 15 elements allowing simple descriptive sentences about document like objects: –“Document has title Hamlet” –“Document has creator William Shakespeare” –“Document has subject love and anguish”

8 The Dublin Core 15 Title Creator Subject /Keywords Description Publisher Other Contributor Date Resource Type Format Resource Identifier Source Language Relation Coverage Rights Management

9 A Scope for the Dublin Core Increase or decrease number of elements? Structured or Unstructured value syntax? Accommodate community extensions?

10 Warwick Framework Provide context for Dublin Core effort Integrate multiple sets of metadata addressing issues of: –individual integrity –distinct audiences –separate realms of responsibility and management

11 Warwick Framework Design Containers for aggregating … Packages of typed metadata sets General principles - information hiding: –only operation defined at container level returns sequence of contained packages –packages are opaque at the container level –access to package contents subject to terms and conditions

12 Package Types Simple metadata set –segregating distinct metadata into separate packages Recursive container –nesting semantically related metadata sets Indirect reference –allowing distribution and sharing of metadata sets

13 Metadata Container Container Package Dublin Core Package MARC record Package Indirect Reference Package Terms and Conditions URI

14 Open Implementation Issues Data encoding Semantic interaction of overlapping sets –between semantically-related packages –between semantically distinct packages Type registry

15 Modeling & Encoding Metadata Components: XML Namespaces Prevent term clash: –record?, creator? Establish concept spaces through URIs xmlns:dc=“http://purl.org/dc xmlns:abc=“http://ilrt.ac.uk/abc Herbert Van de Sompel Cornell University

16 Modeling & Encoding Metadata Components: RDF RDF (Resource Description Format) The instantiation of the Warwick Framework on the Web Provides enabling technology for richly- structured metadata Rich data model supporting notions of distinct entities and properties Syntax expressed in XML

17 RDF Components Formal data model Syntax for interchange of data Schema Type system (schema model)

18 RDF Data Model Directed labeled graphs Model elements –Resource –Property –Value –Statement –Containers

19 RDF Model Primitives Resource Property Value Resource Statement

20 RDF Syntax Example URI:R “CIMI Presentation” Title Creator dc: “Eric Miller” <RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/”> CIMI Presentation Eric Miller

21 “Eric Miller” RDF Model Example #2 URI:R URI:ERIC “emiller@ oclc.org” “Eric Miller” “OCLC” bib:Emailbib:Aff bib:Name URI:OCLC “CIMI Presentation” Title Creator oa: dc:

22 <RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/” xmlns:bib = “http://www.bib.org/persons#”> CIMI Presentation Eric Miller emiller@oclc.org RDF Syntax Example #2

23 RDF Containers Permit the aggregation of several values for a property Express multiple aggregation semantics –unordered –sequential or priority order –alternative

24 RDF Schemas Declaration of vocabularies –properties defined by a particular community –characteristics of properties and/or constraints on corresponding values Schema Type System - Basic Types –Property, Class, SubClassOf, Domain, Range –Minimal (but extensible) at this time –minimize significant clashes with typing system designed for XML Schema WG Expressible in the RDF model and syntax

25 Relationships among vocabularies dc:Creator ms:director marc:100 bib:Author

26 Bringing it together RDF Data Model –Support consistent encoding, exchange and processing of metadata… critical when aggregating data from multiple sources RDF Schema –Declare, define, reuse vocabularies RDF Metadata transmission –XML encoding

27 Interoperability among Metadata Vocabularies core classes Dublin Core MARC INDECSIMS

28 Attribute/Value approaches to metadata… Hamlet has a creator Shakespeare subjectimplied verbmetadata nounliteral Playwright metadata adjective The playwright of Hamlet was Shakespeare R1 “ Shakespeare ” “ Hamlet ” dc:creator.playwright dc:title

29 …run into problems for richer descriptions… Hamlet has a creator Stratford birthplace The playwright of Hamlet was Shakespeare, who was born in Stratford “ Stratford ” R1 “ Shakespeare ” dc:creator.playwright dc:creator.birthplace Hamlet has a creator Shakespeare

30 …because of their failure to model entity distinctions R1 “ Stratford ” creator R2 name “ Shakespeare ” birthplace title “ Hamlet ”

31 Understanding Metadata based on Query Capabilities Simple boolean tags? Agent, time, place questions? –Who was responsible for what and when

32 Applying a Model-Centric Approach Formally define common entities and relationships underlying multiple metadata vocabularies Describe them (and their inter-relationships) in a simple logical model Provide the framework for extending these common semantics to domain and application-specific metadata vocabularies.

33 Conceptual Basis: Evolution of Content over Time IFLA Entity Model From Bearman, et. al., D-Lib Magazine, January 1999.

34 Events are key to understanding metadata relationships? Recognizing inherent lifecycle aspects of digital content - transformation of “input” resources to “output” resources and of their descriptions. (e.g., IFLA model) Modeling implied events as first-class objects provides attachment points for common entities – e.g., agents, contexts (times & places), roles. Clarifying attachment points facilitates mapping across common entities in different vocabularies.

35 Content, Events, & Descriptions

36 Museum Data


Download ppt "Metadata Modularization Concepts and Tools Carl Lagoze CS502 2001-03-14."

Similar presentations


Ads by Google