Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University Cornell Digital.

Similar presentations


Presentation on theme: "Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University Cornell Digital."— Presentation transcript:

1 Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University payette@cs.cornell.edu Cornell Digital Imaging Workshop October 21, 1998

2 Metadata CREATOR: Plato TITLE: The Republic Image 1 cdrom 1 Image 2 cdrom 1 Image 3 cdrom 2 Image File Storage Metadata is structured data about data that imposes order on a disordered information universe. Access Control List

3 Many Types of Metadata Descriptive Structural Terms and conditions Administrative Content ratings Provenance Relationship

4 Basic Functions We Must Support Resource Discovery Access and Use Preservation and Administration

5 Resource Discovery: Focus on Descriptive Metadata

6 Metadata for Resource Discovery Catalogs –OPAC / MARC Records Indexes –Structured descriptive records (e.g., Dublin Core) –Abstracts –Full-text surrogates (e.g, via OCR)

7 Challenges Impracticality of large-scale traditional cataloging –time consuming, labor intensive, special skills –limited coverage - only “selected” items Problems with resource discovery –full-text indexing ineffective (false hits, irrelevancy, overload) –full-text approaches not useful for non-textual data (e.g., audio, video, executable programs)

8 One Solution: Simple Descriptive Surrogates Easy to create Applicable across domains Applicable for different genre of objects Allows interoperability among robots, indexers, and search clients

9 Dublin Core Element Set Good baseline descriptive record Can exist along side other specialized metadata Common ground for discovery across disparate resources No specialized skills required Flexibility through qualifiers Source: http://www.purl.org/Metadata/dublin_core/

10 Dublin Core : 15 Elements Title name given to the work by the author Author or Creator person(s) responsible for the intellectual content Subject and Keywords the topic of the work, keywords, or formal classification schemes Description textual description of the content (abstract, prose describing an image, etc.) Publisher the organization making the work available in its present form Other Contributor person(s) other than the author who have made significant contributions to the intellectual content Date the date the work was made available Resource Type category of the resource Format Data representation of the resource Resource Identifier Unique Identification string (e.g. URL, URN, ISBN...) Source object from which this object is derived (if applicable) Language language of the intellectual content of the object Relation relationship of the object to other objects or collections Coverage spatial locations and temporal duration characteristics Rights Management a pointer to a copyright notice, a rights management statement, or a rights server.

11 Dublin Core in HTML META Tags Cornell Digital Library Research Group <META name="DC.identifier" scheme="URL" content="http://www2.cs.cornell.edu/NCSTRL/CDLRG/cdlrg.htm"> Source: http://www.w3.org/TR/REC-html40/

12 Warwick Framework Developed by Dublin Core community Broader framework to accommodate diverse metadata schemes Encourages community-specific definition and administration of metadata Modularity supports interoperability among: – content providers – catalogers and indexers – automated resource discovery systems

13 Warwick Framework Container Container Package Dublin Core Package Other Descriptive Package Reference to MARC Simple Package: Typed Metadata Set Package MARC Record URI

14 WWW Infrastructure Evolving in this Direction Dublin Core submitted to IETF as RFC –ftp://ftp.isi.edu/in-notes/rfc2413.txt Resource Description Framework (RDF) –http://www.w3.org/RDF/ Extensible Markup Language (XML) –http://www.w3.org/XML/

15 Resource Description Framework (RDF) Influenced by the Warwick Framework, among others Enables interoperability between applications that exchange metadata Mix and match of metadata elements from different schemas An application of XML (transfer syntax)

16 A Simple RDF Model www2.cs.cornell.edu/CDLRG/doc1 DC:Creator DC:Publisher QCSchema:Rating www.xxx.org/rate A B MyRatingYourRating

17 RDF Expressed in XML Dublin Core Element Set <?xml:namespace name= “http://www.purl.org/Metadata/dublin_core/” as=“DC”> <?xml:namespace name= “http://www.w3.org/Schemas/RDF/” as=“RDF”> <RDF:Assertions href=“http://www2.cs.cornell.edu/CDLRG/doc1”> Sandy Payette Cornell DLRG

18 RDF: Why is it important? Market demand for metadata deployment Software infrastructure will be ubiquitous (e.g. free in browsers, servers, proxies, editors, etc.) RDF is a general purpose framework that provides structured, human-readable and machine- understandable metadata for the web Allows stakeholder communities to independently developed, maintain, and reuse vocabularies

19 Access and Use Focus on Structural Metadata

20 Structural Metadata What is it? Data that…. –Defines structure within documents –Aggregates images into meaningful entities –Correlates document components to image files –Organizes a collection of objects Where is it? –ASCII text files in directories –Relational databases –Embedded in documents or surrogates (e.g. SGML)

21 First... A Data Model Data models mirror natural attributes and relationships of real-world objects Page Chapter Table Contents Index Front 0:1 1:N 0:1 1:N 0:1 1:N

22 “Binding” Document Images with SGML <!DOCTYPE EBIND PUBLIC "-//UC Berkeley//DTD ebind.dtd (ElectronicBinding (Ebind))//EN" [ <!ENTITY % birch PUBLIC "-//UC Berkeley//ENTITIES Birch-tree fairy book (Page Images)//EN"> %birch;]> Introductory note Source: http://sunsite.berkeley.edu/Ebind/

23 Finding Aids in SGML Encoded Archival Description (EAD) –SGML mark up of descriptive access tools (inventories, registers, indexes, and guides) –provides more detail about a collection than in typical catalog record –facilitates access - “drill down” into collection –potential international standard –maintained jointly by Library of Congress and Society of American Archivists (SAA) Source: http://www.loc.gov/rr/ead/eadhome.html

24 Preservation and Administration Focus on Administrative Metadata and Persistent Identifiers

25 Administrative Metadata Information for managing images… over time –relocation –migration (new formats) –copyright tracking –archiving of objects and services Where is it? –File headers (to help prevent orphaned images) –External databases (e.g., relational db) –Separate files stored with images

26 Create a Preservation Audit Trail Image File Attributes: formats versions compression Image Attributes: resolution bit depth orientation Process Data: creation date/time equipment used Rights Management Data: Expiration dates Copyright info source statements

27 Persistent Identifiers Globally unique names Persistent … names are permanent, lasting Used in resolution services to locate the object (locations change over time). cnri.dlib/april97-payette Naming Authority Item Name Unique Identifier: URL: http://www.somewebserver.org/somedirectory/somefile

28 Identifiers: Current Initiatives IETF Uniform Resource Names (URN) –specification of URN framework –requirements for resolution systems –syntax definition Existing Systems –CNRI’s Handle System –OCLC PURLs –DOI Initiative

29 Further reading IFLA: A Good List - http://www.nlc-bnc.ca/ifla/II/metadata.htm Lynch, et. al.: CNI Resource Discovery White Paper - http://www.cni.org/projects/nidr/nidr.html Lagoze: Resource Discovery in the Digital Age - http://www.dlib.org/dlib/june97/06lagoze.html Payette: Persistent Identifiers, RLG DigiNews - http://www.rlg.org/preserv/diginews/diginews22.html W3C: Metadata Overview - http://www.w3.org/Metadata


Download ppt "Metadata for Digital Libraries: A Functional Approach Sandra Payette Digital Library Research Group Cornell University Cornell Digital."

Similar presentations


Ads by Google