1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs Dublin Core.

Slides:



Advertisements
Similar presentations
Ali Alshowaish. dc.coverage element articulates limitations in the scope of the resource, typically along the following lines: geographical, temporal,
Advertisements

Metadata and Search at Boeing Julie Martin Library & Learning Center Services
RDA & Serials. RDA Toolkit CONSER RDA Cataloging Checklist for Textual Serials (DRAFT) CONSER RDA Core Elements Where’s that Tool? CONSER RDA Cataloging.
Content and Bibliographic Theory CS 431 Architecture of Web Information Systems Carl Lagoze Cornell University Acks to H. Van de Sompel.
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History 2 University of California, Berkeley School of Information IS 245: Organization.
1 CS 430 / INFO 430 Information Retrieval Lecture 19 Metadata 1.
8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.
William Y. Arms Corporation for National Research Initiatives March 22, 1999 Object models, overlay journals, and virtual collections.
Kristin Eberle Monica Hampton Carmen Velasquez Kristin Eberle Monica Hampton Carmen Velasquez Knowledge Management.
1 CS 502: Computing Methods for Digital Libraries Lecture 13 Descriptive Metadata I: cataloguing, classification, authority files.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
1 CS 430 / INFO 430 Information Retrieval Lecture 16 Library Catalogs 1.
CS 501: Software Engineering Fall 2000 Lecture 6 (a) Requirements Analysis (continued) (b) Requirements Specification.
LSTA Digital Imaging Grants Presentation Projects Workshop September 13, 2002 Wendy Sistrunk Music Catalog Librarian University of Missouri—Kansas City.
1 CS 430: Information Discovery Lecture 15 Library Catalogs 3.
1 Open-source platform for accessible content management Museo & Web CMS.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC.
Cornell CS Bibliographic Concepts CS 502 – Carl Lagoze – Cornell University Acks to H. Van de Sompel.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
1 CS 430: Information Discovery Lecture 17 Library Catalogs 2.
1 CS 430: Information Discovery Lecture 14 Automatic Extraction of Metadata.
1 CS 502: Computing Methods for Digital Libraries Lecture 28 Current work in preservation.
1 CS/INFO 430 Information Retrieval Lecture 20 Metadata 2.
1 CS/INFO 430 Information Retrieval Lecture 16 Metadata 3.
INLS 520 – Fall 2007 Erik Mitchell INLS 520 Information Organization.
1 CS 430: Information Discovery Lecture 7 Descriptive Metadata 3 Dublin Core Automatic Generation of Catalog Records.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
LIS654 lecture 5 DC metadata and omeka tables Thomas Krichel
Modularization and Interoperability: Dublin Core and the Warwick Framework Sandra D. Payette Digital Library Research Group Cornell University November.
1 CS 430 / INFO 430 Information Retrieval Lecture 14 Metadata 1.
What users want & how FRBR can help Diane Vizine-Goetz Research Scientist OCLC Research.
Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
1 Discussion Class 4 The Dublin Core Metadata Initiative.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
APPLYING FRBR TO LIBRARY CATALOGUES A REVIEW OF EXISTING FRBRIZATION PROJECTS Martha M. Yee September 9, 2006 draft.
AACR2 Pt. 1, Monographic Description LIS Session 2.
Evidence from Metadata INST 734 Doug Oard Module 8.
RDA DAY 1 – part 2 web version 1. 2 When you catalog a “book” in hand: You are working with a FRBR Group 1 Item The bibliographic record you create will.
The physical parts of a computer are called hardware.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Libraries Catalogs Dublin Core.
FRBR: Cataloging’s New Frontier Emily Dust Nimsakont Nebraska Library Commission NCompass Live December 15, 2010 Photo credit:
1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
8/28/97Information Organization and Retrieval Introduction University of California, Berkeley School of Information Management and Systems SIMS 245: Organization.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
The ___ is a global network of computer networks Internet.
CS 501: Software Engineering Fall 1999 Lecture 5 (a) Requirements Analysis (continued) (b) Requirements Specification.
1 CS 430: Information Discovery Lecture 7 Automatic Generation of Catalog Records.
Global Rangelands Data Entry Guidelines March 23, 2015.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by .
1 Metadata: an overview Alan Hopkinson ILRS Middlesex University.
Session 3 Metadata & Workflow
CS 430: Information Discovery
Lecture 12 Why metadata? CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
CS 430: Information Discovery
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Catherine Lai MUMT-611 MIR January 27, 2005
Introduction to Metadata
Cataloging Tips and Tricks
Attributes and Values Describing Entities.
Metadata to fit your needs... How much is too much?
CS 430: Information Discovery
Attributes and Values Describing Entities.
CS 430: Information Discovery
Presentation transcript:

1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs Dublin Core

2 Course Administration Assignment 1 Submission instructions will be posted soon. You will need a csuglab account. If you do not have such an account, go to Upson 311. Programming in Perl First class on Perl is Wednesday night, Hollister 110, 7:30 to 9:00 p.m.

3 Course Administration New Course LAW 410 Limits on and Protection of Creative Expression - Copyright Law and Its Close Neighbors This course, offered during fall term 2001, provides an introduction to copyright law and closely related legal regimes for non-law students.

4 Example: Monograph catalog record Citation Caroline R. Arms, editor, Campus strategies for libraries and electronic information. Bedford, MA: Digital Press, 1990.

5 MARC fields tag value r Z675.U5C / Campus strategies for libraries and electronic title statement information/Caroline Arms, editor. 260 {Bedford, Mass.} : Digital Press, c1990. publisher 300 xi, 404 p. : ill. ; 24 cm. collation 440 EDUCOM strategies series on information technology series title 504 Includes bibliographical references (p. {373}-381). 020 ISBN X : $34.95

6 MARC fields (continued) 650 Academic libraries--United States--Automation. subject heading 650 Libraries and electronic publishing--United States. 650 Library information networks--United States. 650 Information technology--United States. 700 Arms, Caroline R. (Caroline Ruth) 040 DLC DLC DLC 043 n-us CIP ver. br02 to SL APIF/MIG

7 MARC Encoding: For Print and Computer Processing tag: 260 subfield a:{Bedford, Mass.} : subfield b:Digital Press, subfield c:c1990. MARC encoding: &2600#abc#{Bedford, Mass.} :#Digital Press,#c1990.%

8 Name authority files Caroline R. Arms or Caroline Ruth Arms? Which William Phillips of Cardiff? Mark Twain or Samuel Clemens? Epithets: of Cardiff doctor Dates: flourished 1860 circa

9 Shared cataloguing OCLC -- Large centralized transaction processing database system When a library catalogs a book it deposits MARC record in OCLC Other libraries can copy the record saves duplication of cataloguing build database of holdings OCLC database has 43 million records

10 Subject information Library of Congress Subject Headings Academic libraries--United States--Automation Hierarchical classification Library of Congress call number:Z675.U5C16 Dewey Decimal Classification:027.7 Creation and maintenance of lists of subject headings and classifications is a never ending task.

11 Notes on MARC A great achievement: Developed in 1960s Magnetic tape exchange format for printing catalog records The dawn of computing: mixed upper and lower case variable length fields, repeated fields non-Roman scripts 100(?) million records with standard content and format Thousands of trained librarians (millions?)

12 Notes on MARC A great problem: Not designed for computer algorithms One record per item (poor links between records) Tied to traditional materials and traditional practices Not Unicode 100 of million records at $ $10 billion A classic legacy system!

13 Cataloguing Objectives Functions of catalogs: finding collocating (recall and precision) choosing acquiring navigating... among items in a bibliographic universe Compare use cases in software design.

14 IFLA Model Work A work is the underlying abstraction, e.g., The Iliad The Computer Science departmental web site Beethoven's Fifth Symphony Unix operating system The 1996 U.S. census This is roughly equivalent to the concept of "literary work" used in copyright law.

15 IFLA Model Expression. A work is realized through an expression, e.g., The Illiad has oral expressions and written expressions A musical work has score and performance(s). Software has source code and machine code Many works have only a single expression, e.g. a web page, or a book.

16 IFLA Model Manifestation. A expression is given form in one or more manifestations, e.g., The text of The Iliad has been manifest in numerous manuscripts and printed books. A musical performance can be distributed on CD, or broadcast on television. Software is manifest as files, which may be stored or transmitted in any digital medium.

17 IFLA Model Item. When many copies are made of a manifestation, each is a separate item, e.g., a specific copy of a book computer file [Works, expressions, manifestations and items are explored in CS 502, Computing Methods of Digital Libraries.]

18 Dublin Core Simple set of metadata elements for online information 15 basic elements intended for all types and genres of material all elements optional all elements repeatable Developed by an international group chaired by Stuart Weibel since (Diane Hillmann and Carl Lagoze of Cornell are very active in this group.)

19

20 Dublin Core publisher: OCLC creator: Weibel, Stuart L. creator: Miller, Eric J. title: Dublin Core Reference Page date: format: text/html (MIME type) language: en (English) identifier:

21 Dublin Core with Meta Tags

22 Dublin Core elements 1. Title The name given to the resource by the creator or publisher. 2. Creator The person or organization primarily responsible for the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources. 3. Subject The topic of the resource. Typically, subject will be expressed as keywords or phrases that describe the subject or content of the resource. The use of controlled vocabularies and formal classification schemes is encouraged.

23 Dublin Core elements 4. Description A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. 5. Publisher The entity responsible for making the resource available in its present form, such as a publishing house, a university department, or a corporate entity. 6. Contributor A person or organization not specified in a creator element who has made significant intellectual contributions to the resource but whose contribution is secondary to any person or organization specified in a creator element (for example, editor, transcriber, and illustrator).

24 Dublin Core elements 7. Date A date associated with the creation or availability of the resource. 8. Type The category of the resource, such as home page, novel, poem, working paper, preprint, technical report, essay, dictionary. 9. Format The data format of the resource, used to identify the software and possibly hardware that might be needed to display or operate the resource. 10. Identifier A string or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs.

25 Dublin Core elements 11. Source Information about a second resource from which the present resource is derived. 12. Language The language of the intellectual content of the resource. 13. Relation An identifier of a second resource and its relationship to the present resource. This element permits links between related resources and resource descriptions to be indicated. Examples include an edition of a work (IsVersionOf), or a chapter of a book (IsPartOf).

26 Dublin Core elements 14. Coverage The spatial locations and temporal durations characteristic of the resource. 15. Rights A rights management statement, an identifier that links to a rights management statement, or an identifier that links to a service providing information about rights management for the resource.

27 Qualifiers Element qualifier Example: Date DC.Date -> Created: DC.Date -> Issued: DC.Date -> Available: / DC.Date -> Valid: /

28 Qualifiers Value qualifiers Example: Subject DC.Subject -> DDC: DC.Subject -> LCSH: Digital libraries-United States

29

30 Dublin Core with qualifiers Digital Libraries and the Problem of Purpose David M. Levy Corporation for National Research Initiatives January 2000 article /january2000-levy English Copyright (c) David M. Levy

31

32

33 Limits of Dublin Core Complex objects Article within a journal A thumbnail of another image The March 28 final edition of a newspaper Complete object Sub-objects Metadata records

34 Flat v. linked records Flat record All information about an item is held in a single Dublin Core record, including information about related items convenient for access and preservation information is repeated -- maintenance problem Linked record Related information is held in separate records with a link from the item record less convenient for access and preservation information is stored once Compare with normal forms in relational databases

35 Dublin Core with flat record extension Continuation D-Lib Magazine

36 Events Version 1 New material Version 2 Should Version 2 have its own record or should extra information be added to the Version 2 record? How are these represented in Dublin Core?

37 Minimalist versus structuralist Minimalist 15 elements, no qualifiers, suitable for non-professionals encourage creators to provide metadata Structuralists 15 elements, qualifiers, RDF, detailed coding rules will require trained metadata experts [For an example of how complex Dublin Core can become, see the source of: htm#]

38 Dublin Core in many languages See: Thomas Baker, Languages for Dublin Core, D-Lib Magazine December 1998,

39 Dublin Core: Personal Opinion Dublin Core is a simple way to describe digital content that: is a single, self-contained object ("document-like") is static with time has few relationships Some web sites satisfy these criteria Dublin Core is not suitable for digital content that: is heavily structured changes dynamically