1 CS 430 / INFO 430 Information Retrieval Lecture 16 Library Catalogs 1.

Slides:



Advertisements
Similar presentations
Principles of Cataloguing & Classification: a basic introduction
Advertisements

Gathering Information Information Collection: Garbage In – Garbage Out.
OUP in support of digital libraries Main objectives Historical Context Why Xml ? Librarian Resource Centre Oxford Index Marzena Giers Fidler 5 th June.
Nonprint Materials Cataloging Codes LSC550 Dr. Yan Ma.
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History 2 University of California, Berkeley School of Information IS 245: Organization.
5 th September 2003Diane Tough Content Creation at the NHM or The evolving catalogue!
1 CS 430 / INFO 430 Information Retrieval Lecture 19 Metadata 1.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.
1 Cataloging for School Librarians: It’s Child’s Play! Or is it? Basic Tools for Cataloging Sources for Bibliographic and Authority Records, Helpful Tools.
Using library resources for research Paul Johnson Bedford Library.
1 CS 502: Computing Methods for Digital Libraries Lecture 13 Descriptive Metadata I: cataloguing, classification, authority files.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
The Library Cataloging Tradition
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
1 Cataloging for School Librarians — It Matters! Margaret Maurer Head, Catalog and Metadata Kent State University Libraries and Media Services 2006 ILF.
CS 501: Software Engineering Fall 2000 Lecture 6 (a) Requirements Analysis (continued) (b) Requirements Specification.
Employers’ Expectation for Entry-Level Catalog Librarians: What Position Announcement Data Indicate.
Online the Library Michaelmas Term 2011 Trinity College Library Dublin 1 1.
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC.
Homework Full-text article – entire textual contents of article in online format Abstract – brief summary of article Citation – basic information required.
Organization of Knowledge –ISTC 653 September 6, 2012.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Library Technical Services: Selection, Acquisitions, Cataloging and Processing Adding materials to the library collection (Textbook Chapter 5)
Copy cataloguing in Finland Juha Hakala The National Library of Finland
Cataloging and Metadata at the University Library.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
History of Bibliographic Control. Library and Information-type Work –Undertaken through much of human history. –Information packages of various types,
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
Library Technical Services: Acquisitions, Cataloging and Processing
The Library Cataloging Tradition Marty Kurth CS 431 February 9, 2005 [slides stolen from Diane Hillmann]
The Research Process Getting the Information You Need.
Midterm Hardware vs. Software Everyone got this right!
1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs Dublin Core.
Developing Databases and Selecting an Appropriate Library System.
Current Events and Issues Using Index Databases for Finding Answers.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
1 CS 430 / INFO 430 Information Retrieval Lecture 14 Metadata 1.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums Jenn Riley Metadata Librarian Indiana University Digital Library.
Cataloguing Code and Cataloguing Process. What is a Catalog(ue)?  A list of library materials contained in a collection, a library, or a group of libraries.
Indexes and Abstracts: Dissecting the Resource By M. Leedy.
The physical parts of a computer are called hardware.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
Description of Bibliographic Items. Review Encoding = Markup. The library cataloging “markup” language is MARC. Unlike HTML, MARC tags have meaning (i.e.,
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Libraries Catalogs Dublin Core.
1 CS 430: Information Discovery Lecture 6 Descriptive Metadata 2 Library Catalogs.
A brief tour of Academic Search Premier. Agenda: Agenda: What is a database? What is a database? Searching keywords and using truncation. Searching keywords.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
8/28/97Information Organization and Retrieval Introduction University of California, Berkeley School of Information Management and Systems SIMS 245: Organization.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
The ___ is a global network of computer networks Internet.
Bibliographic Record Description of a book or other library material.
An information retrieval system may include 3 categories of information:  Factual  Bibliographical  Institutional  Exchange and sharing of these categories.
CS 501: Software Engineering Fall 1999 Lecture 5 (a) Requirements Analysis (continued) (b) Requirements Specification.
Queensland University of Technology Faculty of Information Technology Michael Middleton 1 CRICOS No J Bibliographic description.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
1 Metadata: an overview Alan Hopkinson ILRS Middlesex University.
Theory, Tools, History: A Brief Introduction August 17, 2016.
Lecture 12 Why metadata? CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
From the old to the new… Towards better resource discoverability
Professional development training on cataloging at the University Wisconsin-Madison Memorial Library, USA 14th October -24th October, 2016 Aigerim Shurshenova.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Cataloging Tips and Tricks
MARC: Beyond the Basics 11/24/2018 (C) 2006, Tom Kaun.
Metadata to fit your needs... How much is too much?
CS 430: Information Discovery
Classification & Cataloging
Presentation transcript:

1 CS 430 / INFO 430 Information Retrieval Lecture 16 Library Catalogs 1

2 Course Administration

3 Information Retrieval with High Recall Full-text Indexing (automated) Text only. Most effective on medium-length documents on related topics. High recall requires tuning system to the specific collection and skilled users. Catalogs and Indexes (created manually) Can be used for all formats of material Requires close quality control of metadata creation High recall requires tuning system to the specific collection and skilled users.

4 Descriptive metadata Information discovery is can be very effective when applied to metadata rather than raw information Allows fielded searching author = "Goethe" Suitable for non-textual material type = "picture" and subject = "Ithaca" Can be used with controlled vocabulary language = "en" (English)

5 Examples of Library Catalogs Cornell University Library catalog: Library of Congress, Prints and Photographs:

6 Origins of Library Catalogs Bibliographic Objective: To bring together like items To differentiate among similar ones Sir Anthony Panizzi, Keeper of Books at the British Museum ( ). His Ninety-One Rules (1841) were the basis of modern catalog rules.

7 Origins of Library Catalogs Information Discovery: to enable a person to find a book of which either the author, title or subject is known to show what the library has by a given author, on a given subject, or in a given kind of literature to assist in the choice of a book as to its edition (bibliographically) or to its character (literary or topical). Charles Ammi Cutter Librarian of the Boston Athenaeum Rules for a Dictionary Catalog, 1874

8 Origins of Library Catalogs Classification: Division of subject matter into a hierarchy. Typically used in libraries to provided a subject- based order for shelving books. Melvil Dewey Acting Librarian of Amherst College (1874) Dewey Decimal system of book classification, uses the numbers 000 to 999 to cover the general fields of knowledge and decimals to fit special subjects.

9 Technology Materials to be catalogued: Originally books Extended to serials, maps, music, etc., but concepts still rely heavily on experience with books Form of catalog: Entries in books (Panizzi) Index cards (Cutter) Online databases (Kilgour)

10 Catalogs as Investments Costs: Conventional Catalog Records are created by skilled librarians. (cost estimate $100 per record). OCLC's catalog has 52 million records. Total investment is several billion dollars. Cataloguing Standards: Enable libraries to share records Combine records of the past with records created today Allow readers and librarians to move between libraries

11 Shared Cataloguing: OCLC OCLC -- Large centralized transaction processing database system When a library catalogs a book it deposits MARC record in OCLC Other libraries can copy the record saves duplication of cataloguing build database of holdings OCLC database has 52 million records, serves 47,000 libraries When developed in 1967, OCLC was a pioneering computer system (had to develop own network, computer terminal, etc.)

12 Layers of a Library Catalog Encoding Rules that define how catalog records are encoded in a computer system, e.g., XML mark-up. Syntax Rules that define the fields and subfields, whether repeated, optional, etc. Semantics Rules that define the values of the field and subfield, with instructions for cataloguers of what data to include and how to decide when choices have to be made.

13 Library Cataloging using the Anglo American Cataloguing Rules Anglo American Cataloguing Rules (AACR2) Rules for each category of material, e.g., monographs (books). Specify what fields should be used and what data to include in each field. Text strings were originally intended for printed catalog cards. MARC format An exchange format for catalog records. Includes encoding rules and syntax specification. "MARC Catalog" Catalog in MARC format, where content of each field follows AACR2.

14 Anglo American Cataloguing Rules The Anglo American Cataloguing (AACR) rules provide detailed rules for the choice of fields the content of the data that goes into each field the syntax of the data that goes into each field The rules are an excellent example of technical writing, precise but clear. For an example, see:

15 Example: Controlled Vocabulary Level 1Level 2 ArtsArchitecture Art therapy Careers* Computers in art Dance Drama/dramatics Film History* Informal education* Instructional issues* Music Photography Popular culture* Process skills* Technology* Theater arts Visual arts Terms marked * can appear in other hierarchies Source: presentation by Diane Hillmann, 2004

16 MARC Format The MARC format was developed in the late 1960s as a tagging scheme for exchanging catalog records on magnetic tape. It remains the standard way to represent such data. At present, MARC is steadily being converted (slowly) to modern computing formats, e.g., Unicode, XML.

17 MARC: Monograph catalog record Citation Caroline R. Arms, editor, Campus strategies for libraries and electronic information. Bedford, MA: Digital Press, 1990.

18 MARC fields tag value r Z675.U5C / Campus strategies for libraries and electronic title statement information/Caroline Arms, editor. 260 {Bedford, Mass.} : Digital Press, c1990. publisher 300 xi, 404 p. : ill. ; 24 cm. collation 440 EDUCOM strategies series on information technology series title 504 Includes bibliographical references (p. {373}-381). 020 ISBN X : $34.95

19 MARC fields (continued) 650 Academic libraries--United States--Automation. subject heading 650 Libraries and electronic publishing--United States. 650 Library information networks--United States. 650 Information technology--United States. 700 Arms, Caroline R. (Caroline Ruth) 040 DLC DLC DLC 043 n-us CIP ver. br02 to SL APIF/MIG

20 MARC Encoding tag: 260 subfield a:{Bedford, Mass.} : subfield b:Digital Press, subfield c:c1990. MARC encoding: &2600#abc#{Bedford, Mass.} :#Digital Press,#c1990.% [Definitely not a modern encoding!] Note that the content is designed to be part of a printed catalog record and is not in a convenient format for computer manipulation.

21 Name authority files An Authority File "brings together like items and differentiates among similar ones." Caroline R. Arms or Caroline Ruth Arms? Which William Phillips of Cardiff? Mark Twain or Samuel Clemens? Epithets: of Cardiff doctor Dates: flourished 1860 circa

22 LC Control Number:n HEADING :Arms, Caroline R. (Caroline Ruth) cz n n|acannaab |a aaa c 010__ |a n __ |a (DLC)n __ |a InU |c DLC |d DLC |a Arms, Caroline R. |q (Caroline Ruth) |w nna |a Arms, Caroline Ruth |a Arms, C. R. |q (Caroline Ruth) 670 __ |a Arms, W.Y. Report on the performance problems of the RLIN computer system, 1982: |b t.p. (Caroline R. Arms) 670 __ |a LC data base, 8/24/87 |b (hdg.: Arms, Caroline Ruth; usage: Caroline R. Arms, C. R. Arms) 670 __ |a Campus networking strategies, 1988: |b CIP t.p. (Caroline Arms) 670 __ |a Phone call to pub., 2/10/88 |b (Caroline Ruth Arms; studied at Oxford) 670 __ |a Campus strategies for libraries and electronic information, c1990: |b CIP t.p. (Caroline Arms) data sheet (b ) 953 __ |a bz46 |b bd24 Name authority: example

23 Subject information Library of Congress Subject Headings Academic libraries--United States--Automation Hierarchical classification Library of Congress call number:Z675.U5C16 Dewey Decimal Classification:027.7 Creation and maintenance of lists of subject headings and classifications is a never ending task.

24 Online public access catalog (OPAC) History: First stage Library mounts its MARC records on a central computer Provides a simple terminal interface and dedicated terminals Boolean search -- fielded searching [Most university libraries reached this stage about 1990] History: Second stage Library connects computer to a campus network and Internet Converts card catalog records to MARC (retrospective conversion)

25 Library information systems When the catalog is online... Add other collections and services: Secondary information (Inspec, Medline, Chemical Abstracts) Reference works (dictionaries, encyclopedias) Improve user interface Add full text searching Add web interface Add gateway to off-campus information sources: Scientific journals Databases (census, genome)

26 Library management systems A library management system, sometimes called an integrated library system, integrates the internal processes of a library, e.g., acquisitions, cataloguing, binding, circulation, etc. It usually contains an online public access catalog, but does not provide integrated services to users. Library management systems are produced by small companies who lack the capital and technical expertise to develop modern digital libraries.

27 Notes on MARC A great achievement: Developed in 1960s Magnetic tape exchange format for printing catalog records The dawn of computing: mixed upper and lower case variable length fields, repeated fields non-Roman scripts 100(?) million records with standard content and format Thousands of trained librarians (millions?)

28 Notes on MARC A great problem: Not designed for computer algorithms One record per item (poor links between records) Tied to traditional materials and traditional practices Not Unicode 100 of million records at $ $10 billion A classic legacy system!