OCLC Online Computer Library Center V irtual I nternational A uthority F ile Ed O’Neill Prepared with the assistance of Rick Bennett Australian Committee.

Slides:



Advertisements
Similar presentations
Changing World of Authority Control Presentation by Barbara B. Tillett, Ph.D. Chief, Policy and Standards Division Library of Congress July 11, 2009 ExLibris.
Advertisements

OCLC Online Computer Library Center OCLC Cataloging Update Connexion client 1.50 & more OCLC CJK Users Group Annual Meeting San Francisco, CA April 8,
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
Virtual International Authority File ALA, June 2006 Richard BennettOCLC Christel HengelDDB Thomas B. HickeyOCLC Edward T. ONeillOCLC Barbara B. TillettLC.
OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International.
A Virtual International Authority File Presentation by Barbara B. Tillett, Ph.D. Chief, Cataloging Policy and Support Office Library of Congress to the.
Virtual International Authority File – OAI model B. Tillett 2002.
Future of Cataloging RDA and other innovations Pt. 2.
The world’s libraries. Connected. Using Authorities to Improve Subject Searches Beyond Libraries – Subject Metadata in the Digital Environment and Semantic.
Module 5a: Authority Control and Encoding Schemes IMT530: Organization of Information Resources Winter 2007 Michael Crandall.
VIAF for NAAC 2012 October Eric Childress OCLC Research.
RDA Test “Train the Trainer” Module 6: Identifying Families [Content as of Mar. 31, 2010]
The Virtual International Authority File Thomas Hickey ACIG 2009 July 12 ALA, Chicago IL.
MARC 101 for Non-Catalogers Colorado Horizon Users Group Meeting Philip S. Miller Library Castle Rock, CO May 29, 2007.
The Mysterious MARC Record
Authorities in a connected world Indiana Library Federation 2011 November 16 Thomas Hickey OCLC Chief Scientist.
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History 2 University of California, Berkeley School of Information IS 245: Organization.
Cataloguing Codes and Conceptual Models: RDA and the Influence of FRBR and other IFLA Initiatives by Dr. Barbara B. Tillett Chief, Cataloging Policy &
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
A Virtual International Authority File Presentation by Barbara B. Tillett, Ph.D. Chief, Cataloging Policy and Support Office Library of Congress For the.
A Virtual International Authority File Presentation by Barbara B. Tillett, Ph.D. Chief, Cataloging Policy and Support Office Library of Congress for the.
Descriptive Cataloging of Monographs 1. Introduction DRAFT.
Introduction to MARC Cataloguing Part 2 Presenters: Irma Sauvola: Part 1 Dan Smith: Part 2.
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
RDA Test “Train the Trainer” Module 7: Identifying Corporate Bodies [Content as of Mar. 31, 2010]
SEARCHING AND COPY- CATALOGING MUSIC IN CONNEXION CLIENT CLA TECHNICAL SERVICES INTEREST GROUP & THE MUSIC LIBRARY ASSOCIATION, SOUTHERN CALIFORNIA CHAPTER,
Is Cataloging Dead: Advocacy for Bibliographic Control Randy Roeder and Rebecca Routh ILA/ACRL Spring Conference Davenport, Iowa March 3, 2008.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Matching names in parallel T. Hickey Access October.
Updated :02 Hong Kong University of Science & Technology Library XML Name Access Control Repository at the Hong Kong University of Science.
Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web Dr. Barbara B. Tillett Chief, Policy & Standards Division.
VIAF (Virtual International Authority File) Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web Dr. Barbara B.
Thomson Scientific October 2006 ISI Web of Knowledge Autumn updates.
The Library Cataloging Tradition Marty Kurth CS 431 February 9, 2005 [slides stolen from Diane Hillmann]
Prague 24 November TEL-ME-MOR/M-CAST Seminar on Subject Access The Virtual International Authority File (VIAF) Christel Hengel.
Beyond Copy Cataloging: MARCing the Next Frontier Susan M. Sutch An Infopeople Workshop Winter/Spring
New MARC Fields with RDA Bibliographic and Authority Formats Bibliographic and Authority Formats.
Developing Databases and Selecting an Appropriate Library System.
Implementation scenarios, encoding structures and display Rob Walls Director Database Services Libraries Australia.
RDA in NACO Module 6.a RDA Chapter 11: Identifying Corporate Bodies—Overview Recording the Attributes.
The Future of Cataloging Codes and Systems: IME ICC, FRBR, and RDA by Dr. Barbara B. Tillett Chief, Cataloging Policy & Support Office Library of Congress.
RDA in NACO Module 4.a Module 4.b Module 4.c RDA Chapter 9: Identifying Persons— Overview Recording the Attributes.
Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
APPLYING FRBR TO LIBRARY CATALOGUES A REVIEW OF EXISTING FRBRIZATION PROJECTS Martha M. Yee September 9, 2006 draft.
AACR2 Pt. 1, Monographic Description LIS Session 2.
MARCIt records for e-journals project to implement MARCIt service McGill University Library Feb
Cataloguing Code and Cataloguing Process. What is a Catalog(ue)?  A list of library materials contained in a collection, a library, or a group of libraries.
RDA and Special Libraries Chris Todd, Janess Stewart & Jenny McDonald.
RDA DAY 1 – part 2 web version 1. 2 When you catalog a “book” in hand: You are working with a FRBR Group 1 Item The bibliographic record you create will.
Cataloguing Session Libraries Australia Forum 2011.
The physical parts of a computer are called hardware.
FRBR: Cataloging’s New Frontier Emily Dust Nimsakont Nebraska Library Commission NCompass Live December 15, 2010 Photo credit:
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
THE INTERNATIONAL STANDARD ISO The International Organization for Standardization (ISO) is a worldwide organization which deals with the development.
IME ICC : Report of 1 st Meeting, Frankfurt, Germany July 28-30, 2003 Reported by Dr. Barbara B. Tillett Chair, IFLA Cataloguing Section Chair, IME ICC.
Presenter: Tito Wawire US Embassy, Library of Congress.
| Barbara Pfeifer | VIAF workshop Strasbourg | VIAF partners: Deutsche Nationalbibliothek (DNB) Barbara Pfeifer.
The ___ is a global network of computer networks Internet.
Michael J. Duffy IV Western Michigan University Music Library Association 2016 WorldCat Discovery: Updates from the MLA/MOUG OCLC.
Queensland University of Technology Faculty of Information Technology Michael Middleton 1 CRICOS No J Bibliographic description.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
AN ARCHETYPE FOR INFORMATION ORGANIZATION AND CLASSIFICATION OCLC WorldCat.
Enhancing VIAF with WorldCat
Getting started With Linked Data.
Cataloging Tips and Tricks
MARC: Beyond the Basics 11/24/2018 (C) 2006, Tom Kaun.
Name authority control in an evolving landscape
Onboarding Webinar 13 April 2019 Presented by and.
AUC’s Role In Facilitating Access To Knowledge In The Arab World
Presentation transcript:

OCLC Online Computer Library Center V irtual I nternational A uthority F ile Ed O’Neill Prepared with the assistance of Rick Bennett Australian Committee on Cataloguing Seminar Sydney, Australia, January 31, 2005

Background The IFLA Section on Cataloguing recognized the need for a international authority file: Where authority records from the world’s national bibliographic agencies could be linked Would be available via the Internet Would be a practical expansion of the concept of universal bibliographic control Would build on the work done by each national bibliographic agency Allowing national or regional variations in authorized form to co-exist Supporting worldwide user’s needs for variations in preferred language, script, and spelling

Background The VIAF could be one of the basic building blocks for a “semantic web” When combined with other controlled vocabularies and authority files from such sources as abstracting and indexing services, archives, museums, publishers, etc. Libraries now have an opportunity to make a great contribution to this future and should help make this vision a reality The VIAF be made freely available on the Web to users worldwide

Joint Project A project to test the concept of a VIAF is being jointly undertaken by: Die Deutsche Bibliothek (DDB) The Library of Congress (LC) OCLC Online Computer Library Center (OCLC)

VIAF Formally Approved in Berlin Beacher Wiggins Barbara Tillett Christel Renate Hengel-Dittrich Gömpel Elisabeth Neggemann Jay Jordan Ed O’Neill

Project Goal Demonstrate the feasibility of VIAF by linking the personal names authority records between: Personennormdatei (PND) Library of Congress Name Authority File (LCNAF)

What is the VIAF? The VIAF will be a file of metadata to link users from records in one national bibliographic agency’s personal name authority file to matching records in other national authority files The VIAF will provide for web access through a specially designed user interface The VIAF will support for multi-lingual and multi-script capability The VIAF will use Open Archive Initiative (OAI) protocols to harvest metadata from the agencies’ authority files, which would then be added to the shared servers to keep the file updated The system is being designed so that any number of authority files can be linked

The Problem In the LCNAF and PND authority files: A person may have the same established form in both authority files Different people may be assigned the same established form Different forms of the name may be established for the same person An particular person may not be established in both files

Two People – One Name Adams, Mike In the PND, the name is established for a golfer In LCNAF, the name is established for an author of a Beatles collector's guide

Two Names – One Person LC: Morel, Pierre PND: Morellus, Petrus

Brief LC Authorty 010 n DLC $c DLC $d DLC Larson, Jack. 670 Thomson, V. The cat, c1982: $b t.p. (Jack Larson)

Information in Bibliographic Records From the bibliographic records we gain significant additional information about Jack Larson: He is a lyricist His primary subject area is music He was published in the 80s and 90s by G. Schirmer and Belwin Mills in New York Worked with Virgil Thomson and Gerhard Samuel Jack Larson is the only name he has used on his publications Etc.

Project Phases Phase 1: Build enhances authority files for both PND and LC person names Phase 2: Match PND and LC enhances authority records to create the initial version of the VIAF Phase 3: Build OAI Server Phase 4: Ongoing maintenance and metadata harvesting using OAI protocols Phase 5: Build end user interface with unicode displays

Phase 1 Building the Enhanced Authority Files Authority records generally include very few, if any, details about the person and/or their publishing history The information is rarely sufficient to determine if two different authority records represent the same person To provide additional information to unambiguously match authority records for same author, information from bibliographic records is used to enhance the authority record

Enhancing the Authorities Bibliographic Record Derived Authority Record Enhanced Authority

Mining the Bibliographic Record LDR 00826ccm a ocm s1982 nyuuua n eng 10 $a $a DLC $c DLC 19 $a $c $ $a $b G. Schirmer 45 2 $b d $b d $b va01 $b ve01 $a ka $a M $b.T $a Thomson, Virgil, $d $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson]. 260 $a New York : $b G. Schirmer, $c c $a 1 score (11 p.) ; $c 31 cm. 500 $a For soprano, baritone, and piano $a Vocal duets with piano $a Larson, Jack $x Musical settings $a Larson, Jack. Authors LC Control Number LC Classification Title Material Type Publisher Place of Publication Language Date of Publication Usage

Derived Authority Record 00525nz n xlc OCoLC nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf $a Larson, Jack $a $a the cat $b duet for soprano and baritone $a g schirmer $a nyu $a jack larson $a eng $a $a 198x $a cm $a thomson, virgil $d 1896 All text is normalized Subjects are grouped into broad subject areas Material type is codedPublication date is by decadeCoauthor

90x Control numbers 901 ISBN $a Numeric portion of ISBN 902 ISSN $a Numeric portion of ISSN 903 LCCN $a Numeric portion of LCCN

91x Title fields 910 Title from 245, Subfields a & b 911 Abbreviated title from 210, Subfields a & b 913 Uniform title from 240, Subfields a & b 914 Translated title from 242, Subfields a & b 915 Collective uniform title from 243, All subfields 916 Variant title from 246, Subfields a & b 917 Uniform Title Extracted from Name/Title authorities, field 100 $t

92x Publisher fields 920 Publisher number (Publisher number from ISBN) 921 Publisher name (Publisher name from the 260 $b or 533 $c) 922 Place of publication (Country of publication code from 008 field)

93x Usage 930 Name Usage (Form of name found in the statement of responsibility, 245 subfield $c)

94x Attributes 940 Language (Language code from the 008 or 041 subfield $a) 941 Author's role (Relater code from 700, subfields $e and/or $4) 942 North American Title Count subject (NATC survey line number) 943 Decade of publication 944 Format (Type and bib level) 945 Broader Subject Area

95x Joint Authors 950 Personal Authors (From either the 100 or 700 fields) 951 Corporate Authors

96x Names as Subjects 960 Name as Subject

99x Number of Records 999 Number of Associated bibliographic records –$a Total number of associated bibliographic records –$b Bibliographic Record Control Number –$2 Source of Bibliographic Record

Enhanced Authority Record 00824nz n oca n| acannaab| |n aaa ||| 3 10 $a n $a DLC $c DLC $d DLC $a Larson, Jack $a Thomson, V. The cat, c1982: $b t.p. (Jack Larson) $a $ $a $ $a the cat $b duet for soprano and baritone $ $a sun like $b on a poem by jack larson $ $a g schirmer $ $a belwin mills publ corp $ $a nyu $ $a jack larson $ $a eng $ $a 234 $ $a 198x $ $a 197x $ $a cm $ $a thomson, virgil $d 1896 $ $a samuel, gerhard $9 1

LC Bibliographic Records Number of records: 7,612,979 Personal Names assigned: 6,318,094 Unique Personal Names: 2,554,266

LCNAF Personal Name Authorities Differentiated names: 3,834,162 Undifferentiated names: 37,990 Total authority records:3,872,152

LC Names Established Names 3,834,162 Names from Bib Records 2,554,266 Uncontrolled Names 394,951 Orphaned Names 1,674, 847 Active Established Names 2,159,315

DDB Bibliographic Records Die Deutsche Bibliothek (DDB): 6,316,675 Bibliotheksverbund Bayern (BVB): 5,022,316 Total number of records: 11,338,991 Number of assignments: 12,080,387 Number of unique names: 2,371,461

DDB Names Established Names 2,498,071 Names from Bib Records 2,371,461 Uncontrolled Names 313,931 Orphaned Names 440,541 Active Established Names 2,057,530

Phase 2 Matching the Enhanced Authorities

Linking Retrospective Files Matching Algorithms Enhanced LCNAF Authorities Enhanced PND Authorities VIAF Authorities

Matching Objectives Each distinct author should be uniquely identified. Author: An individual person responsible for the intellectual or artistic content of a work. Established Names: A symbol (character string) used to represent an author. Names will not necessarily be the same in the LCNAF and the PND authority files.

Matching LCNAF PND   ‑   ‑ 

Name Matching To be considered for a match, two names must be consistent: Smith, J. William Are Consistent Smith, John Smith, J. William Are Inconsistent Smith, John Q.

Strong Matching Attributes A work (title) in common Common controls numbers (ISBN, ISSN, or LCCN) Dates; the combination of birth and death year--A moderate match score value is given for matching birth dates Joint Authors Distinct form alternate name For example, LC has 100 Schade, Peter, $d Mosellanus, Petrus, $d While PND has 100 Mosellanus, Petrus, $d Schade, Peter, $d

Weaker Attributes Role (Author, Illustrator, composer, etc. Subject Area of Publications Format (Books, Films, Musical scores, etc.) Language Country Date of publications

Similarity Measure The total similarity measure, is a weighted sum of the of the individual attribute matches A similarity measure is only computed for consistent names The weighting factor is lower for the weaker attributes and higher for the stronger attributes Care is taken to avoid double counting or using scores that are correlated

Similarity Metric oca | X | DDB n| acannaab| |n aaa ||| | n | |||az|nnaa|||||||||||| a|aba|||| d DLC $c DLC $d DLC | X $2 GyFmDB Tarrant, John, $d | DDB $b ger $d 9999 $f RAK-PND The light inside the dark, 1998: $b CIP t.p. (John | Tarrant, John Tarrant) data sheet (John M. Tarrant; b. 1949) | $ $9 1 | licht im herzen der dunkelheit $b die nacht der seele $9 1 | und der weg zur erleuchtung $ the light inside the dark $b zen soul and the | the light inside the dark $9 1 spiritual life $9 1 | $ $9 1 | goldmann $ harpercollins publishers $9 1 | gw $ nyu$9 1 | john tarrant $ john tarrant $9 1 | ger$ eng$9 1 | x$ $9 1 | am$ x$9 1 | $b $2 DDB am$ $b ocm $2 DLC Tarrant, John, $d Tarrant, John the light inside the dark $b zen soul and the spiritual life the light inside the dark harpercollins publishersgoldmann Similarity Metric = 0.89

Future of VIAF? If the proof-of-concept is successful, the VIAF will be expanded: To include other authority files for personal names, To include other types of authorities – Corporate names, – Geographic names, – etc.

First VIAF Record          

Phase 3: Build OAI Server LCNAF DDB/PND OAI Server(s) Slide Courtesy of Barbara Tillett, Library of Congress

Phase 4: Ongoing maintenance and metadata harvesting using OAI protocols Slide Courtesy of Barbara Tillett, Library of Congress

Phase 5: Build End User Interface with unicode displays User’s cookie specifies hongul is preferred. Display 700 form, building on local system’s authority structure Slide Courtesy of Barbara Tillett, Library of Congress

Questions? Thank you