Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web Dr. Barbara B. Tillett Chief, Policy & Standards Division Library of Congress For MOUG, Feb. 8, 2011
DBpedia National Library of Sweden Linked Data LCSH VIAF
Internet “Cloud” Databases, Repositories Web front end Services 3
Internet “Cloud” Web front end Services VIAF Databases, Repositories LCSH 4
5 VIAF Objectives Facilitate exposure of authority data Reduce cataloging costs Simplify authority control (creation and maintenance) internationally Provide authority data in form, language, and script users want
VIAF 6 ЧАЙКОВСКИЙ, ПЕТР ИЛЬИЧ, Tchaikovsky, Peter Ilich, Tschaikowski, Peter I. Čajkovskij, Petr Il'ič Chai ̆ kovski, P. I
7 VIAF: The Virtual International Authority File Original VIAF partners Library of Congress (LC) Library of Congress Deutsche Nationalbibliothek (DNB) Deutsche Nationalbibliothek Bibliothèque nationale de France (BnF) Bibliothèque nationale de France OCLC - host Virtually combining the name authority files of all institutions into a single name authority service.
8 Virtual International Authority File Matches names across 21 authority files of 18 institutions 13 million name records 10 million personas 4.5 million clusters Based on KSY Cooperative Identities Hub, CEAL
9 Library of Congress/NACO Deutsche Nationalbibliothek Bibliothèque nationale de France National Library of Australia National Library of the Czech Republic Bibliotheca Alexandrina (Egypt) Getty Research Institute National Library of Israel Istituto Centrale per il Catalogo Unico (Italy) Biblioteca National de Portugal Biblioteca Nacional de España National Library of Sweden Swiss National Library Vatican Library NUKAT Center (Poland) Library and Archives Canada National Széchényi Library (Hungary) RERO (Switzerland)
10 Current Status Available as linked data with URIs (Universal Resource Identifiers) Unicode throughout MARC 21, UNIMARC, and RDF supported Usage tripled this last year Thousands of visits daily
Enhancing the Authorities Bibliographic Record Derived Authority Record Enhanced Authority 11
Mining the Bibliographic Record LDR 00638ncm a a s1965 oruuua n eng 10 $a $a DLC $c DLC 019 $a $c $ $a $b Matrix Publ. Co $b d $b d $b va01 $b ve01 $a ka $a M1258 $b.L $a Leigh, Mitch, $d $a The man of La Mancha / $c by Mitch Leigh & Joe Darion; arr. By Roland Barrett & Alan Keown. 260 $a Springfield, OR : $b Matrix Publ. Co., $c c $a 1 score (16 p.) ; $c 18 x 27 cm. 500 $a Brief record $a Musicals $x Excerpts $a Leigh, Mitch $x Musical settings $a Darion, Joe. Authors LC Control Number LC Classification Title Material Type Publisher Place of Publication Language Date of Publication Usage
Derived Authority Record 00505cz a n xlc OCoLC n|acannaab|n aaa c $a OCoLC $b eng $c OCoLC $f viaf $a Leigh, Mitch $a $a the man of la mancha $a matrix publ co $a oru $a mitch leigh $a eng $a $a 196x $a cm $a darian, joe $d All text is normalized Subjects are grouped into broad subject areas Material type is coded Publication date is by decade Coauthor
Enhanced Authority Record 00505cz a n oca n| acannaab| |n aaa ||| 3 10 $a n $a DLC $c DLC $d DLC $a Leigh, Mitch, $d $a the man of la mancha, c1966: $b t.p. (Mitch Leigh) $a $ $a $ $a impossible dream $ $a century library of music and sound by mitch leigh $ $a matrix publ co $ $a kapp $ $a oru $ $a mitch leigh $ $a eng $ $a 234 $ $a 196x $ $a 197x $ $a cm $ $a darian, joe $d $ $a wasserman, dale $9 1
15 Information in Bibliographic Records He writes music His primary subject area is music He was published in the 1960s and 1970s by Matrix Publ. Co. in Oregon and Kapp in New York Worked with Joe Darion and Dale Wasserman Mitch Leigh is the only name he has used on his publications Etc.
16 Hosted by
17 viaf.org
Cervantes Saavedra, Miguel de 1547 Cervantes de Salazar, Francisco, ca Cervantes, Cervantes Juan, Cervantes, Ignacio, Cervantes, Juan de, Cervantès, François, Cervani, Giulio, Cervantes, María Antonieta Cervantes de Haro, fl As viewed Nov. 1, 2010 cer
Cervantes
Preferred Forms
Cervantes
MARC 21 Cervantes
RDF Cervantes
30 VIAF and Catalogers Use as a reference tool: To resolve conflicts, questionable dates, forms of name, etc. Cite as source in 670 $a, for example: BNF in VIAF, Feb. 8, 2011 Nat. Lib. of Australia in VIAF, Feb. 4, 2011 LAC in VIAF, Feb. 3, 2011
31 Next steps for VIAF Better searching More “Linked data” Related persons as in WorldCat Identities, Wikipedia, etc. Participants beyond libraries Rights management agencies, Publishers Museums, Archives More name types Corporate and Family names Uniform titles Geographic names … not topical terms
32 SKOS Simple Knowledge Organization System “Provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary”—SKOS Primer
33 SKOS Based on the Resource Description Framework (RDF) Resources can be exchanged between software applications and published on the Web Interconnects data on the Web, helping create the Semantic Web
34 id.loc.gov/authorities “Authorities & Vocabularies” from the Library of Congress Intent: To provide human and programmatic access to commonly found standards and vocabularies developed by LC
35 “Authorities & Vocabularies” LCSH was the first offering Subject headings Genre/form headings Children’s subject headings Subdivision records Validation records Provides links from LCSH headings to RAMEAU headings Exploring Répertoire de vedettes- matière (RVM) and others
36 “Authorities & Vocabularies” Also includes: Thesaurus for Graphic Materials (TGM) MARC geographic area codes MARC language codes MARC relator codes Preservation Events … etc.
37 “Authorities & Vocabularies” Benefits Servers can download entire controlled vocabularies and the values within them, in multiple formats Available for free on the Web
38 “Authorities & Vocabularies” Human end-users can Search and view individual headings and data elements Details of the record Visualization Suggest additions, changes
39
40
41
42 URI for specific LCSH records/ concepts: id.loc.gov/authorities/[LCCN] id.loc.gov/authorities/sh “Authorities & Vocabularies”
43
44
45 Contact information Content of site: Libby Dechman, Technical questions: Larry Dixson, “Authorities & Vocabularies”
46 A comment form and discussion list are available at “Authorities & Vocabularies”
47 RDA Controlled Vocabularies - Registries Free on the Web at Open Metadata Registry
Carrier type
URI
RDA Carrier Types URI
RDA Linked Data Don Quixote Madrid, 1979 English Spanish French German Cervantes Library of Congress Copy 1 Green leather binding Exemplary novels Wasserman The Man of La Mancha Text Movies … Derivative works Subject created
53 RDA Linked Terms for Languages Don Quijote Madrid, 1979 Inglés Español Francés Alemán Cervantes Library of Congress Copia 1 Encuadernación en piel color verde Novelas Ejemplares Wasserman The Man of La Mancha Texto Películas … Obras derivadas Materias
Internet “Cloud” Web front end Services VIAF Databases, Repositories LCSH
iPhone apps to connect to libraries via WorldCat (OCLC) Pic2shop app MHiuaDXipWQ MHiuaDXipWQ RedLaser app Dv1cAYR5wc&feature=related Dv1cAYR5wc&feature=related 49