Authorities Futures T Hickey OCLC. Why authorities?

Slides:



Advertisements
Similar presentations
xID Web Services (xISBN, xOCLCnum, xISSN) FRBR grouping of editions and formats Tim McCormick Product Manager, Grid Services Xiaoming.
Advertisements

OCLC Research OCLC Online Computer Library Center ALA Midwinter 2006 San Antonio, TX OCLC FictionFinder & OCLC DeweyBrowser Eric Childress OCLC Research.
OCLC Online Computer Library Center Parallel Text Searching on a Beowulf Cluster using SRW Ralph LeVan OCLC Research.
XID Web services Xiaoming Liu Senior Software Engineer OCLC.
A Virtual International Authority File Presentation by Barbara B. Tillett, Ph.D. Chief, Cataloging Policy and Support Office Library of Congress to the.
RoMEO Service & Developments Peter Millington & Jane H Smith Centre for Research Communications University of Nottingham JISC Conference 2010 RoMEO: An.
Mobile Solutions Niels Peter Sørensen, Product Manager Anne-Marie Arnvig, Communications and Relations Manager.
What’s the difference between MBD Search Engine and other SEs?
The What’s, How’s and Why’s of ‘Open Access’. $22, Open Access $14, $12, Some sample 2008 journal prices…
RILM Abstracts of Music Literature in its Global Environment: The Past and Vision for the Future Zdravko Blažeković Répertoire International de Littérature.
Data Mining the Largest Library Database in the World Roy Tennant OCLC Research Leveraging WorldCat.
Leap into Innovation 7 Cloud Facts You Can't Afford to Ignore James Lambe Canadian Director, Google Enterprise.
Copyright © 2014 Pearson Education, Inc. Publishing as Prentice Hall
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Test practice Multiplication. Multiplication 9x2.
Parallel Databases Michael French, Spencer Steele, Jill Rochelle When Parallel Lines Meet by Ken Rudin (BYTE, May 98)
The Last Procedure Before First Functional Prototype Grant Boomer, Brett Papineau, Tanis Lopez, Archana Shrestha CS 383.
Context Character Theme/ symbols Author’s intention.
Welcome to Milne Library: An introduction to the library and its many resources.
JSTOR User Services l February 2009 Using the JSTOR Interface User Services, February 2009.
CS 345A Data Mining Lecture 1
Bielefeld, Strategies for Quality Search Engines Thomas Place Deputy Librarian Tilburg University
A Virtual International Authority File Presentation by Barbara B. Tillett, Ph.D. Chief, Cataloging Policy and Support Office Library of Congress For the.
A Virtual International Authority File Presentation by Barbara B. Tillett, Ph.D. Chief, Cataloging Policy and Support Office Library of Congress for the.
Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web Dr. Barbara B. Tillett Chief, Policy & Standards Division.
Symbian os with smart phones Guided by: Hetal A Josiyara
SCOPUS AND SCIVAL EVALUATION AND PROMOTION OF UKRAINIAN RESEARCH RESULTS PIOTR GOŁKIEWICZ PRODUCT SALES MANAGER, CENTRAL AND EASTERN EUROPE KIEV, 31 JANUARY.
Chapter 3 by June Kaminski
DATAVERSE FOR JOURNALS Mercè Crosas, Ph.D. Director of Data Science IQSS, Harvard Society for Scholarly Publishing 37 th Meeting,
Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York.
Catherine C. Marshall Akshay Kulkarni.  Explores practices associated with ◦ Collaborative Authoring ◦ Reference Use ◦ Informal Creation of Personal.
Publisher’s Perspective: Digitization of print resources, and archiving of digital resources Judy Best, June 13, 2006.
Unit 3 - Computer Systems
PI Performance Monitoring James Wong OSI Software, Inc.
Prague 24 November TEL-ME-MOR/M-CAST Seminar on Subject Access The Virtual International Authority File (VIAF) Christel Hengel.
SCOPUS AND SCIVAL EVALUATION AND PROMOTION OF UKRAINIAN RESEARCH RESULTS PIOTR GOŁKIEWICZ PRODUCT SALES MANAGER, CENTRAL AND EASTERN EUROPE LVIV, 11 SEPTEMBER.
OCLC Research OCLC Online Computer Library Center Research & New Technologies Interest Group 24 October 2005 DeweyBrowser & Curiouser Diane Vizine-Goetz.
New approaches to the catalog T. Hickey Svensk Biblioteksförening 2005 October 28.
Web Archiving and Access Mike Smorul Joseph JaJa ADAPT Group University of Maryland, College Park.
Page: 1 EDUG Symposium „Dewey-flokkunarkerfið - trending DDC topics in Iceland and other parts of Europe” 22 May :15 – 15:45 News from WebDewey.
ORCID and Elsevier Mike Taylor Research Specialist
PSYC1010 Introduction to Library Research Adam Taves Reference Librarian.
OCLC Research: Selected projects Eric Childress Larry Olszewski Presentation for Dpto. Biblioteconomía y Documentación Universidad Carlos III de Madrid.
SLIDE 1DID Meeting - Montreal Integrating Data Mining and Data Management Technologies for Scholarly Inquiry Ray R. Larson University of California,
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Building Blocks for the Future: Making Controlled Vocabularies Available for the Semantic Web Dr. Barbara B. Tillett Chief, Policy & Standards Division.
HOUR 1 TO 54. HOUR 1 Getting started What do we know? What is Wiki? How do we communicate professionally? It’s not a tumor? When do we get our white coats?
Gary Handman
Computer Hardware.
Advanced Searching IS530 Fall 2009 Dr. Dania Bilal.
Microsoft ® Official Course Structuring and Publishing Websites for All Users Microsoft SharePoint 2013 SharePoint Practice.
Unit C-Hardware & Software1 GNVQ Foundation Unit C Bits & Bytes.
The Public’s Library and Digital Archive terasaur A Hybrid Cloud Approach to Publishing Large Files John Reuning, Paul Jones IBM Cloud Academy Conference.
Piotr Ilich Tchaikovsky ( ). Life Born in Russia Studied music while in Law School Gave up legal job to enroll in St. Petersburg Conservatory.
Piotr Ilich Tchaikovsky ( ). Life Born in Russia Studied music while in Law School Gave up legal job to enroll in St. Petersburg Conservatory.
Operating Systems Shannon Gibson. What is an Operating System?  An operating system is the most important software that runs on a computer.
© 2016 Ex Libris | Confidential & Proprietary Publishing non-preferred terms to Primo Harvard University April Yoel Kortick Senior Librarian
How do Computers Work ?.
Sarah Whitcher Kansa (Open Context / Alexandria Archive Institute)
EDUG Meeting Alexandria: WebDewey 2.0
Best Chatbot Maker -
Looking Inside the machine (Types of hardware, CPU, Memory)
“The Irish Presence in the Published Record” by OCLC Research, from An Exploration of the Irish Presence in the Published Record (doi: /C3WS6R),
A Virtual International Authority File
Publishing University Engaged Research
Smart Media Interactions
Searching and browsing multiple subject gateways in the Renardus Service (Michael Day, UKOLN, University of Bath) Searching and browsing multiple subject.
Reviewing the Literature
EUROPEAN WEB SITE ON INTEGRATION
WRITTEN SOURCES OF DATA
Presentation transcript:

Authorities Futures T Hickey OCLC

Why authorities?

Searching

Browsing

Variations on Tchaikovsky NACO: Tchaikovsky, Peter Ilich, German: Čajkovskij, Pëtr I French: ČajkovskijPiotr Ilʹič Cyrillic: Чайкoвский, Пётр Ильич ( )

More ways to say Chajkowskii Ciaikovsky, Piotr Ilic Tschaikowsky, Peter Iljitch Tchaikowsky, Peter Iljitch Ciaikovsky, Pjotr Iljc Cajkovskij, P. I Tsjaikovsky, Peter Iljitsj Czajkowski, Piotr Chaikovsky, P. I Csajkovszkij, Pjotr Iljics Tsjaïkovskiej, Pjotr Iljietsj Tjajkovskij, Pjotr Ilitj Čaikovskis, P Chaĭkovskiĭ, Petr Ilʹich Tchaikovski, P Tchaikovski, Piotr Ilyitch Chaĭkovskiĭ, P Tchaikovsky, P Tchaïkovsky, Piotr Ilitch Tschaikowsky, Pjotr Iljitsch Tschajkowskij, Pjotr Iljitsch Tchaïkovski, P. I Ciaikovskij, Piotr Ciaikovskji, Piotr Ilijich Tschaikowski, P. I Tschaikowski, Peter Illic Tjajkovskij, Peter Chaĭkovski, Pʹotr Ilich Tschaikousky Tschaijkowskij, P. I Tschaikowsky, P. I Chaĭkovski, P. I Tchaïkovski, Petr Ilitch Ciaikovski, Peter Ilic Tschaikowski, Pjotr Tchaikowsky, Pyotr Sinopov, P Tchaikovskij, Piotr Ilic

Wider coverage Published, unpublished, objects, licensed, archival Multiple sources Machine generated Info. professionals, scholars, researchers, enthusiasts Broader use of APIs Multiple views Better context Better navigation More mashups Authorities touch everything

33 Nodes 132 CPUs 528 Gigabytes memory 33 Terabytes disk 100-fold speed up 1 hour <1 minute 1 day 15 minutes 1 month 8 hours

Controlling WorldCat Virtual International Authority File WorldCat Identities

Controlling names in WorldCat Has been done semi-manually – Encourages review of all links For Identities we did this automatically – Research copy of WorldCat – Very aggressive matching How to move links to WorldCat?

Pretend you are a Connexion Client Program to: – Log in – Search for record – Verify heading hasnt changed – Insert authorized form – Add link – Do replace

Then just replace 26 million records Each update takes two transactions – Retrieve the record – Replace the record If it takes 2 seconds/update – 52,000,000 seconds – ~ 2 years

But, we can run multiple clients Connexion can handle 40+ of these clients – ~ 20 records/second Offline processing has limited capacity – Run 32 clients for 12 hours for 16 updates/second – ~700,000 overnight – Up to a million/day 3 million/week 2-3 months elapsed time

Virtual International Authority File

VIAF DNB Bib & AuthorityBnF Bib & AuthorityLC Bib & Authority VIAF ~7.5 million personal name authority records ~25 million bibliographic records ~1.2 million links between files

Match on Names and dates in headings Standard numbers Titles Coauthors Publishers Personal name as subject

Matching situations

Hickey, Thomas Butler, d 1947-

Dempsey, Lorcan

Tchaikovsky, Peter Ilich C ̌ ajkovskij, Pe ̈ tr I. C ̌ ajkovskij, Pe ̈ tr I./ Tchaikovsky, Peter Ilich/ Чайкoвский, Пётр Ильич

Fournier, Marcel Fournier, Marcel,1946- Fournier, Marcel,1945-

What makes a match? 1,338,606 Title 526,234 Double date 67,749 Joint author 47,499 LCCN 15,867 Partial date and partial title 6,454 Partial date and publisher 4,673 Partial title and publisher 4,116 Name as subject 2,158 Standard number

Next steps for VIAF Merged display Better documentation More participants Geographics

New Zealand Identities (in WorldCat) 82,868Mahy, Margaret 73,871Mansfield, Katherine 53,779Marsh, Ngaio 52,876Cowley, Joy 23,009Frame, Janet 11,986Park, Ruth

Australian Identities (in WorldCat) 51,399Keneally, Thomas 42,679Fox, Mem 30,301Travers, P. L. 28,998Lindsay, Jack 19,179Marsden, John 16,688Stead, Christina 15,041Malouf, David 14,717Jennings, Paul 13,769Lawson, Henry 12,612Winton, Tim

Editing

Merged result Immediately visible in Identities Persistent in Identities Information fed into established channels

Name Finder

Implementation SRU/SRW server (Z39.50 for the Web) XML returned XSLT style sheets transform it to HTML

Syndication Searchable via SRU, OpenURL Sitemaps for harvesters HTML for harvesters and mobile devices Links in Wikipedia

More Identities

Gods

Other identities

Thomas Hickey Chief Scientist OCLC