OCLC Cluster Service Leiden March Discussion Session With KB & UVA Janifer Gatenby, Strategic Research
2 Agenda Welcome and Introductions Presentation –Clustering –Audience Level –Copyright / Rareness –FAST subject headings Discussion Lunch
3 Some slides from NCSU’s Endeca Test Catalog using OCLC work identifiers for Clustering
4
5
6
7 Some slides from PiCarta (Netherlands) Test Catalog using OCLC work identifiers for Clustering
8 Without clustering
9 With Clustering
10 Consolidation of Holdings The above example shows 2 holdings, one each per bibliographic record. The consolidation of holdings permits Reservations (holds) and Requests at work level
11 Dutch NCSU 6.7 million work identifiers / 7.7 million bib records Collapse rate of 13% –Av bibliographic records per work record Software adaptation less than 1 week 1.64 million work identifiers / 1.7 million bib records Collapse rate of 3% –Av bibliographic records per work record
12 Method OCLC # OCLC Work IDTitle Goldene vliess Goldene vliess Goldene vliess Goldene vliess Goldene vliess Goldene vliess
13 Method PPNOCLC # OCLC Work IDTitleComments Goldene vliessnot in main group Goldene vliessnot in main group Goldene vliessnot in main group Goldene vliessin main group x Goldene vliessin main group Goldene vliessin main group
14 Fixing Mismatches Alternatives –Fix data at source –Apply name / title authority records –Enhance algorithm Eliminate foreign articles Convert “fünf”, “vijf”, “cinq” to “5” etc. At OCLC –Quality control –Office of Research
15 Authorities Ensure Matching Foreign union catalogue data –Non AACR2, not native MARC21, other language of cataloguing, non standard uniform titles –Requesting 1,000 name / title authority records per union catalogue Bib record for a translation without uniform title will match if there is a comprehensive author / title authority record
16 Bib Authority 100 …Rowling, J.K. 245 …La chambre secrète ……………. Rowling, J.K. The secret chamber De geheime kamer La chambre secrète Die geheime kammer ……………
17 FRBR – Divide and conquer Creation of works (38 million) Algorithm Authority records Cleaning bibliographic records where necessary No manual links created Improved user interfaces Harvesting Loading IDs & records Authority records Improved user interfaces Suggestions for the improvement of the algorithm and records
18 ALA Mid Winter Meeting Representatives 19 libraries with substantial holdings in WorldCat Clear Requirements –XML cluster record service –Minimum of daily update
19 Discussion
20 Phase 2 Phase 1 – table Phase 2 – work record with enriched data –Audience level –Rareness –Copyright –FAST headings for faceted search
21 Audience Level and Rareness
22 OpenURL Request Transfer Message
23 Faceted Search
24 FAST headings Fully formed concepts Suitable for faceted search –LCSH “sentences” – breaking into concepts is tricky
25 Discussion
26 Cluster Identifier Type Value Instance/s Identifier/s + type Copyright estimate Holdings count (rarity) Description Related Works WC Cluster Identifier Instance/s Relationship (sequel etc.) OCLC Number
27
28 Deployment CBS incorporating cluster record in test due Easter Installation in LBS OCLC Distribution service – dev. To start in April PSI modifications to use cluster record Looking for testing partners
29