Download presentation
Presentation is loading. Please wait.
Published byBrittney Skinner Modified over 9 years ago
1
OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13
2
OCoLR 20041025 #53928015 OCLCR Overview Some context Looking at data in action OpenWorldCat FRBR Data mining
3
OCoLR 20041025 #53928015 OCLCR Context: value Amazoogle: what should we be doing which fits into a world that they occupy. Where do we provide unique value. ROI: libraries invest in data but do not extract as much value as they might from it. Unless we release more value, then the argument for this investment becomes weaker. User: how do we co-create value with users. What opportunities are there for mixing catalog data and user contributed data? Management intelligence: how do we use data better to inform management decisions?
4
OCoLR 20041025 #53928015 OCLCR Context: consequences The role of the catalog? The role of structured data? The role of the library?
6
OCoLR 20041025 #53928015 OCLCR Data Open WorldCat FRBR WorldCat Wiki Management intelligence
11
OCoLR 20041025 #53928015 OCLCR FRBR ‘Interim FRBR’ in OWC FRBR in research projects FictionFinder Curioser xISBN Algorithm Top 1000 FRBR in FirstSearch – late this year
17
OCoLR 20041025 #53928015 OCLCR
18
Top Sets for Fiction (Records) RecordKeys 1,296defoe, daniel\1661 1731/robinson crusoe 1,267 carroll, lewis\1832 1898/alices adventures in wonderland 971 cervantes saavedra, miguel de\1547 1616/don quixote 828 stevenson, robert louis\1850 1894/treasure island 689 twain, mark\1835 1910/adventures of huckleberry finn 624 twain, mark\1835 1910/adventures of tom sawyer 618 swift, jonathan\1667 1745/gullivers travels
19
Top Sets for Fiction (Holdings) HoldingKeys 29,043twain, mark\1835 1910/adventures of huckleberry finn 26,088carroll, lewis\1832 1898/alices adventures in wonderland 20,843twain, mark\1835 1910/adventures of tom sawyer 19,410defoe, daniel\1661 1731/robinson crusoe 18,566cervantes saavedra, miguel de\1547 1616/don quixote 18,492stevenson, robert louis\1850 1894/treasure island 18,123dickens, charles\1812 1870/christmas carol
20
OCoLR 20041025 #53928015 OCLCR Taking FRBR onto the open web Curio(u)ser
24
OCoLR 20041025 #53928015 OCLCR MetaWiki WIKI – web pages metaWIKI – data Capture user input in structured ways
25
OCoLR 20041025 #53928015 OCLCR Extending Wiki’s utility Wiki: supported markup: wikitext page editing: a single text block searches: full text searching collections managed: one per wiki MetaWiki: supported markup: wikitext structured data (e.g., MARC, METS, DC…) page editing: a single text block, or, field level searches: full text searching fielded searching collections managed: one/multiple per OaiWiki
26
Lorcan: note that this is a work in progress
27
OCoLR 20041025 #53928015 OCLCR Management intelligence So we have all this data – what can it tell us? Several projects underway: only some discussed here
28
OCoLR 20041025 #53928015 OCLCR Making Data Work Harder Activities “shed” data: Cataloging bibliographic information Web site traffic transaction logs Reference queries search term lists Need to mine this data for intelligence that creates value for libraries and users OCLC Research undertaking a number of data-mining projects aimed at: Knowing more about the characteristics of library collections Creating interesting and useful data displays Generating intelligence to support library decision-making
29
OCoLR 20041025 #53928015 OCLCR Data mining OCLC has a new collection analysis service Some research projects looking at systemic questions described here.
30
OCoLR 20041025 #53928015 OCLCR Looking at Library Print Book Collections … Systematically 32 million print books, representing 26 million distinct works Half of print books published after 1977; more than 80% still “in copyright” Rareness is common! Only a third of print books have more than five holdings; half have two or less OCLC/Ithaka collaboration: Use WorldCat to characterize the “system-wide” print book collection – i.e., aggregate print book holdings in WorldCat Intelligence of this kind can help establish digitization priorities and inform preservation planning More information: http://www.oclc.org/research/presentations/lavoie/cni2005.ppt Only about 120,000 works had both print book and e-book manifestations
31
OCoLR 20041025 #53928015 OCLCR The Implications of GooglePrint … Potentially covers about one third of print books in WorldCat ~60 percent of “GooglePrint” books held by only one of the Google 5 Less than 5 percent held by all of the Google 5 ~20 percent of “GooglePrint books” out of copyright Paper forthcoming …
32
OCoLR 20041025 #53928015 OCLCR Know Your Audience! Implies: we can infer materials’ audience level from holdings patterns, which in turn can support: Collection management Readers’ advisory services Reference services Information retrieval Holdings represent selection decisions by librarians … implies there are about 1 billion individual selection decisions in the WorldCat holdings file Selections are made to serve the interests of a library’s target community … Associate target community (audience level) to particular library profiles - e.g., ARL, non-ARL academic, public, K-12 school … Paper forthcoming! ?
33
OCoLR 20041025 #53928015 OCLCR “Last Copy”: Identifying At-Risk Materials ~23 million WorldCat records have only a single holding attached Libraries need to know what portions of their collections are: Rare … Rare and valuable … “Last copy” (artifact and/or content) Identification of rare materials essential intelligence in support of storage, digitization, and preservation decision-making Data-mining study of Vanderbilt holdings in WorldCat: Identified 23,000 items held uniquely by Vanderbilt ~60 % are print books ~60 % produced prior to 1950; ~25 % produced after 1970 Paper forthcoming!
34
OCoLR 20041025 #53928015 OCLCR Thank you! OCLC Research: http://www.oclc.org/research/http://www.oclc.org/research/ Lorcan: http://orweblog.oclc.org/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.