Download presentation
Presentation is loading. Please wait.
Published byJerome Ray Modified over 9 years ago
1
OCLC Programs & Research Prospecting in the library data mines Brian Lavoie Consulting Research Scientist OCLC Programs & Research Annual Partners Meeting Washington, DC June 4, 2007
2
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 2 Making data work harder Data is an asset Informs planning and decision-making Drives new forms of services Libraries have many data assets Bibliographic, holdings, usage, reference inquiries, … Opportunities to collect data increase in network spaces … Web site traffic, click-through patterns, e-usage, … Make data work harder Use library data in innovative ways to create value
3
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 3 Data mining & OCLC Research Networks of collaboration and coordination Decisions taken in “system-wide context” Focus on resources of “system” Mass digitization, cooperative print storage, shared discovery environments, … As library networks develop and expand, opportunities arise to create value through: Collective action Aligning local collections with system-wide environment Data is context Research area focused on data mining activities Aggregate collections “System-wide collection” (as represented in WorldCat)
4
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 4 Managing the collective collection Mass digitization “Last copies” Long tail
5
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 5 Mass digitization Google Book Search (aka Google Print for Libraries) Aggregate collection of digitized print books (combined holdings of Harvard, Michigan, Oxford, NYPL, and Stanford) Data-mining to provide empirical context to inform community- wide dialog http://www.dlib.org/dlib/september05/lavoie/09lavoie.html
6
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 6 “Rareness is common” System-wide print book collection: ~32 million print books 37% Held by 1 5% Held by > 100 3% Held by 51 - 100 5% Held by 26 - 50 20% Held by 6 - 25 30% Held by 2 - 5 Data-mining to better understand nature of the “collective collection” Identify rare & unique materials in system-wide collection (“last copies”)
7
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 7 The Library Long Tail (using holdings as measure of popularity) Number of Holdings Items ranked by system-wide popularity HEAD: Top 10% of WorldCat records (ranked by holdings) account for 80% of total WorldCat holdings LONG TAIL: Bottom 90% of WorldCat records (ranked by holdings) account for 20% of total WorldCat holdings HEAD: Small proportion of items account for lion’s share of collecting activity LONG TAIL: Everything else spread out across Long Tail of diffuse collecting activity Data-mining to inform strategies/policies aimed at optimizing system-wide supply & demand for library materials
8
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 8 Others … Registry of Copyright Evidence New York Art Museum study
9
OCLC Programs & Research Prospecting the library data mines Annual Partners Meeting 9 Shared print storage Use library data to inform decision-making: Data about library assets (bibliographic) Data about choices involving these assets (holdings, circ., ILL) System-wide aggregation (larger aggregation = richer context) Shared print storage decision-making: Data about assets (local inventories of print materials) Data about system-wide availability (holdings) Data about usage (local & system-wide) Role of Research: Data collection Data-mining analysis in support of project needs Inform community dialog on shared print storage issues Analyze “collective collection” in shared print context Support development of effective print storage strategies Standardize analysis to maximize applicability/re-use
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.