Anatomy of Aggregate Collections: The Example of Google Print for Libraries Brian Lavoie Senior Research Scientist OCLC Research OCLC Members Council Meeting.

Slides:



Advertisements
Similar presentations
ETD Preservation Survey Results Gail McMillan Digital Library and Archives, Virginia Tech 11th International ETD Symposium Robert Gordon University.
Advertisements

Jack Jedwab Association for Canadian Studies September 27 th, 2008 Canadian Post Olympic Survey.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
1 Yell / The Law and Special Education, Second Edition Copyright © 2006 by Pearson Education, Inc. All rights reserved.
Accredited Supplier Communications Plan FY09-10 Q1 to Q4 May 2009, v2.0 Home Access Marketing & Stakeholder Engagement Team.
1 DSLs with Groovy Saager Mhatre. 2 github.com/dexterous code.google.com/u/saager.mhatre
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2013 Elsevier Inc. All rights reserved.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 40.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 38.
Chapter 1 Image Slides Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Disability status in Ethiopia in 1984, 1994 & 2007 population and housing sensus Ehete Bekele Seyoum ESA/STAT/AC.219/25.
The University of Texas at El Paso Building a National Reputation By Successfully Serving its Region The University of Texas at El Paso Building a National.
Mining for Digital Resources: Identifying and Characterizing Digital Materials in WorldCat Brian Lavoie Lynn Silipigni Connaway Ed ONeill ACRL 12 th National.
Ithaka A Systemwide View of Library Collections Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005.
Programs and Research The virtual cultural heritage Lorcan Dempsey With contributions from Constance Malpas LIBER Think tank on the future value of the.
LIBER pre-conference, 5 July 05 The inside out library: libraries in the age of Amazoogle Lorcan Dempsey OCLC LIBER pre-conference: Converging and dissolving.
OCoLR # OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005.
OCLC Research The tale of the library long tail: space, collections, and the network Brian Lavoie Consulting Research Scientist OCLC Space: The Final Frontier.
RLG Programs Assessing Uniqueness in the System-wide Book Collection Constance Malpas Program Officer RLG Webinar 24 April 2008.
The changing scholarly and cultural record Lorcan Dempsey Hall Center for the Humanities, University of Kansas December 5, 2007 Thanks to Constance Malpass.
David Burdett May 11, 2004 Package Binding for WS CDL.
OCLC Online Computer Library Center Libraries and the Landscape of the Future Symposium on the Future of Integrated Library Systems September 13, 2007.
Custom Services and Training Provider Details Chapter 4.
CALENDAR.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Learning to show the remainder
Photo Slideshow Instructions (delete before presenting or this page will show when slideshow loops) 1.Set PowerPoint to work in Outline. View/Normal click.
Break Time Remaining 10:00.
This module: Telling the time
The basics for simulations
KARACHI FASHION WEEK CHAPTER 3 JANUARY 27 – 30, 2011 FASHION RUNWAY SHOW FASHION RUNWAY SHOW BRAND PRESENTATIONS BRAND PRESENTATIONS FASHION BRANDS EXHIBITIONS.
PP Test Review Sections 6-1 to 6-6
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
15. Oktober Oktober Oktober 2012.
Core Code Quiz How much do you really know…?. Core Code Quiz   Take a minute and write everything you know about this Core Code.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
We are learning how to read the 24 hour clock
1..
Adding Up In Chunks.
The Future of Higher Education in Further Education Colleges Swindon 17 May 2007.
Sets Sets © 2005 Richard A. Medeiros next Patterns.
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
25 seconds left…...
Subtraction: Adding UP
: 3 00.
5 minutes.
Types of clocks. Types of clocks Sand clock or Hourglass clock.
Numeracy Resources for KS2
Essential Cell Biology
Converting a Fraction to %
Age Biased Technical and Organisational Change, Training and Employment Prospects of Older Workers Luc Behaghel, Eve Caroli and Muriel Roger Paris School.
Clock will move after 1 minute
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Select a time to count down from the clock above
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
3 - 1 Copyright McGraw-Hill/Irwin, 2005 Markets Demand Defined Demand Graphed Changes in Demand Supply Defined Supply Graphed Changes in Supply Equilibrium.
1 DIGITAL INTERACTIVE MEDIA Wednesday, October 28, 2009.
What the quarterly Labour Force Survey can tell us about the economic circumstances of people with sight loss Sue Keil RNIB.
Characterizing Web Content, User Interests, and Search Behavior by Reading Level and Topic Jin Young Kim*, Kevyn Collins-Thompson, Paul Bennett and Susan.
The world’s libraries. Connected. Print Management at ‘Mega’-scale NITLE Collections in a Mega-regional framework NITLE Shared Academics » Future of Libraries.
OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill.
Anatomy of Aggregate Collections Exploring Mass Digitization and the “Collective Collection” Brian Lavoie Research Scientist OCLC Research NELINET September.
OCLC Programs & Research Prospecting in the library data mines Brian Lavoie Consulting Research Scientist OCLC Programs & Research Annual Partners Meeting.
NetLibrary Publishers’ Summit Looking at libraries Lorcan Dempsey OCLC NetLibrary Publishers’ Summit June 2005.
Presentation transcript:

Anatomy of Aggregate Collections: The Example of Google Print for Libraries Brian Lavoie Senior Research Scientist OCLC Research OCLC Members Council Meeting October 2005

Aggregate collections Boundaries between local and external collections increasingly blurred … Resource sharing (digital/network technologies) Cooperative collection management (resource allocation) Shift in focus to resources of the system (or subsets of the system), rather than individual collections Need data to support/illuminate system-wide perspective Characterize/analyze aggregate collections WorldCat: largest aggregate collection Aggregate holdings of >20,000 libraries Bridge from local to system-wide perspective

The system-wide print book collection as represented in WorldCat (January 2005) ~55 million ~41 million ~35 million ~32 million print books More information:

Google Print for Libraries Aggregate collection of print books Aggregate print book holdings of five major research libraries (Harvard, Michigan, Oxford, NYPL, and Stanford) Focus on copyright issues; very little discussion of Google Print for Libraries as an aggregate collection What are characteristics of this aggregate collection? How does it relate to the system-wide collection? WorldCat: useful data source for analysis Lavoie, Connaway, Dempsey: Anatomy of Aggregate Collections: The Example of Google Print for Libraries D-Lib (September 2005)

G5 coverage of system-wide print book collection 10.5 million unique books 10.5 million unique books

Holdings overlap Potential redundancy rate of 40 percent Potential redundancy rate of 40 percent

Language distribution LanguageGoogle 5System-wide English German French Spanish Chinese Russian Italian Japanese Hebrew Arabic Portuguese Polish Dutch Latin Korean Swedish0.01< 0.01 All others More than 430 languages in Google 5 collection More than 430 languages in Google 5 collection

Cumulative age distribution of G5 holdings > 80 percent of Google 5 collection still in copyright > 80 percent of Google 5 collection still in copyright

Works Coverage slightly higher (35 %) Holdings overlap slightly greater (56 % held uniquely) Coverage slightly higher (35 %) Holdings overlap slightly greater (56 % held uniquely)

Some speculation … What results would have been obtained if a different group of libraries had been selected? What incremental extensions to coverage can be obtained by adding additional library collections to original Google 5? Chose 5 new libraries: Small US liberal arts college Large US public university Large US private university Large US metropolitan library Large Canadian university

Beyond the Google 5 … New Google 5Original Google 5 Total holdings:~8 million~18 million Total unique books:5.9 million10.5 million % of system-wide:18 percent33 percent Redundant holdings:26 percent42 percent Impact by library type:% of holdings unique relative to original G5 collection: Large US metropolitan library:39 percent (most unlike G5) Large US private university:25 percent Large Canadian university:23 percent Large US public university:21 percent Small US liberal arts college:13 percent (most like G5)

The Google 10 Original Google 5 (10.5 million books) Google 10 collection: 12.3 million books million (17 %) Google 10 collection: 12.3 million books million (17 %) Diminishing returns? Original G5: ~18 million holdings 58% unique New G5: ~8 million holdings 22% unique

Mass digitization programs and other aggregate collections increasingly common features of library landscape Effective decision-making/planning aided by convergence on set of standard questions that help map out anatomy of aggregate collections Example: mass digitization programs What are characteristics of overarching population of materials that is target of digitization effort? How much of population will digitization effort cover? What is potential degree of redundancy? What bibliographic unit is focus of digitization (e.g., manifestations, expressions, works)? What number of participants and combination of institution types is optimal for obtaining maximum benefit with minimum cost? Anatomy of aggregate collections

Aggregate collections and WorldCat WorldCat more than tool for cataloging and reference; also strategic resource for managing aggregate collections OCLC Group Services OCLC WorldCat Collection Analysis Service OCLC Research data-mining activities Web site: