Ithaka A Systemwide View of Library Collections Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005
Ithaka Systemwide View of Library Collections Print collections have been changing, as the distinction between local and external resources is increasingly blurred due to resource sharing Digitization combined with network technologies creates opportunities for one copy of a resource to be shared across many libraries These forces inevitably are going to lead to a shift in focus to the resources of the system, rather than individual library collections
Ithaka Mass Digitization Great deal of public and private investment in digitization programs … e.g., JSTOR, ARTstor - and of course mass digitization spearheaded via GooglePrint Digitization opportunities unlimited; resources are not … How to determine priorities? What programs of digitization will be necessary to meet the needs of the scholarly community?
Ithaka Print Preservation From a systemwide perspective, what preservation framework makes most sense for print resources? How have preservation frameworks changed over time? As retrospective materials become increasingly available in digital form, will new frameworks for print preservation be necessary?
Ithaka What Are We Going to Do Today? The kinds of collaborations necessary to begin to take advantage of a systemwide perspective are very hard, both from economic and political standpoints We will not be proposing any answers! Instead, we thought to take advantage of the WorldCat resource – which affords the broadest view of print collections – to build a bridge from a local perspective to the beginnings of a systemwide perspective Todays presentation focuses on print books
Ithaka Data Sources WorldCat: worlds largest and most comprehensive bibliographic database > 20,000 libraries worldwide have contributed to the development of WorldCat Copy of WorldCat from January 2005: ~55 million records Copy of WorldCat holdings file from January 2005: ~950 million holdings
Ithaka Data Source Limitations Not all published materials are cataloged in WorldCat Not all library holdings are represented in WorldCat Largely reflects North American library collections So … WorldCat does not embody the whole universe of library collections and holdings – but its a very good approximation!
Ithaka 1. The Systemwide Collection Size Age
Ithaka How Many Books Are Held in the Systemwide Collection?
Ithaka How Many Books Are Held in the Systemwide Collection?
Ithaka How Many Books Are Held in the Systemwide Collection?
Ithaka How Many Books Are Held in the Systemwide Collection?
Ithaka Works and Manifestations FRBR (Functional Requirements for Bibliographic Records): Hierarchy of bibliographic entities Works, Expressions, Manifestations, Items Work: distinct intellectual or artistic creation e.g., Macbeth Manifestation: physical embodiment of an expression of a work e.g., Macbeth, Folger Shakespeare Library edition, published in paperback by Washington Square Press (2004) WorldCat records describe FRBR manifestations Works identified using OCLC FRBRization algorithm Converts MARC21 bibliographic databases into FRBR work-sets
Ithaka Most Book Works Have Few Manifestations
Ithaka Print Book Manifestations and Works – and Digital Manifestations
Ithaka How Old Are the Components of the Systemwide Collection? Cumulative Book Works/Manifestations Over Time
Ithaka How Old Are the Components of the Systemwide Collection? Book Works/Manifestations per Year
Ithaka Age of Works and Manifestations: Relative to 1923 (millions)
Ithaka 2. Individual Collections Cumulate to Form the System How will digitization bring them together virtually?
Ithaka Minimal Overlap Book Works Held by X or More Libraries (in millions)
Ithaka Works Held Broadly Book Works Held by X or More Libraries (in millions)
Ithaka Works Held Broadly Book Works Held by X or More Libraries, as Percent of Total Book Works
Ithaka The Virtual System in Practice GooglePrint digitization initiative Questions: How many print books does this initiative potentially impact? What proportion of systemwide print book collection does this represent? Overlap (how much held broadly? how much held uniquely?) Forthcoming paper from OCLC researchers that will offer some perspective on these questions Hopefully, work like this will help to establish set of important questions/metrics that need to be addressed when: Considering digitization initiatives Considering implications of a changing world of research and learning for collections
Ithaka 3. How Is Rareness Distributed through the System?
Ithaka Systemwide Holdings of Print Works
Ithaka More than 9 millions works are held only once
Ithaka 4. What Systemwide Preservation Frameworks Have Served Us?
Ithaka The Growth and Peak in Average Holdings Over Time
Ithaka Steady, Gradual Nineteenth Century Growth in Works Held Many Times…
Ithaka …Rapid Postwar Increase in Works Held Many Times
Ithaka Of Works with Multiple Holdings, Steady Increase Through the 1960s in the Proportion Held Many Times
Ithaka Summary and Discussion
Ithaka Summary: Findings 1.Roughly 26 million print title works, represented in 32 million print title manifestations, are held by OCLC member libraries. This should be seen as a minimum in considering the number of printed books over time. Half of the books date from the period since How can a mass digitization strategy effectively manage the intellectual property ramifications of this finding? 2.Publications are distributed across a wide number of libraries, and any mass digitization strategy that ignores this distributional reality is likely to omit numerous works. How should this finding impact the library systems planning for a massive format migration?
Ithaka Summary: Findings 3.Rareness is very common within the system. This has been recognized by many librarians but is not always taken into account in policy development. How will any future print preservation strategy address this reality? Can data on rareness help to inform digitization strategies? 4.Redundancy in holdings across the system has changed over time. How has this led our framework for preservation to become more or less secure? What lessons should be drawn as we consider other print preservation strategies, particularly in the era of mass digitization, such as paper repositories? What lessons might there be for digital preservation?
Ithaka More information … More in-depth article forthcoming … Contact us with comments and questions: Brian Lavoie: Roger C. Schonfeld: