KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the Cloud Library With thanks to Constance Malpas at OCLC and John Wilkin at University of Michigan for their considerable contributions
Overview Need for cloud library Our pilot project Brief overview of HathiTrust Scope and process for pilot project Expectations and benefits
Cloud Library, not cloud computing Similar but vastly different Necessity/desire to share resources leverage shared investment, reduce local cost Multiple digital and print repositories Repositories can now move into a cloud that will become a shared network resource What infrastructure needed?
Registry Transfers Borrowing System Shared Collections Withdrawals Retrievals Commitments Holdings Loans Disclose Aggregate holdings and joint commitments constitute a shared asset enabling collaborative management strategies Procedures Policies Infrastructure Assets Local Collections Off-Site Collections ReCAP Digitized Library Collections
Perceived need Already good support of other virtual shared services, e.g., ILL, doc delivery What exists in off-site storage and digital repositories that isnt currently accessible? Collection development mechanisms need to discover accessibility and preservation statuses How should we build such a service for consumers?
Partners in pilot NYU – model customer Acute space pressures; major library renovation Limited mandate to build local collection of record ReCAP – model supplier Large-scale shared academic storage collection HathiTrust – model supplier Large-scale shared digital repository OCLC Research and CLIR – consultants & convener
Demand for services Multiple, sometimes overlapping, reasons institutions will be interested in being part of a cloud library preserving titles that are rare and/or special in some manner remove titles that are duplicated across many institutions added value of shared materials in digital repository (discovery, search) contributing to a public good
A bit about HathiTrust To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge materials converted from print improve access …to meet the needs of the co-owning institutions reliable and accessible electronic representations coordinate shared storage strategies public good …sustaining the historical record simultaneously …centralized …open
Growth of HathiTrust Includes ingest of materials not from Google (GBS)
Goals of pilot study service expectations for both digital and print repositories cost/benefit analyses for sharing resources processes for discovery of shareable titles not the build-out of technical solutions
N=7.6M ReCAP N=3.8M HathiTrust Material that NYU can already source through existing ILL – enhance local collection Material that NYU can obtain through HT dependent on copyright status – enhance local collection N=2.3M opportunities for institutional cooperation shared policy frameworks joint service agreements increased operational efficiencies Intersections Material that NYU may choose to relegate with appropriate service level agreement Material that NYU can relegate with a high degree of confidence Material that NYU may choose to relegate based on copyright/ availability
Process for discovery of overlap Ingestion on a monthly basis Checking of OCLC numbers (without cant be processed)– use of xID to derive more New data structure…
Harvest Hathi metadata Derive addl OCLC numbers via xID Extract WorldCat data Extract OCLC numbers Normalize rights values Process, index, analyze Join Hathi and WorldCat data Monthly data harvest 2 weeks per cycle to process Rights anomalies report OCLCnum report Overlap analysis report
HathiTrust: Looking forward Ingesting from 4 institutions (UC, Indiana, Wisconsin, Michigan), more to come Moving from off-site storage scanning to main libraries Result: slight changes in number of PD volumes Change in membership …broader base of institutions for cost-sharing Future contracts will mostly be picklists Internet Archive ingest starts this winter/late fall Completion of TRAC certification
Expectations Service expectations for both HathiTrust and ReCAP turnaround time continuity of operations access privileges For ReCAP, agreements similar to current processes With HathiTrust, all are par for the course
Partners in cloud library with HathiTrust With HathiTrust as a service partner, institutions can reap the benefits of… preservation of texts and metadata longevity and perpetuity trust and reliability access to titles not held by library (comprehensive) opportunity for voice in HathiTrust development
Outcome Increased reliance on a network of collections and services with a robust underpinning of shared policy and service infrastructures that are jointly owned by participating libraries Naturally, as number of participants grows, value of partnership increases…
Questions? Constance Malpas (OCLC): John Wilkin (HathiTrust): Kat Hagedorn (HathiTrust):