Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries.

Slides:



Advertisements
Similar presentations
Home-Grown Digital Library System Built Upon Open Source XML Technologies and Metadata Standards David Lacy Villanova University
Advertisements

Beyond the Google Book: the Future of the Digital Library Cory Snavely Library IT Core Services manager University of Michigan April 20, 2010.
HathiTrust Digital Library
HATHI TRUST A Shared Digital Repository Building A Future By Preserving Our Past The Preservation Infrastructure of HathiTrust Digital Library Jeremy York.
HATHI TRUST A Shared Digital Repository HathiTrust Digital Library Is There A Past In Your Future? Princeton University February 2010.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
HathiTrust: Building the Universal Collection John Wilkin 18 May 2009.
This Library Never Forgets Preservation, Cooperation, and the Making of HathiTrust Digital Library Jeremy York Project Librarian HathiTrust Digital Library.
HathiTrust: A Big Idea with Bold Plans
HATHI TRUST A Shared Digital Repository HathiTrust Open Webinar Jeremy York Project Librarian, HathiTrust May 3 and 5, 2011.
Building the Universal Library: The Promise and Challenges of HathiTrust John Wilkin 2 April 2009.
HathiTrust Sharing a Federal Print Repository: Issues and Opportunities May 25, 2011 Heather Christenson.
HATHI TRUST A Shared Digital Repository Digital Preservation, HathiTrust, and the Reimagination of the Library Landscape Jeremy York Iceland August 5,
HATHI TRUST A Shared Digital Repository HathiTrust How We Can Make A Difference Jeremy York Yale University November 3, 2010.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
HATHI TRUST A Shared Digital Repository HathiTrust 101 John Wilkin and Jeremy York August 27, 2010.
What is HathiTrust and Why is it relevant to research libraries? Sourcing and Scaling brought to the collective collection.
HATHI TRUST A Shared Digital Repository HathiTrust, Collections, and Collaboration COLD 2011 Spring Meeting Jeremy York May 20, 2011.
HATHITRUST A Shared Digital Repository HathiTrust Outside-In University of Michigan Law School June 14, 2011 Jeremy York HathiTrust Project Librarian.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
The PREMIS Data Dictionary Michael Day Digital Curation Centre UKOLN, University of Bath JORUM, JISC and DCC.
An update on Google Book search digitization at the University of Michigan … the agreement and plans for work between Google and the.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
MacKenzie Smith Associate Director for Technology MIT Libraries.
HathiTrust and the Ecology of Shared Collections Paul N. Courant 21 May 2009.
Standards showcase: MODS, METS, MARCXML ALA Annual 2006 Rebecca Guenther and Jackie Radebaugh Network Development and MARC Standards Office Library of.
HATHITRUST A Shared Digital Repository We’re Preserving the Past, What About the Present? NISO Webinar: Ensuring the Preservation of E-Books May 23, 2012.
What’s Next for HathiTrust?. We’re Growing Up! Partnership Arizona State University Baylor University Boston University California Digital Library Columbia.
HATHITRUST A Shared Digital Repository HathiTrust current work, challenges, and opportunities for public libraries Creating a Blueprint for a National.
HATHITRUST A Shared Digital Repository HathiTrust as a Model for Preservation and Access Jeremy York Media Preservation Conference April 17, 2013.
HATHI TRUST A Shared Digital Repository Digital Repositories for Preservation and Access Digital Directions 2013 Jeremy York July 22, 2013 Unless otherwise.
HATHITRUST A Shared Digital Repository Bibliographic Metadata and HathiTrust ALCTS CaMMS Catalog Management Interest Group Meeting American Library Association.
October 24, 2006Merit Technical Staff Meeting1 The Google Project at the University of Michigan Perry Willett Head, Digital Library Production Service.
DRS 2 one in a series of periodic updates Harvard University Library Andrea Goethals October 21, 2009 DRS = Digital Repository Service.
HATHITRUST A Shared Digital Repository HathiTrust METS and PREMIS October 25, 2011 Jeremy York Project Librarian, HathiTrust.
HATHITRUST A Shared Digital Repository HathiTrust on the Move A Growing Partnership Taking Stock and Looking Ahead National Library of Medecine October.
HATHITRUST A Shared Digital Repository A Preservation Infrastructure Built to Last: Preservation, Community, and HathiTrust UNESCO Memory of the World.
HATHITRUST A Shared Digital Repository HathiTrust Overview: Partnership and Services Jeremy York Wesleyan University Web Presentation February 18, 2014.
HATHI TRUST A Shared Digital Repository Columbia University and HathiTrust Collaboration at a new level.
HATHITRUST A Shared Digital Repository HathiTrust Past, Present, and Future A Brief Introduction.
HATHITRUST A Shared Digital Repository More, Better, Together: HathiTrust Accomplishments and Aspirations The Researcher of Tomorrow Universidad Complutense.
HathiTrust – How To By Dr. Rob McGeachin 20 th Annual AgNIC Meeting May 7, 2015.
HATHITRUST A Shared Digital Repository HathiTrust: Putting Research in Context HTRC UnCamp September 10, 2012 John Wilkin, Executive Director, HathiTrust.
Web-based workflow software to support book digitization and dissemination The Mounting Books project books.northwestern.edu Open Repositories 2009 Meeting,
“Old Style” Libraries, Digital Libraries: Convergences, Divergences, And the Troubles in Between.
HATHITRUST A Shared Digital Repository HathiTrust Infrastructure and Information Organization November 7, 2011 Jeremy York Project Librarian, HathiTrust.
HathiTrust Digital Library. Overview ›Began in 2008 ›Large scale digital preservation repository ›Partnership of major research libraries ›Focus on both.
HATHITRUST A Shared Digital Repository HathiTrust: Key Concepts and Issues in Managing the Digital Archive ICPSR Summer Workshop “Curating and Managing.
Breana McCracken University of Illinois at Urbana-Champaign HathiTrust and Copyright Future Implications - Strong precedent for libraries to continue to.
HATHITRUST A Shared Digital Repository HathiTrust and TRAC DigitalPreservation 2012 July 25, 2012 Jeremy York, Project Librarian, HathiTrust.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share.
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
HATHITRUST A Shared Digital Repository HathiTrust and the Future of Research Libraries American Antiquarian Society March 31, 2012 Jeremy York, Project.
HATHITRUST A Shared Digital Repository Your Library, Now Online! Putting HathiTrust in the Context of Traditional (and New) Library Services MCLS Webinar.
HATHI TRUST A Shared Digital Repository Use of PREMIS for Internet Archive AIPs September 22, 2010.
The Oxford-Google Digitization Project* Michael Popham Oxford Digital Library * Rules of commercial confidentiality apply to this presentation!
HATHITRUST A Shared Digital Repository Institution Uses of HathiTrust Jeremy York University of Maine May 24, 2013.
Digital Preservation Panel Medusa at the University of Illinois at Urbana-Champaign: A Digital Preservation Service Based on PREMIS Kyle Rimkus, Preservation.
HathiTrust: Collaboration in Building the Universal Collection John Wilkin 1 October 2009.
HathiTrust: Possibilities Metadata Working Group Cornell University Library March 21, 2014.
HATHITRUST A Shared Digital Repository HathiTrust Large Digital Libraries: Beyond Google Books Modern Language Association January 5, 2012 Jeremy York,
Barbara Preece ICOLC, April Mark Sandler Center for Library Initiatives Chicago Illinois Indiana Iowa Michigan Michigan State Minnesota Northwestern.
HathiTrust: A valuable and visionary Partnership.
HathiTrust--a GovDocs Repository? Brian Vetruba, Catalog Librarian/Germanic Studies Librarian Washington University in St. Louis Leveraging.
HathiTrust Digital Library Interface and Services
Bentley Project Reel Digitization Bentley Historical Library t
Building the Universal Library: Introducing HathiTrust
Presentation transcript:

Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries December 8, 2008

The Vision Universal Digital Library Common Goal Single Entity but Partnership of Many Libraries

The Reasons Google Digitization Project Collective Agreement with CIC Announced in June 2007 – U of Michigan and U of Wisconsin Projects already underway

Librarians value preservation – How to ensure digital files are preserved? The Reasons

The Reasons Librarians value access – How to create a comprehensive and coherent body of materials? Librarians believe in cooperation – How do you achieve a common goal?

The Beginning In 2007, CIC agreed to establish a shared digital repository University of Michigan and Indiana University initial leaders of this effort

The Beginning CIC Shared Digital Repository HathiTrust

The Name The name… hathitrust.org hathi.org olifant.org silverback.org kingkong.org toomai.org

The Name The meaning behind the name – Hathi (hah-tee)--Hindi for elephant – Big, strong – Never forgets, wise – Secure – Trustworthy

Banking Analogy

The Logo

The Partners When announced in October 2008, full partners included: – University of California system – CIC (Committee on Institutional Cooperation) – University of Virginia University of Chicago University of Illinois Indiana University University of Iowa University of Michigan Michgian State University University of Minnesota Northwestern University Ohio State University Pennsylvania State University Purdue University University of Wisconsin-Madison

vs. The Differences

Sorting the Issues Cost Model – Partners charged a one-time start-up fee based on the number of volumes added to the repository, in addition to an annual fee for the curation of those volumes.

Sorting the Issues Governance HathiTrust Operational Advisory Board Executive Management Group Strategic Advisory Board

Sorting the Issues Impact of Google settlement – Full access to materials – More quickly than a court – Win would have permitted content locked up for years

HathiTrust Architecture Storage in Ann Arbor and Indianapolis Encrypted backup to 2 nd AA location Inbound validation, standards-based object storage and related metadata Rights database for rights metadata Online catalog as source and storage for descriptive metadata

Objectives: – A guiding principle: store archival images, create deliverables on demand – Incorporate TDR-specific practices Simple filesystem layout using Pairtree structure – One directory per volume, all files inside zip w/associated METS file – Use of a namespace allows for conflicting identifiers – Namespaces for institutions and, if needed, types of identifiers within the institution Page image and metadata repository

What information to store? – Considered complexity and maintenance – Considered using MARC directly – Needed to accommodate both bib record-derived rights and manual overrides Approach: examine bib record, determine authoritative copyright status, store rights attribute, source, reason, and timestamp Stored in MySQL Rights database, pt1 ©

Each rights attribute must have a reason. – bib: bibliographically-derived – man: manual access control override – ddd: due diligence documented Typical rights attributes in use – pd: public domain – pdus: public domain for US viewers* – inc: in copyright – nobody (override): no access Source (e.g.,google) Rights database, pt. 2 ©

© rights database GeoIP database archival page image Pageturner: page image retrieval library catalog metadata METS XML online page image XSLT XML HTML browser

HathiTrust and TRAC Automatic validation in GROOVE – Check barcode check digit using Luhn algorithm – Fixity check on JPG, TIFF, UTF8 using MD5 – Well-formedness and embedded metadata check on JPG, TIFF, UTF8 using JHove – Various completeness cross-checks – Failures retried, admin will eventually intervene Periodic fixity checks using MD5

OAIS Reference Model GRIN Internal Data Loading GRIN Internal Data Loading Google [OCA] In-house Conversion Google [OCA] In-house Conversion MARC record extensions (Aleph) Rights DB MARC record extensions (Aleph) Rights DB Page Turner HathiTrust API OAI GeoIP DB CNRI Handles [Solr] Page Turner HathiTrust API OAI GeoIP DB CNRI Handles [Solr] METS/PREMIS object TIFF G4/JPEG2000 OCR MD5 checksums METS/PREMIS object TIFF G4/JPEG2000 OCR MD5 checksums METS object PNG OCR PDF METS object PNG OCR PDF Isilon Site Replication TSM MD5 checksum validation Isilon Site Replication TSM MD5 checksum validation GROOVE (JHOVE) GROOVE (JHOVE)

Why METS? – Can serve as an Archival Information Package and a Dissemination Information Package – Designed to record the relationship between pieces of complex digital objects – Can be created automatically as texts are loaded or reloaded METS Object

Whats there? –metsHdr with an ID and CREATEDATE –dmdSec with a URL –Two techMD referencing notes files –Two fileGrps (images and OCR) –Physical structMap tying together the files with any metadata (pg. numbers or features) METS Object

HathiTrust Services Preservation of digital surrogate Access (within bounds of law and settlement) – Viewing – Redistribution Services for print-disabled users Section 108 Non-consumptive research

HathiTrust Branding

Legal Status of the Books Outside of the Settlement – Public domain content digitized by libraries unconstrained – Libraries continue to do preservation-related work with in-copyright works (Sec108) Settlement – LDC or cooperative LDC (HathiTrust) – Services for print-disabled users – Non-consumptive research – Section 108 uses – General discovery – Sharing of Public domain

HathiTrust Future Expansion of partnership New services Revision of governance Refinement of content

Contacts, etc. (see sitemap) Patricia Steele John Wilkin

Digital library for the future