Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries.

Similar presentations


Presentation on theme: "Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries."— Presentation transcript:

1 Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries December 8, 2008

2 www.hathitrust.org The Vision Universal Digital Library Common Goal Single Entity but Partnership of Many Libraries

3 www.hathitrust.org The Reasons Google Digitization Project Collective Agreement with CIC Announced in June 2007 – U of Michigan and U of Wisconsin Projects already underway

4 www.hathitrust.org Librarians value preservation – How to ensure digital files are preserved? The Reasons

5 www.hathitrust.org The Reasons Librarians value access – How to create a comprehensive and coherent body of materials? Librarians believe in cooperation – How do you achieve a common goal?

6 www.hathitrust.org The Beginning In 2007, CIC agreed to establish a shared digital repository University of Michigan and Indiana University initial leaders of this effort

7 www.hathitrust.org The Beginning CIC Shared Digital Repository HathiTrust

8 www.hathitrust.org The Name The name… hathitrust.org hathi.org olifant.org silverback.org kingkong.org toomai.org

9 www.hathitrust.org The Name The meaning behind the name – Hathi (hah-tee)--Hindi for elephant – Big, strong – Never forgets, wise – Secure – Trustworthy

10 www.hathitrust.org Banking Analogy

11 www.hathitrust.org The Logo

12 www.hathitrust.org The Partners When announced in October 2008, full partners included: – University of California system – CIC (Committee on Institutional Cooperation) – University of Virginia University of Chicago University of Illinois Indiana University University of Iowa University of Michigan Michgian State University University of Minnesota Northwestern University Ohio State University Pennsylvania State University Purdue University University of Wisconsin-Madison

13 www.hathitrust.org vs. The Differences

14 www.hathitrust.org Sorting the Issues Cost Model – Partners charged a one-time start-up fee based on the number of volumes added to the repository, in addition to an annual fee for the curation of those volumes.

15 www.hathitrust.org Sorting the Issues Governance HathiTrust Operational Advisory Board Executive Management Group Strategic Advisory Board

16 www.hathitrust.org Sorting the Issues Impact of Google settlement – Full access to materials – More quickly than a court – Win would have permitted content locked up for years

17 www.hathitrust.org HathiTrust Architecture Storage in Ann Arbor and Indianapolis Encrypted backup to 2 nd AA location Inbound validation, standards-based object storage and related metadata Rights database for rights metadata Online catalog as source and storage for descriptive metadata

18 www.hathitrust.org Objectives: – A guiding principle: store archival images, create deliverables on demand – Incorporate TDR-specific practices Simple filesystem layout using Pairtree structure – One directory per volume, all files inside zip w/associated METS file – Use of a namespace allows for conflicting identifiers – Namespaces for institutions and, if needed, types of identifiers within the institution Page image and metadata repository

19 www.hathitrust.org What information to store? – Considered complexity and maintenance – Considered using MARC directly – Needed to accommodate both bib record-derived rights and manual overrides Approach: examine bib record, determine authoritative copyright status, store rights attribute, source, reason, and timestamp Stored in MySQL Rights database, pt1 ©

20 www.hathitrust.org Each rights attribute must have a reason. – bib: bibliographically-derived – man: manual access control override – ddd: due diligence documented Typical rights attributes in use – pd: public domain – pdus: public domain for US viewers* – inc: in copyright – nobody (override): no access Source (e.g.,google) Rights database, pt. 2 ©

21 www.hathitrust.org © rights database GeoIP database archival page image Pageturner: page image retrieval library catalog metadata METS XML online page image XSLT XML HTML browser

22 www.hathitrust.org HathiTrust and TRAC Automatic validation in GROOVE – Check barcode check digit using Luhn algorithm – Fixity check on JPG, TIFF, UTF8 using MD5 – Well-formedness and embedded metadata check on JPG, TIFF, UTF8 using JHove – Various completeness cross-checks – Failures retried, admin will eventually intervene Periodic fixity checks using MD5

23 www.hathitrust.org OAIS Reference Model GRIN Internal Data Loading GRIN Internal Data Loading Google [OCA] In-house Conversion Google [OCA] In-house Conversion MARC record extensions (Aleph) Rights DB MARC record extensions (Aleph) Rights DB Page Turner HathiTrust API OAI GeoIP DB CNRI Handles [Solr] Page Turner HathiTrust API OAI GeoIP DB CNRI Handles [Solr] METS/PREMIS object TIFF G4/JPEG2000 OCR MD5 checksums METS/PREMIS object TIFF G4/JPEG2000 OCR MD5 checksums METS object PNG OCR PDF METS object PNG OCR PDF Isilon Site Replication TSM MD5 checksum validation Isilon Site Replication TSM MD5 checksum validation GROOVE (JHOVE) GROOVE (JHOVE)

24 www.hathitrust.org Why METS? – Can serve as an Archival Information Package and a Dissemination Information Package – Designed to record the relationship between pieces of complex digital objects – Can be created automatically as texts are loaded or reloaded METS Object

25 www.hathitrust.org Whats there? –metsHdr with an ID and CREATEDATE –dmdSec with a URL –Two techMD referencing notes files –Two fileGrps (images and OCR) –Physical structMap tying together the files with any metadata (pg. numbers or features) METS Object

26 www.hathitrust.org HathiTrust Services Preservation of digital surrogate Access (within bounds of law and settlement) – Viewing – Redistribution Services for print-disabled users Section 108 Non-consumptive research

27 www.hathitrust.org HathiTrust Branding

28 www.hathitrust.org Legal Status of the Books Outside of the Settlement – Public domain content digitized by libraries unconstrained – Libraries continue to do preservation-related work with in-copyright works (Sec108) Settlement – LDC or cooperative LDC (HathiTrust) – Services for print-disabled users – Non-consumptive research – Section 108 uses – General discovery – Sharing of Public domain

29 www.hathitrust.org HathiTrust Future Expansion of partnership New services Revision of governance Refinement of content

30 www.hathitrust.org Contacts, etc. http://www.HathiTrust.org (see sitemap) Patricia Steele John Wilkin

31 www.hathitrust.org Digital library for the future


Download ppt "Building the Universal Library: Introducing HathiTrust Patricia A. Steele Indiana University Libraries John Price Wilkin University of Michigan Libraries."

Similar presentations


Ads by Google