HathiTrust: Possibilities Metadata Working Group Cornell University Library March 21, 2014
HathiTrust is a consortium. A consortium of over 80 libraries Board of Governors Program Steering Committee Working Groups Committees
HathiTrust is a digital library. A continuously growing digital library, currently home of 11M+ books Statistics and Visualizations ( Almost 500 terabytes 3.7M volumes (~33% of total) in the public domain All volumes indexed in full, and searchable in full
HathiTrust is a digital library. Cornell patrons can use a Cornell NetID to login Create Collections (public or private) Download PDFs of any item available in full text Obtain “Enhanced Access” if a CUL affiliate has a certified print disability (
HathiTrust is a preservation repository. HathiTrust is TRAC Certified.TRAC Every book, and every page of every book, has a persistent identifier.persistent identifier Cornell University Library deposits books digitized through the Google partnership.
HathiTrust: getting info out information = metadata info about resources + data resources themselves
APIs: BibAPI returns bibliographic, rights, and volume information when given a single or multiple standard identifiers: OCLC number, LCCN, ISSN, ISBN, HathiTrust Volume ID, HathiTrust record number
APIs: DataAPI retrieves content (page images, OCR, METS and or whole volume packages) and bibliographic info
OAI Harvesting OAI feed (MARC21 and Dublin Core) of public domain materials
Hathifiles tab-delimited files identifying the contents of HathiTrust repository Posted daily: Described: