HATHITRUST A Shared Digital Repository Preservation with a Purpose: End User Access Services in HathiTrust Jeremy York Rutgers University February 24,

Slides:



Advertisements
Similar presentations
HathiTrust Unless otherwise noted, these slides and their contents are licensed under a Creative Commons Attribution Unported License.
Advertisements

This Library Never Forgets Preservation, Cooperation, and the Making of HathiTrust Digital Library Jeremy York Project Librarian HathiTrust Digital Library.
HATHI TRUST A Shared Digital Repository HathiTrust, Collections, and Collaboration COLD 2011 Spring Meeting Jeremy York May 20, 2011.
National Institutes of Health U.S. Department of Health and Human Services The PEPH Resource Center: A New, More Convenient Login.
HATHITRUST A Shared Digital Repository Update on Developments and Activities UM Selectors October 9, 2012 Jeremy York, Project Librarian, HathiTrust.
HATHITRUST A Shared Digital Repository We’re Preserving the Past, What About the Present? NISO Webinar: Ensuring the Preservation of E-Books May 23, 2012.
What’s Next for HathiTrust?. We’re Growing Up! Partnership Arizona State University Baylor University Boston University California Digital Library Columbia.
HATHITRUST A Shared Digital Repository HathiTrust current work, challenges, and opportunities for public libraries Creating a Blueprint for a National.
HATHITRUST A Shared Digital Repository HathiTrust as a Model for Preservation and Access Jeremy York Media Preservation Conference April 17, 2013.
HATHITRUST A Shared Digital Repository Bibliographic Metadata and HathiTrust ALCTS CaMMS Catalog Management Interest Group Meeting American Library Association.
HATHITRUST A Shared Digital Repository Collective Stewardship through HathiTrust Digital Library African Studies in the Digital Age November 12, 2014 Mike.
HATHITRUST A Shared Digital Repository HathiTrust METS and PREMIS October 25, 2011 Jeremy York Project Librarian, HathiTrust.
HATHITRUST A Shared Digital Repository HathiTrust on the Move A Growing Partnership Taking Stock and Looking Ahead National Library of Medecine October.
HATHITRUST A Shared Digital Repository HathiTrust: A Second Life for Library Collections Jeremy York Exploring Humanities Cyberinfrastructure April 30,
HATHITRUST A Shared Digital Repository HathiTrust: The Collection and Its Uses NEFLIN Webinar - November 7, 2013 Jeremy York, Assistant Director, HathiTrust.
HATHITRUST A Shared Digital Repository A Preservation Infrastructure Built to Last: Preservation, Community, and HathiTrust UNESCO Memory of the World.
HATHITRUST A Shared Digital Repository How Can Digital Collections Support Shared Print Initiatives? The HathiTrust Print Monograph Archive Planning Task.
HATHITRUST A Shared Digital Repository Big Collections in an Era of Big Copyright: Practical Strategies for Making the Most of Digitized Heritage Jeremy.
HATHITRUST A Shared Digital Repository HathiTrust Overview: Partnership and Services Jeremy York Wesleyan University Web Presentation February 18, 2014.
UC Collections & Services
HATHITRUST A Shared Digital Repository Why Digitize? or The Limits of Preservation 2014 TEI/DHCS Plenary Session Evanston, IL Mike Furlough Executive Director,
HATHITRUST A Shared Digital Repository Digital Humanities in HathiTrust: Research At Any Scale Jeremy York Digital Humanities and the Futures of Japanese.
HATHITRUST A Shared Digital Repository Getting the Most Out of HathiTrust: An Overview of Resources, Tools, and Services Jeremy York Oakland University.
Average Increase in Direct Compensation by Employee Group (Includes Extension, excludes Hospital) PercentPercent.
HATHITRUST A Shared Digital Repository HathiTrust Past, Present, and Future A Brief Introduction.
HATHITRUST A Shared Digital Repository More, Better, Together: HathiTrust Accomplishments and Aspirations The Researcher of Tomorrow Universidad Complutense.
CILogon and InCommon: Technical Update Jim Basney This material is based upon work supported by the National Science Foundation under grant numbers
HATHITRUST A Shared Digital Repository HathiTrust: Putting Research in Context HTRC UnCamp September 10, 2012 John Wilkin, Executive Director, HathiTrust.
HATHITRUST A Shared Digital Repository Collaborating Globally, Planning Locally HathiTrust and New Opportunities in Collection Management GWLA/UNM: Emerging.
PLASC Member Survey: Who’s our crowd? Conducted by Stephanie Bennett and Adrienne Pruitt Presented at the PLASC annual business meeting, Friday, August.
HATHITRUST A Shared Digital Repository HathiTrust Infrastructure and Information Organization November 7, 2011 Jeremy York Project Librarian, HathiTrust.
LEGEND Public Health Schools Law Schools Medical & Other Schools Public Health Schools Teaching Public Health Law As of July 1, 2012.
Map Review. California Kentucky Alabama.
June, 2012 Art Mandel.  Multiple acceptances to Ivy League Schools  Multiple acceptances to the “Most Competitive” colleges and universities  State.
HATHITRUST A Shared Digital Repository HathiTrust: Key Concepts and Issues in Managing the Digital Archive ICPSR Summer Workshop “Curating and Managing.
Breana McCracken University of Illinois at Urbana-Champaign HathiTrust and Copyright Future Implications - Strong precedent for libraries to continue to.
HATHITRUST A Shared Digital Repository HathiTrust and TRAC DigitalPreservation 2012 July 25, 2012 Jeremy York, Project Librarian, HathiTrust.
Harrison’s Top 25 1.Florida State 2.Alabama 3.Oregon 4.Oklahoma 5.South Carolina 6.Michigan State 7.Ohio State 8.Auburn 9.Baylor 10.Georgia 11.UCLA 12.LSU.
HATHITRUST A Shared Digital Repository HathiTrust and the Future of Research Libraries American Antiquarian Society March 31, 2012 Jeremy York, Project.
HATHITRUST A Shared Digital Repository Your Library, Now Online! Putting HathiTrust in the Context of Traditional (and New) Library Services MCLS Webinar.
The NCAA in 2035 By: Mr. Dunlap Why we are here? We have already seen during the past few years, change is coming with the NCAA. Football is leading.
HATHITRUST A Shared Digital Repository Institution Uses of HathiTrust Jeremy York University of Maine May 24, 2013.
PUBLIC SCHOOL LAW Part 16 : Primary Legal Sources—Judicial.
Hawaii Alaska (not to scale) Alaska GeoCurrents Customizable Base Map text.
HathiTrust: Collaboration in Building the Universal Collection John Wilkin 1 October 2009.
US MAP TEST Practice
HATHITRUST A Shared Digital Repository HathiTrust Large Digital Libraries: Beyond Google Books Modern Language Association January 5, 2012 Jeremy York,
An Overview of the Platform
Collaboration: to work jointly with others towards a common goal Or the whole is greater than the sum of its parts Lisa B. German Library Faculty Organization.
An Introduction to the Coalition. A diverse group of public and private American colleges and universities (the Coalition) has come together to develop.
Center for the Integration of Research, Teaching and Learning: Advancing the teaching of STEM disciplines in higher education.
United States Cultural Regions. New England The six states of New England are Maine, New Hampshire, Vermont, Rhode Island, Massachusetts and Connecticut.
Presenters:Lea Domingo, Branch Manager, Kahuku Public and School Library Sunny Pai, Digital Initiatives Librarian, Kapiolani Community College If you.
HathiTrust: A valuable and visionary Partnership.
HathiTrust--a GovDocs Repository? Brian Vetruba, Catalog Librarian/Germanic Studies Librarian Washington University in St. Louis Leveraging.
HATHITRUST A Shared Digital Repository ALA CopyTalk: CRMS The Copyright Review Management System September 1, 2016 Melissa Levine, Lead Copyright Officer,
Introducing Students to the Locker
IRBchoice Connections Call: June 10, 2016
HathiTrust Digital Library Interface and Services
Faculty Salary Study Comparison to AAU Data Exchange Institutions
HathiTrust Copyright Review
The States How many states are in the United States?
Welcome to IB Information Night
The NCAA in 2035 By: Mr. Dunlap.
Christopher C. Brown Reference Librarian
HathiTrust And Its Research Center
The New Era of the NCAA By: Mr. Dunlap.
WASHINGTON MAINE MONTANA VERMONT NORTH DAKOTA MINNESOTA MICHIGAN
CBD Topical Sales Restrictions by State (as of May 23, 2019)
From Innovation to Commercialization Access to Data
Presentation transcript:

HATHITRUST A Shared Digital Repository Preservation with a Purpose: End User Access Services in HathiTrust Jeremy York Rutgers University February 24, 2015

About

HathiTrust Members Allegheny College American University of Beirut Arizona State University Baylor University Boston College Boston University Brandeis University Brown University California Digital Library Carnegie Mellon University Case Western Reserve Colby College Columbia University Cornell University Dartmouth College Duke University Emory University Getty Research Institute Georgetown University Georgia Tech Harvard University Library Indiana University Iowa State University Johns Hopkins University Kansas State University Lafayette College Library of Congress Massachusetts Institute of Technology McGill University` Michigan State University Montana State University Mount Holyoke College New York Public Library New York University North Carolina Central University North Carolina State University Northeastern University Northwestern University The Ohio State University Oklahoma State University Penn State Princeton University Purdue University Rutgers University Stanford University State University System of Florida Syracuse University Temple University Texas A&M University Texas Tech Tufts University Universidad Complutense de Madrid University of Alabama University of Alberta University of Arizona University of British Columbia University of Calgary University of California Berkeley Davis Irvine Los Angeles Merced Riverside San Diego San Francisco Santa Barbara Santa Cruz The University of Chicago University of Connecticut University of Delaware University of Houston University of Illinois University of Illinois at Chicago The University of Iowa University of Kansas University of Maine University of Maryland University of Massachusetts, Amherst University of Miami University of Michigan University of Minnesota University of Missouri University of Nebraska-Lincoln University of New Mexico The University of North Carolina at Chapel Hill University of Notre Dame University of Oklahoma University of Pennsylvania University of Pittsburgh University of Queensland University of Tennessee, Knoxville University of Texas University of Utah University of Vermont University of Virginia University of Washington University of Wisconsin- Madison Utah State University Vanderbilt University Virginia Tech Wake Forest University Washington University Yale University Library

Partnership Preserve and expand access to library collections Leverage collection action – Shared Print Monographs Archive – US Federal Government Documents – Rights and Access – Discovery and Use

Digital Repository Launched 2008 Initial focus on digitized book and journal content – 13.2 million total volumes – 6.7 million book titles – 350,000 serial titles – 4.9 million volumes in the public domain (~37%)

The Name The meaning behind the name – Hathi (hah-tee)--Hindi for elephant – Big, strong – Never forgets, wise – Secure – Trustworthy

What is in HathiTrust?

Libraries in US by # Volumes ALA - Nation’s Largest Libraries: Data from

1. Michigan4,712, California3,612, Harvard838, Wisconsin561, Indiana529, Cornell510, Penn State388, Illinois329, NYPL294, Princeton252, Minnesota193, Madrid117, Library of Congress 108, Keio University90,112

Collection Overlap 19% overlap in 2009 (2.93 million volumes) 31% overlap in 2010 (6.15 million volumes) More than 50% median overlap with ARL institutions – higher for small liberal arts colleges

HathiTrust contains materials in all disciplines… HathiTrust by call number – rshttp:// rs and includes a wide range of primary source materials, such as: Diaries Correspondence Reports Newspapers Memoirs

HathiTrust covers a wide range of formats, such as Books Encyclopedias Archival materials Directories Periodicals Maps Musical scores Statistics Visual Materials

Dates

Language Distribution (1) The top 10 languages make up ~87% of all content

Language Distribution (2) The next 40 languages make up ~12% of total

HathiTrust and other e-databases

Access and Services

Determinants of Access Copyright determination / Permissions Third-party agreements Overlap with print collection

Full View Limited View PD Worldwide PD U.S. Open Access No Restrictions No Restrictions Special Access Only In Copyright / Undetermined Full DownloadPage-at-a-time DownloadNo Download IC U.S.

Content Distribution

✔ Full View (PD/PDUS, OA) No Restrictions Full View (PD/PDUS, OA) Restrictions Limited View

✔ Full View (PD/PDUS, OA) No Restrictions Full View (PD/PDUS, OA) Restrictions Limited View

Full View (PD/PDUS, OA) No Restrictions Full View (PD/PDUS, OA) Restrictions Limited View ✔

Full View (PD/PDUS, OA) No Restrictions Full View (PD/PDUS, OA) Restrictions Limited View ✔

Lawful uses Access to users who have print disabilities Access works that are damaged or missing and also out of print Subject to terms and conditions at

Type of work Searchable (bibliographic and full-text) Viewable*Full-PDF download Print on Demand Print disabilities* Preservation uses (Section 108)* Public domain worldwide Worldwide Partners-only if 3 rd -party restrictions, if not, worldwide. Worldwide N/A Public domain (US) – Non-US works published between 1873 and WorldwideWhen accessed from with the United States Partners in the US if 3 rd party restrictions, if not, anyone in the US Available within the United States Partners in the US; partners worldwide where laws permit N/A Works that rights holders have opened access to in HathiTrust Worldwide If third-party restrictions, full- PDF only available if opened with CC license) Worldwide with permission WorldwideN/A Works that are in-copyright or of undetermined status WorldwideNot available Partners in the US; partners worldwide where laws permit Partners in the US; partner worldwide where laws permit * Note: Access to in-copyright works is subject to conditions listed in HathiTrust’s policies on Access and Use.Access and Use

Best way to ensure you are getting full access: LOGIN

User Collections Featured Collections: – ured ured All Collections with at least 250 items –

Adventure Novels: G. A. Henty Ancestry and Genealogy Ann Arbor History English Short Title Catalog Incunabula (Universidad Complutense de Madrid) Islamic Manuscripts Kean University NJ History Project Library Science Journals Manuscripts (Universidad Complutense de Madrid) Patent Indexes Records of the American Colonies UCSF University Publications UM Press UMich Hatcher Reference University of California, San Francisco University Press of Florida Utah State University Press

Examples of uses Oxford English Dictionary Ben Zimmer Problem is "cut the mustard" (OED 1891) predates "muster." Earliest I've seen for "muster" is Thesis research Islamic Manuscripts Local/Family History

APIs Bibliographic API – Volume and rights information – MARC records – OAI – “Hathifiles” – Data API – Volume and rights information – Page images – OCR –

Services Public domain and open access works ✔ Full download of materials where possible* ✔ Print on demand ✔ Lawful uses of in-copyright works* ✔ Collections and APIs ✔ Computational Access

Distribution of datasets – Non-Google-digitized Dataset (540,000+) – PD, PDUS, Open Access – Signed researcher statement Google-digitized (4.4 million+) – PD, PDUS, Open Access – Agreement between institution and Google – Brief proposal Characterize texts Provide ids (custom sets possible) Research, results, use of results – Signed researcher statement

HTRC HathiTrust Research Center – Developed collaboratively by Indiana University and University of Illinois; launched July 2011 – Enables computational access to public domain and open access materials; working to support in-copyright materials as well – Secure Environment – bring researchers to the data – Build services and tools that facilitate research by digital humanities and informatics communities – Advanced Collaborative Support RFP: Awards:

Using the HTRC Portal: sign up, browse volume lists and algorithms, execute algorithms, view results – Workset Builder – Sandbox: run own algorithms Getting Started with the HTRC [Google doc] –

HTRC UnCamp Ann Arbor, Michigan March 30-31, 2015 Keynotes, demos, “unconference” sessions Registration, Agenda, Logistics: – lists –

Projects (1) Detecting Literary Plagiarisms: The Case of Oliver Goldsmith. – Douglas Duhaime. University of Notre Dame. Taxonomizing the Texts: Towards Cultural-Scale Models of Full Text. Colin Allen, Jaimie Murdock. Indiana University Bloomington. – Allen and Murdock will carry out a cultural-scale investigation and topic modeling on HT public-domain full text through random sampling to select collections – Topic modeling to select collections according to the Library of Congress Subject Headings (LCSH). The Trace of Theory. – Geoffrey Rockwell, Laura Mandell, Stefan Sinclair, Matthew Wilkens, Susan Brown. University of Alberta, Texas A&M University, University of Notre Dame. Topic modeling; tools and methods to track the concept of “theory”. Dr. Michelle Alexopolous, University of Toronto – Tracking technology diffusion through time using the HT corpus.

Projects (2) Burton, Vernon. “The South as ‘Other,’ the Southerner as ‘Stranger.’” – Explore how attitudes expressed in print about slavery, southerners, and non-southerners have changed over both time and space. Ted Underwood, Associate Professor of English at the University of Illinois, Urbana- Champaign. – Using public domain texts received from HathiTrust to explore changing relationships in literary genres from Andrew Piper, Associate professor of German literature at McGill University. – Analyzing linguistic patters in German texts from Amanda Watson, librarian at New York University. – Studying How poetry anthologies in selected texts reflect the rise and fall of poets’ reputations over the course of the 19th century. Glenn Worthey, Digital Humanities Librarian at Stanford University Libraries. – Performing spatio-temporal investigation into the history of Brazilian Portuguese, to be accomplished by text-mining methods (n-gram analysis, etc.). Matthew Wilkens, Assistant professor of English, University of Notre Dame. – American Council of Learned Societies (ACLS) fellowship for project “Literary Geography at Scale.”

How to find out more About: Resources: Twitter: Facebook: Monthly newsletter: – – RSS Contact us: Blogs: – Large-scale Search – Perspectives from HathiTrust