The Hathi Trust Research Center and tool builders John Unsworth (with Beth Plale, Scott Poole, Robert McDonald, and others) Project Bamboo Corpora Space.

Slides:



Advertisements
Similar presentations
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
Advertisements

HATHI TRUST A Shared Digital Repository Delivering Data For New Generations of Research Strategies and Challenges Jeremy York NISO/BISG Forum ALA 2010.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
Information Analysis at Scale: HathiTrust Research Center Beth Plale Director, Data to Insight Center Co-Director, HathiTrust Research Center November.
HATHITRUST A Shared Digital Repository The HathiTrust Print Monograph Archive Planning Task Force Print Archive Network Forum ALA 2015 Midwinter Meeting.
HathiTrust Research Center Architecture
May 17, 2011 DPLA Global Interoperability and Linked Data Workshop Building a Public Research Center for the HathiTrust Digital Library Robert H. McDonald.
HathiTrust Research Center Tools SHARC: Secure HathiTrust Analytics Research Commons Dirk Herr-Hoyman HTRC Operations Manager + Architect Indiana University.
Elephant in the Room: Scaling Storage for the HathiTrust Research Center Robert H. McDonald Associate Dean for Library Technologies Deputy.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
Open Annotation Collaboration Rob Sanderson, Herbert Van de Sompel DMSS Meeting, May 14-15, Stanford, CA Robert Sanderson –
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign InCommon and TeraGrid Campus Champions Jim Basney
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
Computational Research and Copyright John Unsworth BNN Future of the Academy Speaker Series MIT Faculty Club May 25, 2012.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
DuraCloud A service provided by Sandy Payette and Michele Kimpton.
HATHITRUST A Shared Digital Repository HathiTrust: Putting Research in Context HTRC UnCamp September 10, 2012 John Wilkin, Executive Director, HathiTrust.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
HATHITRUST A Shared Digital Repository HathiTrust Update January 30, 2015 Mike Furlough, Executive Director Sharon Farb, HathiTrust Collections Committee.
The Marine Metadata Interoperability Project A Model for Community Collaboration September 23, 2010 Nan Galbraith WHOI.
HathiTrust Digital Library. Overview ›Began in 2008 ›Large scale digital preservation repository ›Partnership of major research libraries ›Focus on both.
Google Book Settlement NIH Public Access Act The Fair Copyright in Research Works Act FRPAA Institutional Mandates OA Day.
Organizational Memory: Issues in Design & Implementation Sree Nilakanta May 1, 2000.
Geospatial Platform Update Migration of GOS to Data.gov Rob Dollison GOS Project Manager FGDC Metadata Summit 10/26/11.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
HathiTrust Research Center Dedicated to provision of computational access to comprehensive body of published works for scholarship and education.
HTRC Workshop 101 THATCamp Gainesville April 24, 2014.
The Hindi word for ‘elephant’ ITC Friday, January 22, 2010.
HATHI TRUST RESEARCH CENTER Building Collections and Analyzing Data Stacy Kowalczyk.
Challenges and Opportunities for Academic Libraries Collaborative Imperatives to Support Collections, Digital Initiatives, and New Services for a Changing.
HathiTrust Research Center Architecture Overview Robert H. McDonald Executive Committee-HathiTrust Research Center (HTRC) Deputy Director-Data.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
OpenHIE Improving health for the underserved. The Open Health Information Exchange (OpenHIE) Community: A diverse community enabling interoperable health.
UDL Coaching: What, Why, and How Patti Ralabate Director of Implementation March 5, 2014.
Accessing HTRC Data. What is Hathitrust Research Center? A collaborative research center launched jointly by Indiana University and the University of.
HATHITRUST A Shared Digital Repository The HathiTrust Print Monograph Archive Planning Task Force Print Archive Network Forum ALA 2015 Annual Meeting June.
IT and IM: Promises and Pitfalls Greta Lowe August 15, 2011.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
HATHITRUST A Shared Digital Repository HathiTrust and the Future of Research Libraries American Antiquarian Society March 31, 2012 Jeremy York, Project.
Joint Information Systems Committee Supporting Higher and Further Education Rachel Bruce Programme Manager, JISC Executive Collection.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Annual Review Meeting - Introduction.
Controlled Vocabulary & Thesaurus Design Course Introduction and Background.
Existing knowledge Grey literature and other documents, images, videos, and more represent the mission and work of an agency, but preserving and creating.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
LIBER and the Google Book Settlement Wouter Schallier Executive Director.
HTRC Loretta Auvil, Boris Capitanu University of Illinois at Urbana-Champaign
Great Midwestern Space Grant Consortium November 5, 2004 Jim Stofan Director, Informal Education Division To Inspire the Next Generation of Explorers …As.
Transition Virtual Community of Practice. What are communities of practice?  Communities of practice are groups of people who share a concern or a passion.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Facing the challenge of relevance Erwin Bleumink 4 June 2013 TNC13.
The Data Capsule for Non-Consumptive Research Beth Plale, Atul Prakash, Geoffrey Fox, Robert H. McDonald A Proposal to the Alfred P. Sloan Foundation HTRC.
1 This Changes Everything: Accelerating Scientific Discovery through High Performance Digital Infrastructure CANARIE’s Research Software.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
Indiana University School of Indiana University ECCR Summary Infrastructure: Cheminformatics web service infrastructure made available as a community resource.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
EarthCube Sustaining the Geosciences for 21 st Century Challenges Credits: from top to bottom: NOAA Okeanos Explorer Program (CC BY-SA 2.0), NASA/Kathryn.
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
What’s next with the HathiTrust Research Center?
Mass Digitization of Books and the Potential for Universal Access
Access  Discovery  Compliance  Identification  Preservation
HathiTrust And Its Research Center
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Bird of Feather Session
Building a CMMI Data Infrastructure
Presentation transcript:

The Hathi Trust Research Center and tool builders John Unsworth (with Beth Plale, Scott Poole, Robert McDonald, and others) Project Bamboo Corpora Space Workshop II Maryland Institute for Technology in the Humanities 1-1:20 pm, June 6, 2011

Our raison d'etre Phase I : starting Apr 2011 and going for 18 mos. Phase I : starting Apr 2011 and going for 18 mos. Phase II : starting Fall 2012 and going for … Phase II : starting Fall 2012 and going for … Goal: enable strong computational research and education on a collection that has not been amenable to computational exploration EVER before! Goal: enable strong computational research and education on a collection that has not been amenable to computational exploration EVER before!

More formally HTRC is founded as a joint venture between Indiana University and the University of Illinois Urbana-Champaign, aimed at solving the difficult challenges of increasing computational access to the public domain and copyrighted material in HathiTrust. HTRC is founded as a joint venture between Indiana University and the University of Illinois Urbana-Champaign, aimed at solving the difficult challenges of increasing computational access to the public domain and copyrighted material in HathiTrust.

HTRC will: Maintain repository of text mining algorithms and retrieval tools available on-line for human and programmatic discovery. Also register derived data sets, indexes, and versions in registry repository. Maintain repository of text mining algorithms and retrieval tools available on-line for human and programmatic discovery. Also register derived data sets, indexes, and versions in registry repository. Be a user-driven resource, with an active advisory board, and a community model that allows users to share algorithms and tools. Be a user-driven resource, with an active advisory board, and a community model that allows users to share algorithms and tools. Support interoperability across collections and institutions, through use of inCommon SAML identity. Support interoperability across collections and institutions, through use of inCommon SAML identity.

Non-consumptive research One of HTRC’s unique research challenges is support for non-consumptive research. Google Book Settlement defines non-consumptive research as “research in which computational analysis is performed on one or more books, but not research in which a researcher reads or displays.” One of HTRC’s unique research challenges is support for non-consumptive research. Google Book Settlement defines non-consumptive research as “research in which computational analysis is performed on one or more books, but not research in which a researcher reads or displays.” In practice, we believe this will mean HTRC users should not be able to export copyrighted materials from the research environment. In practice, we believe this will mean HTRC users should not be able to export copyrighted materials from the research environment.,

Long-term goals Support innovation in cyberinfrastructure to deliver optimal access and use of HathiTrust corpus. Support innovation in cyberinfrastructure to deliver optimal access and use of HathiTrust corpus. Implement “Non-consumptive” research : a technical and intellectual challenge Implement “Non-consumptive” research : a technical and intellectual challenge Identify and host existing data analysis, text mining and retrieval tools that are of interest to the community. Identify and host existing data analysis, text mining and retrieval tools that are of interest to the community. Stimulate development of new analytical methods and tools. We hope that the scale of the HTRC will promote new levels of collaboration in tool development. Stimulate development of new analytical methods and tools. We hope that the scale of the HTRC will promote new levels of collaboration in tool development.