Digital Libraries and the Open Archives Initiative Louisiana State University June 30, 2000 Edward A. Fox CC CS DLRL Internet.

Slides:



Advertisements
Similar presentations
A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
Advertisements

The Open Archives Initiative Thomas Krichel
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11 th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
© 2013 Association for Computing Machinery Honeywell Introduction to the ACM Digital Library January 16, 2013 Honeywell Introduction to the ACM Digital.
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
Publishing Solutions for Contemporary Scholars: The Library as Innovator and Partner Sarah E. Thomas University Librarian Cornell University Ithaca, NY.
Continuing Evolution of the NDLTD 18 May 1999 ETD Workshop Virginia Tech, Blacksburg, VA, USA
Digital Library Architecture and Technology
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
ECDL Workshop “Extending Interoperability of Digital Libraries: Building on the Open Archives Initiative” Lisbon – September 21, 2000 Edward A. Fox
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2011: 14 th Int. Symp. on ETDs Cape Town: ETDs for Rookies Edward A. Fox Executive Director,
Digital Libraries: An Aid to Education through Interoperable Open Archives of Resources U. Kentucky February 24, 2000 Edward A. Fox
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
US-Korea Joint Workshop on Digital Libraries SDSC - August 10-11, 2000 Open Archives Edward A. Fox CS DLRL Internet TIC.
Serenate1 Non-standard users: The Library Raf Dekeyser K.U.Leuven.
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2009: 12 th Int. Symp. on ETDs Pittsburgh, PA: Newcomers Edward A. Fox, Executive.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
The Open Archives Initiative (OAI) and Electronic Theses and Dissertations (ETDs) ASIDIC ‘2000 Orlando, FL - March 27, 2000 Edward A. Fox
Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.
Collaborative Research: Curriculum Development for Digital Library Education Presentation in May 1,2006
Creating and Operating a Digital Library for Information and Learning– the GROW Project Muniram Budhu Department of Civil Engineering & Engineering Mechanics.
Modern Information Retrieval
ETD Rookies: NDLTD Background ETD 2004 University of Kentucky June 3, 2004 Edward A. Fox
CITIDEL: Computing & Information Technology Interactive Digital Educational Library Web Page: Contacts: Future.
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Tsinghua University Library Yang Zhao & Airong Jiang Tsinghua University Library, Beijing China 4 June, 2004 Electronic Thesis and Dissertation System.
XXDL and CSTC and Virginia Tech NSDL Fall 2000 PI Meeting September 22-24, 2000 NSF, Arlington, VA Edward A. Fox CS DLRL.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
1 The NSDL Program Stephen Griffin National Science Foundation.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
1 Video Message: Welcome ETD 2015: 18 th Int’l Symposium on ETDs New Delhi, India Edward A. Fox Executive Director, Chairman of the Board NDLTD,
Introduction to Concept Maps Edward A. Fox and Rao Shen CS5604 Fall 2002 “Information Storage & Retrieval” Dept. of Computer Science Virginia Tech, Blacksburg,
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
1 IBM Academic Initiative Introduction for Pamplin School of Business Virginia Tech – October 13, 2011 “IBM Academic Skills Cloud and Computing Education.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
OAI: What happened since Summer 2000 End of Summer 2000 –Not only e-prints research library community publishers, librarians, scholars –Digital Library.
Serenate1 The librarian’s view Raf Dekeyser K.U.Leuven.
Open Archives Initiative Gail McMillan Digital Library and Archives, Virginia Tech Society for Scholarly Publishing: June 1, 2000.
Open Archives Initiative CNI Phoenix December 13, 1999 Dale Flecker, Harvard Carl Lagoze, Cornell John Ober, CDL Don Waters, Mellon.
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
NDLTD Union Collection User Services Edward A. Fox Virginia Tech DLRL March 2001.
The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Introduction to USETDA and Brief History of the ETD Movement John Hagen, Consultant – Renaissance Scholarly Communications / Board Member – NDLTD and USETDA.
Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Systems for scholarly communication
NSDL Data Repository (NDR)
Publishing Solutions for Contemporary Scholars: The Library as Innovator and Partner Sarah E. Thomas University Librarian Cornell University Ithaca, NY.
Institutional Repositories
Presentation transcript:

Digital Libraries and the Open Archives Initiative Louisiana State University June 30, 2000 Edward A. Fox CC CS DLRL Internet TIC Virginia Tech, Blacksburg, VA, USA

Acknowledgements (Selected) F Sponsors: ACM, Adobe, ARL, Belgian Science Found., CLIR, DARPA, IBM, LANL, Microsoft, NSF, OCLC, SPARC, US Dept. of Ed. (FIPSE), … F VT Faculty/Staff: Tony Atkins, Thomas Dunbar, John Eaton, Gwen Ewing, Peter Haggerty, Gary Hooper, Gail McMillan, Len Peters, James Powell, …  VT Students: Emilio Arce, Fernando Das Neves, Brian DeVane, Robert France, Marcos Goncalves, Scott Guyer, Robert Hall, Neill Kipp, Paul Mather, Tim McGonigle, Todd Miller, Constantinos Phanouriou, William Schweiker, Ohm Sornil, Hussein Suleman, Patrick Van Metre, Laura Weiss, …

Virginia Tech Background F Largest university in Virginia, land-grant, football, town population 35K plus 25K students F Blacksburg Electronic Village, since 1992, with > 80% of community on Internet F Net.Work.Virginia, largest ATM network, with over 750 sites, for education, research, government F LMDS, Local Multipoint Distribution Service, gigabit wireless networking - 1/3 of Virginia F Math Emporium, 500 workstations F Faculty Development Initiative, round 2 F Hosting First Joint Conference on Digital Libraries, Summer Hotel Roanoke, VA

Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)

Digital Libraries SGML (1985) PDF (1992) NSF DLI (1994) Library Cancellations (1988) University Scholarly Electronic Pub. (1988) Info. Literacy (1995) Improving Education Internet (1984) WWW (1994) Multimedia (1986)

Digital Libraries Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer

DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian

How do universities and digital libraries relate? F Each U. will have its own digital libraries. Hence there will be large numbers (i.e., critical mass). F All students will learn how to use and how to “feed” digital libraries (and bring those habits to future work as needs and skills). F All digital library problems (esp. federation, flexibility, personalization) appear at U’s (so they are a good type of testbed, with willing collaborators in-place for developing solutions). F Start with NDLTD, extend to NUDL

Digital Libraries --- Virginia Tech F MARIAN (NLM) F CS DL Prototype - ENVISION (NSF, ACM) F TULIP (Elsevier, OCLC) F BEV History Base (NSF, Blacksburg) F DL for CS Education - EI (NSF, ACM) F WATERS, NCSTRL (NSF) F NDLTD (SURA, US Dept. of Education) F CSTC (NSF, ACM), CRIM (NSF, SIGMM) F WCA (Log) Repository (W3C) F VT-PetaPlex-1 (Knowledge Systems)

NCSTRL F F Networked Computer Science Technical Reference Library F CS Technical Reports F 1994 merger of CSTC + WATERS F 1998 integration with LANL server (CoRR) F Federated search, mirrors, Dienst protocol

Digital Libraries --- Objectives F World Lit.: 24hr / 7day / from desktop F Integrated “super” information systems: 5S: streams, structures, spaces, scenarios, societies F Ubiquitous, Higher Quality, Lower Cost F Education, Knowledge Sharing, Discovery F Disintermediation -> Collaboration F Universities Reclaim Property F Interactive Courseware, Student Works F Scalable, Sustainable, Usable, Useful

Benefits F Ease of use F Effectiveness F “The benefits of digital libraries will not be appreciated unless they are easy to use effectively.” - IITA Workshop report

DLs: Why of Global Interest? F National projects can preserve antiquities and heritage: cultural, historical, linguistic, scholarly F Knowledge and information are essential to economic and technological growth, education F DL - a domain for international collaboration –wherein all can contribute and benefit –which leverages investment in networking –which provides useful content on Internet & WWW –which will tie nations and peoples together more strongly and through deeper understanding

DL Challenges F Preservation - so people with trust DLs F Supporting infrastructure - networks,... F Scalability, sustainability, interoperability F DL industry - critical mass by covering libraries, archives, museums, corporate info, govt info, personal info - “quality WWW” integrating IR, HT, MM,... –Need tools & methods to make them easier to build

Computing (flops) Digital content Communicat i ons (bandwidth, connectivity) Locating Digital Libraries in Computing and Communications Technology Space Digital Libraries technology trajectory: intellectual access to globally distributed information lessmore

Definitions F Library ++ (library+archive+museum+…) F Distributed information system + organization + effective interface F User community + collection + services F Digital objects, repositories, IPR management, handles, indexes, federated search, hyperbase, annotation

Definition: Digital Libraries are complex systems that F help satisfy info needs of users (societies) F provide info services (scenarios) F organize info in usable ways (structures) F present info in usable ways (spaces) F communicate info with users (streams)

5S Layers Societies Scenarios Spaces Structures Streams

Document Models, Representations, and Accesses F Doc = stream + structure + use-scenario; hybrid (paper/electronic), digital only F Multilingual: content, summary, metadata F Multimedia: structure, quality (oS), search F Structured: MARC, SGML, by user: MVD F Distributed collection: Kleisli, CIMI, Z39.50 F Federated search: collecting, picking site(s), parallel search / fall-back, fusing results F Access: IPR, payment, security, scenarios

Architectural Issues F Internet middleware F Independent system / part of federation F Decompositions vary –search engine, browser, DBMS, MM support –repository, handle server, client –information resources + mediators, bus or agent collection + client with workspace/environment F Metrics: e.g., for federated search

Standards F Protocols/federation –Z39.50, CIMI –Dienst, NCSTRL –OAi protocol F Metadata –TEI: inline, detailed (structure in stream) –MARC: two-level, fine-grained –Dublin Core: high-level, 15 elements –RDF: describing resources/collections, annotation –OAMS and others used in OAi

Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)

Enhancing Learning with DLs

NSF Education Innovation (EI) F NSF “Interactive Learning with a Digital Library in Computer Science” ( ) F 45 online courses (esp. Internet, IR, MM, Professionalism, overall EI project pages): 100+K accesses/wk F Tools: SWAN (visualization), QUIZIT F Evaluation –traditional –network logging and analysis –tools for visualization

Digital Library Courseware F F WWW pages or large PDF copy files F Online quizzes based on book by Michael Lesk (Morgan Kaufmann Publishers) F Contents based on book, with several other popular topics added (e.g., agents) F Separate pages to supplement: Definitions, Resources (People, Projects), and References

CS -> CSTC -> CRIM F NSF and ACM Education Committee are funding a 2 year project “A Computer Science Teaching Center” - CSTC - F College of NJ, U. Ill. Springfield, Virginia Tech F Focus initially on labs, visualization, multimedia F Multimedia part is also supported by a 2nd grant to Virginia Tech and The George Washington University: (with curricular guidelines also under development)

CS Teaching Center (CSTC) F Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. F Learners benefit from having well-crafted modules that have been reviewed and tested. F Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. [See NSF NSDL - National Science (math, engineering, technology education) Digital Library (formerly SMETE-lib) at ] F ACM Education Board and SIG support, new NSF grant with COLLEGIS Research Institute and others …

Browsing (1)

Browsing (2)

CRIM Rationale F MM field needs properly trained personnel F Support this with resources + curricula F Together these help us move toward a DL for Interactive MM -> CS -> NSDL F Benefits will go to teachers (who have more to build upon) and students (who will have a richer environment for learning

CRIM Project Activities F Workshops, other ways to involve community F WWW site including DL in CSTC re MM –Devised cataloging schema, designed interface –Referring to all MM syllabi and curriculum –Inviting learning resources for the CRIM DL, with reviews, reuse certifications F Publish report on MM curriculum through ACM and IEEE, after careful review F Introducing into CC2001: information retrieval, hypertext/hypermedia, multimedia, digital libraries

Curriculum Resources in Interactive Multimedia (CRIM) F MM field needs properly trained personnel F Support this with resources + curricula F Benefits will go to teachers (who have more to build upon) and students (who will have a richer environment for learning F CSTC, CRIM have led to ACM Journal of Educational Resources in Computing, JERIC F Together these help us move forward: DL for Interactive MM -> CS -> NSDL

SMETE Library -> NSDL (from to NSF DLI-2) F Context: Global movement toward Digital Libraries (see April 1998 CACM) F NSF effort: Science, Mathematics, Engineering, and Technology Education Digital Library (focussed on undergraduates) –3 workshops, yearly increasing funds / new calls F NSDL will operate as a distributed federation, with separate parts for each key discipline, and should lead to a global effort.

Selected NSDL Projects/Topics COLLEGIS Res. Inst.IMS, CS, Math, Viz., … Columbia UniversityEarth sciences Stanford UniversityMedicine (images) U. California BerkeleyEngineering University of MarylandK-12 education U. Texas at AustinPhysical anthropology

Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)

Open Archives initiative OAi

OAi Philosophy F Self-archiving = submission mechanism F Long-term storage system = archive F Open interface = harvesting mechanism F Data provider + service provider F Start with “gray literature” –e-prints/pre-prints, reports, dissertations, …

Tiered Model of Interoperability Mediator services Metadata harvesting Document models

Repository of Digital Objects Repository Access Protocol handle Digital object terms and conditions

Open Archives initiative History F xxx at LANL = Los Alamos National Laboratory (Ginsparg) for high-energy physics F CSTR + WATERS = NCSTRL (Lagoze) F xxx + NCSTRL = CoRR collaboration F UPS (Universal Preprint Service) – 1999 mtg –Herbert Van de Sompel (U. Ghent, SFX) … –Dublin Core (DC), XML –Dienst protocol and software (Lagoze) F Renamed late 1999 as OAi

Open Archives (protoproto) F ArXiv & Los Alamos National Lab F CogPrints & U. Southampton F NACA & NASA (reports) F NCSTRL & Cornell U. F NDLTD & Virginia Tech F RePEc & U. Surrey F Total of around 200K records

Original Open Archives Members F Caroline Arms, Library of Congress F Leslie Carr, University of Southampton F Mark Doyle, American Physical Society F Dale Flecker, Harvard University F Edward A. Fox, Virginia Tech F Michael Friedman, HighWire Press, Stanford U. F Paul M. Gherman, Vanderbilt U. & SPARC F Paul Ginsparg, Los Alamos National Lab. & xxx F Stevan Harnad, University of Southampton F Thomas Krichel, University of Surrey & RePEc F Carl Lagoze, Cornell University …

Original Open Archives Members cont’d F Rick Luce, Los Alamos National Laboratory F Clifford Lynch, Coalition for Networked Info. F Kurt Maly, Old Dominion University F Michael Nelson, NASA Langley Research Center F John Ober, California Digital Library F Bob Parks, Washington University & EconWPA F Herbert Van de Sompel, University of Ghent F Eric F. Van de Velde, Caltech F Don Waters, The Andrew W. Mellon Foundation F Ken Weiss, California Digital Library

Open Archives Future F EconWPA (U. Washington) F e-biomed -> PubMed Central (NIH) F PubScience (DOE) F Clinical Medicine Netprints (+ other HighWire Press holdings ) F University ePub (California Digital Library) F All public e-prints (MIT) F Scholar’s Forum (Caltech) F Int’l: CERN, Germany, India, Mexico, … F Goal: millions of books/articles/reports / yr

Approaches to Open Archives Build By Discipline Build By Institution

Approaches to Open Archives Build By Discipline Build By Institution Author Category Interdisciplinary Year Language Query …

Open Archives initiative (OAi) F high-energy physics (Ginsparg, 1991) F CSTR + WATERS = NCSTRL (Lagoze,1994) F xxx + NCSTRL = CoRR collaboration (1998) F Universal Preprint Service protoproto, Oct , 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi F Santa Fe Convention (see Feb. D-Lib Magazine article) F Follow-on mtgs: Antonio, (ECDL) F Archives -> Open Archives –Support unique archive identifiers –Implement Open Archives Metadata Set (DC-based, using XML) –Implement Dienst harvesting interface (based on Dienst protocol) –Register the archive F Build tools, layer other services: linking, searching, …

Mechanisms F Sharing –Join federation, run software –Make metadata and archive available F Aggregating –By discipline –By institution –By genre F Automating –Workflow –Harvesting and providing services –Federated searching –Dynamic linking (e.g., with SFX)

Report on Open Archives work in progress at Virginia Tech With students: Hussein Suleman Dave Watkins Robert France Marcos Andre Goncalves

VT View of the Open Archives initiative (OAi) F Enable sharing of publication metadata and full-text by digital libraries F Standardize low-level mechanisms to share contents of libraries F Build higher-level user-centric and administrative services in meta-libraries F Install organizational mechanisms to support the technical processes

Virginia Tech Projects F MARC XML-DTD F Computer Science Teaching Centre (CSTC) F W3C Web Characterization Repository F OAi Repository Explorer F Networked Digital Library of Theses and Dissertations (NDLTD)

MARC XML-DTD F XML Transport format for US-MARC records F Standardized metadata exchange format for traditional library services joining OAi

CS Teaching Center (CSTC) F Collection of reviewed online resources used to aid in teaching of Computer Science F Supports author submission and peer-review process for new ACM Journal of Educational Resources In Computing (JERIC) F Connected with NSDL (NSF 00-44) F

W3C Web Characterization Repository F Online database of metadata related to publications, tools and data sets dealing with Web characterization F Project of the Web Characterization Activity working group of the World-Wide-Web Consortium ( F

OAi Repository Explorer F Serves as a compliancy test F Allows browsing of open archives using only OAi protocol F Sends requests on behalf of user, parses and checks responses and displays browsable interface F Will detect most discrepancies in protocol F

NDLTD F Work has begun on interoperability between Virginia Tech and partners in Germany F Wrappers have been created to harvest data from remote sites which use other protocols F Harvested data to be stored in a central OAi- compliant database (work in progress)

Extending Services - 1 of 2 F Working with publishers –Motivate students: awards, … –Publicize support of NDLTD u ACM, ACS, IEEE-CS, Elsevier, … –Allow students to increase level of access F Arranging preservation –Mirroring worldwide –Involving long-term trusted parties

Extending Services - 2 of 2 F Adding services currently prototyped –annotation and SDI (routing) capabilities –Dublic Core metadata, crosswalk to MARC –support for XML, *ML, preservation –harvesting, federated search F Adding other services planned –building/using citation DB (CiteSeer, SFX, …) –implementing plagiarism check (like “SCAM”)

Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)