MARIAN: Searching and Querying Across Heterogeneous Federated Digital Libraries Marcos André Gonçalves Robert K. France Edward A. Fox Tamas E. Doszkocs.

Slides:



Advertisements
Similar presentations
A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
Advertisements

UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11 th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
Multi-Model Digital Video Library Professor: Michael Lyu Member: Jacky Ma Joan Chung Multi-Model Digital Video Library LYU9904 Multi-Model Digital Video.
Features and Uses of a Multilingual Full-Text Electronic Theses and Dissertations (ETDs) System Yin Zhang Kent State University Kyiho Lee, Bumjong You.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
Building Reliable Distributed Information Spaces Carl Lagoze CS /22/2002.
From semantic networks, to ontologies, and concept maps: knowledge tools in digital libraries Marcos André Gonçalves Digital Library Research Laboratory.
Building Digital Libraries on Open Archives Donatella Castelli IEI-CNR Italy.
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
Digital Library Architecture and Technology
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
US-Korea Joint Workshop on Digital Libraries SDSC - August 10-11, 2000 Open Archives Edward A. Fox CS DLRL Internet TIC.
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg.
32nd LIBER Annual General Conference - Rome, June 2003 Open archive solutions to traditional archive/library cooperation Donatella Castelli ISTI-CNR.
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2009: 12 th Int. Symp. on ETDs Pittsburgh, PA: Newcomers Edward A. Fox, Executive.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
1 NDLTD Welcome and Introduction ETD 2014: 17 th Int’l Symposium on ETDs Leicester, England Edward A. Fox Executive Director, NDLTD,
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Research Library, Los Alamos National Laboratory RESEARCH OAI4 - Geneva, Switzerland Digital Library Research & Prototyping Team Multi-Graph.
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
The Web-DL Environment for Building Digital Libraries from the Web P. Calado 1, M. Gonçalves 2, E. Fox 2, B. Ribeiro-Neto 1, A. Laender 1, A. da Silva.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Alexandria Digital Earth ProtoType DIGITAL LIBRARIES AND ENVIRONMENTAL INFORMATION Terence R. Smith Alexandria Digital Library Project.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
Tsinghua University Library Yang Zhao & Airong Jiang Tsinghua University Library, Beijing China 4 June, 2004 Electronic Thesis and Dissertation System.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
SPASE and the VxOs Jim Thieman Todd King Aaron Roberts.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
ETD Search Services Ming Luo Edward A. Fox Virginia Tech.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
SCENARIO-BASED GENERATION OF DIGITAL LIBRARY SERVICES Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
ETDs and NDLTD Hussein Suleman University of Cape Town May 2004.
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
NDLTD Union Collection User Services Edward A. Fox Virginia Tech DLRL March 2001.
NDLTD Toward Universal Accessibility of ETDs: Building the NDLTD Union Archive Hussein Suleman, Edward A. Fox,
OAI and ODL Building Digital Libraries from Components Ryan Richardson Virginia Tech DLRL 18 September 2003.
OAI and ODL Building Digital Libraries from Components Hussein Suleman Virginia Tech DLRL 12 September 2002.
NDLTD Standards, Metadata and the OAI-PMH Hussein Suleman University of Cape Town October 2003.
Open your Alfresco Data
Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Outline Pursue Interoperability: Digital Libraries
OAI and Metadata Harvesting
Open Archive Initiative
Presentation transcript:

MARIAN: Searching and Querying Across Heterogeneous Federated Digital Libraries Marcos André Gonçalves Robert K. France Edward A. Fox Tamas E. Doszkocs Work performed at Virginia Tech, Blacksburg, VA USA Support provided in part by NSF & National Library of Medicine.

JCDL 2001 First Joint ACM/IEEE Conference on Digital Libraries (+ NSF DLI-2 PI mtg) June 24-28, 2001 in Roanoke, VA Conference Committee: General Chair: Edward A. Fox, Virginia Tech Program Chair: Christine Borgman, UCLA Treasurer: Neil Rowe, Naval Postgraduate School Posters Chair: Craig Nevill-Manning, Rutgers U. …

Outline NDLTD Harvesting Strategies and the OAI MARIAN Middleware Generating Digital Libraries with 5SL Future Directions

NDLTD (1 of 3) Context: Networked Digital Library of Theses and Dissertations, Please join! Submit your (student’s) works! International federation of universities, libraries, supporting institutions (e.g., VTLS union catalog) Extremely heterogeneous Autonomy of management and decentralization Disparate protocols, metadata, repositories (e.g., UMI, OCLC’s WorldCat), language, encodings, user characteristics and preferences

NDLTD (2 of 3) Worldwide organization: educational/social context National/regional projects in Australia, Catalunya, Germany, India, Latin America (UNESCO/OAS/ISTEC), South Africa (Mellon), USA (including OhioLINK), … International conference (225 in March 2000, more expected for next, at Caltech) Steering committee representing supporting groups as well as the hundreds of universities

NDLTD (3 of 3) Unique collection – discipline/document context Multilingual and multimedia content Large book-size documents Full-content in several formats (XML, PDF, etc.) Large number of bibliographic references Several sets of metadata with different ranges of quality, that can fit with the Open Archives Initiative (

Harvesting Strategies Harvesting vs. Federated Search Harvesting plus Federated Search Plus local collections The NDLTD Union Collection Multiple Harvesting Protocols Harvest™ System Z39.50 Dienst OAI

Union Collection Architecture

Open Archives Initiative (OAI) Interoperability Standards: Released - Jan/Feb Data + Service Providers Metadata Harvesting Protocol Unique identifiers (URNs) for each record Date-stamp for each record when last modified/created/deleted HTTP server with scripting capabilities 6 Service requests (verbs) Identify, ListMetaFormats, ListSets ListIdentifiers, GetRecord, ListRecords

low-barrier interop umbrella herbert van de sompel metadata OPACimageFTXTA&Ie-print

OAI harvesting tools herbert van de sompel service provider harvester data provider repository Datestamp Identifier Set Records repositoryrepository

OAI harvesting tools herbert van de sompel service provider harvester data provider repository Supporting protocol requests: Identify ListMetadataFormats ListSets Harvesting protocol requests: ListRecords ListIdentifiers GetRecord repositoryrepository

Design Features Combined Harvesting, Federated Search, and Local Collections Object-Oriented Information Graph Representation 5S Model and 5SL Specification Language

MARIAN Middleware Flexible Representation Model Information Graph Class Hierarchies Weights and Weighted Sets (w. lazy eval) Class-Based Search Unified Searcher API Combining Heterogeneous Information Structural Matching Synthetic Superclasses

Information Graph Model (1/2) Each Information Object is a Node. Structure: exposed through Links Features of interest can become Nodes or can remain Hidden within Node Class Search Methods.

Information Graph Model (2/2)

Class-Based Search Common Search Methods Text Link / Weighted Link Node in Context Common Searcher Operations Match Best (weighted maximum) Match Most (summative union)

Class-Based Search public interface ClassManager { public WtdObjSet match(InfoDesc description); public boolean isInClass(FullID id); public Object idToObject(FullID id); public Vector idsToObjects(Vector ids); }

Class-Based Search

Combining Sources of Information Structural Matching Extends Weighted Retrieval to include “Best Match to Document Structure” Recursive, Extensible Collection Views Simple Interface to Complex Collections Common Interface to Diverse Collections Weighted Interface to Collections of Varying Quality

Dc.creatorHasDcCreator HasCrawlerAuthor Headings Dc.Subject Keywords HasDcSubject HasHeadings HasKeywords dc.title crawlerTitle PhysDis-ETD (SOIF) dc.description crawlerDescription body Individual HasAuthor HasSubject title ThesisDissertation description SubClasses SubClasses Subject Individual Dc.creatorHasDcCreator HasCrawlerAuthor Headings Dc.Subject Keywords HasDcSubject HasHeadings HasKeywords dc.title crawlerTitle PhysDis-ETD (SOIF) dc.description crawlerDescription body Individual HasAuthor HasSubject title ThesisDissertation description SubClasses SubClasses Subject Individual NDLTD Collection View (part)

5S Model for Digital Libraries (1/2) Formal Model Streams Structures Spaces Services Societies

5S Model for Digital Libraries (2/2) Formal Model Streams Structures Spaces Services Societies NDLTD / MARIAN Example Document (presentable, indexable information object) Weighted Set (e.g., of results to a match operation) Collection Graph; Inheritance Lattice; Measure Space Adaptive Search; Query History Maintenance Library End-Users; DL Builders

5SL Generates Digital Library (Components)

Generating Digital Libraries: XML

Interoperability with 5S and 5SL Reductionist / Constructivist Approach Compositional mappings between DLs Composition of S-based constructs Mapping language

Student Projects to Integrate Schedule-driven Harvester SDI / Filtering for NDLTD MARIAN-Phronesis (Spanish – Monterrey); and work with German (Oldenburg / DFG), Portuguese, Chinese, Japanese, Korean TREC data formatted for loading

Future Work Fusion on hybrid architecture Incorporation of belief networks Using 5SL to generate wrappers New services/ functionalities Personalization (e.g., history, folders) Visualization (e.g., Envision applet) Integration with PetaPlex (100 nodes, 2.5 Tbytes disk capacity, > 300 Mbps to campus backbone, Sornil inversion)

Conclusions NDLTD provides a real, fertile, DL testbed. Harvesting strategies and the OAI MARIAN middleware: graphs, classes, views Generating Digital Libraries with 5SL Future: high performance services, experimental comparisons