Distributing the Indexing and Retrieval of Information Winston Bourne IRNLP.

Slides:



Advertisements
Similar presentations
GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey February 5, 2008.
Advertisements

Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Information Retrieval Liam Quin, Barefoot Computing, Toronto.
Searching Options and Result Sets Sara Randall Endeavor Information Systems October 30, 2003.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
THE DONOR PROJECT Titia van der Werf-Davelaar. Project Financed by: Innovation of Scientific Information Provision (IWI) Duration: –phase 1: 1 may 1998.
Heinrich Stamerjohanns Institute for Science Networking Distributed Open Archives Dr. Heinrich Stamerjohanns Institute for Science Networking at the University.
Possibility in Digital Collection Management Introduction to CONTENTdm TM Hitoshi Kamada University of Arizona Presentation for OCLC-CJK Users Group Annual.
08/06/2001SPACE S.p.A1 Title: CULTURAL HERITAGE DATA MANAGEMENT Paolo Alongi.
Cultural Heritage in REGional NETworks REGNET. October 2001Project presentation REGNET 2 T1.3. IDENTIFICATION OF STANDARDS TO BE USED 1. OBJECTIVES 2.
CYCLADES and SDLIP Simple Digital Library Interface Protocol D-Lib Magazine, March
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
The metadata challenge for libraries: a view from Europe Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
Maurice Hendrix (Semi-)automatic authoring of AH.
Maurice Hendrix (Semi-)automatic authoring of AH.
A REST-ful Web Services Approach to Library Federated Search using SRU Kevin Reiss Rutgers-Newark Law Library CALI 2005 – June 11th.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Z39.50 and the Web ZIG July 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
7 March 2000eLib Collection Description1 People & Resources Identification for Distributed Environments P.R.I.D.E. Andrew Colleran, Quercus.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
Nanotechnology Search Engine Team 2 Scott Ayres Michael Dobbs Emilio Socci.
Breadth First Search
“ The Anatomy of a Large-Scale Hypertextual Web Search Engine ” Presented by Ahmed Khaled Al-Shantout ICS
R utgers C ommunity R epository RU CORE Fedora Repository Object Datastreams.
Fraunhofer Institut Graphische Datenverarbeitung Dr. Joachim Rix Fraunhofer IGD (Institute for Computer Graphics) Department A2 Industrial Applications.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
(c) Maria Indrawan Distributed Information Retrieval.
Computer comunication B Information retrieval Repetition Retrieval models Wildcards Web information retrieval Digital libraries.
Lesson 2 Technology: Federated Searching Explained.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
WHAT HAVE WE DONE SO FAR?  Weeks 1 – 8 : various components of an information retrieval system  Now – look at various examples of information retrieval.
Yuri de Lugt Collexis Karin Clavel TU Delft Library.
Introduction to ebXML Mike Rawlins ebXML Requirements Team Project Leader.
CEN/ISSS DC workshop, January The UK approach to subject gateways Rachel Heery UKOLN University of Bath UKOLN is.
Controlled Vocabularies for DDI 4 How to represent CVs?
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Hotbot A Search Engine Case Study. Introduction  Owned by Terra/Lycos.  One of the largest web search engines.  Uses the Inktomi database combined.
The DNER - a national digital library Andy Powell ZIG Meeting, York October 2001 UKOLN, University of Bath UKOLN is funded by Resource:
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
CSM06 Information Retrieval Lecture 1a – Introduction Dr Andrew Salway
Task-oriented approach to information handling support within web-based education Lora M. Aroyo 15 November 2001.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
Metadata for the Web Andy Powell UKOLN University of Bath
Agenda Why discuss Digital Libraries What is a digital Library History Meta-data FEDORA NSDL D Space.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Information Retrieval
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Model Design using Hierarchical Web-Based Libraries F. Bernardi Pr. J.F. Santucci {bernardi, University of Corsica SPE Laboratory.
Metadata Standardization Status in Korea W3C RDF Core WG Meeting August 1, 2001 Hyung-Jin KWON National Computerization Agency (NCA)
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
From XML to DAML – giving meaning to the World Wide Web Katia Sycara The Robotics Institute
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
General Architecture of Retrieval Systems 1Adrienn Skrop.
TDG-Scholar 2.0 Nicolás Amador Griñolo Agustín Domínguez Alvera.
Collection Fusion in Carrot2
Collaborative Vocabulary Management
VI-SEEM Data Repository
OAI and Metadata Harvesting
Presentation transcript:

Distributing the Indexing and Retrieval of Information Winston Bourne IRNLP

Introduction n Need for Distributed IR n How indices help n Controlling a distributed search n Using meta-data n Distributed IR solutions n Meta-search engines

Need for Distributed IR n Large, ever growing pool of data n Finding required data n Relevance and quality of results n Distributing IR allows scaling of searches

How Indices help n Index resources by some method n Indices far smaller than data pool n Allow multiple agents to search quickly - consider library

Controlling a Distributed Search n Prevent duplicate results n Use domain specific agents n Identify and track queries n Self organizing networks, using hypertext

Using Meta-Data n Meta-data is data about data n Create a description of a resource n Convert query into meta-data

Distributed IR solutions n Emerge, specifically for Scientific data. n Harvest, general Distributed IR, with choice of topology n CHIC-Pilot project, demonstration distributed IR architecture, converging many standards & protocols

Standards and protocols n Dublin Core: generic set of Meta- data descriptors n RDF: used by Emerge, XML based, inheritance hierarchy n SOIF: used by Harvest, simple text based n Centroids: remove redundant terms to further compress indices n Z39.50: ISO, widely used by established institutions.

Meta-Search Engines n Everyday demonstration of Distributed IR n Use single interface to query many conventional search engines n Collation of duplicate results n Convert query to target compatible n Breadth of search at expense of depth