A centre of expertise in digital information management www.ukoln.ac.uk UKOLN is supported by: Is Metasearching Really Better Searching? STM Innovations.

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
Advertisements

Searching Options and Result Sets Sara Randall Endeavor Information Systems October 30, 2003.
Metasearching: The Problem, Promise, Principles, Possibilities & Perils Roy Tennant California Digital Library.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
Theo van Veen, Koninklijke Bibliotheek The European Library: opportunities for new services.
Deconstructing Cataloging A Web Services Approach to Bibliographic Control Thomas Hickey.
A centre of expertise in digital information management IMS Digital Repositories Interoperability Andy Powell UKOLN,
A centre of expertise in digital information management UKOLN is supported by: SRU: An overview of the SRU protocol and how it can be used.
Why metadata matters for libraries... Rachel Heery UKOLN: The UK Office for Library and Information Networking, University of Bath
A centre of expertise in digital information management UKOLN is supported by: The JISC IE Metadata Schema Registry British Library, Boston.
Pete Johnston UKOLN, University of Bath Bath, BA2 7AY
UKOLN is supported by: The JISC Information Environment Bath Profile Four Years On: whats being done in the UK? 7 th July 2003 Andy Powell, UKOLN, University.
Delivering HILT as a shared service Rachel Heery UKOLN, University of Bath
Distributed Service Registries Workshop, July 2005 Slide 1 NISO Metasearch Initiative Registries Robert Sanderson Dept. of Computer Science University.
An overview of collection-level metadata Applications of Metadata BCS Electronic Publishing Specialist Group, Ismaili Centre, London, 29 May 2002 Pete.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
JISC Information Environment Service Registry (IESR) Amanda Hill, Pete Johnston, Ann Apps.
UKOLN is supported by: An overview of the OpenURL UKOLN/JIBS OpenURL Meeting London, September 2003 Andy Powell, UKOLN, University of Bath
Andy Powell, Eduserv Foundation July 2006 Repository Roadmap – technical issues.
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
UKOLN is supported by: IESR, the JISC IE and beyond Andy Powell, UKOLN, University of Bath Using the IESR: what’s in it for you?
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
A centre of expertise in digital information management UKOLN is supported by: XML and the DCMI Abstract Model DC Architecture WG Meeting,
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
The Dublin Core Collection Description Application Profile (DC CD AP) Pete Johnston, UKOLN, University of Bath Chair, DC Collection Description Working.
A centre of expertise in digital information management UKOLN is.
A centre of expertise in digital information management UKOLN is.
DNER Architecture Andy Powell UKOLN, University of Bath Web of Science Enhancements Committee, Centre Point 5 March.
A Middleware Registry for the Discovery of Collections and Services Ann Apps MIMAS, The University of Manchester, UK.
Ray Denenberg Ralph LeVan Workshop 20 March 25, 2006; Washington Metasearch - the NISO Initiative.
Disseminating Service Registry Records Ann Apps MIMAS, The University of Manchester, UK.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
The KB on its way to Web 2.0 Lower the barrier for users to remix the output of services. Theo van Veen, ELAG 2006, April 26.
Customising Location of Knowledge Ann Apps and Ross MacIntyre MIMAS, The University of Manchester, UK.
Challenges for the DL and the Standards to solve them Alan Hopkinson Technical Manager (Library Systems) Learning Resources Middlesex University.
River Campus Libraries Find Articles A Web Redesign for ENCompass David Lindahl Web Initiatives Manager River Campus Libraries University of Rochester.
Federated Searching: The ABC’s of HSE, XML, & Z39.50 Harry Samuels Product Manager Linking & Searching August 27, 2004.
Metasearching: The Promise and Peril Roy Tennant.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
OpenURL Link Resolvers 101
7. Approaches to Models of Metadata Creation, Storage and Retrieval Metadata Standards and Applications.
The DNER - a national digital library Andy Powell ZIG Meeting, York October 2001 UKOLN, University of Bath UKOLN is funded by Resource:
DNER Architecture Andy Powell, Liz Lyon MLE Steering Group 4 May 2001 UKOLN, University of Bath UKOLN is funded by.
IESR Interfaces: Current Services and Future Plans Ann Apps MIMAS, The University of Manchester, UK.
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
Emerging Uses for the OpenURL Framework Ann Apps and Ross MacIntyre MIMAS, The University of Manchester.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
Joint Information Systems Committee Supporting Higher and Further Education Rachel Bruce Programme Manager, JISC Executive Collection.
JISC Information Environment Service Registry (IESR) Ann Apps MIMAS, The University of Manchester, UK.
© 2010 Deep Web Technologies, Inc. Taking the Library Back from Google Abe Lederman, President and CTO Deep Web Technologies May 12, 2010.
Breaking Out of the Box: Creating Customized Metasearch Services Using an XML API Roy Tennant, California Digital Library.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
Collection-level description: from theory to practice Minerva project meeting Paris, 24 January 2003 Pete Johnston UKOLN, University of Bath Bath, BA2.
The JISC Information Environment Service Registry (IESR) Ann Apps Mimas, The University of Manchester, UK.
Digital libraries research IG Cataloging and metadata IG Web services and metadata switch February 2003 Web services and metadata switch February 2003.
Collections, services, and interoperability in the information environment Minerva Project WP3/4 meeting, Paris, 5 July 2002 Pete Johnston UKOLN, University.
Taking the Library Back from Google Abe Lederman, President and CTO October 18-20, 2007.
A centre of expertise in digital information management 10 minute practical guide to the JISC Information Environment (for publishers!)
Networked Information Resources Federated search, link server, e-books.
Resource Discovery Landscape
Accessing a national digital library: an architecture for the UK DNER
Disseminating Service Registry Records
JISC Information Environment Service Registry (IESR)
Presentation transcript:

A centre of expertise in digital information management UKOLN is supported by: Is Metasearching Really Better Searching? STM Innovations Seminar London, Friday 2 December 2005 Pete Johnston Research Officer, UKOLN, University of Bath

A centre of expertise in digital information management Is Metasearching Better Searching? What is metasearch? Making metasearch work –The NISO Metasearch Initiative Metasearch today –Metasearch and Google –Metasearch and "social bookmarking"

A centre of expertise in digital information management What is metasearch?

A centre of expertise in digital information management What is metasearch? Metasearch, parallel search, federated search, broadcast search, cross-database search, search portal are a familiar part of the information community's vocabulary. They speak to the need for search and retrieval to span multiple databases, sources, platforms, protocols, and vendors at one time. NISO MetaSearch initiative

A centre of expertise in digital information management The search problem User wants to find, access, and use items made available by multiple content providers Content providers make their collections available through their own separate presentation services User interacts with multiple services in succession, e.g. –Query Resource Discovery Network (RDN) for Web resources –Query Zetoc for journal articles –etc

The search problem Web Sites

A centre of expertise in digital information management The search problem User has to –Discover different services –Manage different authentication/access requirements –Use different user interfaces for search –Interpret different result sets different metadata –Manipulate different result sets human-readable (HTML) but difficult to merge, reuse May still not have access to (appropriate copy of) resource

A centre of expertise in digital information management The metasearch solution The provision of "metasearch" services that –enable user to search across the metadata databases of multiple content providers from a single interface –manage multiple result sets and present to user –manage authentication/access –(etc!) Seamless (to the user) discovery of and access to heterogeneous, distributed resources!

A centre of expertise in digital information management Approaches to metasearch (1): cross-searching Metasearch service accepts user query Sends query to multiple content provider search targets Receives responses from targets Presents result sets to user

Z39.50, SRW, SRU, etc Metasearch: Cross-search Web Site Search Targets

A centre of expertise in digital information management Approaches to metasearch (2): harvesting Metasearch service periodically gathers metadata records from content provider repositories into local database Metasearch service accepts user query Executes query on local database Presents result sets to user Some harvesting services may also harvest/index copy of resource

Metasearch: Harvester OAI-PMH Web Site Repositories

A centre of expertise in digital information management Cross-searching & harvesting Metasearch service may use both in combination! Cross-search –Latest results returned –Content provider controls searches available –May slow overall performance Harvesting –Better performance for user query –Options for normalisation etc by harvester –Only as up-to-date as last harvest

A centre of expertise in digital information management A hospitable climate for metasearch? Metasearch service depends on access to metadata Web Services –Standards for providing machine interfaces to applications on Web –Based on HTTP and XML –SOAP (messaging protocol), WSDL (service description), WS-* (!!) –WS not just for search! –Service-oriented approaches, modular applications –Google and Amazon provide Web Services "Web 2.0" –"The Web as platform" –Recombining data and services from multiple sources

A centre of expertise in digital information management The problems with metasearch User requires/expects resources from increasing range of content providers What if content provider doesn't implement standard search/harvest interface? Some proprietary APIs, "XML Gateways" –Scalability Some "screen-scraping" –Parsing of HTML pages to obtain metadata –Rights issues –Scalability, volatility

A centre of expertise in digital information management The problems with metasearch Metasearch services work, but…. For service provider –complex, laborious –fragile, susceptible to change by content provider –duplication of effort by service providers For content provider –concerns over efficiency –concerns over access management –rights, branding, results presentation/ranking

A centre of expertise in digital information management Making metasearch work

A centre of expertise in digital information management Making metasearch work Effective metasearch requires agreements between content providers and service providers –Transport protocol(s) –Query language(s) syntax and semantics –Metadata schemas syntax and semantics –Metadata quality presence of values, formats of literals etc –Intellectual property rights issues how metadata records and resources are presented, used –Authorisation / authentication –Disclosure / discovery of collections and services Andy Powell, "Metasearching: an overview", Presentation to BCS EPSG Seminar, July 2004

A centre of expertise in digital information management The NISO Metasearch Initiative Response to concerns of librarians, systems vendors, content providers Aims to enable –metasearch service providers to offer more effective and responsive services –content providers to deliver enhanced content and protect their intellectual property –libraries to deliver services that distinguish their services from Google and other free web services NISO MetaSearch initiative

A centre of expertise in digital information management Task Group 1: Access Management Conducted survey of authentication methods in use Developed use cases for authentication in metasearch context Ranked methods by ability to satisfy needs of use cases Recommends either: –IP-Authentication with a Proxy Server, or –Username/Password authentication Liaison with Shibboleth community

A centre of expertise in digital information management Task Group 2: Collection Description Metasearch service needs information about targets available for search/harvest –Discover collections of potential interest –Obtain sufficient information to identify a collection –Select one or more collections from amongst a number of discovered collections –Discover the services that provide access to the collection –Select a service with which to interact –Interact with service Collection description Service description

Metasearch 1 Metasearch 2 Collection/Service Knowledge Base 1 Collection/Service Knowledge Base 2 Shared Collection/Service Registry

A centre of expertise in digital information management Task Group 2: Collection Description Collection Description Specification –Metadata schema for collection-level description –Closely aligned with DCMI Collection Description Application Profile –Title, Subject, Size, Language, Item Type, Owner, Collector, Audience, Rights etc –Whole/Part relationships –Collection/Catalogue relationships –Collection/Service relationships

A centre of expertise in digital information management Task Group 2: Collection Description Information Retrieval Service Description Specification –Describe those digital services that provide access to collections –Zeerex Indicates protocol used Describes access point(s) for service Describes authentication/authorization requirements Lists operations/queries supported

A centre of expertise in digital information management Task Group 3: Search/Retrieve Result Set Metadata –Metadata schema to describe result set and record within result set –To support ranking, branding etc Citation Metadata –Metadata schema for citation components (based on subset of OpenURL)

A centre of expertise in digital information management Task Group 3: Search/Retrieve NISO XML Gateway –Based on SRU ("non-conformant subset") –Query encoded in URI, transmitted in HTTP GET, response as XML document –Three levels of implementation Level 0: Any query grammar Level 1: Provide description record for database Level 3: Support CQL –Liaison with A9 Opensearch

A centre of expertise in digital information management Metasearch today

Metasearch and Google Google –Harvests full-text of Web pages by following links –Makes indexes available for search –Result ranking based on number of links to page Index coverage limited to "visible Web" –Problems with Authentication controls Non-persistent URIs Non-textual resources Even if indexed, low ranking if few links No fielded searching

A centre of expertise in digital information management Metasearch and Google "Success is as much about what you dont search as what you do" Selection is important Relevance of results not determined only by links, citations e.g. often useful/vital to select/filter by audience, purpose of resource Roy Tennant, "Is Metasearch Dead?"

A centre of expertise in digital information management Metasearch and Google Google interest in indexing "hidden Web" –Collaborations with repository providers, OCLC etc –Google Scholar Google interest in metadata-based approach? –Google Base Google and Metasearch as complementary approaches to discovery

Metasearch and "Social bookmarking" del.icio.us

Bibliographic metadata added to item by Connotea Metasearch and "Social bookmarking" Connotea

A centre of expertise in digital information management Metasearch and "Social Bookmarking" Simple user-generated metadata –Typically description plus "tags" –Capture user perceptions of resources –Some services adding richer metadata Social: merging of personal collections –Bookmarking services as discovery services Connotea as "community-driven recommendation system" (Lund et al) Metadata available via RSS or simple API –Can metasearch services use/integrate metadata from bookmarking services?

Is Metasearching Better Searching? Technical components for metasearch available User expectations of coverage mean metasearch is a cross-domain problem However, quality of metasearch dependent on –metadata quality –metadata consistency –…across multiple providers Metasearch can complement other approaches Metasearch as "enabler" –supporting construction of many different services

A centre of expertise in digital information management UKOLN is supported by: Is Metasearching Really Better Searching? STM Innovations Seminar London, Friday 2 December 2005 Pete Johnston Research Officer, UKOLN, University of Bath