Presentation is loading. Please wait.

Presentation is loading. Please wait.

OCLC Online Computer Library Center Interoperability Standards & Searching Multiple Repositories Ralph LeVan/OCLC Ray Denenberg/Library of Congress.

Similar presentations


Presentation on theme: "OCLC Online Computer Library Center Interoperability Standards & Searching Multiple Repositories Ralph LeVan/OCLC Ray Denenberg/Library of Congress."— Presentation transcript:

1 OCLC Online Computer Library Center Interoperability Standards & Searching Multiple Repositories Ralph LeVan/OCLC Ray Denenberg/Library of Congress

2 The Problem How do I provide a common interface for my users? How do I combine results from multiple sources?

3 How do I provide a common interface for my users? How do I convert my queries into the Content Provider’s (CP’s) queries? How do I ask for 10 records? How do I ask for more records? How do I interpret their response?

4 How do I convert my queries into the CP’s queries? My user said “author=twain and title=huck finn” Google expects: +twain +”huck finn” Z39.50: twain/1=1003;4=2 “huck finn”/1=4;4=1 and Lucene: creator:twain and titlePhrase:”huck finn”

5 How do I ask for 10 records? Amazon won’t let you RedLightGreen: MAXRECORDS=n British Library: records=n

6 How do I ask for more records? Amazon: page=n RedLightGreen: STARTINDEX=n British Library: start=n

7 How do I interpret their response? How many records did I retrieve? Did something go wrong? How do I convert the CP’s records into something my users will recognize?

8 How many records did I retrieve? Amazon: Books (334) RedLightGreen: Viewing: 1-10 of 239 results British Library 190

9 Did Something Go Wrong? RedLightGreen: We didn't find any matches for dog and. British Library: Nothing found due to an error Too many hits. Refine your request.

10 How do I convert the records? Amazon: Thud! (Discworld, Book 32) by Terry Pratchett ( Hardcover - Sep 13, 2005) Books: See all 334 items Buy new : $24.95 $15.72 Used & new from $3.76 Usually ships in 24 hours Excerpt from page 2 : "... Terry Pratchett "Most of the news is... " See more references to pratchett in this book. Surprise me! See a random page in this book.

11 Converting Records Cont. RedLightGreen: Hogfather, by Terry Pratchett 3 editions published between 1996 and 1998 in English. Primary Subject: Discworld Imaginary Place - Fiction 2.

12 Converting Records Cont. British Library: Thud! / Terry Pratchett. http://catalogue.bl.uk/F/-?func=direct- doc- set&doc_number=013220851&l_base=BLL01& from=A9OpenSearch Pratchett, Terry. ; London : Doubleday, 2005.. ISBN 0385608675 (hbk.) : £17.99. (Added : 20050614 )

13 How do I combine results from multiple sources? Things you might want the server to do for you: –Common Record Format –Common Sort Order –Common Rank Order

14 Functional Matrix Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema

15 The Old Solutions Screen Scraping Private API’s Z39.50

16 Screen Scraping A query has to be generated and embedded in a CP specific URL Code has to be written to examine the HTML returned by a CP Prone to breakage –Web sites change formatting frequently Every site is unique –Separate code to be maintained for every site

17 Private API’s Often only a slight improvement over screen scraping Provides documentation on how to construct the URL Might provide documentation on how to construct the query Might guarantee a stable response format Still requires unique code for each site

18 Z39.50 Guarantees a standard request and response But… –Not HTTP or HTML Binary encoding over raw TCP/IP –Complicated 11 services 7 extended services –Easy to be compliant and not interoperable –Unfriendly The response to a protocol error was to drop the connection

19 Why Use A Standard API? Defined requests and responses Reusable code across sites Open Source code

20 The New Solutions OpenSearch 1.1 MXG –Levels 0-2 SRU

21 OpenSearch 1.1 From Wikipedia –OpenSearch is a collection of technologies that allow publishing of search results in a format suitable for syndication. It is a way for search engines to publish their search results in a standard and accessible format

22 OpenSearch 1.1 (cont.) Defines a Description Record with information about the CP –ShortName and LongName –Description –Tags –URL template Example: http://herbie.bl.uk:9080/opensearch.xml

23 OpenSearch 1.1 (cont.) URL Template –Server Indicates how to specify OpenSearch request parameters –Parameters not specified in the template are unavailable –The only mandatory parameter is {searchTerms}

24 OpenSearch 1.1 (cont.) Request Parameters –{searchTerms} –{count} –{startIndex} –{startPage} –{language} –{outputEncoding} –{inputEncoding}

25 OpenSearch 1.1 (cont.) Uses RSS 2.0 with a few extra elements for the response –RSS define title, description and link elements –OpenSearch adds the totalResults, startIndex, itemsPerPage, link and Query elements http://herbie.bl.uk:9080/cgi- bin/OSxml1.cgi/?q=levan&format=rss

26 Functional Matrix OS 1.1 Request Record Starting Point● Request Number of Records ○ Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response ○ Record Count In Response ○ Records In Known Schema ○ Key: ●==Full Support ○==Limited Support

27 Cool Feature The RSS mechanism in OpenSearch provides the ability to have persistent and periodic queries!

28 NISO MetaSearch XML Gateway MXG MXG has been designed to provide a low implementation barrier to content providers that want to make their databases available to metasearch engines. Interoperability across content providers was explicitly not a goal of MXG

29 MXG Levels of Support Level 0: Requests are simple URL’s using any query grammar and responses are XML records Level 1: Adds a description record for the database Level 2: Support a limited subset of a standard query grammar: CQL

30 MXG Request Version (mandatory) Query (mandatory) StartRecord MaximumRecords http://alcme.oclc.org/MXG/search/ORPub s?version=1.1&query="levan"&startRec ord=1&maximumRecords=10

31 MXG Response <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"> 1.1 10 … 1 1.1 "stuff"

32 MXG Response Records info:srw/schema/1/dc-v1.1 xml … 1

33 MXG Response recordData rrl1234 Dog and Cat

34 MXG Error Messages info:srw/diagnostic/1/51 66ntqk http://www.loc.gov/z3950/agency/zing/srw/diagnostics- list.html

35 Functional Matrix MXG Level 0 Request Record Starting Point● Request Number of Records● Request Record Schema ○ Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages● XML Response● Record Count In Response● Records In Known Schema● Key: ●==Full Support ○==Limited Support

36 MXG Level 1 Add a description record for the database http://www.loc.gov/z3950/agency/zing/srw/explain.html http://alcme.oclc.org/MXG/search/ORPubs

37 Functional Matrix MXG Level 1 Request Record Starting Point● Request Number of Records● Request Record Schema● Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages● XML Response● Record Count In Response● Records In Known Schema● Key: ●==Full Support ○==Limited Support

38 MXG Level 2 Support a limited subset of a standard query grammar: CQL Supports indexes and Booleans http://www.loc.gov/z3950/agency/zing/cql/ http://alcme.oclc.org/srw/search/ORPublications?version=1.1&qu ery=dc.author=levan&maximumRecords=1

39 Functional Matrix MXG Level 2 Request Record Starting Point● Request Number of Records● Request Record Schema● Defined Query Grammar ○ Specify Sort Order Specify Ranking Order Diagnostic Messages● XML Response● Record Count In Response● Records In Known Schema● Key: ●==Full Support ○==Limited Support

40 SRU MXG Level 2 Plus: –Full Query Grammar (CQL) –Full Sort Specification

41 CQL: Common Query Language Loosely based on CCL Search Boolean & Proximity Operators Index Sets & Indexes String Indexes vs. Keyword Indexes Truncation Characters ‘*’, ‘#’ & ‘?’ Relations: ‘=‘, all, any, exact, within Example: dc.title=“harry potter” or bib1.isbn=123-456-78x

42 Sort sortKeys parameter with the following comma separated values specified: –Xpath (path to the element to be sorted on) –Schema (that the xpath comes from) –Ascending (value is 1==true or 0==false, default==true) –CaseSensitive (value is 1==true or 0==false, default==false) –missingValue (values are omit, abort, highValue or lowValue, default==highValue) e.g. &sortKeys=title,onix,0

43 Functional Matrix SRU Request Record Starting Point● Request Number of Records● Request Record Schema● Defined Query Grammar● Specify Sort Order● Specify Ranking Order ○ Diagnostic Messages● XML Response● Record Count In Response● Records In Known Schema● Key: ●==Full Support ○==Limited Support

44 Cool Feature Combining SRU response data and echoed data with javascript and stylesheets allows for thin, browser based, clients http://alcme.oclc.org/MXG/search/ORPub s?version=1.1&query="levan"&startRec ord=1&maximumRecords=10

45 Functional Matrix OS 1.1 MXG L0 MXG L1 MXG L2 SRU Request Record Starting Point●●●●● Request Number of Records ○ ●●●● Request Record Schema ○ ●●● Defined Query Grammar ○ ● Specify Sort Order● Specify Ranking Order ○ Diagnostic Messages●●●● XML Response ○ ●●●● Record Count In Response ○ ●●●● Records In Known Schema ○ ●●●● Key: ●==Full Support ○==Limited Support


Download ppt "OCLC Online Computer Library Center Interoperability Standards & Searching Multiple Repositories Ralph LeVan/OCLC Ray Denenberg/Library of Congress."

Similar presentations


Ads by Google