Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interoperability Standards & Searching Multiple Repositories

Similar presentations


Presentation on theme: "Interoperability Standards & Searching Multiple Repositories"— Presentation transcript:

1 Interoperability Standards & Searching Multiple Repositories
Ralph LeVan/OCLC Ray Denenberg/Library of Congress

2 The Problem How do I provide a common interface for my users?
How do I combine results from multiple sources?

3 How do I provide a common interface for my users?
How do I convert my queries into the Content Provider’s (CP’s) queries? How do I ask for 10 records? How do I ask for more records? How do I interpret their response?

4 How do I convert my queries into the CP’s queries?
My user said “author=twain and title=huck finn” Google expects: +twain +”huck finn” Z39.50: twain/1=1003;4=2 “huck finn”/1=4;4=1 and Lucene: creator:twain and titlePhrase:”huck finn”

5 How do I ask for 10 records? Amazon won’t let you
RedLightGreen: MAXRECORDS=n British Library: records=n

6 How do I ask for more records?
Amazon: page=n RedLightGreen: STARTINDEX=n British Library: start=n

7 How do I interpret their response?
How many records did I retrieve? Did something go wrong? How do I convert the CP’s records into something my users will recognize?

8 How many records did I retrieve?
Amazon: <a href="/gp/search/ref=sr_nr_i_0/ ?%5Fencoding=UTF8&keywords=pratchett&rh=i%3Aaps%2Ck%3Apratchett%2Ci%3Astripbooks&page=1">Books</a><span class="narrowValue"> (334)</span> RedLightGreen: <b>Viewing:</b> 1-10 of 239 results British Library <opensearch:totalResults>190</opensearch:totalResults>

9 Did Something Go Wrong? RedLightGreen: British Library:
<span class=smallText>We didn't find any matches for <b>dog and</b>.</span> British Library: <item > <title >Nothing found due to an error</title> <description >Too many hits. Refine your request.</description></item>

10 How do I convert the records?
Amazon: <table class="searchresults" border="0" width="100%" cellpadding="0" cellspacing="0"> <tr><td width="100%" class="searchitem" id="Td:0"> <table border="0" width="100%" cellpadding="0" cellspacing="0"><tr valign="top"> <td> <table class="n2" border="0" cellpadding="0" cellspacing="0"> <tr> <td class="imageColumn" width="88"><table border="0" cellpadding="0" cellspacing="0"> <tr><td align="center" width="80"> <a href=" src=" width="55" alt="Thud! (Discworld, Book 32)" height="82" border="0" /></a> </td><td width="8"></td></tr></table></td> <td class="dataColumn"><table cellpadding="0" cellspacing="0" border="0"><tr><td> <a href=" class="srTitle">Thud! (Discworld, Book 32)</span></a> by Terry Pratchett (<span class="binding">Hardcover</span> - Sep 13, 2005)</td></tr> <tr><td class="brandLink"><span class="aliasName">Books:</span> <a href="/gp/search/ref=sr_nr_seeall_1/ ?%5Fencoding=UTF8&keywords=pratchett&rh=i%3Aaps%2Ck%3Apratchett%2Ci%3Astripbooks">See all 334 items</a></td></tr> <tr><td><span class="priceType"><a href=" new</a>: </span> <span class="listprice">$24.95</span> <span class="saleprice">$15.72</span>   <span class="priceType"> <a href=" & new</a> </span> from <span class="otherprice">$3.76</span>   <span class="avail">Usually ships in 24 hours</span> </td></tr><tr><td colspan="2"><table cellpadding="0" cellspacing="0" border="0"> <tr><td class="excerptStart"><span class="excerptLead">Excerpt from</span> <a href="/gp/reader/ /ref=sib_aps_pg/ ?%5Fencoding=UTF8&keywords=pratchett&p=S00E&checkSum=y3glB4NEGJ6Ql3iAWFd6teZptAJmys3Uu8CCW9387%252BA%253D">page 2</a>: "<span class="excerpt">... Terry <b>Pratchett</b> "Most of the news is ...</span>"</td></tr> <tr><td class="excerptSeeMore"><a href="/gp/reader/ /ref=sib_aps_ref/ ?%5Fencoding=UTF8&keywords=pratchett&v=search-inside">See more references</a> to <span class="excerptUserInput">pratchett</span> in this book.</td></tr><tr><td style="padding-top: 5px; padding-bottom: 8px;"><span style="font-weight: bold; color: #339933;">Surprise me!</span> <a href=" a random page</a> in this book.</td></tr></table></td></tr> </table></td></tr></table> </td></tr></table></td> </tr>

11 Converting Records Cont.
RedLightGreen: <td class="highlightcell"><span class="titleText"><b><a title="View more information about this title." href="ucw.servlets.UCWController?ACTION=EDITION&WORKID= &LANGUAGE=ENG&MATERIAL=books&FROMRSLT=3&FROMWORK=1&lang=english">Hogfather</a></b>, by Terry Pratchett <br>3 editions published between 1996 and 1998 in English.<br>Primary Subject: Discworld Imaginary Place - Fiction<br><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/gray.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/></span></td></tr></table><table xmlns=" border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="recordsepcell" colspan="2"><img src="/ucwprod/web/images/clear.gif" height="1"/></td></tr></table><table xmlns=" border="0" cellpadding="3" cellspacing="0" width="100%"><tr valign="top"><td width="25" align="right" class="highlightcell"><span class="titleText">2.</span></td>

12 Converting Records Cont.
British Library: <item ><title >Thud! / Terry Pratchett.</title> <link > <description > Pratchett, Terry. ; London : Doubleday, ISBN (hbk.) : £ (Added : )</description></item>

13 How do I combine results from multiple sources?
Things you might want the server to do for you: Common Record Format Common Sort Order Common Rank Order

14 Functional Matrix Request Record Starting Point
Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema

15 The Old Solutions Screen Scraping Private API’s Z39.50

16 Screen Scraping A query has to be generated and embedded in a CP specific URL Code has to be written to examine the HTML returned by a CP Prone to breakage Web sites change formatting frequently Every site is unique Separate code to be maintained for every site

17 Private API’s Often only a slight improvement over screen scraping
Provides documentation on how to construct the URL Might provide documentation on how to construct the query Might guarantee a stable response format Still requires unique code for each site

18 Z39.50 Guarantees a standard request and response But…
Not HTTP or HTML Binary encoding over raw TCP/IP Complicated 11 services 7 extended services Easy to be compliant and not interoperable Unfriendly The response to a protocol error was to drop the connection

19 Why Use A Standard API? Defined requests and responses
Reusable code across sites Open Source code

20 The New Solutions OpenSearch 1.1 MXG Levels 0-2 SRU

21 OpenSearch 1.1 From Wikipedia
OpenSearch is a collection of technologies that allow publishing of search results in a format suitable for syndication. It is a way for search engines to publish their search results in a standard and accessible format

22 OpenSearch 1.1 (cont.) Defines a Description Record with information about the CP ShortName and LongName Description Tags URL template Example:

23 OpenSearch 1.1 (cont.) URL Template
Server Indicates how to specify OpenSearch request parameters Parameters not specified in the template are unavailable The only mandatory parameter is {searchTerms} <Url type="application/rss+xml" template=" />

24 OpenSearch 1.1 (cont.) Request Parameters {searchTerms} {count}
{startIndex} {startPage} {language} {outputEncoding} {inputEncoding}

25 OpenSearch 1.1 (cont.) Uses RSS 2.0 with a few extra elements for the response RSS define title, description and link elements OpenSearch adds the totalResults, startIndex, itemsPerPage, link and Query elements

26 Key: ●==Full Support ○==Limited Support
Functional Matrix OS 1.1 Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema Key: ●==Full Support ○==Limited Support

27 Cool Feature The RSS mechanism in OpenSearch provides the ability to have persistent and periodic queries!

28 NISO MetaSearch XML Gateway MXG
MXG has been designed to provide a low implementation barrier to content providers that want to make their databases available to metasearch engines.  Interoperability across content providers was explicitly not a goal of MXG

29 MXG Levels of Support Level 0: Requests are simple URL’s using any query grammar and responses are XML records Level 1: Adds a description record for the database Level 2: Support a limited subset of a standard query grammar: CQL

30 MXG Request Version (mandatory) Query (mandatory) StartRecord
MaximumRecords

31 MXG Response <?xml version="1.0" ?> <searchRetrieveResponse
xmlns=" <version>1.1</version> <numberOfRecords>10</numberOfRecords> <records> … </records> <nextRecordPosition>1</nextRecordPosition> <echoedSearchRetrieveRequest> <query>"stuff"</query> </echoedSearchRetrieveRequest> </searchRetrieveResponse>

32 MXG Response Records <record> <recordSchema>
info:srw/schema/1/dc-v1.1 </recordSchema> <recordPacking>xml</recordPacking> <recordData> </recordData> <recordPosition>1</recordPosition> </record>

33 MXG Response recordData
<srw_dc:dc xmlns=" xmlns:dc=" xmlns:srw_dc="info:srw/schema/1/dc-v1.1"> <dc:identifier>rrl1234</dc:identifier> <dc:title>Dog and Cat</dc:title> </srw_dc:dc>

34 MXG Error Messages <diagnostics>
<diagnostic xmlns=" <uri>info:srw/diagnostic/1/51</uri> <details>66ntqk</details> </diagnostic> </diagnostics>

35 Key: ●==Full Support ○==Limited Support
Functional Matrix MXG Level 0 Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema Key: ●==Full Support ○==Limited Support

36 MXG Level 1 Add a description record for the database

37 Key: ●==Full Support ○==Limited Support
Functional Matrix MXG Level 1 Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema Key: ●==Full Support ○==Limited Support

38 MXG Level 2 Support a limited subset of a standard query grammar: CQL
Supports indexes and Booleans

39 Key: ●==Full Support ○==Limited Support
Functional Matrix MXG Level 2 Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema Key: ●==Full Support ○==Limited Support

40 SRU MXG Level 2 Plus: Full Query Grammar (CQL) Full Sort Specification

41 CQL: Common Query Language
Loosely based on CCL Search Boolean & Proximity Operators Index Sets & Indexes String Indexes vs. Keyword Indexes Truncation Characters ‘*’, ‘#’ & ‘?’ Relations: ‘=‘, all, any, exact, within Example: dc.title=“harry potter” or bib1.isbn= x

42 Sort sortKeys parameter with the following comma separated values specified: Xpath (path to the element to be sorted on) Schema (that the xpath comes from) Ascending (value is 1==true or 0==false, default==true) CaseSensitive (value is 1==true or 0==false, default==false) missingValue (values are omit, abort, highValue or lowValue, default==highValue) e.g. &sortKeys=title,onix,0

43 Key: ●==Full Support ○==Limited Support
Functional Matrix SRU Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema Key: ●==Full Support ○==Limited Support

44 Cool Feature Combining SRU response data and echoed data with javascript and stylesheets allows for thin, browser based, clients

45 Key: ●==Full Support ○==Limited Support
Functional Matrix OS 1.1 MXG L0 MXG L1 MXG L2 SRU Request Record Starting Point Request Number of Records Request Record Schema Defined Query Grammar Specify Sort Order Specify Ranking Order Diagnostic Messages XML Response Record Count In Response Records In Known Schema Key: ●==Full Support ○==Limited Support


Download ppt "Interoperability Standards & Searching Multiple Repositories"

Similar presentations


Ads by Google