A REST-ful Web Services Approach to Library Federated Search using SRU Kevin Reiss Rutgers-Newark Law Library CALI 2005 – June 11th.

Slides:



Advertisements
Similar presentations
2008 EPA and Partners Metadata Training Program: 2008 CAP Project Geospatial Metadata: Intermediate Course Module 3: Metadata Catalogs and Geospatial One.
Advertisements

Pierre-Johan CHARTRE Java EE - JAX-RS - Pierre-Johan CHARTRE
What is intraLibrary Connect? Martin Morrey Product Director, Intrallect Ltd
OAI from 50,000 Feet OAI develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. Begun in 1999.
DIBYENDRA HYOJU MADAN PURASKAR PUSTAKALAYA JUNE 14, 2011 Virtual Union Catalogue Using Koha ILS 1.
1 of 16 Information Access The External Information Providers © FAO 2005 IMARK Investing in Information for Development Information Access The External.
Handle System: DOI Technical Infrastructure Corporation for National Research Initiatives Larry Lannom December 10, 1997.
© 2008 EBSCO Information Services SUSHI, COUNTER and ERM Systems An Update on Usage Standards Ressources électroniques dans les bibliothèques électroniques.
Usage Statistics in Context: related standards and tools Oliver Pesch Chief Strategist, E-Resources EBSCO Information Services Usage Statistics and Publishers:
Deconstructing Cataloging A Web Services Approach to Bibliographic Control Thomas Hickey.
The Future of the Catalog Shelley Hostetler Product Manager, Voyager Endeavor Information Systems.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Cathy N. Hartman University of North Texas Libraries October 10, 1998 Cathy N. Hartman University of North Texas Libraries October 10, 1998.
A centre of expertise in digital information management UKOLN is supported by: SRU: An overview of the SRU protocol and how it can be used.
DigiTools support of Web Services Repositories and Web Services workshop | 2 June 2009 Alan Oliver, Business Development Director.
REST - Representational State Transfer
REST Vs. SOAP.
Reinventing using REST. Anything addressable by a URI is called a resource GET, PUT, POST, DELETE WebDAV (MOVE, LOCK)
PNS: Personalized Multi-Source News Delivery Georgios Paliouras(1), Mouzakidis Alexandros(1), Christos Ntoutsis(2), Angelos Alexopoulos(3), Christos Skourlas(2)
World Meteorological Organization Working together in weather, climate and water WMO Information System (WIS) Search (with SRU) Timo Pröscholdt (PO-WIS)
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Ockham Library Network OAI, other “light-weight” protocols, and scholarly communication.
Bookshelf.EXE - BX A dynamic version of Bookshelf –Automatic submission of algorithm implementations, data and benchmarks into database Distributed computing.
Wageningen Library content Management System Peter van Boheemen 32nd ELAG Library Systems Seminar 14 April 2008, Wageningen.
Master’s course Bioinformatics Data Analysis and Tools Lecture 6: Internet Basics Centre for Integrative Bioinformatics.
Beth Forrest Warner Director, Digital Library Initiatives University of Kansas Presentation to Oregon State University Library May 5, 2003.
ELPUB 2006 Bansko, 14 June 2006 E-publishing Infrastructure for Firenze University Press Patrizia Cotoneschi University of Florence E-publishing Infrastructure.
Lund Online 07/10/2009 Ingolf Kaspar, Regional Sales Manager EBSCO Publishing.
Z39 Intro DigiTool Version 3.0. Z39 Intro 2 Overview What is z39.50? “A network protocol which specifies rules that allow searching of a range of different.
Source: George Colouris, Jean Dollimore, Tim Kinderberg & Gordon Blair (2012). Distributed Systems: Concepts & Design (5 th Ed.). Essex: Addison-Wesley.
Integrating Complementary Tools with PopMedNet TM 27 July 2015 Rich Schaaf
XML: The Strategic Opportunity Roy Tennant Challenges*  Only librarians like to search, everyone else likes to find  Our users want more information.
2013Dr. Ali Rodan 1 Handout 1 Fundamentals of the Internet.
1 NAVIGATING INFORMATION RESOURCES IN AGRICULTURE IN ICT ENVIRONMENT Dr. K. VEERANJANEYULU UNIVERSITY LIBRARIAN & CCPI, e-Granth Project Head, University.
REST.  REST is an acronym standing for Representational State Transfer  A software architecture style for building scalable web services  Typically,
Why We Create Metadata and How it is Useful Bruce Godfrey University of Idaho Library INSIDE Idaho
1999 Asian Women's Network Training Workshop What the Internet Offers Communications  Across the country or across the world Information resources and.
Python and REST Kevin Hibma. What is REST? Why REST? REST stands for Representational State Transfer. (It is sometimes spelled "ReST".) It relies on a.
OpenURL Link Resolvers 101
Web Services Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Dynamic Content On Edge Cache Server (using Microsoft.NET) Name: Aparna Yeddula CS – 522 Semester Project Project URL: cs.uccs.edu/~ayeddula/project.html.
© 2007 CBHL The CBHL Distributed Library The Council on Botanical and Horticultural Libraries A Guide to Content and Search Features.
Libraries at the Network Level: APIs, Linked Data, and Cloud Computing Roy Tennant OCLC Research rtennant on Twitter.
Marshall Breeding Director for Innovative Technology and Research Vanderbilt University
Introduction to Web Services Eric Lease Morgan University Libraries of Notre Dame June 24, 2005.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
Tsinghua University Library Yang Zhao & Airong Jiang Tsinghua University Library, Beijing China 4 June, 2004 Electronic Thesis and Dissertation System.
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
ScholarSpace & Open UH Mānoa March 2013 Beth Tillinghast Web Support Librarian ScholarSpace & eVols Project Manager UHM Library.
World Wide Web Library 150 Week 8. The Web The World Wide Web is one part of the Internet. No one controls the web Diverse kinds of services accessed.
Computing Fundamentals Module Lesson 6 — Using Technology to Solve Problems Computer Literacy BASICS.
A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.
CONTENTS  Definition And History  Basic services of INTERNET  The World Wide Web (W.W.W.)  WWW browsers  INTERNET search engines  Uses of INTERNET.
© 2010 Deep Web Technologies, Inc. Taking the Library Back from Google Abe Lederman, President and CTO Deep Web Technologies May 12, 2010.
Web Technologies Lecture 10 Web services. From W3C – A software system designed to support interoperable machine-to-machine interaction over a network.
Z39.50 and the ZING Initiatives: MAVIS Users Conference, 2003 November 6, 2003 Larry E. Dixson Library of Congress.
Electronic Commerce Semester 1 Term 1 Lecture 7. Introduction to the Web The Internet supports a variety of important tools, such as file transfer, electronic.
Digital libraries research IG Cataloging and metadata IG Web services and metadata switch February 2003 Web services and metadata switch February 2003.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
Redefining the Library’s Role through an Institutional Repository Sharon Mader, Dean Jeanne Pavy, Scholarly Communications Librarian Earl K. Long Library.
The Client-Server Model
REST- Representational State Transfer Enn Õunapuu
Database Fundamentals
WEB API.
Web APIs In computer programming, an application programming interface (API) is a set of subroutine definitions, protocols, and tools for building application.
Week 05 Node.js Week 05
Web-Services and RESTful APIs
Presentation transcript:

A REST-ful Web Services Approach to Library Federated Search using SRU Kevin Reiss Rutgers-Newark Law Library CALI 2005 – June 11th

Wouldnt it Be Nice? If you could search all your librarys electronic and print resources from one location? If interesting distributed resources could be easily integrated with local materials for your users? If you could get at the Deep Web in a structured and simple manner? Institutional Repositories Database Content Open Access Journals

Project Genesis Desire to increase visibility of all library resources Provide a common search interface to a growing group of resources Library Catalog In House Digital Library Collections The Web & Academic Internet (Future) Need to use an open-source solution Cost Compatibility with emerging digital library standards Make our digitized resources programmer-friendly

Solution: Federated Search The SRU Protocol The Search Retrieve URL Service Implemented using open source Perl and Python Modules Search different collections and resources at the same time Provides a complimentary search interface monolithic single application search interfaces Not a replacement, but a complement Implementation of standard search and retrieval protocols can benefit: Authors Publisher Content Providers Users

Why Should you Care? Directors Open standards & protocols = More return on software investment Bring the Academic Internet to your users Web Developers/Masters Open protocols and standards bring down implementation barriers such as time and cost Develop customized interfaces using XML/XSLT for search and retrieval Library Technologists Improve deep web accessibility Learn about a descendent of a familiar tool (Z39.50)

What is a Web Service? Facilitates communication to a networked application Client requests something Server carries out the request and reports success or failure Responses and requests (sometimes) are encoded in XML Programmers embed calls to a web service as a part of a useful local application Query a online database Receive news updates Receive stock quotations

What is REST? REpresentational State Transfer Roy FieldingRoy Fielding A design philosophy not a protocol The fundamental concept behind the web Each URL/URI is a unique state transferred from server to client Characteristics of REST-ful web services: Always over HTTP Request: Form a URL + query string Response: Comes back in XML

cat Cat in the Hat <!– more results follow </results

REST v. SOAP / XML-RPC Eric Lease Morgan classifies web services: SOAP-ful Web Services More complicated; but potentially more robust than REST Can use any sort of transport mechanism, ; SSH, telnet Encoded using the SOAP XML wrapper – W3C standard for web services Example – The Google API (incorporate Googles search results into your own program) REST-ful Web Services Serve up as arbitrary application defined XML only Transported via HTTP requests only

What are SRW and SRU? The Search/Retrieve Web Service and the Search/Retrieve URL Service Standard way to search any Internet information resource: Set up an SRW or SRU server (relatively painless) Accept queries & return search results For librarians – SRW/U comes out of the ZING (Z39.50 International: the Next Generation) Group Hasnt hit critical mass unlike OAI or RSS SRW is the SOAP-ful flavor SRU is the REST-ful flavor

SOAP-ful Example - SRW Law and Technology Journal <!– record content

REST-ful Example - SRU <SRW:searchRetrieveResponse xmlns:SRW=" xmlns:DIAG=" Law and Technology Journal <!– record content

Basic SRU Server Understands queries written in CQL (Common Query Language) Queries sent to an SRU server as a URL parameter Receive a structured XML response with search results Take this Result and… Format it for your users using XSLT More processing- do something else with it

CQL - Basics Combines two search engine traditions Simple Google-like queries Dont worry, you dont have to write a query parser Expressive and powerful but non-intuitive query languages Z39.50s query language SQL & XML Query languages Examples new jersey statutes description = governor of new jersey and description = ethics cat prox/distance=3/unit=word/ordered hat

URL + Parameters Test server [ inspired by an implementation by Mike Taylorhttp://law-library2.rutgers.edu/SRU/sru.pl Basic SRU parameters: operation – tells the SRU server what it is supposed to do (only 3 of them searchRetrieve, explain, scan) version – currently 1.1 startRecord – the first record that you want back maximumRecords – number of records you want back at any one time recordSchema – metadata format you want back The full SRU request: library2.rutgers.edu/SRU/srucql.pl?query=new+jersey+statutes&startR ecord=1&maximumRecords=10&collection=lawlib&version=1.1&opera tion=searchRetrieve&recordSchema=dc Application defined parameters, you just must let users know these are there in the documentation for your SRU server: ex: the collection parameter above

XSLT for Formatting Use the stylesheet parameter This allows you to specify an XSLT stylesheet that to format your search re You can have different stylesheets for different users Client side v. Server Side XSLT Browser support is unreliable Large XML documents can tax a server

Current Collections Electronic Journal & Databases (Titles + Descriptions Only) Law Library Website Digital Library Collections [8 Collections] Collections indexed using swish-e Library Catalog Done using the British Librarys python wrapper class for Z39.50 serverspython wrapper class Uses the python ZOOM and CQL

Application Diagram Harvested OAI Data Python Z39.50 Swish-e Digital Library Library Website Library Catalog SRU Server User XSL Stylesheet HTML XML Response URL Request w/ SRU Params

Other SRU Operations Explain Tells a programmer about your SRU Server Explain Response structure defined by the Zeerex XML DTD Z39.50 explain, explained and re-engineered in XML Fields to search Metadata sets Default Parameters for an SRU or SRW server What portion of CQL is supported Provides documentation for your SRU implementation Scan Index function Could display a controlled vocabulary for a given collection Not implemented on any SRU apps

SRU could Fight Search Engine Babble Consider the query urban planning law: new.rutgers.edu/search/t?SEARCH=urban+planning+la w new.rutgers.edu/search/t?SEARCH=urban+planning+la w g+law&btnG=Google+Search g+law&btnG=Google+Search &sm=Yahoo%21+Search&fr=FP-tab-web-t&toggle=1 &sm=Yahoo%21+Search&fr=FP-tab-web-t&toggle=1 eld_2=author&value_1=urban+planning+law&value_2=& connector_3=and&field_3=ancestor.link&op_3=in&value _3=http%3A%2F%2Flaw.bepress.com%2Frepository&hi dden_3=1&x_force_carryover=&format=cover_page&qu ery=Processing...

To combine These Search Results… Get the query syntax and all URL parameters correct Then scrape the HTML output in order get the information on the query response Not a very reliable proposition What if you could send a query to all them using the same syntax and receive your responses back in the same format? You can using SRU

Imagine if… Many useful remote resources supported SRU It would be easy to bring users search results from targeted SRU-compliant resources

Possible SRU Applications Incorporation of content from popular indexes and resources into federated searches Subject-specific search interfaces Institutional repository content harvested via OAI or something like it Blogs Self-published works Grey literature BePress, PLOS, or SSRN

OAI and SRU/W OAI is a harvesting protocol (Open Archives Initiative)Open Archives Initiative Full-text search isnt an option in OAI SRU/W is a searching protocol Full-text search is an option in SRU if you want They Compliment each other You could easily search harvested OAI data via SRU Imagine if one could easily harvest court decisions via OAI….(Tom Bruce) Search for OAI data providers Registry via SRU – See the University of Illinois OAI registry

Potential OAI – SRU Synergy OAI Harvesters OAI Data Providers SRU Server Make Data Available Harvest and Maintain Updated Indexes of Data Search and Present Data to Users

Adding the Nellco RepositoryRepository Harvest the Records via OAIRecords Create a Swish config file for the harvested records Index with Swish Create an explain record & id value in our federated application for the index Make the new index a search target This is what google scholar is doing with IRs like Dspace (full-text possible)

Harvested Resource Results

What Can Libraries Gain? Improved discovery of institutional and remote resources Subject specific aggregation Especially for the Academic Internet Different entry points to collections Provide user-sensitive display of resources using XSL stylesheets We must be able to accept and deliver multiple forms of metadata in order to build scalable digital libraries - Roy Tennant

Federated Search Issues Ranking of search results Effective display of results To build a fully featured search interface you need: Metadata and more Metadata Simple Dublin Cores focus is discovery Doesnt represent technical information The identifier element isnt adequate for resources with multiple formats and manifestations

For the Future Your search engine is only as good as your data XML that parses Metadata – early and often A robust and extensible format such as METS METS provides facilities for encoding structural and technical information along with description metadata Crosswalks can be used to extract simpler metadata formats from METS for use by apps like SRU and OAI Open protocols layered on top of each SRU interface over OAI harvested metadata Institutional OpenURL resolver information applied to identifiers returned via SRU and OAI to grant user access to subscriptionsOpenURL

REST-ful Applications Youve Heard off RSS/Atom – The technology behind blogs OAI – The Open Archives Initiative Some Library OPACs are quasi REST-ful services See the library lookup tool by Jon Udell Sessions or cookies deployed for anonymous OPAC searches can kill the possibility for writing applications like library lookup Consider this URL: new.rutgers.edu/search/t?SEARCH=urban+planninghttp://law- new.rutgers.edu/search/t?SEARCH=urban+planning With SRU any OPAC cold be queryed in this fashion dc.title=urban planning&recordSchema=dc dc.title=urban

What you can do next? Demand Vendor Support for simple REST-like open interfaces Check out library2.rutgers.edu/SRU/examples/ library2.rutgers.edu/SRU/examples/ SRU/OAI examples SRU/OAI practical research Develop an SRU server for your own collection(s) Get involved, ZING meets in Chicago next weekZING