1 Interoperability: architectures and connections John Gilby, M25 Systems Team, LSE Ashley Sanders, Copac Team, MIMAS "Hyper Clumps, Mini Clumps and National.

1 1 Interoperability: architectures and connections John Gilby, M25 Systems Team, LSE Ashley Sanders, Copac Team, MIMAS "Hyper Clumps, Mini Clumps and National Catalogues: resource discovery for the 21st century“ 11th November 2004, British Library, London

2 2 Contents Overview of technical architecture of union catalogues (Copac & InforM25) Introduce Z39.50 to Z39.50 middleware & issues to consider CC-interop and JAFER –Installation, configuration & testing Results set issues and searching times

3 3 A reminder, Z39.50 is: a standard for information retrieval a client/server relationship –Z-client – stand-alone in PC or associated with web server/user interface –Z-server - generally a module in library systems a method for communication between disparate computer systems (such as a library catalogue and a user’s PC)

4 4 Copac has 26 libraries (including large research, academic and BL, NLS) geographically covers whole of UK JISC funded, administered by MIMAS has “control” over indexes and searching process can be searched via Z39.50 periodic data loads live circulation data via Z39.50 = very successful and popular with users Copac V3 – experimental Z39.50 searching of Copac and National Library of Wales

5 5 Incoming MARC records from contributing institutions Record pre-processing: standardisation & problem identification Copac database Z-server, OpenURL & web interface Formation of consolidated and individual records & indexes CURL database creation MARC21 & UKMARC Duplicate checks pass/fail webZ39.50 CURL/Copac database creation

6 6 Distributed catalogue typically has up to 40 library catalogues (academic – CAIRNS, InforM25, RIDING; Public - WiLL) regionally based funded by regional organisation rely on institutional catalogues for record standards, indexing and Z-server configurations some control over Z39.50 searching process data is as up to date as library OPAC ‘clump’ software combines result sets and presents them to user generally cannot accept queries outside of user interface

7 7 Use r Copac single, large database Distributed catalogue Z-client software and user interface Z-server/institutional library systems network Union catalogues

8 8 Z to Z Middleware Remote user Z-client Z39.50 to Z39.50 Middleware Institution Z-server A Institution Z-server B Z39.50 ‘Local’ user web interface e.g. M25 libraries e.g. Copac V3

17 17 Search tests Search set 1 - Copac Z39.50 criteria –no query transformations Search set 2 - M25 ‘best practice’ settings –query transforms applied

18 18 Search test results — 1 Access failed –variable: always, sometimes, occasional, never –Talis & Aleph access problems –firewall problems Access succeeded –some searches received no response

19 19 Search test results — 2 Response with Copac search settings –203 searches carried out –95 failed to return a result (0 or more records) Response with InforM25 settings –199 searches carried out –3 failed to return a result (0 or more records)

20 20 Middleware benefits Simplifies access to range of catalogues Query transformation improves search success rate Virtual catalogue staff can: – provide centralised development and maintenance – identify and investigate problems – act as a central contact point Can interconnect the (JISC) Information Environment Potentially useful for a National Catalogue

21 21 Search problems/solutions Users lose control of query Search consistency – failure of catalogues to respond – lowest common denominator or all options? – catalogues searching different fields – catalogues searching fields in different ways Standardisation – profiles eg. Bath Profile – work on index standardisation

22 22 Response times Improved access to resources – benefits end-user and library staff BUT – impacts on local catalogue – over-large result sets – duplication of material Response times – impact on local catalogue searcher – impact on virtual catalogue searcher

23 23 Response time test Hourly search for ‘Austen’ – record time taken to obtain search result – does not include record collection or result processing Number of searches responding – c.90% within 2 seconds – c.4% within 4-27seconds Overall response time governed by slowest catalogue – Timeouts for slow to- or non-responding catalogues

24 24 Restricted searches Should all searches be sent to all catalogues? – control where searches are sent initially – pre-defined search groups - by location/subject? Better to deal with large result sets through ranking and/or sorting? – which brings us back to response times…

25 25 Summary & what next ? JAFER tests - middleware works Enables distributed catalogues to be ‘plugged into’ the IE Dynamic resource selection is technically feasible Clump services interested Further investigations: – Response-time tests – Results processing

26 26 Further details Reports on the project website: Copac Team: M25 Systems Team:

