RoMEO and CRIS Technical Issues & Efficiency Tips

Slides:



Advertisements
Similar presentations
RoMEO Service & Developments Peter Millington & Jane H Smith Centre for Research Communications University of Nottingham JISC Conference 2010 RoMEO: An.
Advertisements

Developing SHERPA/RoMEO's role in improving open access Peter Millington Centre for Research Communications University of Nottingham Mötesplats Open Access.
SHERPA Din guide til det åpne landskapet 31. oktober 2007 Peter Millington SHERPA Technical Development Officer SHERPA, University.
RoMEO, JULIET & OpenDOAR Services that can enhance your repository JISC Repositories & Preservation Programme Meeting, Bristol,
RoMEO, JULIET and OpenDOAR: A Tale with a Happy Ending!
Process and Compliance and the RCUK Policy: Funders’ Author Support Pages RLUK Workshop on Implementation of the RLUK Policy 28 th January 2013 Bill Hubbard.
Searching TAL Online Developed by Northern Lights Internet Solutions Ltd. Advanced Searching.
SHERPA Services and Publishers SHERPA Services and Publishers Meeting 22 May 2014 Bill Hubbard Director, Centre for Research Communications.
June Overview of Operations & the INIS Record INIS Training Seminar 2-6 June 2003 Vienna, Austria Seyda RIEDER INIS Section Supervisor, Bibliographic.
Imagery 2.0 –you are here and there A brief introduction to social photo and video.
CORE 2: Information systems and Databases CENTRALISED AND DISTRIBUTED DATABASES.
Relational Databases Melton, Beth “Databases: Access Terminology and Relational Database Concepts.” 09/LPMArticle.asp?ID=73http://pubs.logicalexpressions.com/Pub00.
SERIALS A Quick Overview. There is much confusion about the difference between a series and a serial. There is much confusion about the difference between.
Incorporating COUNTER compliant download statistics into an EPrints repository Alan Stiles Library Services, Open CC-BY 4.0 except.
Monitoring the acquisition process by web widgets Leonardo Tininini and Antonino Virgillito ISTAT Meeting on the Management of Statistical Information.
Using COS Funding Alert Alerting You to Relevant New Opportunities from the World’s Largest Funding Database ™ Via your COS Workbench ™
Process and Compliance and the RCUK Policy: Funders’ Author Support Pages Bill Hubbard Director, Centre for Research Communications University of Nottingham.
MODULE 3 - Exploring the Open Access landscape: how to make use of OA.
CENDI/FLICC Workshop, June 21, 2000 Slide 1 of 24 The Impact of Reference Linking on the Creation and Use of References/Citations CENDI/FLICC Workshop.
IBM Lotus Software © 2006 IBM Corporation IBM Lotus Notes Domino Blog Template Steve Castledine.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
If you have a transaction processing system, John Meisenbacher
A Brief Introduction to RoMEO and the API Peter Millington Centre for Research Communications University of Nottingham RoMEO API Workshop, Repository Fringe.
Copyright and RoMEO RSP Summer School Jane H Smith Services Development Officer, SHERPA
SHERPA/RoMEO Open Access Policy Tool for Publishers Peter Millington Centre for Research Communications University of Nottingham SHERPA/RoMEO for Publishers.
Searching for Sources What to look for Where to look How to improve your search efficiency.
SHERPA/RoMEO Forthcoming Developments and the API Peter Millington Centre for Research Communications University of Nottingham RoMEO API Workshop, Repository.
King’s College London Pre-Sessional Programme Searching for sources: Creating a bank of information.
Introduction to SHERPA RoMEO and its Significance for Publishers
RoMEO Service and Developments
Off Campus Library Services: Nursing Education
Indexing (and other good ideas)
TaiRox Product Overview Demonstrations
Open Access and Research Data Symplectic Pilot
Publishing DDI-Related Topics Advantages and Challenges of Creating Publications Joachim Wackerow EDDI16 - 8th Annual European DDI User Conference Cologne,
Reusing and repurposing metadata in a Current Research Information System and Institutional Repository 3 June 2010 Robin Armstrong Viner Cataloguing.
SHERPA/RoMEO Future Features
Single Sample Registration
Informatica PowerCenter Performance Tuning Tips
Research Organisation Subgroup June 1, 2017
Modernization of Navigation Statistics Publishing
“Real Simple Syndication” (RSS)
COMP 430 Intro. to Database Systems
Web Caching? Web Caching:.
USING CARLI DIGITAL COLLECTIONS
Tools and Techniques to Clean Up your Database
Access Busn 216.
Pack Your Park by Modernizing Your Business Online
Relate to Clients on a business level
Zetoc: Electronic Table of Contents from the British Library
ColdFusion Performance Troubleshooting and Tuning
London – 11th June 2015 (afternoon – part 2)
Zetoc: Electronic Table of Contents from the British Library
Pack Your Park by Modernizing Your Business Online
Accessing and searching for journals and wider material
Assessing Quality of Paradata to Better Understand the Data Collection Process for CAPI Social Surveys François Laflamme Milana Karaganis European Conference.
Accessing and searching for journals and wider material
CVE.
WISER Humanities: Keeping up to date
ISI Web of Knowledge update: April 2009
COUNTER Update February 2006.
Relational Database Design
Open Access and Subscription
SHERPA and OUP: an odd couple?
OpenDOAR and ROAR RSP Services Day, Nottingham, 23rd Apr.2008
From adaptive to intelligent: query processing in SQL Server 2019
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Business Mail Test and Innovation Scheme
From adaptive to intelligent:
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

RoMEO and CRIS Technical Issues & Efficiency Tips Peter Millington Centre for Research Communications University of Nottingham RoMEO and CRIS in Practice Birmingham, 1st April 2011

Outline Patterns of usage Approaches to using ROMEO in CRIS Do we have a crisis? Approaches to using ROMEO in CRIS Real time queries Caching and reusing RoMEO query results Rates of change – Reality Check And their implications Other efficiency tips

Usage of Interactive RoMEO

Usage of Interactive RoMEO

Usage of Interactive RoMEO Similar curve shapes for other measures Distinct weekly pattern ~4,500 Page views per day ~1,000 Visits per day ~ 700 Unique visitors per day Seems to be a stable seasonal pattern

Usage of the RoMEO API – All Users

Usage of the RoMEO API – All Users

Usage of the RoMEO API – Requests

Usage of the RoMEO API – Requests

Usage of the RoMEO API Much more variable pattern Weekly cycle of visits less distinct Number of requests very highly variable More usage by fewer users ~60 Unique visitors per day Over 250,000 hits per day (>50 times interactive) Significant growth Steady growth in number of API users Rapid growth in number of requests

Do we have a Crisis? Do you ever think RoMEO is slow? Most API usage is by CRIS-like applications How can we improve things? Higher capacity server? Funding? Unnecessary? Improve efficiency? Optimise the API? More efficient usage? Put a cap on number of requests per day? What level? 1000? 2000? Block commercial software users N.b. Creative Commons License

API approaches in CRIS applications Real time requests when displaying data Acceptable for individual article displays Latency too slow for lists of articles Caching RoMEO data for rapid local re-use Initial (bulk) checks against RoMEO Store the results locally Periodically recheck for updated policies Whole bibliography Additions and updates only

Real Time Usage Pattern

Real Time Usage Pattern

Real Time Usage Pattern Levels vary day by day Arguably high usage for one installation Occasional peaks Special system jobs Special end user projects

Caching with Monthly Updates

Caching with Monthly Updates Rechecking the whole database each cycle Seems to take three days. Low priority setting? Scheduled job – starts 1st of the month Could it be a weekend instead? Faster. Less intrusive. What is being checked? Each reference? Groups of records for each journal title? What about additions between cycles?

Caching with Daily Updates (1)

Caching with Daily Updates (1)

Caching with Daily Updates (1) Whole database checked every day Institutions can easily have lists of 50,000 items! Lists constantly growing, slowing things down What is being checked? Each reference? Probably Additions and updates between checks? No accuracy problems Sledgehammer to crack a nut

Is the nut cracking the sledgehammer?

Caching with Daily Updates (2)

Caching with Daily Updates (2) Note the logarithmic scale Large initial check of the whole database Daily check of added & changed items only Welcome low loading on the API

Rates of Change – Reality Check Institutional Bibliographies Up to 2,000 additions per year (<40 per week) Few bibliographic changes after initial QA RoMEO Publishers’ Policies c.25 additions or substantive changes per week Journal - Publisher Correlations Change of publisher - infrequent - mostly January Bulk changes - Business take-over or name change Expiry of archiving embargos

RoMEO Implications of Change Rates Institutional Bibliographies Only need to check additions & changes Weekly check probably sufficient, or on first use RoMEO Publishers’ Policies Recheck when the RoMEO record changes Store RoMEO ID with article/journal for bulk updates Journal - Publisher Correlations Full recheck annually on rolling cycle Specific rechecks for known business/name changes Expiry of archiving embargos Scope for improvement in RoMEO

Caching of RoMEO Publisher Data Download the whole database with “?all=yes” Relatively fast Download as often as you wish Suggest weekly And/Or… Store key RoMEO data with bibliographic records Provide links to interactive RoMEO Full publisher records using RoMEO ID, or Journal level data using ISSN

Caching Journal-level Data Schema/Organisation Per journal (efficient) Per article (probably inefficient) Fields Journal title ISSN and ESSN RoMEO Persistent Publisher ID RoMEO Colour and/or Version-specific permissions Normal – i.e. At the time of publication Adjusted after the completion of any embargo period

Most Efficient RoMEO Queries Journals ISSN/ESSN or Exact Title Unique or far fewer results, so faster May avoid the overhead of needing to search Zetoc Publishers RoMEO ID Unique result. It gets no faster. Exact publisher name May sometimes find multiple results.

What to do with failed requests? Don’t just keep rechecking! Not a journal article? Outside RoMEO’s scope. Prevent rechecking Data error (e.g. typo, bad abbreviation)? Correct the source data, then recheck No publisher or no policy in RoMEO? Feedback to RoMEO – if important Recheck infrequently – say annually or quarterly

Any Questions? RoMEO: http://www.sherpa.ac.uk/romeo API: http://www.sherpa.ac.uk/romeo/api Blog: http://romeoblog.jiscinvolve.org E-mail: romeo@sherpa.ac.uk Twitter: @SHERPAServices Peter Millington: peter.millington@nottingham.ac.uk 0115 84 68481