Building a hosted repository service on DSpace Matthew Cockerill Director of Operations BioMed Central Ltd. Open Repository.

Slides:



Advertisements
Similar presentations
Partnering with Faculty / researchers to Enhance Scholarly Communication Caroline Mutwiri.
Advertisements

Creating Institutional Repositories Stephen Pinfield.
Enlighten: Glasgows Universitys online institutional repository Morag Greig University Library.
Overlay journals at UCL: the EPICURE project Martin Moyle LEAP Members Meeting, QMUL, 07 December 2011.
Distributed Data Processing
CHORUS Implementation Webinar May 16, 2014 Mark Martin Assistant Director, Office of Scientific and Technical Information Office of Science U.S. Department.
WHY CMS? WHY NOW? CONTENT MANAGEMENT SYSTEM. CMS OVERVIEW Why CMS? What is it? What are the benefits and how can it help me? Centralia College web content.
B USINESS I MPROVEMENT S TRATEGY O VERVIEW Chris Coles V1.04 provided by
Role of librarians in the development of Institutional Repositories Susan Ashworth University of Glasgow.
Institutional Repository for CDU What’s in your bottom drawer? Ruth Quinn, Director Library and Information Access Charles Darwin University.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Jul The New Geant4 License J. Perl The New Geant4 License Makes clear the user’s wide- ranging freedom to use, extend or redistribute Geant4, even.
3/5/2007 Copyright Notice COPYRIGHT © 2007 THE REGENTS OF THE UNIVERSITY OF MICHIGAN ALL RIGHTS RESERVED PERMISSION IS GRANTED TO USE, COPY, CREATE DERIVATIVE.
Presented by Ansie van der Westhuizen Unisa Institutional Repository: Sharing knowledge to advance research
Content Management Systems …mostly Umbraco ALL ABOUT.
FPGA and ASIC Technology Comparison - 1 © 2009 Xilinx, Inc. All Rights Reserved How do I Get Started with PlanAhead?
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
OPEN REPOSITORY Hosting DSpace as a Saleable Service UKI DSUG meeting, Friday 24th November 2006.
STATUS UPDATE EM SUBCOMMITTEE Friedrich Roth, EM subcommittee chairman SEG 2012, Las Vegas Technical Standards Committee meeting.
Adding Genes This presentation gives a quick overview on how to add Genes to Osprey.
Practical Advice Morag Greig Advocacy William J Nixon Service Development DAEDALUS Workshop – 27 June 2005.
DSpace. TM 2 Agenda  Introduction to DSpace  DSpace community  Institutional Repository  Easy to add/find content in DSpace  Building Online Communities.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
Using HTML/JavaScript/AJAX in Workflow Presented by: Mike Gostomski & Alison Nimura Portland State University March 21, 2011 Session ID 3742.
Italy: OA repositories, mandates and author’s rights management. Does it really work? Paola Gargiulo CASPUR.
X3D Graphics for Web Authors X3D-Edit Update SIGGRAPH 2008 Don Brutzman Naval Postgraduate School Monterey California USA.
17-1 JXTA Developer and Business Resources Module Objectives ● Understand JXTA's Open Source Model ● Learn how to get involved at jxta.org ● Learn.
Blue Diamond Scott Auge Amduus Information Works, Inc.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Andrew McNab - License issues - 10 Apr 2002 License issues for EU DataGrid (on behalf of Anders Wannanen) Andrew McNab, University of Manchester
Amy Jackson UNM Technology Days July 22,  An institutional repository (IR) is a web-based database of scholarly material which is institutionally.
BMC Open Access Colloquium, 8 February Morgan: "Open Access Repositories"
Resume Builder Todd Abel, Microsoft Copyright Notice © 2003 Microsoft Corporation. All rights reserved.
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
International Telecommunication Union New Delhi, India, December 2011 ITU Workshop on Standards and Intellectual Property Rights (IPR) Issues Philip.
Uganda Scholarly Digital Library (USDL) Makerere University’s Institutional Repository By Margaret Nakiganda URL:
UK LOCKSS Alliance: Investigation into Private LOCKSS Networks Adam Rusbridge EDINA, University of Edinburgh.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
National Alliance for Medical Image Computing Licensing in NAMIC 3 requirements from NCBC RFA (paraphrased)
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage Receipts.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage Supplier Returns.
1 Resource Management: Resource Management Fundamentals.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage and Disposition Inventory Returns.
Hosting Websites and Web Applications with Microsoft ® SQL Server ® 2008.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
Institutional Repositories July 2007 Intellectual property management : the DISA experience Dr D Peters DISA: Digital Innovation South Africa.
1 of 26 For Oracle employees and authorized partners only. Do not distribute to third parties. © 2009 Oracle Corporation – Proprietary and Confidential.
Open Repository Claire Bundy OAI6 Geneva Overview BioMed Central: who we are About Open Repository Is Open Repository right for you? Questions and.
Oracle E-Business Suite R12.1 Accounts Receivables Essentials Partner Boot Camp Training Courseware.
The Glasgow Experience: From DAEDALUS to Enlighten William J Nixon and Morag Greig Glasgow University Library IUA Librarians Group, 20 th February 2007.
-1- For Oracle employees and authorized partners only. Do not distribute to third parties. © 2009 Oracle Corporation – Proprietary and Confidential Oracle.
Permission to reprint or distribute any content from this presentation requires the prior written approval of Standard & Poor’s. Copyright © 2011 Standard.
The secure site rendering issue (all navigation crushed together as a list at the top of the page) is a compatibility issue with Internet Explorer only.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle Proprietary and Confidential. 1.
-1- For Oracle employees and authorized partners only. Do not distribute to third parties. © 2009 Oracle Corporation – Proprietary and Confidential Oracle.
To synchronize subtitles in linear time!
<Insert Picture Here>
Evaluating Architectures
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Comparative Law of Licenses and Contracts in the US, UK and EU
Automation in an XML Authoring Environment
Self-Registration walk-through
Implementing an Institutional Repository: Part II
Motivation for 36OU Open Rack
Software Architecture
Implementing an Institutional Repository: Part II
BEMS user Manual Fundación cartif.
How to Implement an Institutional Repository: Part II
Presentation transcript:

Building a hosted repository service on DSpace Matthew Cockerill Director of Operations BioMed Central Ltd. Open Repository

What is Open Repository  A hosted repository service  Based on DSpace  Operated by BioMed Central

Outline  Background on BioMed Central  Why is there a need for a hosted repository service?  Why build it on DSpace?  Why choose Open Repository?  Technical implementation challenges  Other challenges

Background on BioMed Central  Scientific publisher,founded in 1999  All research articles Open Access  130+peer-reviewed journals  10,000+ articles published  Continuing to grow rapidly

Open Access research  All research distributed under the Creative Commons Attribution License:  Allows –Redistribution –Reuse –Creation of derivative works –Commercial or non-commercial

Institutional repositories and Open Access publishing  Sometimes seen as alternative roads to Open Access  In fact roads are very complementary  Repositories can contain both: –Manuscript copies of articles from 'traditional journals' –Final, structured versions of articles from open access journals  We expect growth in repositories to go hand in hand with growth in Open Access publishing

Outline  Background on BioMed Central  Why is there a need for a hosted repository service?  Why build it on DSpace?  Why choose Open Repository?  Technical implementation challenges  Other challenges

Why is there a need for a hosted repository service?  Not all institutions want to operate, maintain and customize their own repository  Small institutions –Hosted solution can offer better value, due to economies of scale –Alternative 'shoestring' solutions are possible but do not give reliability of flexibility  Large institutions –Hosted solution may give greater flexibility

BioMed Central's track-record as a service provider  Has developed and operated a 24/7 web-based journal workflow system for thousands of authors, reviewers, and journal editors since 2000  25,000+ manuscripts have been submitted to BioMed Central journals to date

Outline  Background on BioMed Central  Why is there a need for a hosted repository service?  Why build it on DSpace?  What does OR offer compared to regular DSpace  Technical implementation challenged  Other challenges

Why was DSpace chosen as the foundation for Open Repository  Java-based  Large, active and diverse community of developers  Designed with the big issues in mind –Modularity/extensibility –Scalability –Interoperability –Long term digital preservation  BSD-licensed

BSD License Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Outline  Background on BioMed Central  Why is there a need for a hosted repository service?  Why build it on DSpace?  Why choose Open Repository?  Technical implementation challenges  Other challenges

Why choose Open Repostory?  Does not require extensive in house IT skills/resources  Flexible customization  High availability, for a fraction of the price of a dedicated HA solution  Additional features compared to standard DSpace software

Why not to choose OR?  Not for every institutions  Some institutions choose to make a major investment in developing and extending the repository platform  In return for greater investment of staff and resources, an institution can – arbitrarily customize DSpace to its precise needs – steer the overall direction of the DSpace platform

Impact of RCUK position statement  The draft position statement on Open Access from RCUK proposes to mandate deposition of articles in an Open Access repository if available  Only a small minority of UK institutions currently have repositories  RCUK policy likely to encourage many smaller institutions to consider setting up repositories

High Availability  Commercial Tier-1 network datacentre  24x7 monitoring, troubleshooting and fault resolution  Fully redundant infrastructure: power / internet / firewall / LAN etc  High-end fibre-channel/RAID storage  DSpace Tomcat servers configured as an active/passive cluster  Oracle database - 2-node RAC cluster + offsite standby database

Examples of functionality added to core DSpace platform  Automatic population of repository with Open Access content  Improvements to ease-of-use of submission system  Automated conversion of proprietary file formats to PDF suitable for archiving  XML markup of submitted articles  Enhanced usage reporting tools

Enhanced access statistics

Additional access stats reporting

Easy entry of metadata for items that are in PubMed

Keeping track of DOI/PubMed for items

XML full text rendering

Outline  Background on BioMed Central  Why is there a need for a hosted repository service?  Why build it on DSpace?  Why choose Open Repository?  Technical implementation challenges  Other challenges

Tomcat application  Running multiple instances of DSpace within Tomcat is fairly straightforward and works OK  Ultimately may need to tweak DSpace code to allow single DSpace application instance to have many 'faces' (different repositories) i.e. break the 1:1 relationship between application instance and repository  That is the approach we use to operate our 70 independent journal websites

Database issues  Each Repository needs it's own database schema (for metadata etc.)  Don't want to have to independently manage (dozens or hundreds) of database schemas  Need to maintain good performance  Also would like all DSpace instances to effectively share a pool of connections – difficult if each connection is tied to a different user/schema

Database solution: Part 1 1.Partition all tables, by a new repos_id column 2.Create a series of schemas, one for each Open Repository, identified by repos_id 3.Generate a set of views in each schema, which filter the underlying tables by the relevant repos_id 4.End result:  Schema appears to DSpace code to be indistinguishable from a dedicated schema  Single set of tables provide easy manageability  Partitioning ensures high performance

Database solution: Part 2 1.To allow efficient sharing of database connections, all connections use same username 2.ALTER SESSION SET CURRENT_SCHEMA used to point at correct schemaALTER SESSION SET CURRENT_SCHEMA 3.Oracle's connection attribute functionality is used to ensure that connections already pointing at the correct session are reused when possibleconnection attribute

Each DSpace instance has own connection pool OR1OR2OR3OR4OR5 Tomcat applications Database connections Database Webserver Active Inactive INEFFICIENT

DSpace instances share a connection pool OR1OR2OR3OR4OR5 Tomcat applications Database connections Database Webserver Active Inactive Shared connection pool EFFICIENT

Contributing code back to DSpace  BioMed Central intends to contribute many of its tweaks to the core DSpace code back to the DSpace project  Where possible, all proprietary functionality is being added as distinct modules  DSpace's architectural evolution will hopefully make this easier to achieve  BioMed Central's goal is for Open Repository to remain in sync, as far as possible, with the core DSpace code

Outline  Background on BioMed Central  Why is there a need for a hosted repository service?  Why build it on DSpace?  Why choose Open Repository?  Technical implementation challenges  Other challenges

Biggest challenge  Persuading authors to contribute content to the repository  Not trivial  Need to: –Make it as easy as possible –Carrots and sticks

Ease of use of BioMed Central’s manuscript submission system 96.8% rate ease of use as "good" or "very good"

End-to-end service  The Open Repository service is not just about providing the technology  Provision of training and ongoing technical support to the institution's repository administrators  Provide guidelines on best practice for successfully launching a repository

First live customer - INSERM

INSERM’s Open Repository

Acknowledgements  Open Repository team –Mark Merifield –Liam Lynch –Tom Mowlam –Marie Martens