Software Sustainability Institute www.software.ac.uk Referencing software to make it sustainable: what, why and how? SciencePAD Persistent Identifiers.

Slides:



Advertisements
Similar presentations
Delivering User Needs: A middleware perspective Steven Newhouse Director.
Advertisements

Building Repositories of eprints in UK Research Universities Bill Hubbard SHERPA Project Manager University of Nottingham.
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
Web: The Future of OMII-UK e-Science: the Changing Landscape 17 April 2009 Neil Chue Hong.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Supporting education and research Repositories in Context Digital repositories as components of an integrated infrastructure for education Leona Carpenter.
Software Sustainability Institute “Doing Science Properly in the Digital Age” UK e-Infrastructure Academic User Community Forum 12 September.
Managing your research data: University support for researchers Sally Rumsey The Bodleian Libraries University of Oxford Mary Harssch
Software Sustainability Institute SSI in one slide.
An Overview. BizLink BizLink is a Social Networking platform for business. It allows colleagues to come together, ask questions, share resources, form.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
INDEPTH Network INDEPTH Data Systems Kobus Herbst.
Research Data Service at the IT Pro Forum HEIDI IMKER, DIRECTOR.
Software Sustainability Institute Software Information and Scientific Publications doi: /m9.figshare Beyond EMI: A Roadmap.
Software Sustainability Institute The Software Sustainability Institute 20 January 2015, HEP Software Foundation workshop Neil Chue.
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
The student experience of e-learning Dr Greg Benfield Oxford Centre for Staff & Learning Development.
Software Sustainability Institute Training in Computational Skills Scientific Meeting 2014 “NGS Data after the Gold Rush” TGAC, Norwich.
Esri UC2013. Technical Workshop. Technical Workshop 2013 Esri International User Conference July 8–12, 2013 | San Diego, California ArcGIS for Local Government.
Data Management Development and Implementation: an example from the UK SLA Conference, Boston, June 2015 Geraldine Clement-Stoneham Knowledge and Information.
Good practice in Research Data Management Module 6: Tools, training and support.
Sharing the load – librarians and research data support services Stephen Grace, Research Services Librarian M25 Conference, Wellcome Collection, 23 April.
Software Sustainability Institute Software Sustainability: Issues, Challenges and Initiatives Neil Chue Hong,
Adopting Hydra Making the case and getting going Chris Awre Hydra Europe Symposium London School of Economics, 23 rd April 2015.
Software Sustainability Institute Linking software: Citations, roles, references,and more
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
The DSpace Course Module – An introduction to DSpace.
Data Citation: the next big thing… ?!?! 1 Victoria University 20 Nov
JISC CETIS Conference, Oxford, November 2004 Repositories: State of ELF “volunteer”: Martin Morrey Intrallect Ltd.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Why persistent identifiers are crucial in digital preservation.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
13 September 2012 The Libraries’ Role in Research Data Management: A Case Study from the University of Minnesota Meghan Lafferty, Chemistry, Chemical Engineering,
Software Sustainability Institute Putting the user back into software sustainability 16 December 2013, Scientific Software Days, Austin.
The Department of Energy’s Public Access Solution Giving Voice to Energy and Science R&D Results Jeffrey Salmon Deputy Director for Resource Management.
We are the 92% Valuing the contribution of research software Neil Chue Hong, FORCE2015 Research Communications and e-Scholarship.
A disaggregated model for preservation of E-Prints Gareth Knight SHERPA DP Project Arts and Humanities Data Service.
Caring and Sharing Collaboration in Digital Curation outside North America Ross Harvey Simmons College, Boston Curation Matters: 17 June 2010.
Software Sustainability Institute Dealing with software: the research data issues 26 August.
Scholarly communications Discussion group Linked Data Workshop May 2010.
An Introduction. Aspiration To begin the process of adding significant value to those emerging repositories in which.
Software Sustainability Institute Software Attribution can we improve the reusability and sustainability of scientific software?
Because good research needs good data Funded by: Digital Curation for Researchers, 28th February 2013 The Shifting Research Data Management Policy Landscape.
Key themes covered Search engines Locating/ assessing suitable resources Information Skills – knowing where to look Free web-based RDN,NLN, Ferl JISC or.
Research Services Ten top things researchers need to know about research data management Slides provided by DaMaRO Project, University of Oxford.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
A centre of expertise in digital information managementwww.ukoln.ac.uk Exploiting The Potential Of Wikis: Introduction Brian Kelly UKOLN University of.
LIBRARY SERVICES Strategies for gaining and maintaining academic support for the institutional open access.
Data Citation & Digital Object Identifiers DOIs. 2 Digital Object Identifiers 101 Persistent identifier Identifies intellectual property in the digital.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Publishing & Citing Research Data Arun Prakash. Agenda  Introduction  Why is Data publishing important ?  Ongoing Work  Role of Semantics.
Software Sustainability Institute Tracking Software Contributions doi: /m9.figshare Joint ORCID – DRYAD Symposium on Research.
Software Sustainability Institute Working with research software 2 nd - 4 th November.
Software Sustainability Institute Building sustainable software for science … why good code is only the beginning 10 April 2013, EGI.
The Claromentis Digital Workplace An Introduction
Working with your archive organization: Broadening your user community Robert R. Downs, PhD Socioeconomic Data and Applications Center (SEDAC) Center for.
SciencePAD Open Software for Open Science Alberto Di Meglio – CERN.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Software Sustainability Institute Open science is impossible without software 5 th April 2016,
Brian Nosek University of Virginia -- Center for Open Science -- Improving Openness.
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
Esri UC 2014 | Technical Workshop | Address Maps and Apps for State and Local Government Allison Muise Nikki Golding Scott Oppmann.
E 3 : The Enlighten Embedding Experience William J Nixon How embedded and integrated is your repository? #jiscrte Nottingham 10 February 2012.
Open Exeter Project Team
Chapter 18 Maintaining Information Systems
Publishing software and data
How an RSE can benefit your projects:
Unlocking Thesis Data update DOI: /PUB.4314
Unlocking Thesis Data update DOI: /PUB.4314
Dataverse for citing and sharing research data
Presentation transcript:

Software Sustainability Institute Referencing software to make it sustainable: what, why and how? SciencePAD Persistent Identifiers workshop 30 January 2013, CERN Neil Chue Hong Unless otherwise indicated slides licensed under

Why is this important? An animated explanation Featuring a frustrated panda

Software Sustainability Institute Software is no longer easy to define, let alone sustain

Software Sustainability Institute Novel reuse of public sector data What do we sustain: - Map? - Software that creates map? - Software that uses map?

Software Sustainability Institute What do we choose to identify: - Workflow? - Software that runs workflow? - Software referenced by workflow? - Software dependencies? What’s the minimum citable part? Boundary

Software Sustainability Institute Algorithm Function Program Library / Suite / Package … Granularity

Software Sustainability Institute Versioning Personal v1 Personal v1 Personal v2 Personal v2 Personal v3 Personal v3 Personal v2a Personal v2a Public v1 Public v1 Personal v3a Personal v3a Personal v2a Personal v2a Public v2 Public v2 Public v3 Public v3 Why do we version? - To indicate a change - To allow sharing - To confer special status

Software Sustainability Institute Authorship Authorship Which authors have had what impact on each version of the software? Which authors have had what impact on each version of the software? Who had the largest contribution to the scientific results in a paper? Who had the largest contribution to the scientific results in a paper? OGSA-DAI projects statistics from Ohloh

Authorship Which authors have had what impact on each version of the software? Which authors have had what impact on each version of the software? Who is responsible for each component when authors leave? Who is responsible for each component when authors leave?

Software Sustainability Institute Software sustainability is the ability to continue to use, support, maintain, and evolve software

Software Sustainability Institute From xkcd.com

Software Sustainability Institute 12 RCUK Gateway to Research

Software Sustainability Institute The Software Sustainability Institute A national facility for cultivating world- class research through software Better software enables better research Software reaches boundaries in its development cycle that prevent improvement, growth and adoption Providing the expertise and services needed to negotiate to the next stage Developing the policy and tools to support the community developing and using research software Supported by EPSRC Grant EP/H043160/1

Software Sustainability Institute SSI: Long Term Goals Provision of useful, effective services for research software community Development and sharing of research community intelligence and interactions Promotion of research software best practice Mantra:  Keep the software in its respective community  Work with the community, to increase ability  Don’t introduce dependency on SSI as the developer  Expand and exploit networks and opportunities

Software Sustainability Institute SSI Drivers and Themes Two key drivers which cause people to seek the SSI’s advice:  They want to be more productive in their research  They don’t want to be embarrassed by appearing worse than their peers Broadly, our work falls into a few key themes:  Developing the scientific computing / software development skill base  The role and reward of software in research  Recognition of software career paths  Reproducible research

Software Sustainability Institute The Foundations of Digital Research Re- search Careers Recognition / Reward Skills and Capability Software Re-usable Re-producible software-evaluation-guide resources/guides software-carpentry training craftsperson-and-scholar software.ac.uk/blog/ what-research- software-community-and-why-should-you-care publish-or-be-damned-alternative- impact-manifesto-research-software Prlić A, Procter JB (2012) Ten Simple Rules for the Open Development of Scientific Software PLoS Comput Biol 8(12): e doi: /journal.pcbi Wilson G, et al. (2013) Best Practices for Scientific Computing Submitted to PNAS.

Software Sustainability Institute SSI Organisation Community Engagement (Shoaib Sufi)  Fellowship Programme Fellowship Programme  Events and Roadshows Consultancy (Steve Crouch)  Open Call for Projects Open Call for Projects  Software Evaluation Software Evaluation Policy and Publicity (Simon Hettrick)  Guides and Case Studies Guides and Case Studies  Best Practice and Policy Training (Mike Jackson)  Software Carpentry Software Carpentry  Software Surgeries Collaboration between universities of Edinburgh, Manchester, Oxford and Southampton. 9.5 FTEs for 5 yrs supplemented by additional project funding.

Software Sustainability Institute Case Study: Ligand Binding Centre for Computational Chemistry, Bristol  New methods for rapid MC sampling of biomolecular systems modelled using QM/MM  Developed two codes ProtoMS (F77) + Sire (C++)  Water-Swap Reaction Coordinate method to calculate absolute protein-ligand binding free energies SSI’s work is helping to scale development  ProtoMS and Sire both single developer codes  ASPIRE/ACQUIRE framework has multiple devs Split architecture between ASPIRE (adaptive multiresolution hybrid MD simulation) and ACQUIRE (WorkPacket scheduling system with optimisation for time to result vs “green-ness”

Software Sustainability Institute Case Study: Fusion Plasma Culham Centre for Fusion Energy  GS-2 used to study low-frequency turbulence in magnetized plasma  No common visualisation across different groups  Deliver mutually agreeable framework that can be extended easily and can be maintained by the small fusion community SSI’s work means the software can be used between groups  Simplified & enhanced plasma visualisation tool Based on ParaView o/s tool For simulations using GS-2 o/s package  Aim to allow CCFE to contribute back to GS-2 community  “I am very confident the tool will be invaluable” Colin Roach, CCFE

Software Sustainability Institute Case Study: Brain Imaging Brain Research Imaging Centre, Edinburgh  Develop PrivacyGuard software, a DICOM image deidentification toolkit  Created software to support new multispectral colouring modulation and variance identification technique (“MCMxxxVI”) to identify white matter lesions that are indicative of declining cognitive ability  BRIC are not principally software developers, but do provide software to other researchers SSI’s work means the software has been reviewed and refactored  Looked at exploitation Usability review, Naming/trademark review  Made it easier for BRIC staff to maintain and develop Move to standard repositories, testing and documentation processes Examination of licencing for MCMxxxVI Extraction and refactoring to create standalone tools

Software Sustainability Institute Case Study: SARoNGS UK NGS  Previous SARoNGS project aimed to provide a simple mechanism to obtain the security tokens required to access the National Grid Service.  This service uses the UK Access Management Federation to identify people using Shibboleth.  First in ongoing series of “NGS recommends…” SSI’s work means the software has been reviewed and issues corrected  Identified, fixed deep bugs in SARoNGS WMS It is now possible to run SARoNGS jobs successfully on NGS systems Though still needs NGS to roll out more widely…  “Happy with the report, using it to beat people now” David Wallom, OERC & NGS

Software Sustainability Institute Case Study: Climate Policy Modelling CIAS team at Tyndall Centre for Climate Change Research, University of East Anglia  Develop linked climate and economic models for detailed analysis  Their software was not ready to be used by other groups One researcher/developer at UEA, several users SSI’s work means the software is robust enough that it can be installed and used by others  Enabled use of the software by the WWFN’s Climascope project and James Cook University Documented software to allow extensions by contributors Made it easier to maintain and backup Added job scheduling to improve modeling throughput New modelling framework enables new models i.e. new science

Software Sustainability Institute Case Study: textual studies TextVRE team at CeRCH, Kings College London  Developed an environment which is used to integrate various tools used in the e-Humanities textual studies lifecycle  Builds on the German TextGrid project, and many other existing tools SSI’s work means the software is can be run “out of the box” – an important requirement for the researchers  Developed a VM image containing the TextVRE installation Improve installation instructions Develop tests to check each installed component Improve modularisation to allow others to contribute and maintain  Feeding back work to TextGrid

Software Sustainability Institute Case Study: NeISS Evaluate impact of traffic control measures over next 5/10/15 years Access baseline demographic data about the city Execute simulation of traffic system and population Visualise simulation outputs Augment with new forms of data Run dynamic models to assess future patterns (congestion, health, social inequality)

Software Sustainability Institute Case Study: NeISS 25

Software Sustainability Institute Case Study: NeISS 26

Software Sustainability Institute Software sustainability requires thriving communities of users and developers: how do we do this?

Software Sustainability Institute 5 Stars of Scientific Software We need a 5 stars for software!  Existence – there is accurate metadata that defines the software  Availability – you can access and run the software  Openness – the software has an open permissible license  Linked – related data, dependencies and papers are referenced  Assured – the software provides ways of demonstrating “correctness” c.f. 5 Stars of Linked Data (Berners-Lee) 5 Stars of Online Journals (Shotton)

Software Sustainability Institute Discoverable Software? To grow a community around software, first it must be discoverable  For users, wanting to find a solution  For developers, wanting to reuse or extend  For funders, wanting to promote or feature For sustainability  Provide useful information  Make it easier to attract and add contributors  Enable dormant projects to re-activate?

Software Sustainability Institute Software Hub Prototyping What information is useful? Do both provider and user benefit? What can be imported from other sites? What metadata must be collected to produce this information? Is it possible or easy to collect? How do people search for software?

Software Sustainability Institute Levels of Showcasing Level 1: internal  Has had support from Jisc  Has produced a software output  Metadata is incomplete Level 2: awaiting approval  enough metadata to publish it externally  perhaps not all quality criteria met Level 3: published  meets quality criteria  enough information to allow comparison Level 4: featured  seen as particularly useful, exciting, best of breed etc.  associated screencasts, tutorials to show off Offer incentives to move up the levels

Software Sustainability Institute Collecting Software Metadata Can we make the software metadata collection process work?  What are the benefits to provider and user?  Distinction between Project information and Product Information  Difference between information that enables discovery and choice, and the metadata that allows this information to be displayed E.g. “vitality” of project different for developer vs user

Software Sustainability Institute Citable Software?

Software Sustainability Institute Software Metapapers Create a complete scholarly record including “standard” publication, method, dataset and models, and software  e.g. modelling and simulation, statistical analysis  Enable replay, reproduction and reuse Pragmatic approach is to create a metadata record for the software, and link it to a copy of the software in some storage infrastructure  This is a software metapaper  Peer-review the metadata, not the software Journal of Open Research Software:  See: and the work by B. Matthews et al: The Significant Properties of Software: A Study

Software Sustainability Institute SoftwareCite Does the DataCite approach work with software?  What is the cost of minting a DOI?  What level do you mint DOIs for software?  What is the cost of storing the metadata associated with a software asset?  What is the cost of a software asset associated with a DOI disappearing?

Software Sustainability Institute Alternative Impact Stories GitHub Repository Starring as a means of recommendation Forking analogous to citing Direct measure of software citation? Requires user IDs, repository IDs, APIs

Software Sustainability Institute Software sustainability requires identifiers (and open APIs) for discovery, growth, attribution + reward

Why is this important? An animated explanation Featuring a frustrated panda

Software Sustainability Institute ISSUE TRACKER ISSUE TRACKER AUTOMATED BUILD&TEST AUTOMATED BUILD&TEST CODING STANDARDS CODING STANDARDS CODE REPOSITORY CODE REPOSITORY REAL TIME CHAT DEVELOPER LIST DEVELOPER LIST GENERAL LIST GENERAL LIST ANNOUNCE LIST ANNOUNCE LIST CRM USAGE TRACKING USAGE TRACKING CONTACT CONTACT REFERENCE MANAGER REFERENCE MANAGER DOMAIN NAME DOMAIN NAME PRIVATE WIKI/SITE SOCIAL NETWORKING SOCIAL NETWORKING PUBLIC WEBSITE PUBLIC WEBSITE BLOG FAQS FORUMS BROADCAST SERVICES BROADCAST SERVICES GOVERNANCE CONTINUOUS INTEGRATION CONTINUOUS INTEGRATION STYLE CHECKER STYLE CHECKER DEPENDENCY REPOSITORY DEPENDENCY REPOSITORY SHARED CALENDAR SHARED CALENDARPROTOTYPERESEARCHPRODUCTSUPPORTEDCOLLABORATION COMMUNICATION TRAINING