Restoring reproducibility: Making scientist software discoverable

Slides:



Advertisements
Similar presentations
Mine Action Information Center
Advertisements

Restoring reproducibility: Making scientist software discoverable Source codes are increasingly important for the advancement of science in general and.
NTIS on Engineering Village. What is the NTIS Database? The NTIS Database is the main resource for accessing the latest research.
1. Scopus Update November 2004 American University of Beirut Presented by:Amanda Hart Date: 11 November 2004.
WELCOME TO THE AHIA CONNECTED COMMUNITY! HEALTHCARE INTERNAL AUDIT'S PROFESSIONAL THOUGHT LEADERSHIP COMMUNITY.
Google Scholar Tools, Tips, and Tricks 1 Ben Hockenberry Systems Librarian SJFC Lavery Library.
How to Use Google Scholar An Educator’s Guide
Astrophysics Source Code Library Making scientist software discoverable Alice Allen ASCL, Editor 06/25/15 CUA.
Searching Databases. What is in the Library? The Online Library has thousands of journal articles and electronic books available for your use. Also available.
1 ARRO: Anglia Ruskin Research Online Making submissions: Benefits and Process.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Dataset citation Clickable link to Dataset in the archive Sarah Callaghan (NCAS-BADC) and the NERC Data Citation and Publication team
MARKO ZOVKO, ACCOUNT MANAGER STEPHEN SMITH, SOLUTIONS SPECIALIST JOURNALS & HIGHLY-CITED DATA IN INCITES V. OLD JOURNAL CITATION REPORTS. WHAT MORE AM.
NASBLA Social Media: What is it for? NASBLA is involved in numerous Social Media that all serve a distinct purpose. So, what are they all for?
SOCIAL MEDIA PLATFORM Best ones too use for Acumen animation studios.
Restoring reproducibility: Making scientist software discoverable Alice Allen Astrophysics Source Code Library ascl.net.
Gathering Information at the Library. Have a project? Don’t know where to start? HPL can help with that!
© 2004 Reviews.com™ 1 Reviews: A Front End to Literature Bruce Antelman
Introduction to SHERPA RoMEO and its Significance for Publishers
Creating your Social Image
Contact form LAW Click the web link
ReDploy
Plagiarism and the IWU Student
Save the Code? What to do with Short research codes
Using Wikis to Facilitate Collaborative Research Projects
Welcome to your library day!
How to Use Google Scholar An Educator’s Guide
BIO1130 Lab 2 Scientific literature
Finding Scholarly Articles in a Library Database
5 Day Blog Challenge DAY THREE Let’s talk SEO
Curtailing the Challenges Faced in Digital Society
NCJA ZoomGrants Overview Presented by: Lindsey Johnson
Recruiting & Keeping Loyal Volunteers
The How and Why of DOI Assigning DOI’s to IR content
FREE TRAFFIC STRATEGIES
Research software best practices: Transparency, credit, and citation
Library Access 24/7 Did you know that you can do research without actually coming to the KC Library on campus? You have access to our databases and ebooks:
Assess Survey Invitations
Submitting Requests to IT
Better your WordPress blog
Publishing software and data
USING CARLI DIGITAL COLLECTIONS
Traffic Audit Industry: Internet of Things (IoT) Ted Politidis Head of SEO
Test Review Be prepared to provide an answer.
More leads, More enquiries, More sales
Linking persistent identifiers at the British Library
Suffolk Public Schools
Striving to be a Vibrant Club
Getting started on informaworld™
Library Access 24/7 Did you know that you can do research without actually coming to the KC Library on campus? You have access to our databases and ebooks:
SpringerLink Training August 2010
Research MLA 8.
User Information Architecture: Blogs, Wikis, and RSS
Getting your research noticed
Welcome to AIC’s Online Community
New Features Update Web of Knowledge : Discovery Starts Here
Christy Shorey Southern Miss
WorldCat: Broad Web visibility for our collection
FINDING AND CITING RESEARCH FOR A RESEARCH ESSAY (dr. atkins, a
BIO1130 Lab 2 Scientific literature
WISER Humanities: Keeping up to date
Blackboard Tutorial (Student)
We know who they are and what they do, but how do we help them?
Introduction to Rotary Club Central
ENDANGERED ANIMALS A RESEARCH PROJECT
….part of the OSU Libraries' suite of digital library tools…
Brand Yourself and Promote Your Business in Play Therapy
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Concord Products Online
Why Social Media? Think of the marketing potential that is inexpensive, anyone can do, and how effective it is.
Presentation transcript:

Restoring reproducibility: Making scientist software discoverable Here to talk about the effort in astronomy to make research software available for examination: the Astrophysics Source Code Library (ASCL), which is an online registry scientist written software used in research Challenges of creating and growing the resource Specific steps the ASCL has taken to encourage use of resource and to improve code discovery in astronomy Effect this work is having within the community and more broadly And I also want to talk about software in general, and why it’s important to preserve it. Alice Allen Astrophysics Source Code Library

Research software Integrity of research depends on transparency and reproducibility “… anything less than release of actual source code is an indefensible approach for any scientific results that depend on computation...” That software is important in research – it is a method and that method should be available so that others can examine it. Or to paraphrase a bit in Nature in 2012: not making the software available is indefensible And that is why the ASCL was created in 1999 – to make the source codes – the software -- used in astronomy available for examination. We’ll first look at ASCL’s history Then discuss what we learned along the way How we put the lessons to use and finally Effects of our work Ince, Hatton, & Graham-Cumming, The case for open computer programs, Nature, v. 482, Feb. 23, 2012

ASCL A brief history So here is a brief history of the ASCL… I’m going to leave out a lot, but I’m happy to talk about it later if you’d like. Basically, it has had three phases: 1999-2003 Dormancy 2010-present

Code entry, 1999 Started by Robert Nemiroff in 1999 – to make the source codes – the software -- used in astronomy available for examination. This is what the first version looked like…. One HTML page/code, with 10 fields, including subject headings, version number, language(s), and a link to the deposited code Repository Subject headings categorized the codes, and these headings were used on an index page that listed all codes

Code entry, present And this is what it looks like now. As you can see, the look is more professional. We retained most of the metadata from the previous version, but dropped more ephemeral metadata such as version. We also dropped the requirement to have the code deposited, and added a unique identifier, the ASCL ID. We use WordPress for content management and news (our blog), and have an online submissions forms. The MySQL database allows us to better integrate with ADS, the primary indexer of astro journals; we create the bibcode ADS uses for our entries, and added the link to the ADS entry for our record to the page.

Home page, 2003 Can see here the index page, the home page of the original site.

Home page, present This is the home page. It is much cleaner and lists the 10 newest codes right on the page in reverse date order, making it easy for people to see what’s new.

Browse, present Browsing options were expanded, too; this shows a compact browsing screen, which lists just the ASCL ID and the name of the software; there is an expanded mode that displays abstracts, too. two presentations at major astronomy meetings: 1999 and 2000 37 codes Search for new editor in 2003 was unsuccessful Entered period of dormancy as other similar efforts were started Continued to be available to anyone who could find it; lots of old links to it

Looking around and back Why didn’t this work initially? Who else has tried this? What other similar efforts exist? What can I learn from them? When I took over the ASCL in 2010, I wondered “Why didn’t this catch on? Who else has tried this? Have there been other similar efforts, and if so, what can I learn from them, and the ASCL’s early try, too?” I found others, even one that had started before the ASCL, some funded, some not, none of those still existing very active. I asked some of those involved in these other efforts what had worked well and what had not worked for them. ASDS (most robust; funded; 1993-2000) AstroForge (funded, 2003-2006) SkySoft (2003-present) Astro-Code Wiki (2006-2011) Astro-Sim (2007-2010) AstroShare (2008-2014; last update 2012)

Lessons learned People don’t want to deposit their codes/like to keep their codes nearby Little incentive to register software Don’t want to go first Don’t want to have another site to update Funding cycle not long enough to get uptake by community Authors will not update metadata Limited marketing limited growth So with these lessons in mind, the question then how do we use what we’ve learned? How do we encourage software release in the community? Framed this way, it’s a change management issue, and change management has been well-studied and offers guidance.

To bring about change… Build it Enlist/involve others Market, market, market Engage the community Learn what barriers and incentives exist Mitigate barriers and nurture incentives Be patient Here are the most important steps we identified as necessary to change astronomy to a community that shares software more openly and that can guide the ASCL to being more widely adopted. Let’s look at each step in more depth.

Total code entries by quarter July, 2010 - September, 2011 Make it look like something. A few codes with the most recent submission having been in 2003 weren’t going to entice anyone to use the site. Since software authors won’t enter their codes, I decided we’d take on that task. The first 1.5 years I worked on the ASCL, much of my time was spent simply tracking down and entering codes. This graph shows the results of that work, and was created for the first presentations on the ASCL since 2000.

Number of code entries at year end, 2010 - 2015 We’ve continued to build it, and this is easier now as more and more, software authors are submitting their codes. We now have over 1200 codes in the ASCL

Advisory Committee To enlist and involve others, we created an advisory committee. We selected people who provide credibility and knowledge To inform effort To serve as change champions Enlisted people prominent in astro software community, computer science professors with ties to other fields, and others Eight members on advisory committee by end of 2011; ten members currently Email others interested in code release and sharing to let them know the ASCL exists

Get the word out Market, market, market! The chair of the ASCL AC and I devised a marketing plan for the ASCL, which includes Linking from Astronomy Picture of the Day (APOD) to ASCL Participating in meetings/conferences and associated papers Writing to code authors upon software registration Writing guest posts for popular astro blogs Getting involved in open science discussions with others Using social media – Facebook, Twitter, and G+

Community work Educate and engage the community Organize sessions at astronomy meetings Split allotted time between presentations and open discussion with attendees to learn: Barriers to code sharing Incentives to encourage sharing Community expectations

Barriers No payback/benefit to sharing software Not required to, so don’t/too busy to do more work Fear of judgment for “messy” code Fear of losing competitive edge University policies Expectation of support from community http://www.pd4pic.com/road-works-site-street-away-barrier-warning-stop.html

No one can assume that valuable innovations will pop up magically in the public domain if their inventors received no reward for their labor and capital. Richard Epstein Incentives: Citations for software/tracking citations Greater recognition for code contribution Complying with funding requirements Future funding/peer review Meet changing/new expectations

(Patience, the lion on the left of the NY Public Library) Not expecting overnight success Building resource and changing community go hand-in-hand Changing community is harder/longer process Ten-year plan

Are we having any effect? But the big question is, Are we having an effect on the community? Are people changing their behavior?

Well, it appears so; for one, some journal editors are now encouraging authors to register new software with the ASCL before a paper is published. They can use the ASCL ID in the paper, and make it a link to the ASCL entry in the online version of the paper. We see more submissions now, too. 40% of the codes added to the ASCL since July 2014 were submitted by their authors, whereas before that, only a handful of entries had been submitted. Every major astro journal accepts citations to the ASCL.

Citations to ASCL entries by year Citations to ASCL entries are generally more than doubling every year, as you can see from this chart. The ASCL has made it possible to cite software that does not have a descriptive paper in the literature to use for citation. The second largest source of referral traffic to the ASCL comes from ADS, which means people are finding links to code entries there and following them to the ASCL – codes are more easily discoverable.

Things I didn’t say but would have with more time… We have automated some manual processes and will automate others as time/volunteer work/ideas/funding allow, but ASCL will always need human editors. ASCL has had little funding in the past; $5K from AAS for outreach one year, and for this year, Heidelberg Institute for Theoretical Studies (HITS) has provided €6K in (unsolicited!) funding. Consistent funding is on my to do list, and we are looking at perhaps gathering small amounts of funding annually ($500-$1000) from broad group of organizations ASCL is built using open source technologies: MySQL, CodeIgniter, phpbb, WordPress Can be cloned; see offer here and an empty(ish) site here. I’ve had requests from physicists to create a physics source code library, and we are looking at other fields, including economics. I get to meet the best people by working on this project! Thank you for being among them! Template image credit and information: This image is an average of the central 10 velocity planes of a mosaic of five data cubes released as part of the Galactic Arecibo L-band Feed Array HI (GALFA-HI) survey (Peek et al., 2011, Ap J Suppl, 194, 20; DOI 10.1088/0067-0049/194/2/20; ADS Bibcode 2011ApJS..194...20P). The mosaic was with computed version 4.0 of the Montage Image Mosaic Engine. Image courtesy John Good and Bruce Berriman, California Institute of Technology.