Nick Barnes at UKMO, 2012-02-28climatecode.org1 Better Science Through Software Copyright Climate Code Foundation, license CC-BY.

Slides:



Advertisements
Similar presentations
Building Open Science Communities
Advertisements

Analysis of Surface Temperature: UKMO Workshop Summary Stephan Bojinski, GCOS Secretariat GCOS SC-XVIII, 30 Sep 2010.
1 Developing a Research Question Partially adapted from The Research Methods Knowledge Base, William Trochim (2006). & Methods for Social Researchers in.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
The Finch Report and RCUK policies Michael Jubb Research Information Network 5 th Couperin Open Access Meeting 24 January 2013.
How to Write Grants Version 2009.
Implementing the new Workload Policy Heads of School Workshop April 2010.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
What is Localgiving.com? Offered throughout the Community Foundation Network as a fundraising tool for voluntary and community groups Mission: To provide.
Design Principles: Case Study Phillip D. Long, MIT Copyright Phillip D. Long, This work is the intellectual property of the author. Permission is.
© 2009 GroundWork Open Source, Inc. PROPRIETARY INFORMATION: Information contained herein is not for use or disclosure outside of GroundWork Open Source,
Empowering the next generation of builders Build-It-Yourself seeks partners who can help bring a mobile device game development workshop direct to kids.
NetHope Confidential. Unauthorized reproduction or use prohibited. NetHope CLOUD SERVICES PORTAL OVERVIEW TECHNOLOGY PROVIDER LAUNCH January 15, 2013.
Enhancing Geoscience Education at Minority-Serving Institutions AMS Diversity Projects Dr. James Brey Director, Education Program | American Meteorological.
Greater Arizona eLearning Association GAZEL Overview February 2011 Steve Peters
Dr Sue Watts January 7, 2014.
Our Success with Social Media. What types of Social Media do we use Social Media Twitter Facebook Website Blog.
HUBzero Cyberinfrastructure: Your Workday on Steroids Michael McLennan Director, HUBzero® Platform for Scientific Collaboration Purdue University 1.
MITCASESTUDY. Video About MIT OCW (2007)
ACCESS TO UK RESEARCH OUTPUTS The developing RCUK position
Judie Kay & Peter Shadbolt Industry Liaison Beyond the Silos: Developing a Corporate Approach to Industry Engagement.
The DSpace Course Module – An introduction to DSpace.
Open Access Ayesha Abed Library BRAC University October 30, 2011.
1 CS 178H Introduction to Computer Science Research Why Do an Honors Thesis?
References: [1] [2] [3] Acknowledgments:
Evaluating Web Resources Hosted by Lee Anne Morris.
Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION.
The Cluster Computing Project Robert L. Tureman Paul D. Camp Community College.
By Bankole Ebisemiju At an Intensive & Interactive workshop on Techniques for Effective & Result Oriented Annual Operation Plan November 24th 2010 Annual.
The Academic Scientist Kenneth Ruud Prorector for research and development.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Funding your Dreams Cathy Manduca Director, Science Education Resource Center Iowa State University, 2005.
Statipedia: a platform for collaboration across statistical agencies Peter B. Meyer Office of Productivity and Technology, BLS and James A.
START global change SysTem for Analysis, Research & Training UNFCCC Expert Workshop on Monitoring and Evaluating Capacity Building in Developing Countries.
Policies of the major countries of the world concerning implementation of integrated science and technology information networks International Workshop.
Nick Barnes at NCDC, climatecode.org1 Better Science Through Software Copyright Climate Code Foundation, license CC-BY.
Providing Access to Your Data Matthew Mayernik National Center for Atmospheric Research Copyright 2012 Matthew Mayernik. Version 1.0 October 2012 Section:
Software Sustainability Institute Software Attribution can we improve the reusability and sustainability of scientific software?
The Climate Code Foundation Software for Climate Science Nick Barnes talk at Google, climatecode.org.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
The Climate Code Foundation Software for Climate Science Nick Barnes talk at Google NYC, climatecode.org.
Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION.
Initiative overview 30 November 2011 Jay Lawrimore Chief, Ingest and Analysis Branch, NCDC.
Discovery Informatics Workshop Social Computing Challenges DRAFT.
Open Educational Resources for Researcher Development Ian Fairweather, School of Social Sciences University of Manchester.
Tech Day Intro July Peter Kunszt News and Updates.
SET ACCESS TO OPEN - MENDELEY Jose Luis Andrade – President, The Americas Sujay Darji – Regional Sales Manager October 22, 2012.
Outcomes of the online academia consultation Mr. Christopher Clark Head, Partnership and Resource Mobilization Division International.
October 1st 2015Lars Bjørnshauge. Good Publishing Practice – Open Access journals how the Directory of Open Access Journals contributes! Presentation.
Career Development Professional Recognition with the Society of Biology HEaTED – Regional Network Event 23rd April 2013 Debbie Brunt Society of Biology.
Nick Barnes at AMS, climatecode.org1 Better Science with Python Copyright Climate Code Foundation, license CC-BY.
TWC A use case-driven iterative method for building a provenance-aware GCIS ontology Xiaogang Ma a, Jin Guang Zheng a, Justin Goldstein b,c, Linyun Fu.
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
What is the CCF? A non-profit founded in 2010, based in the UK; Continuing projects started in 2008; A few software consultants, currently unpaid part-time;
R&D Operation Best Practice for Start Up Start a Business And Change the world Alfred Boediman, Ph.D.
SciencePAD Open Software for Open Science Alberto Di Meglio – CERN.
Scientific endeavor at stake Society at large Scientists Highest priority groups to engage & benefit: Innovators (Domain & computer sci) Early career Universities.
SciencePAD Open Software for Open Science Alberto Di Meglio – CERN.
New approach in EU Accession Negotiations: Rule of Law Brussels, May 2013 Sandra Pernar Government of the Republic of Croatia Office for Cooperation.
What is ? Open access definition: Image source:
+ Scholarly Communication: An Introduction November 7, 2015 Charlotte Roh Slides 3, 4, 9, and 11 of this work were originally created and revised by Stephanie.
Research Skills for Your Essay Where to begin…. Starting the search task for real Finding and selecting the best resources are the key to any project.
Strategies for NIS Development
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
National planning for Open Research euroCRIS 2017, 30 May 2017
KIOS Open Knowledge: A pillar for excellence
What is open source? Computer software where the source code is distributed under an open source license that allows anyone to study, change, improve.
What is open source? Computer software where the source code is distributed under an open source license that allows anyone to study, change, improve.
Pathways to Impact Lynne McCorriston
What is open source? Computer software where the source code is distributed under an open source license that allows anyone to study, change, improve.
Presentation transcript:

Nick Barnes at UKMO, climatecode.org1 Better Science Through Software Copyright Climate Code Foundation, license CC-BY

Nick Barnes at UKMO, climatecode.org2 What is the CCF? A non-profit founded in 2010; Continuing projects started in 2008; A few software consultants, currently unpaid part-time; Advisory committee of a dozen experts; A growing network of climate scientists and others; Several projects and publications; and big plans.

Nick Barnes at UKMO, climatecode.org3 What is the problem? Scientists have to write code, but: They aren’t well-trained; They aren’t properly rewarded; There is no incentive to publish it. The public need to know about climate science, but: The science isn’t accessible; The practices aren’t always transparent; They are lied to about ‘tricks’ and secrecy.

Nick Barnes at UKMO, climatecode.org4 Foundation goals "to promote public understanding of climate science, by increasing the visibility and clarity of the software used in climate science, and by encouraging climate scientists to do the same; by encouraging good software development and management practices among climate scientists; by encouraging the publication of climate science software as open source.”

Nick Barnes at UKMO, climatecode.org5 Advisory Committee Climate Scientists Kate Willett James Annan V. Balaji Stefan Brönnimann John Christy Reto Ruedy Peter Thorne Other Scientists Steve Easterbrook Peter Murray-Rust Cameron Neylon Andrew Woolf Non-scientists Paul Edwards Glyn Moody

Nick Barnes at UKMO, climatecode.org6 Clear Climate Code Project started in Over-riding goal is clarity: code which interested members of the public can download, run, read and understand. Open-source, of course. First target NASA GISTEMP: 12 KLOC of Fortran (etc). became 3678 lines of Python (including 1500 of docstrings) fixed minor bugs. fosters new science: one paper out now, more draft ccc-gistemp.googlecode.com

Nick Barnes at UKMO, climatecode.org7 Why clarity? Original motivation was to answer critics: Not the real code; Can’t be run; Contains “obvious bugs”; “divinci code written by the shortbus crew.” But also a key message of software engineering: Your target audience is people, not compilers Those people are often yourselves.

Nick Barnes at UKMO, climatecode.org8 What is clarity? def step1(record_source): """An iterator for step 1. Produces a stream of `giss_data.Series` instances. :Param record_source: An iterable source of `giss_data.Series` instances (which it will assume are station records). """ records = comb_records(record_source) helena_adjusted = adjust_helena(records) combined_pieces = comb_pieces(helena_adjusted) without_strange = drop_strange(combined_pieces) for record in alter_discont(without_strange): yield record

Nick Barnes at UKMO, climatecode.org9 Clear how? def step1(record_source): """An iterator for step 1. Produces a stream of `giss_data.Series` instances. :Param record_source: An iterable source of `giss_data.Series` instances (which it will assume are station records). """ records = comb_records(record_source) helena_adjusted = adjust_helena(records) combined_pieces = comb_pieces(helena_adjusted) without_strange = drop_strange(combined_pieces) for record in alter_discont(without_strange): yield record

Nick Barnes at UKMO, climatecode.org10 Clear to whom? def step1(record_source): """An iterator for step 1. Produces a stream of `giss_data.Series` instances. :Param record_source: An iterable source of `giss_data.Series` instances (which it will assume are station records). """ records = comb_records(record_source) helena_adjusted = adjust_helena(records) combined_pieces = comb_pieces(helena_adjusted) without_strange = drop_strange(combined_pieces) for record in alter_discont(without_strange): yield record

Nick Barnes at UKMO, climatecode.org11 Unclear how? def step1(record_source): """An iterator for step 1. Produces a stream of `giss_data.Series` instances. :Param record_source: An iterable source of `giss_data.Series` instances (which it will assume are station records). """ records = comb_records(record_source) helena_adjusted = adjust_helena(records) combined_pieces = comb_pieces(helena_adjusted) without_strange = drop_strange(combined_pieces) for record in alter_discont(without_strange): yield record

Nick Barnes at UKMO, climatecode.org12 Unclear how? for m in range(12): sum_new = 0.0 # Sum of data in new sum = 0.0 # Sum of data in average count = 0 # Number of years where both new and average are valid for a,n in itertools.izip(average[first_year*12+m: last_year*12: 12], new[first_year*12+m: last_year*12: 12]): if invalid(a) or invalid(n): continue count += 1 sum += a sum_new += n if count < min_overlap: continue bias = (sum-sum_new)/count

Nick Barnes at UKMO, climatecode.org13 Clarity enables new science By promoting “computational thinking” (Wing, NSF), Clear code raises new questions… Airport-only trends? Effect of US data? Effect of restricting to long-record stations? Use of land data for ocean cells? Adding more data scraped from met sites? …and helps answer them… …for both original authors and others.

Nick Barnes at UKMO, climatecode.org14 Homogenization project GHCN 3.0 dataset (Menne & Williams 2009); Re-implemented by Dan Rothenberg (Cornell, now MIT); Working with Menne and Williams at NCDC; Algorithm improved, bugs fixed; Revised dataset – GHCN-M – see M&W tech note; Funded by Google (Summer of Code 2011). Presented at AMS New Orleans, Many extensions possible: Peter Thorne has a dream….

Nick Barnes at UKMO, climatecode.org15 Common Climate Project Web framework for visualizing climate datasets; Late Holocene paleoclimatology: Emile-Geay (USC), Smerdon & Anchukaitis (LDEO); Open-source, open datasets; Prototype online at commonclimate.net; Implemented by Hannah Aizenman (grad student at CUNY); Funded by Google (Summer of Code 2011). Presented at AMS New Orleans. … development continues.

Nick Barnes at UKMO, climatecode.org16 Google Summer of Code Google pays students to write code ($5000 for 3 months); Any open-source project; CCF acts as an “umbrella organization”. Our 2011 projects: Hannah Aizenman:Common Climate Project; Filipe Fernandes:Extensions to ccc-gistemp; Daniel Rothenberg:Homogenization. (all presented at AMS New Orleans).

Nick Barnes at UKMO, climatecode.org17 Timetable 2012: Feb 27–Mar 9:brief window for orgs to apply; Mar 16:orgs announced; Mar 26–Apr 6: brief window for students to apply; Apr 23:projects announced; May 21–Aug 20:Coding! Aug 27:final results; Oct 20/21:mentor summit. Google Summer of Code

Nick Barnes at UKMO, climatecode.org18 Can you reproduce Fig 7a? “Why?” Reproducibility; New data; Bug fixes; Revised model; Transparency. Why not? Versioned code; Versioned data; Configuration Management.

Nick Barnes at UKMO, climatecode.org19 Open Science Accelerating trend towards more openness in science. Redefining publication: Open Access; Open Data; Open Knowledge; Open Notebooks; Data-driven intelligence; Workshops, conferences, summits; There’s a war on: PRISM, RWA; Royal Society policy study: Science as a Public Enterprise; But no coherent message about open software in science. Michael Nielsen: Reinventing Discovery

Nick Barnes at UKMO, climatecode.org20 Science Code Manifesto Code:All source code written specifically to process data for a published paper must be available to the reviewers and readers of the paper. Copyright: The copyright ownership and license of any released source code must be clearly stated. Citation:Researchers who use or adapt science source code in their research must credit the code's creators in resulting publications. Credit:Software contributions must be included in systems of scientific assessment, credit, and recognition. Curation:Source code must remain available, linked to related materials, for the useful lifetime of the publication.

Nick Barnes at UKMO, climatecode.org21 Future Plans Changing policies: Transparency; Rewards for all research products. Training scientists: Basic techniques (testing, version control, agile, etc); Code publication and reuse. Providing resources: White papers, blog posts; Directories. Building networks, partnering with institutions; Leading by example: ccc-gistemp; ccf-homogenization; etc….

Nick Barnes at UKMO, climatecode.org22 Questions?

Nick Barnes at UKMO, climatecode.org23 Funding I say "non-profit". Approximately “non-revenue". All accounts open. Total revenue to date£ (+ GSoC students). Total costs to date£ (as of ). All work unpaid (not counting GSoC students). Personal lost income to date probably £40K. Funding model seeks £150K-£500K annually from corporate or NGO sponsorship (plus some project money from academic collaborations). Too much? Not enough? Depends who you ask. Open to suggestions!