Nicholas Car Scientific Data Platforms & Policy

Slides:



Advertisements
Similar presentations
SCADM Report Working Paper 10. Overview SCAR Data and Information Management Strategy (DIMS) – endorsed Oct Introduction to the draft SCAR Data.
Advertisements

An Australian Geoscience Data Cube
Data citation at Geoscience Australia Policy Amanda Steen (Systems and Data Librarian) Infrastructure to support data citation Dr Sue Fyfe (Director, Data.
HORIZON 2020: FINANCIAL ISSUES
1 Auditing in the Public Interest Records Management in the Victorian Public Sector Audit objective Audit had two objectives : The first objective was.
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Deflection of light induced by the Sun gravity field and measured with geodetic VLBI Oleg Titov (Geoscience Australia) Anastasiia Girdiuk (Institute of.
An approach to mineral potential mapping using a mineral systems approach: an example of IOCG deposits in the NT Anthony Schofield and David Huston.
ATOS Analysis of Technical & Organisational Safety for major-accident prevention Risk analysis method integrating both technical and organisational factors.
Capacity Building in Analytical Tools for Estimating and Comparing Costs and Benefits of Adaptation Projects in Africa.
Discrete Global Grid Systems A New Way to Manage ‘Big Earth Data’ Dr. Matthew B.J. Purss Geoscience Australia Co-Chair OGC Discrete Global Grid Systems.
The Keyword Aggregator web service A tool and methodology for managing digital objects’ keywords IINFORMATION MANAGEMENT TECHNOLOGY, LAND & WATER David.
Automatically Calculating the Adherence to License Requirements
Time Management.
Time Management.
Applied Software Testing
Incorporating W3C’s DQV and PROV in CISER’s Data Quality Review and
Cloud EO Big Data processing
Session 4: Systems, structures and resources
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Risk Tolerance Factor # 10 Role Models Accepting Risk
ECA 2010, Geneva, Switzerland Creating a synergy between BPM
Agreeing about agreements: modelling social contracts, people and data
J. Sterling Morton High Schools
Functions Section 5.1.
Persistent Identifiers Implementation in EOSDIS
Tomas Kliment Junior Researcher Italian National Research Council
VP, Institutional Services
Analysis Ready Data ..
Lawrence Livermore National Laboratory
Functions & Relations.
Short to Medium Term Priority issues for EGI, EMI, anD others
UNIT V QUALITY SYSTEMS.
Rethinking the Inflation Target
The Value of Twisting the Lion’s Tail: How the Design of Policy Experiments Impact Learning Outcomes for Adaptation Governance. Belinda McFadgen, PhD researcher,
State your reasons or how to keep proofs while optimizing code
Get a way to find the best packers and movers services in Noida India, Noida, 24, Feb, With regards to picking the correct packers and movers, one.
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Extension of ARD concept to Atmosphere and Oceans?
PS 420/520 International Organization
SDMX: A brief introduction
SLOPE = = = The SLOPE of a line is There are four types of slopes
Communication plan.
Families of Functions, Domain & Range, Shifting
Automating Profitable Growth™
BIO1130 Lab 2 Scientific literature
Welcome.
Objectives The student will be able to:
ELVIS isn’t leaving the building, its helping you design it.
Hey, my name is “Jason Fulton”
Lesson 1-1 Linear Relations and Things related to linear functions
Functions.
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Review reporting decision
Health, Safety & Environmental Management System (HSE MS)
Final Exam Take home format
Y x Linear vs. Non-linear.
Participation Feedback
Overview of Workflows: Why Use Them?
Data Provenance.
Configuration management
Objectives The student will be able to:
05/08/09.
Objectives The student will be able to:
Objectives The student will be able to:
Objectives The student will be able to:
Functions What is a function? What are the different ways to represent a function?
Introduction to reference metadata and quality reporting
Validating MANRS of a network
Presentation transcript:

Nicholas Car Scientific Data Platforms & Policy We of the meta meta Is Australia developing a transparent and reproducible approach to transparency and reproducibility? Nicholas Car Scientific Data Platforms & Policy

Outline Definitions of T & R T & R Approach Describe what a T & R approach to T & R would look like Work to date On T & R, in Aust. Assessment of work Suggestions MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Definitions: In order to be transparent… For Transparency (of data generation & modelling), we must be able to: Find Access Understand Monterey Bay Aquarium Research Institute, 2004 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Definitions: In order to be transparent… For Transparency (of data generation & modelling), we must be able to: Find Traverse from final data/model output to inputs Find all the elements needed Access Understand MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Definitions: In order to be transparent… For Transparency (of data generation & modelling), we must be able to: Find Access One found, we need to be able to access data/models/code Understand MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Definitions: In order to be transparent… For Transparency (of data generation & modelling), we must be able to: Find Access Understand Once accessed, we need to understand what we have MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Definitions: In order to be transparent… For Reproducibility (of data & modelling results): Transparency is a precondition Know what we want ‘reproducible’ Get exactly the same result? Do the same process, possibly different results? Do the same process, tolerating similar results? AnimalFair, 2015 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Definitions: In order to be transparent… For Reproducibility (of data & modelling results): Transparency is a precondition Know what we want ‘reproducible’ Get exactly the same result? Do the same process, possibly different results? Do the same process, tolerating similar results? Let’s be serious: we are a long way off any of the above for many complex processes! MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? Yes, we need it! Too much effort per organisation We are all connected MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency  not present in DC2020 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use Conventions known input data plan output data activity config MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use Conventions known MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Reproducibility (extensibility): Scalability Size Time Uniform/known digital storage rules Complexity Domain extension (datasets, processes, persons’ contributions, …) MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science Important to give credit MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science Important to give credit Important not to forget MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY MDBSY Bioregional Assessments Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY Bioregional Assessments Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments – client reqs, experienced formal DM & better tools Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments – client reqs, experienced formal DM & better tools Post-BA CSIRO’s BA Data Management System (BADMS) Lineage graph MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date Have always had T & R as part of our science “Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments – client reqs, experienced formal DM & better tools Post-BA – home institution commitment, procedures & tools ready before the projects, automated testing (on-going staff incentives) MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Work to date MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: scale, longevity , complexity  Faraday, 1821 NCI, 2015 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: scale & longevity For recent data/models, yes We are now actually trying A few year’s worth of data and metadata are kept MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: scale & longevity For recent data/models, yes We are now actually trying A few year’s worth of data and metadata are kept MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: scale & longevity Short term, yes We are now actually trying A few year’s worth of data and metadata are kept Long term, no guarantees Will data be kept? Will cheap storage continue? Will institutional change make this too hard? MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: scale & longevity For recent data/models, yes We are now actually trying A few year’s worth of data and metadata are kept For 20+ years, no guarantees Will data be kept? Will cheap storage continue? Will institutional change make this too hard? Are our provenance models sufficient? MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: complexity For Transparency, I posited we need: Explicit modelling for transparency Widespread use / Standardised Conventions known MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: complexity Explicit Modelling Not much MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: complexity Explicit Modelling Not much Where there is, it is mostly non-standardised MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Assessment of work: complexity Explicit Modelling Not much Where there is, it is mostly non-standardised Where it is standardised, no guarantee of completeness input data plan output data activity config How would you test for completeness other than by attempting reproduction? MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Suggestions For Transparency: Actually model for it Use the standards Aim for completeness For Reproducibility: Define what form you want Set a testable cost target: 10% in 10 years Test MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Suggestions For Transparency: Actually model for it Use the standards Aim for completeness For Reproducibility: Define what form you want Set a testable cost target: 10% in 10 years Test No, really, you actually must MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Suggestions For the T & R of T & R: Explicitly require T & R of others Establish expectation norms Establish modelling & delivery conventions Establish SLAs Be audited – better than tested MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

Thank you Questions? Phone: +61 2 6249 9093 Web: www.ga.gov.au Email: nicholas.car@ga.gov.au Address: Cnr Jerrabomberra Avenue and Hindmarsh Drive, Symonston ACT 2609 Postal Address: GPO Box 378, Canberra ACT 2601