Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nicholas Car Scientific Data Platforms & Policy

Similar presentations


Presentation on theme: "Nicholas Car Scientific Data Platforms & Policy"— Presentation transcript:

1 Nicholas Car Scientific Data Platforms & Policy
We of the meta meta Is Australia developing a transparent and reproducible approach to transparency and reproducibility? Nicholas Car Scientific Data Platforms & Policy

2 Outline Definitions of T & R T & R Approach
Describe what a T & R approach to T & R would look like Work to date On T & R, in Aust. Assessment of work Suggestions MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

3 Definitions: In order to be transparent…
For Transparency (of data generation & modelling), we must be able to: Find Access Understand Monterey Bay Aquarium Research Institute, 2004 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

4 Definitions: In order to be transparent…
For Transparency (of data generation & modelling), we must be able to: Find Traverse from final data/model output to inputs Find all the elements needed Access Understand MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

5 Definitions: In order to be transparent…
For Transparency (of data generation & modelling), we must be able to: Find Access One found, we need to be able to access data/models/code Understand MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

6 Definitions: In order to be transparent…
For Transparency (of data generation & modelling), we must be able to: Find Access Understand Once accessed, we need to understand what we have MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

7 Definitions: In order to be transparent…
For Reproducibility (of data & modelling results): Transparency is a precondition Know what we want ‘reproducible’ Get exactly the same result? Do the same process, possibly different results? Do the same process, tolerating similar results? AnimalFair, 2015 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

8 Definitions: In order to be transparent…
For Reproducibility (of data & modelling results): Transparency is a precondition Know what we want ‘reproducible’ Get exactly the same result? Do the same process, possibly different results? Do the same process, tolerating similar results? Let’s be serious: we are a long way off any of the above for many complex processes! MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

9 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

10 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? Yes, we need it! Too much effort per organisation We are all connected MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

11 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency  not present in DC2020 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

12 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

13 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

14 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

15 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

16 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use Conventions known input data plan output data activity config MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

17 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Transparency: Explicit modelling for transparency Standardised Widespread use Conventions known MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

18 T & R approach What would developing a T & R approach to T & R look like and do we need to do it? For Reproducibility (extensibility): Scalability Size Time Uniform/known digital storage rules Complexity Domain extension (datasets, processes, persons’ contributions, …) MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

19 Work to date Have always had T & R as part of our science
MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

20 Work to date Have always had T & R as part of our science
Important to give credit MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

21 Work to date Have always had T & R as part of our science
Important to give credit Important not to forget MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

22 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

23 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY MDBSY Bioregional Assessments Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

24 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY Bioregional Assessments Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

25 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

26 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments – client reqs, experienced formal DM & better tools Post-BA MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

27 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments – client reqs, experienced formal DM & better tools Post-BA CSIRO’s BA Data Management System (BADMS) Lineage graph MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

28 Work to date Have always had T & R as part of our science
“Hockey stick” increase in dedicated MODSIM papers 2007, 09, 11, 13, 15  2, 3, 2, 7, 15 CSIRO Land & Water projects: Pre-MDBSY – no formal dedication to task MDBSY – first formal data management, some tools Bioregional Assessments – client reqs, experienced formal DM & better tools Post-BA – home institution commitment, procedures & tools ready before the projects, automated testing (on-going staff incentives) MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

29 Work to date MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

30 Assessment of work: scale, longevity , complexity
Faraday, 1821 NCI, 2015 MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

31 Assessment of work: scale & longevity
For recent data/models, yes We are now actually trying A few year’s worth of data and metadata are kept MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

32 Assessment of work: scale & longevity
For recent data/models, yes We are now actually trying A few year’s worth of data and metadata are kept MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

33 Assessment of work: scale & longevity
Short term, yes We are now actually trying A few year’s worth of data and metadata are kept Long term, no guarantees Will data be kept? Will cheap storage continue? Will institutional change make this too hard? MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

34 Assessment of work: scale & longevity
For recent data/models, yes We are now actually trying A few year’s worth of data and metadata are kept For 20+ years, no guarantees Will data be kept? Will cheap storage continue? Will institutional change make this too hard? Are our provenance models sufficient? MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

35 Assessment of work: complexity
For Transparency, I posited we need: Explicit modelling for transparency Widespread use / Standardised Conventions known MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

36 Assessment of work: complexity
Explicit Modelling Not much MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

37 Assessment of work: complexity
Explicit Modelling Not much Where there is, it is mostly non-standardised MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

38 Assessment of work: complexity
Explicit Modelling Not much Where there is, it is mostly non-standardised Where it is standardised, no guarantee of completeness input data plan output data activity config How would you test for completeness other than by attempting reproduction? MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

39 Suggestions For Transparency: Actually model for it Use the standards
Aim for completeness For Reproducibility: Define what form you want Set a testable cost target: 10% in 10 years Test MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

40 Suggestions For Transparency: Actually model for it Use the standards
Aim for completeness For Reproducibility: Define what form you want Set a testable cost target: 10% in 10 years Test No, really, you actually must MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

41 Suggestions For the T & R of T & R: Explicitly require T & R of others
Establish expectation norms Establish modelling & delivery conventions Establish SLAs Be audited – better than tested MODSIM15 CompSci & Eng Keynote We of the meta meta: is Australia developing a transparent and reproducible approach to transparency and reproducibility?

42 Thank you Questions? Phone: +61 2 6249 9093
Web: Address: Cnr Jerrabomberra Avenue and Hindmarsh Drive, Symonston ACT 2609 Postal Address: GPO Box 378, Canberra ACT 2601


Download ppt "Nicholas Car Scientific Data Platforms & Policy"

Similar presentations


Ads by Google