Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of California Santa Barbara Advancing Software for Ecological Forecasting March 25, 2014 Software for Ecological Synthesis
Ocean Health Index (OHI) Ocean Health Index Halpern et al. 2012
The “long-tail” of science Heidorn, P doi: /lib
goa.nceas.ucsb.edu
Data Heterogeneity HeterogeneityHighLow Tight coupling Simple subsetting Explicit semantics Loose coupling Hard subsetting Limited semantics Data set sizeLowHigh
Diverse Analysis and Modeling Wide variety of analyses used in ecology and environmental sciences – Statistical analyses and trends – Rule-based models – Dynamic models (e.g., continuous time) – Individual-based models (agent-based) – many others Implemented in many frameworks – R, Matlab, SAS, SPSS, Jump, C, Python, Fortran
Kepler DMP-Tool Software & Data Interoperability PlanCollectAssureDescribePreserveDiscoverIntegrateAnalyze
Produce an open-source scientific workflow system Design, share, and execute scientific workflows Support scientists in a variety of disciplines e.g., biology, ecology, oceanography, astronomy Features Data access Cross analytical packages Documentation Provenance tracking Model archiving and sharing
Scientific workflows promote interoperability
Why workflows? Executability Replicability Reproducibility Transparency Modularity Reusability Provenance
How do we harness the long tail? Efficient data federation Interoperable software workflows Central search for discovery Just-in-time data integration – Loose coupling – Schema-less storage