Presentation is loading. Please wait.

Presentation is loading. Please wait.

The MIRACLE project: Cyberinfrastructure for visualizing model outputs

Similar presentations


Presentation on theme: "The MIRACLE project: Cyberinfrastructure for visualizing model outputs"— Presentation transcript:

1 The MIRACLE project: Cyberinfrastructure for visualizing model outputs
Dawn Parker, Michael Barton, Terence Dawson, Tatiana Filatova, Xiongbing Jin, Allen Lee, Ju-Sung Lee, Lorenzo Milazzo, Calvin Pritchard, J. Gary Polhill, Kirsten Robinson, and Alexey Voinov

2 Background and motivation
Growing interest in analyzing highly detailed “big data” Concurrent development of a new generation of simulation models including ABMS, which themselves produce “big data” as outputs Need for tools and methods to analyze and compare these two data sources

3 Motivation for the ABM community
Sharing model code is great—but there are large barriers to entry to getting someone else’s model running Sharing model output data can accomplish many of the goals of code sharing It also lets other researcher explore new parameter spaces, or use different algorithms Sharing of analysis algorithms may jump start development of complex-systems specific output analysis methods

4 Project Objectives 1. Collect, extend, and share methods for statistical analysis and visualization of output from computational agent-based models of coupled human and natural systems (ABM-CHANS). 2. Create web-based facilities to interactively visualize and analyze archives of model output data for ABM-CHANS models as part of CoMSES Net, an existing community modeling archive for ABM.

5 Project Objectives, cont.
3. Conduct meta-analyses of our own projects, and invite the ABM-CHANS community to conduct further meta-analyses using the new tools. 4. Apply the statistical analysis algorithms we develop to empirical datasets to validate their applicability to large scale data from complex social systems.

6 Prototype goals Simple as possible demonstration prototype
Hosted on the Compute Canada/Sharcnet platform One example project under development from each participating research group, containing: Model output data and metadata Workflow description for data creation Scripts used to analyze output data, with documentation Output analysis

7 Planned minimal functionality: All Users
Permissions Apply for a user login Query/plot results: Display existing analysis or call scripts to run new analysis using existing scripts and output data Show and download analysis existing scripts Save a query Comment on a query, data-set, or project Explore Navigate projects, data-sets, queries, and comments to find an interesting project. Search

8 Planned minimal functionality: Project user (research group creating data and script archive)
Upload data Add/edit/rename metadata (preexisting categories) Upload and activate scripts (must link scripts to data) Permissions Join a project, accept members of a project group Publish a project

9 Planned minimal functionality: Administrators
Permissions Manage user logins Manage problematic users/postings Record user workflows for research on how people model and how groups communicate

10 Simple(Simple prototype)prototype
Beta prototype Simple(Simple prototype)prototype Development team: Xiongbing Jin, Allen Lee, Calvin Pritchard, Kirsten Robinson S(sp)P presented by Allen Lee

11 Companion threads Methods for statistical analysis of complex simulation model output data (Twente, lead institution) Metadata standards for complex simulation model output data (James Hutton Institute, lead institution) Both threads supported by user workshops

12 Community input: IEMSS 2014 workshop, “Analysing and synthesising results from complex socio-ecosystem models with high-dimensional input, parameter and output spaces” Focus questions: 1)   What existing and developing methodologies are currently being used to analyze, visualize, and synthesize model output data?

2)   What are the further unmet requirements of this community for data analysis, visualization, and synthesis?

13 Review paper in press, JASSS “The Complexities of Agent-Based Modeling Output Analysis (J.S. Lee et al.) Reviews state-of-the art approaches to output analysis Examines stability/convergence conditions, sensitivity analysis, spatio-temporal analysis, visualization, and communication Follow-up proposed IEMSS 2016 paper session will focus on novel methods from other domains, promising for ABM output analysis

14 Community input: ESSA 2014 workshop “Towards metadata standards for social simulation outputs”
Rationale: Workflows used to create model output unknown Simulation outputs need metadata to aid interpretation and ensure replicability—data need metadata, regardless of where they come from! If we are to create a tool where users can upload their output data, we need to know its structure Users also need to know what they are looking at

15 ESSA 2014 workshop continued
Questions: What file formats do you use for your simulation outputs? What metadata do you record in or about your simulation outputs? Metadata schema paper in draft

16 Metadata for ABM output data
Goals User needs to understand the data (what’s inside the files, what are the relationships between the files, project and owners…) User needs to know how the data were generated (input data, analysis scripts, parameters, computer environment, workflows that chain several scripts…) Two types of metadata Metadata that describe the current state of data (data structure, file and data table content  Fine Grain Metadata) Metadata that describe the provenance of data (how the data were generated  Coarse Grain Metadata)

17 Capturing metadata Goal: Automated metadata extraction with minimum user input Fine grain metadata Automatically extracting metadata from files (CSV columns, ArcGIS Shapefile metadata and attribute table columns, etc.) Coarse grain metadata Workflow describes how a script could produce a certain file type, while provenance describes how script A produces file B Provenance can be automatically captured when user runs scripts and workflows using the MIRACLE system (computer environment, user name, application name, process, input files and parameters, output files.) Workflows can be constructed based on captured provenance

18 Summing up: How might the MIRACLE platform be used?
Within a research group: Efficiently share and discuss new model results Let group member explore new parameter spaces Create accessible archives for publications Across groups: Provide prototypes to new researchers, or those looking for new analysis methods Provide examples for teaching and labs Facilitate additional “after-market” research and publication

19 We hope the MIRACLE project will help to…
Develop, share, test, and compare new statistical methods appropriate for analysis of complex systems data; Improve communication and assessment within the modeling community; Reduce barriers to entry for use of models; Improve the ability of policy makers and stakeholders to understand and interact with model output

20 Acknowledgements “Digging into Data” international funding award to Parker, Dawson, Filatova, and Barton (Canadian, UK, Netherlands, and US national science funding agencies) Waterloo Institute for Complexity and Innovation Workshop participants Compute Canada/Sharcnet


Download ppt "The MIRACLE project: Cyberinfrastructure for visualizing model outputs"

Similar presentations


Ads by Google