Download presentation
Presentation is loading. Please wait.
Published byErin Watson Modified over 9 years ago
1
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007
2
Informatics and the Research Cycle Mental Model Research Design Publish Data-intensive Data mining Bio-inspired algs. Exp. Data Analysis Visualization Compute- intensive Parallel processing High throughput Knowledge- intensive Human cognition Ontologies Sem. mediation Collect Data Inductive, Descriptive Statistics Deductive, Prescriptive Mechanistic Conduct Analyses Scientific Workflow System Automation => replication Access to distributed resources Reusability & sharing Empowered by knowledge-intensive approaches*** Data Management Data models Metadata Storage Cyberinfrastructure: Sharing data, analyses, mental models
3
Scientific Workflows Scientists do their analyses now by:Scientists do their analyses now by: –Focus on data collection and the analytical steps –Manually coordinate export and import of data among software systems Workflow systems collaborate with the scientist to:Workflow systems collaborate with the scientist to: –Discover existing data –Handle data flow between components –Document the analytical process Query EcoGrid to find data Archive output to EcoGrid with workflow metadata
4
–Not linear –Involve multiple data sets –Involve multiple analytical steps
5
Automated Workflows ScriptsSingle platformScriptsSingle platform Visual modelingSingle environment environmentVisual modelingSingle environment environment Workflows:Workflows: –Cross-platform –Cross-environment –Distributed data & analyses
6
Productivity Example Mental Model BiomassTempSoil Et al. == f ( C Concept ClimateTemp Soil Biomass MergeModelPredict Conceptual Workflow AS TS DS TS Transformation Step Dessimination DS Executable Workflow AS Analysis Step Data Step DS AS DS Abstract Workflow “View1”: Excel GIS SAS GIS “View2”: VBScript R Script GA R
7
Scientists design their research at the conceptual workflow level Often done on the fly over the period of time the research is being conducted For automated approaches, this must be well thought out from the beginning HOWEVER, because of the automation it is easy to modify the analysis and rerun it many times, so you are not locked into the original design
8
Benefits Reusable analysis steps, pipelines, and workflows Formal documentation of methods (output in report format) Reproducibility of methods Visual creation and communication of methods Versioning Automated data typing and transformation
9
Nested workflows AS x TS 1 AS y AS z AS r TS 2 Search for relevant data and analyses (Query) SW 0 Image Processing Pipeline Signal Processing Pipeline AS r TS 2 Field Data Ground Sensors Imagery Semantically-integrated
10
Ecological niche modeling conceptual workflow Training sample GARP rule set Test sample Species pres. & abs. points EcoGrid Query EcoGrid Query Layer Integration Sample Data + A3 + A2 + A1 Data Calculation Map Generation Validation User Model quality parameters Native range prediction map Env. layers Generate Metadata Archive To Ecogrid Selected prediction maps Transformation Scaling EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase Integrated layers Integrated layers GARP rule set Species pres. & abs. points
11
Ecological niche modeling conceptual workflow Training sample GARP rule set Test sample Species pres. & abs. points EcoGrid Query EcoGrid Query Layer Integration Sample Data + A3 + A2 + A1 Data Calculation Map Generation Validation User Model quality parameters Native range prediction map Env. layers Generate Metadata Archive To Ecogrid Selected prediction maps Transformation Scaling EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase Integrated layers Integrated layers GARP rule set Species pres. & abs. points Spatial location Temporal extent
12
Generic Workflow Training sample GARP rule set Test sample Occurrence Data Binary, Categorical or Numeric EcoGrid Query EcoGrid Query Layer Integration Sample Data + A3 + A2 + A1 Data Calculation Map Generation Validation User Model quality parameters Prediction map Environmental layers Generate Metadata Archive To Ecogrid Selected prediction maps Physical Transformation Scaling EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase Integrated layers Integrated layers GARP rule set
13
Temperature Interpolation Workflow Training sample GARP rule set Test sample Weather station temperature data EcoGrid Query EcoGrid Query Layer Integration Sample Data + A3 + A2 + A1 Data Calculation Map Generation Validation User Model quality parameters Prediction map: Interpolated temperature grid Environmental layers: elevation, aspect, land cover Generate Metadata Archive To Ecogrid Selected prediction maps Physical Transformation Scaling EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase Integrated layers Integrated layers GARP rule set
14
Temperature Interpolation Workflow Training sample GARP rule set Test sample Sinkhole occurrence EcoGrid Query EcoGrid Query Layer Integration Sample Data + A3 + A2 + A1 Data Calculation Map Generation Validation User Model quality parameters Prediction map: Sinkhole distribution Environmental layers: Groundwater level, chemistry, etc Generate Metadata Archive To Ecogrid Selected prediction maps Physical Transformation Scaling EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase EcoGrid DataBase Integrated layers Integrated layers GARP rule set
15
Exercise 1.Divide into groups of 4 (or so) with similar research interests 2.Pick a research topic to collaborate on 3.Construct a workflow diagram for an analysis that could be conducted 4.Discuss how it could be reused for other related or unrelated analyses
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.