Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at.

Similar presentations


Presentation on theme: "Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at."— Presentation transcript:

1 Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign (UIUC) POC: Peter Bajcsy, email: pbajcsy@ncsa.uiuc.edupbajcsy@ncsa.uiuc.edu CyberIntegrator: A Meta-Workflow System Designed for Solving Complex Scientific Problems using Heterogeneous Tools

2 Outline Problem Formulation –Meta-Workflow Definitions –Past Work Design –Workflow Requirements Driven by Environmental Observatories –Architecture of NCSA Meta-workflow Prototype Called CyberIntegrator Implementation –Key Capabilities of CyberIntegrator Use Cases –Environmental and Hydrological Engineering Summary

3 Problem Formulation

4 Science Problem Formulation

5 System Problem Formulation

6 Work Flow Problem Formulation

7 Meta-Workflow Definition Meta-workflow (MWF) definitions in the past: –(1) Workflow aspect: a workflow is an aggregation of tasks, a meta- workflow is an aggregation of workflows or a hierarchy of workflows –(2) Process management aspect: large activities have to be integrated, executed and evaluated in a process of conducting electronic commerce Our meta-workflow definition includes multiple of its dimensions: –(1) hierarchical structure and organization of software, combinatorial explosion of module connection –(2) heterogeneity of software tools and computational resources, the number of different engines and software applications used by people for a reason –(3) usability of tool and workflow interfaces, –(4) community sharing of fragments and user friendly security, –(5) community knowledge and provenance, –(6) execution and built-in fault-tolerance, etc

8 Previous Work Other efforts: –Business process workflow architectures - FlowMark, WSFL and BPEL: serving business community –Scientific workflow architectures - DAGMan, Taverna, SciFlo, Kepler, D2K, OGRE, CCA, Pegasus, GridFlow and Grid Ant, Triana and GSFL Comparison: –Our work focuses on the simplicity of end user interactions with information technologies while utilizing all execution mechanisms transparently (workflow by example). –Our work creates provenance to recommendation pipelines for the benefit of a community (recommendations based on provenance information).

9 Research Topics Data Translations: Semantic and syntactic mapping of data structures Provenance Information: Granularity of gathered provenance information for recommendations, auditing and re-construction HCI: User interface design issues and community dependencies Meta-Data: Federation of distributed (data, tool, computational resource) registries Execution: Just in time data delivery wrt. remote computing; Cost benefit analysis of data transfer vs. CPU requirements; Execution triggered by streaming data

10 Design

11 Design Goals Make scientific discoveries easier –Workflow by example (step-by-step experimentation) –Design friendly user interfaces –Build seamless access to heterogeneous data/tools/resources –Provide data and process provenance information –Recommend data, tools and computational resources –Derive higher level semantic tools

12 Meta-workflow Architecture

13 Implementation

14 Meta-Workflow Features Workflow by example Support of heterogeneous executors –Workflows: GeoLearn, D2K, Kepler/Ptolemy –Applications: MS Excel, Im2Learn, ArcGIS –Web services: D2KWS Provenance –Gathering & Meta-data repositories Recommendations

15 Meta-workflow Editor

16

17 Use Cases

18 Meta-Workflow R&D Drivers Community drivers: –Environmental Science: CLEANER –Hydrological Science: CUAHSI Science drivers: –Environmental Modeling of Nutrient Distribution Monte Carlo simulations of maximum amount of pollution that a water body can receive each day and still retain its uses –Understanding the Dynamic Evolution of Land-Surface Variables in the Illinois River Basin Data-driven analyses of multi-variable relationships from remote sensing data Technology drivers: –Collaboratory Cyberenvironments

19

20

21 Summary The problem of designing a highly interactive scientific meta-workflow system is very complex Key capabilities of our meta-workflow prototype implementation called CyberIntegrator were demonstrated with two use cases. We plan on building and deploying a practical tool for multiple communities. Publications: –Image Spatial Data Analysis Group at NCSA: –URL: http://isda.ncsa.uiuc.edu Questions: –Peter Bajcsy; Email: pbajcsy@ncsa.uiuc.edupbajcsy@ncsa.uiuc.edu

22 Hydro-informatics

23 Backup

24 Meta-workflow System Information

25 Terminology Engines are stand-alone environments and applications that are used by many tools –Examples: Matlab, MS Excel, D2K, Im2Learn, ArcGIS, Kepler Tools are solutions specific to a problem and consist of several algorithms –Examples: Image Calculator in Im2Learn, Pie chart visualization in MS Excel, … Algorithms are code fragments that perform a specific operation in a tool –Examples: image addition operation in Image Calculator

26 Environmental Science

27 Hydrological Science


Download ppt "Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at."

Similar presentations


Ads by Google