Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at.

Slides:



Advertisements
Similar presentations
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Advertisements

Service Oriented Architecture Inevitable? What next?
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Experience of application of modern GIS-technologies for environmental monitoring tasks Prof., Dr. Cheremisina Evgenia, Dr.Lyubimova Anna.
C van Ingen, D Agarwal, M Goode, J Gupchup, J Hunt, R Leonardson, M Rodriguez, N Li Berkeley Water Center John Hopkins University Lawrence Berkeley Laboratory.
Nadia Ranaldo - Eugenio Zimeo Department of Engineering University of Sannio – Benevento – Italy 2008 ProActive and GCM User Group Orchestrating.
Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology.
Automated Analysis and Code Generation for Domain-Specific Models George Edwards Center for Systems and Software Engineering University of Southern California.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Deployment and Evaluation of an Observations Data Model Jeffery S Horsburgh David G Tarboton Ilya Zaslavsky David R. Maidment David Valentine
University of Illinois Role of Mashups, Cloud Computing, and Parallelism for Visual Analytics Loretta Auvil.
SAN DIEGO SUPERCOMPUTER CENTER Developing a CUAHSI HIS Data Node, as part of Cyberinfrastructure for the Hydrologic Sciences David Valentine Ilya Zaslavsky.
University of Illinois at Urbana-Champaign National Center for Supercomputing Applications An Integrated Environmental Observatory Cyberenvironment Barbara.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
Submitted by: Madeeha Khalid Sana Nisar Ambreen Tabassum.
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Digital Object Architecture
Workflow sharing and integration services by the ER-flow project on behalf of the ER-flow consortium EGI Community Forum, Manchester,
Lisa Ruff Business Productivity/Accessibility TS Microsoft Federal.
Appraisal and Data Mining of Large Size Complex Documents Rob Kooper, William McFadden and Peter Bajcsy National Center for Supercomputing Applications.
Cyberinfrastructure Overview Core Cyberinfrastructure Team Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Dynamic Virtual Observatories James Myers, Luigi Marini, Rob.
1 Another group of Patterns Architectural Patterns.
DORII review Remote instrumentation communities and application support NA3 Roberto Pugliese Sincrotrone Trieste SCpA.
1 Advanced topics in OpenCIM 1.CIM: The need and the solution.CIM: The need and the solution. 2.Architecture overview.Architecture overview. 3.How Open.
MAEviz as a MAE/NCSA Cyberenvironment Partnership Jim Myers Associate Director NCSA Cyberenvironments.
CCGrid 2003, Tokyo, Japan GridFlow: Workflow Management for Grid Computing Junwei Cao ( 曹军威 ) C&C Research Labs, NEC Europe Ltd., Germany Stephen A. Jarvis.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Workflow Project Status Update Luciano Piccoli - Fermilab, IIT Nov
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
S. Shumilov – Zürich Analytical Visualization Framework - a visual data processing and knowledge discovery system Ivan Denisovich, Serge Shumilov Department.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Distributed Computing With Triana A Short Course Matthew Shields, Ian Taylor & Ian Wang.
Our simulation is based on Chris Starnes. original work by Reynolds [8] on the simulation of flocks of birds (or ‘Boids‘) in a manner not subject to the.
MAEviz Terry McLaren Project Manager, Cyberenvironment Technologies (CET), National Center for Supercomputing Applications (NCSA), University of Illinois.
©2012 LIESMARS Wuhan University Building Integrated Cyberinfrastructure for GIScience through Geospatial Service Web Jianya Gong, Tong Zhang, Huayi Wu.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
Toward interactive visualization in a distributed workflow Steven G. Parker Oscar Barney Ayla Khan Thiago Ize Steven G. Parker Oscar Barney Ayla Khan Thiago.
Adapting the Electronic Laboratory Notebook for the Semantic Era Tara Talbott, Michael Peterson, Jens Schwidder, James D. Myers 2005 International Symposium.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association INSTITUTE FOR WATER AND RIVER.
An Overview of Scientific Workflows: Domains & Applications Laboratoire Lorrain de Recherche en Informatique et ses Applications Presented by Khaled Gaaloul.
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
Origami: Scientific Distributed Workflow in McIDAS-V Maciek Smuga-Otto, Bruce Flynn (also Bob Knuteson, Ray Garcia) SSEC.
Wide Area Grid – Technical Requirements Paul Kopp.
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
V7 Foundation Series Vignette Education Services.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Sharing Hydrologic Data with the CUAHSI* Hydrologic Information System
Data Warehousing and Data Mining
Grid Application Model and Design and Implementation of Grid Services
Scientific Workflows Lecture 15
GGF10 Workflow Workshop Summary
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign (UIUC) POC: Peter Bajcsy, CyberIntegrator: A Meta-Workflow System Designed for Solving Complex Scientific Problems using Heterogeneous Tools

Outline Problem Formulation –Meta-Workflow Definitions –Past Work Design –Workflow Requirements Driven by Environmental Observatories –Architecture of NCSA Meta-workflow Prototype Called CyberIntegrator Implementation –Key Capabilities of CyberIntegrator Use Cases –Environmental and Hydrological Engineering Summary

Problem Formulation

Science Problem Formulation

System Problem Formulation

Work Flow Problem Formulation

Meta-Workflow Definition Meta-workflow (MWF) definitions in the past: –(1) Workflow aspect: a workflow is an aggregation of tasks, a meta- workflow is an aggregation of workflows or a hierarchy of workflows –(2) Process management aspect: large activities have to be integrated, executed and evaluated in a process of conducting electronic commerce Our meta-workflow definition includes multiple of its dimensions: –(1) hierarchical structure and organization of software, combinatorial explosion of module connection –(2) heterogeneity of software tools and computational resources, the number of different engines and software applications used by people for a reason –(3) usability of tool and workflow interfaces, –(4) community sharing of fragments and user friendly security, –(5) community knowledge and provenance, –(6) execution and built-in fault-tolerance, etc

Previous Work Other efforts: –Business process workflow architectures - FlowMark, WSFL and BPEL: serving business community –Scientific workflow architectures - DAGMan, Taverna, SciFlo, Kepler, D2K, OGRE, CCA, Pegasus, GridFlow and Grid Ant, Triana and GSFL Comparison: –Our work focuses on the simplicity of end user interactions with information technologies while utilizing all execution mechanisms transparently (workflow by example). –Our work creates provenance to recommendation pipelines for the benefit of a community (recommendations based on provenance information).

Research Topics Data Translations: Semantic and syntactic mapping of data structures Provenance Information: Granularity of gathered provenance information for recommendations, auditing and re-construction HCI: User interface design issues and community dependencies Meta-Data: Federation of distributed (data, tool, computational resource) registries Execution: Just in time data delivery wrt. remote computing; Cost benefit analysis of data transfer vs. CPU requirements; Execution triggered by streaming data

Design

Design Goals Make scientific discoveries easier –Workflow by example (step-by-step experimentation) –Design friendly user interfaces –Build seamless access to heterogeneous data/tools/resources –Provide data and process provenance information –Recommend data, tools and computational resources –Derive higher level semantic tools

Meta-workflow Architecture

Implementation

Meta-Workflow Features Workflow by example Support of heterogeneous executors –Workflows: GeoLearn, D2K, Kepler/Ptolemy –Applications: MS Excel, Im2Learn, ArcGIS –Web services: D2KWS Provenance –Gathering & Meta-data repositories Recommendations

Meta-workflow Editor

Use Cases

Meta-Workflow R&D Drivers Community drivers: –Environmental Science: CLEANER –Hydrological Science: CUAHSI Science drivers: –Environmental Modeling of Nutrient Distribution Monte Carlo simulations of maximum amount of pollution that a water body can receive each day and still retain its uses –Understanding the Dynamic Evolution of Land-Surface Variables in the Illinois River Basin Data-driven analyses of multi-variable relationships from remote sensing data Technology drivers: –Collaboratory Cyberenvironments

Summary The problem of designing a highly interactive scientific meta-workflow system is very complex Key capabilities of our meta-workflow prototype implementation called CyberIntegrator were demonstrated with two use cases. We plan on building and deploying a practical tool for multiple communities. Publications: –Image Spatial Data Analysis Group at NCSA: –URL: Questions: –Peter Bajcsy;

Hydro-informatics

Backup

Meta-workflow System Information

Terminology Engines are stand-alone environments and applications that are used by many tools –Examples: Matlab, MS Excel, D2K, Im2Learn, ArcGIS, Kepler Tools are solutions specific to a problem and consist of several algorithms –Examples: Image Calculator in Im2Learn, Pie chart visualization in MS Excel, … Algorithms are code fragments that perform a specific operation in a tool –Examples: image addition operation in Image Calculator

Environmental Science

Hydrological Science