 Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented.

Slides:



Advertisements
Similar presentations
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Advertisements

Nimrod/K: Towards Massively Parallel Dynamic Grid Workflows David Abramson, Colin Enticott, Monash Ilkay Altinas, UCSD.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
EUFORIA FP7-INFRASTRUCTURES , Grant Scientific Workflows Kepler and Java API 4 HPC/GRID ITM meeting Juelich 2009 Michał Owsiak Marcin Płóciennik.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
UCSD SAN DIEGO SUPERCOMPUTER CENTER Ilkay Altintas Scientific Workflow Automation Technologies Provenance Collection Support in the Kepler Scientific Workflow.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February.
Type System, March 12, Data Types and Behavioral Types Yuhong Xiong Edward A. Lee Department of Electrical Engineering and Computer Sciences University.
Behavioral Types as Interface Definitions for Concurrent Components Center for Hybrid and Embedded Software Systems Edward A. Lee Professor UC Berkeley.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
February 11, 2010 Center for Hybrid and Embedded Software Systems Ptolemy II - Heterogeneous Concurrent Modeling and Design.
Ngu, Texas StatePtolemy Miniconference, February 13, 2007 Flexible Scientific Workflows Using Dynamic Embedding Anne H.H. Ngu, Nicholas Haasch Terence.
Computational Physics Kepler Dr. Guy Tel-Zur. This presentations follows “The Getting Started with Kepler” guide. A tutorial style manual for scientists.
Review of “Embedded Software” by E.A. Lee Katherine Barrow Vladimir Jakobac.
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
An Extensible Type System for Component-Based Design
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
SEC PI Meeting Annapolis, May 8-9, 2001 Component-Based Design of Embedded Control Systems Edward A. Lee & Jie Liu UC Berkeley with thanks to the entire.
Department of Electrical Engineering and Computer Sciences University of California at Berkeley System-Level Types for Component-Based Design Edward A.
Department of Electrical Engineering and Computer Sciences University of California at Berkeley Concurrent Component Patterns, Models of Computation, and.
System-Level Types for Component-Based Design Paper by: Edward A. Lee and Yuhong Xiong Presentation by: Dan Patterson.
The Kepler Project Overview, Status, and Future Directions Matthew B. Jones on behalf of the Kepler Project team National Center for Ecological Analysis.
Department of Electrical Engineering and Computer Sciences University of California at Berkeley The Ptolemy II Framework for Visual Languages Xiaojun Liu.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
1 Ilkay ALTINTAS - October, 2007 Ilkay ALTINTAS Lab Director, Scientific Workflow Automation Technologies San Diego Supercomputer Center, UCSD Kepler Scientific.
Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
January, 23, 2006 Ilkay Altintas
Composing Models of Computation in Kepler/Ptolemy II
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
ITPA/IMAGE 7-10 May 2007 Software and Hardware Infrastructure for the ITM B.Guillerminet, on behalf of the ITM & ISIP teams (P Strand, F Imbeaux, G Huysmans,
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Scientific Workflows. 2 Overview More background on workflows Kepler Details Example Scientific Workflows Other Workflow Systems.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Kepler includes contributors from GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs (SEEK), EAR (GEON), DOE DE-FC02-01ER25486.
Design Languages in 2010 Chess: Center for Hybrid and Embedded Software Systems Edward A. Lee Professor UC Berkeley Panel Position Statement Forum on Design.
Your name here SPA: Successes, Status, and Future Directions Terence Critchlow And many, many, others Scientific Process Automation PNNL.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Toward interactive visualization in a distributed workflow Steven G. Parker Oscar Barney Ayla Khan Thiago Ize Steven G. Parker Oscar Barney Ayla Khan Thiago.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Partnerships in Innovation: Serving a Networked Nation Grid Technologies: Foundations for Preservation Environments Portals for managing user interactions.
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
BalticGrid-II Project EGEE UF’09 Conference, , Catania Partner’s logo Framework for Grid Applications Migrating Desktop Framework for Grid.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
CS 5991 Presentation Ptolemy: A Framework For Simulating and Prototyping Heterogeneous Systems.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Interfacing gLite services with the Kepler.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Scientific workflow in Kepler – hands on tutorial
Tamas Kiss University Of Westminster
Ptolemy II - Heterogeneous Concurrent Modeling and Design in Java
Ptolemy II - Heterogeneous Concurrent Modeling and Design in Java
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Retargetable Model-Based Code Generation in Ptolemy II
SDM workshop Strawman report History and Progress and Goal.
Ptolemy II - Heterogeneous Concurrent Modeling and Design in Java
Staying afloat in the sensor data deluge
Computational Physics Kepler
Ptolemy II - Heterogeneous Concurrent Modeling and Design in Java
A Semantic Type System and Propagation
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

 Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented model with directors acting as the main workflow engine  Enables different models of computation

 Modeling flow of data from one step to another in series of computations to achieve some scientific goal

 Software system for modeling, simulation, and design of concurrent, real-time, embedded systems developed at UC Berkeley  Objective: “ The focus is on assembly of concurrent components. The key underlying principle in the project is the use of well-defined models of computation that govern the interaction between components. A major problem area being addressed is the use of heterogeneous mixtures of models of computation. ”

 Directors  Actors  Ports  Relations

 Directors control execution of workflow  Actors are executable components of a workflow (scheduling, dispatching threads, etc)  Directors govern execution of Actors

Actor-/Dataflow Orientation vs Object-/ Control flow Orientation

 Every Kepler workflow needs a director  Execute networks of components under multiple execution models › Synchronous vs. Parallel vs. Dataflow vs. time-based vs. event-based vs. all combined  Computation model dictates semantics for component interaction

 Make use of separation of concerns › e.g., component execution, workflow execution and provenance tracking  Managers acts like “common execution environment” › governing different concerns related to execution of network and services

 CT – continuous time modeling  DE – discrete event systems  FSM – finite state machines  PN – process networks  SDF – synchronous dataflow  DDF – dynamic dataflow  SR - synchronous/reactive systems

 Reusable components that execute variety of functions  Communicate with other actors in workflow through ports  Composite actor – aggregation of actors  Composite actor may have a local director

 Top level workflows can be conceptual representation of science process  Drilling down reveals increasing levels of detail  Composing models using hierarchy promotes development of re-usable components

 Each actor implements several methods › initialize() – initializes state variables › prefire() – indicates if actor wants to fire › fire() – main point of execution  Read inputs, produce outputs, read parameter values › postfire() – update persistent state, see if execution complete › wrapup()  Each director calls these methods according to its model

 Copy actor – copy files from one resource to another during execution › Stage actor – local to remote host › Fetch actor - remote to local host  Job execution actor – submit and run a remote job  Monitoring actor – notify user of failures  Service discovery actor – import web services from a service repository or web site  Rexpression actors  MatlabExpression actors  Web services actors – Given WSDL and name of an operation of a web service, dynamically customizes itself to implement and execute that method  Database connection and query actors

 Ports used to produce and consume data and communicate with other actors in workflow › Input port – data consumed by actor › Output port – data produced by actor › Input/output port – data both produced and consumed

 Direct same input or output to more than one port  Example: direct output to 1. display actor to show intermediate results, and 2. operational actor for further processing

 Execution Options: › inside GUI › at command-line › distributed computing

 Kepler components can be shared by exporting workflow or component into a Kepler Archive (KAR) file (extension of JAR file format)  Component Repository is centralized system for sharing Kepler workflows  Users can search for components from repository from within Vergil

 Kepler provides direct access to scientific data archived in many of commonly used data archives. › Ex. access to data stored in Knowledge Network for Biocomplexity (KNB) Metacat server and described using Ecological Metadata Language.  Additional supported data sources › DiGIR protocol, OPeNDAP protocol, GridFTP, JDBC, SRB, and others.

 Kepler ships by default with: › Globus actors › GridFTP actors  No BES implementation*  Job submission to openPBS, G-lite  Kepler actors capable of using Unicore by Euforia (Poznań SC)  TeraGrid gateways exists that use Kepler

 Actor Data Polymorphism: › Add numbers (int, float, double, complex) › Add strings (concatenation) › Add complex types (arrays, records, matrices) › Add user-defined types

 Distributed execution of workflow parts (peer to peer)  Efficient data transfer  Provenance tracking of data and processes  Tracking workflow evolution  Streaming data analysis  Easy-to-deploy batch interfaces  Intuitive workflow design  Customizable semantic typing  Interoperability with other workflow and analytical environments (at exec level)

 Ecology › SEEK: Ecological Niche Modeling and climate change › REAP: Modeling parasite invasions in grasslands using sensor networks › NEON: Ecological sensor networks; COMET: Environmental science  Geosciences › GEON: LiDAR data processing, Geological data integration › NEESit: Earthquake engineering  Molecular biology › SDM: Gene promoter identification and ScalaBLAST › ChIP-chip: Genome-scale research; CAMERA: Metagenomics  Oceanography › REAP: SST data processing; LOOKING/OOI CI: ocean observing CI › ROADNet: real-time data modeling and analysis › ATOL: Processing Phylodata ; CiPRES: Phylogentic tools  Chemistry › Resurgence: Computational chemistry; DART/ARCHER: X-Ray crystallography  Library science › DIGARCH: Digital preservation; UK Text Mining Center: Cheshire feature and archival  Conservation biology › SanParks: Thresholds of Potential Concerns  Physics › SDM: astrophysics TSI-1 and TSI-2 ; CPES: Plasma fusion simulation; ITER-EU: ITM fusion workflows