Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented.

Similar presentations


Presentation on theme: " Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented."— Presentation transcript:

1

2

3  Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented model with directors acting as the main workflow engine  Enables different models of computation

4  Modeling flow of data from one step to another in series of computations to achieve some scientific goal

5  Software system for modeling, simulation, and design of concurrent, real-time, embedded systems developed at UC Berkeley  Objective: “ The focus is on assembly of concurrent components. The key underlying principle in the project is the use of well-defined models of computation that govern the interaction between components. A major problem area being addressed is the use of heterogeneous mixtures of models of computation. ”

6

7  Directors  Actors  Ports  Relations

8  Directors control execution of workflow  Actors are executable components of a workflow (scheduling, dispatching threads, etc)  Directors govern execution of Actors

9 Actor-/Dataflow Orientation vs Object-/ Control flow Orientation

10  Every Kepler workflow needs a director  Execute networks of components under multiple execution models › Synchronous vs. Parallel vs. Dataflow vs. time-based vs. event-based vs. all combined  Computation model dictates semantics for component interaction

11  Make use of separation of concerns › e.g., component execution, workflow execution and provenance tracking  Managers acts like “common execution environment” › governing different concerns related to execution of network and services

12  CT – continuous time modeling  DE – discrete event systems  FSM – finite state machines  PN – process networks  SDF – synchronous dataflow  DDF – dynamic dataflow  SR - synchronous/reactive systems

13  Reusable components that execute variety of functions  Communicate with other actors in workflow through ports  Composite actor – aggregation of actors  Composite actor may have a local director

14  Top level workflows can be conceptual representation of science process  Drilling down reveals increasing levels of detail  Composing models using hierarchy promotes development of re-usable components

15  Each actor implements several methods › initialize() – initializes state variables › prefire() – indicates if actor wants to fire › fire() – main point of execution  Read inputs, produce outputs, read parameter values › postfire() – update persistent state, see if execution complete › wrapup()  Each director calls these methods according to its model

16  Copy actor – copy files from one resource to another during execution › Stage actor – local to remote host › Fetch actor - remote to local host  Job execution actor – submit and run a remote job  Monitoring actor – notify user of failures  Service discovery actor – import web services from a service repository or web site  Rexpression actors  MatlabExpression actors  Web services actors – Given WSDL and name of an operation of a web service, dynamically customizes itself to implement and execute that method  Database connection and query actors

17  Ports used to produce and consume data and communicate with other actors in workflow › Input port – data consumed by actor › Output port – data produced by actor › Input/output port – data both produced and consumed

18  Direct same input or output to more than one port  Example: direct output to 1. display actor to show intermediate results, and 2. operational actor for further processing

19  Execution Options: › inside GUI › at command-line › distributed computing

20

21

22

23

24

25  Kepler components can be shared by exporting workflow or component into a Kepler Archive (KAR) file (extension of JAR file format)  Component Repository is centralized system for sharing Kepler workflows  Users can search for components from repository from within Vergil

26  Kepler provides direct access to scientific data archived in many of commonly used data archives. › Ex. access to data stored in Knowledge Network for Biocomplexity (KNB) Metacat server and described using Ecological Metadata Language.  Additional supported data sources › DiGIR protocol, OPeNDAP protocol, GridFTP, JDBC, SRB, and others.

27  Kepler ships by default with: › Globus actors › GridFTP actors  No BES implementation*  Job submission to openPBS, G-lite  Kepler actors capable of using Unicore by Euforia (Poznań SC)  TeraGrid gateways exists that use Kepler

28

29

30  Actor Data Polymorphism: › Add numbers (int, float, double, complex) › Add strings (concatenation) › Add complex types (arrays, records, matrices) › Add user-defined types

31

32  Distributed execution of workflow parts (peer to peer)  Efficient data transfer  Provenance tracking of data and processes  Tracking workflow evolution  Streaming data analysis  Easy-to-deploy batch interfaces  Intuitive workflow design  Customizable semantic typing  Interoperability with other workflow and analytical environments (at exec level)

33  Ecology › SEEK: Ecological Niche Modeling and climate change › REAP: Modeling parasite invasions in grasslands using sensor networks › NEON: Ecological sensor networks; COMET: Environmental science  Geosciences › GEON: LiDAR data processing, Geological data integration › NEESit: Earthquake engineering  Molecular biology › SDM: Gene promoter identification and ScalaBLAST › ChIP-chip: Genome-scale research; CAMERA: Metagenomics  Oceanography › REAP: SST data processing; LOOKING/OOI CI: ocean observing CI › ROADNet: real-time data modeling and analysis › ATOL: Processing Phylodata ; CiPRES: Phylogentic tools  Chemistry › Resurgence: Computational chemistry; DART/ARCHER: X-Ray crystallography  Library science › DIGARCH: Digital preservation; UK Text Mining Center: Cheshire feature and archival  Conservation biology › SanParks: Thresholds of Potential Concerns  Physics › SDM: astrophysics TSI-1 and TSI-2 ; CPES: Plasma fusion simulation; ITER-EU: ITM fusion workflows


Download ppt " Scientific workflow management system based on Ptolemy II  Allows scientists to visually design and execute scientific workflows  Actor-oriented."

Similar presentations


Ads by Google