Download presentation
Presentation is loading. Please wait.
Published byElfrieda Hawkins Modified over 9 years ago
1
Programming Scientific and Distributed Workflow with Triana Services Matthew Shields, GGF10 Workflow Workshop, 9th March
2
Matthew Shields, Cardiff University Presentation Outline Triana Overview Triana services and their distribution Distribution policies The GAP interface and its relation to the Gridlab GAT Scientific Workflow Binary Inspiral Algorithm Example Dynamic Distributed Workflow Service Composition on the Grid Service Usage, dynamically distributing a Triana workflow Conclusion
3
Matthew Shields, Cardiff University What is Triana?
4
Matthew Shields, Cardiff University GAPGAP Any GAP service e.g. Web service Triana Distributed Work-flow Network Action Commands Workflow, e.g. BPEL4WS Triana Engine Triana Controlling Service (TCS) Triana Service & Engine Triana Service & Engine Other Engine Distributed Triana Work-flow - flexible distribution: based around Triana Groups - HPC and Pipelined distribution Triana Gateway
5
Matthew Shields, Cardiff University GAP Overview based around a series of Java interface classes Concrete implementations that form the GAP bindings The core interface is the Service Creation and Discovery Pipe Creation and Discovery Message Communication Information Job Submmission Data Management - transfers - logical lookup Will be become an adapter for the GridLab Java GAT, providing: Advertisement, Discovery, deployment and communication of services GRMS job submission adapter Data Management Services
6
Matthew Shields, Cardiff University Jxtaserve GSI Enabled NS-2 And more.. Java GAT Prototype Jxta GridLab GAT (www.gridlab.org) Advertising Discovery Communication GAP (Java Prototype) Web Services P2PS Job Submission (GRMS) Generic Job Submission Virtual filename data access Data Management Set of generic Java interfaces high level abstractions to Grid services Factory design – dynamic pluggable services OGSA (planned)
7
Matthew Shields, Cardiff University Triana Prototype Distributed Triana Prototype Based around Triana Groups i.e. aggregate tools Each group can be distributed Distribution policies: HTC - high throughput/task farming Pipeline - allow node to node communication Each service can be a gateway to finer granularities of distribution: Pipeline Distribution Task-Farming Distribution Triana Service
8
Matthew Shields, Cardiff University Triana Workflow Triana is inherently flow based Data flow - data arriving at component triggers execution Control flow - control commands trigger execution Decentralised execution Data or Control messages sent along communication “pipes” from sender to receiver causes receiver to execute Synchronous or Asynchronous messaging (Implementation dependant) Multiple inputs can block or trigger immediately (Component designer defined)
9
Matthew Shields, Cardiff University Components and Definitions Component is unit of execution Components are defined in XML files: Naming information Input and output ports Parameter information Why Components? To simplify the application design process and to speed up application development The component model provides an infrastructure for the interaction of components
10
Matthew Shields, Cardiff University Taskgraph Internal object based workflow graph representation Taskgraph - DAG Tasks Connections External XML representation Simple XML syntax List of participating Task definitions Parent/Child connection Hierarchical (Compound components) Alternative Languages & Syntax e.g. BPEL4WS Available through pluggable readers & writers.
11
Matthew Shields, Cardiff University Workflow No explicit language support for control constructs Loops and execution branching handled by components Loop component - controls loop over sub-workflow Logical component - control workflow branching Unlike BPEL4WS or similar Flexibility of control - constraint based loops etc…
12
Matthew Shields, Cardiff University Distributing Triana Workflow Deploying Remote Services on Resources Service application installation Service execution Service discovery Mapping tasks or groups of tasks to Services Workflow rewiring, XML definition for connections modified for remote location - sub-workflows duplicated Data distribution, annotated sub-sections of taskgraph passed to resources
13
Matthew Shields, Cardiff University GEO 600 Inspiral Search Background Compact binary stars orbiting each other in a close orbit among the most powerful sources of gravitational waves As the orbital radius decreases a characteristic chirp waveform is produced - amplitude and frequency increase with time until eventually the two bodies merge together Computing Need 10 Gigaflops to keep up with real time data (modest search..) Data 8kHz in 24-bit resolution (stored in 4 bytes) -> Signal contained within 1 kHz = 2000 samples/second divided into chunks of 15 minutes in duration (i.e. 900 seconds) = 8MB Algorithm Data is transmitted to a node Node initialises i.e. generates its templates (around 10000) fast correlates its templates with data
14
Matthew Shields, Cardiff University Coalescing Binary Search GEO 600 Coalescing Binary Search Algorithm implemented as a Triana workflow
15
Matthew Shields, Cardiff University Coalescing Binary Scenario Gridlab Test-bed GW Data Distributed Storage Logical File Name CB Search Controller GAT (GRMS, Adaptive) GW Data GAT (Data Management) Submit Job Optimised Mapping Email, SMS notification
16
Matthew Shields, Cardiff University GRMS Web Service rage1.man.poznan.pl Gridlab Testbed GAP Triana Service Job Submission
17
Matthew Shields, Cardiff University Triana GRMS Component Front end to GridLab GRMS Web Service Job Submission Service - interfaces with GRAM GAP Web Service binding + GSI Authentication Java CoG Kit X509 Certificate handling Axis authentication & communication GRMS executes applications on GridLab Testbed Heterogeneous hardware platforms Default software - Globus 2.4, GSISSH, cc, cvs, c++, F90, make, perl, mpicc
18
Matthew Shields, Cardiff University Service Composition Workflow Multiple GRMS Components Install Applications (ftp, tar, ant) Start installed Triana Services
19
Matthew Shields, Cardiff University Dynamic Distributed Workflow Distribution units are standard Triana tools, enabling users to create their own custom distributions Distribution Unit WaveGrapher Gaussian FFT Gaussian FFT Remote Services Local Triana The workflow is cloned/split/rewired to achieve the required distribution topology Custom distribution units allow sub- workflows to be distributed in parallel or pipelined
20
Matthew Shields, Cardiff University Conclusion Gridlab Test-bed GW Data Distributed Storage Logical File Name CB Search Controller GAT (GRMS, Adaptive) GW Data GAT (Data Management) Submit Job Optimised Mapping Email, SMS notification
21
Matthew Shields, Cardiff University Conclusion Shown three distinct workflows Service composition workflow to submit grid jobs that deploys multiple Triana Services on remote resources Local scientific workflow representing the algorithm Dynamic distributed workflow - rewire local workflow for data parallelism across multiple Triana Services GAP API Web Service binding + GSI - Grid Job Submission P2PS binding - service discovery + service communication Combined to perform parallel scientific computation
22
Matthew Shields, Cardiff University Thanks ! The Astronomers: Prof. B Sathyaprakash, David Churches, Roger Philp and Craig Robinson The Triana team: Ian Wang, Andrew Harrison, Omer Rana, Diem Lam and Shalil Majithia All the partners in the GridLab project
23
Matthew Shields, Cardiff University Thanks ! Information & Software http://www.trianacode.org/ http://www.gridlab.org/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.