Programming Scientific and Distributed Workflow with Triana Services Matthew Shields, GGF10 Workflow Workshop, 9th March.

Slides:



Advertisements
Similar presentations
Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
Advertisements

Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Under the Hood of a Workflow Manager Matthew Shields, BiodiversityWorld GRID workshop, NeSC, 30 June - 1 July T r a ai n.
Programming Workflow with Triana Services Matthew Shields, SC4DEVO Workshop, July 2004.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 2.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Resource wrappers, web services, grid services Jaspreet Singh School of Computer.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
CompuNet Grid Computing Milena Natanov Keren Kotlovsky Project Supervisor: Zvika Berkovich Lab Chief Engineer: Dr. Ilana David Spring, /
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Instrumentation and Measurement CSci 599 Class Presentation Shreyans Mehta.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
GMD German National Research Center for Information Technology Innovation through Research Jörg M. Haake Applying Collaborative Open Hypermedia.
SUN HPC Consortium, Heidelberg 2004 Grid(Lab) Resource Management System (GRMS) and GridLab Services Krzysztof Kurowski Poznan Supercomputing and Networking.
SOA, BPM, BPEL, jBPM.
Grid Workflow within Triana Ian Wang Cardiff University.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Triana Dr Matthew Shields, School of Computer Science, Cardiff University A Problem Solving Environment and Grid Workflow Tool.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
RUNNING PARALLEL APPLICATIONS BEYOND EP WORKLOADS IN DISTRIBUTED COMPUTING ENVIRONMENTS Zholudev Yury.
DISTRIBUTED COMPUTING
Cluster Reliability Project ISIS Vanderbilt University.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Triana: Service-Oriented Examples Ian Taylor Cardiff University, and the Center for Computation and Technology LSU.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
NeSC Apps Workshop July 20 th, 2002 Customizable command line tools for Grids Ian Kelley + Gabrielle Allen Max Planck Institute for Gravitational Physics.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
23:48:11Service Oriented Cyberinfrastructure Lab, Grid Portals Fugang Wang April 29
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
© 2008 Open Grid Forum Independent Software Vendor (ISV) Remote Computing Primer Steven Newhouse.
Workflow and Triana Services Matthew Shields, e-Science Workflow Services, 3-5 December T r a ai n.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Interoperability between Scientific Workflows Ahmed Alqaoud, Ian Taylor, and Andrew Jones Cardiff University 10/09/2008.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Constructing Data Mining Applications based on Web Services Composition Ali Shaikh Ali and Omer Rana
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
PROGRESS: ICCS'2003 GRID SERVICE PROVIDER: How to improve flexibility of grid user interfaces? Michał Kosiedowski.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
GSFL: A Workflow Framework for Grid Services Sriram Krishnan Patrick Wagstrom Gregor von Laszewski.
Ian Taylor, Cardiff Work-Flow Application Toolkit Eger Meeting Ian Taylor & Ian Wang Cardiff University, UK.
Holding slide prior to starting show. A Portlet Interface for Computational Electromagnetics on the Grid Maria Lin and David Walker Cardiff University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
What is Triana?. GAPGAP Triana Distributed Work-flow Network Action Commands Workflow, e.g. BPEL4WS Triana Engine Triana Controlling Service (TCS) Triana.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Distributed Computing With Triana A Short Course Matthew Shields, Ian Taylor & Ian Wang.
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
Using Federated Services with Triana Matthew Shields Cardiff University.
A Demonstration of Collaborative Web Services and Peer-to-Peer Grids Minjun Wang Department of Electrical Engineering and Computer Science Syracuse University,
Glen Dobson, Lancaster University Service Grids Workshop NeSC Edinburgh 23/7/04 Endpoint Services Glen Dobson Lancaster University,
Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer Science Faculty of Information Technology.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
On Using BPEL Extensibility to Implement OGSI and WSRF Grid Workflows Aleksander Slomiski Presented by Onyeka Ezenwoye CIS Advanced Topics in Software.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Holding slide prior to starting show. Processing Scientific Applications in the JINI-Based OGSA-Compliant Grid Yan Huang.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Overview on the work performed during EPIKH Training Faiza MEDJEK /INFN, CATANIA 1.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Unified Modeling Language
A General Approach to Real-time Workflow Monitoring
GGF10 Workflow Workshop Summary
Presentation transcript:

Programming Scientific and Distributed Workflow with Triana Services Matthew Shields, GGF10 Workflow Workshop, 9th March

Matthew Shields, Cardiff University Presentation Outline Triana Overview Triana services and their distribution Distribution policies The GAP interface and its relation to the Gridlab GAT Scientific Workflow Binary Inspiral Algorithm Example Dynamic Distributed Workflow Service Composition on the Grid Service Usage, dynamically distributing a Triana workflow Conclusion

Matthew Shields, Cardiff University What is Triana?

Matthew Shields, Cardiff University GAPGAP Any GAP service e.g. Web service Triana Distributed Work-flow Network Action Commands Workflow, e.g. BPEL4WS Triana Engine Triana Controlling Service (TCS) Triana Service & Engine Triana Service & Engine Other Engine Distributed Triana Work-flow - flexible distribution: based around Triana Groups - HPC and Pipelined distribution Triana Gateway

Matthew Shields, Cardiff University GAP Overview based around a series of Java interface classes Concrete implementations that form the GAP bindings The core interface is the Service Creation and Discovery Pipe Creation and Discovery Message Communication Information Job Submmission Data Management - transfers - logical lookup Will be become an adapter for the GridLab Java GAT, providing: Advertisement, Discovery, deployment and communication of services GRMS job submission adapter Data Management Services

Matthew Shields, Cardiff University Jxtaserve GSI Enabled NS-2 And more.. Java GAT Prototype Jxta GridLab GAT ( Advertising Discovery Communication GAP (Java Prototype) Web Services P2PS Job Submission (GRMS) Generic Job Submission Virtual filename data access Data Management Set of generic Java interfaces high level abstractions to Grid services Factory design – dynamic pluggable services OGSA (planned)

Matthew Shields, Cardiff University Triana Prototype Distributed Triana Prototype Based around Triana Groups i.e. aggregate tools Each group can be distributed Distribution policies: HTC - high throughput/task farming Pipeline - allow node to node communication Each service can be a gateway to finer granularities of distribution: Pipeline Distribution Task-Farming Distribution Triana Service

Matthew Shields, Cardiff University Triana Workflow Triana is inherently flow based Data flow - data arriving at component triggers execution Control flow - control commands trigger execution Decentralised execution Data or Control messages sent along communication “pipes” from sender to receiver causes receiver to execute Synchronous or Asynchronous messaging (Implementation dependant) Multiple inputs can block or trigger immediately (Component designer defined)

Matthew Shields, Cardiff University Components and Definitions Component is unit of execution Components are defined in XML files: Naming information Input and output ports Parameter information Why Components? To simplify the application design process and to speed up application development The component model provides an infrastructure for the interaction of components

Matthew Shields, Cardiff University Taskgraph Internal object based workflow graph representation Taskgraph - DAG Tasks Connections External XML representation Simple XML syntax List of participating Task definitions Parent/Child connection Hierarchical (Compound components) Alternative Languages & Syntax e.g. BPEL4WS Available through pluggable readers & writers.

Matthew Shields, Cardiff University Workflow No explicit language support for control constructs Loops and execution branching handled by components Loop component - controls loop over sub-workflow Logical component - control workflow branching Unlike BPEL4WS or similar Flexibility of control - constraint based loops etc…

Matthew Shields, Cardiff University Distributing Triana Workflow Deploying Remote Services on Resources Service application installation Service execution Service discovery Mapping tasks or groups of tasks to Services Workflow rewiring, XML definition for connections modified for remote location - sub-workflows duplicated Data distribution, annotated sub-sections of taskgraph passed to resources

Matthew Shields, Cardiff University GEO 600 Inspiral Search Background Compact binary stars orbiting each other in a close orbit among the most powerful sources of gravitational waves As the orbital radius decreases a characteristic chirp waveform is produced - amplitude and frequency increase with time until eventually the two bodies merge together Computing Need 10 Gigaflops to keep up with real time data (modest search..) Data 8kHz in 24-bit resolution (stored in 4 bytes) -> Signal contained within 1 kHz = 2000 samples/second divided into chunks of 15 minutes in duration (i.e. 900 seconds) = 8MB Algorithm Data is transmitted to a node Node initialises i.e. generates its templates (around 10000) fast correlates its templates with data

Matthew Shields, Cardiff University Coalescing Binary Search GEO 600 Coalescing Binary Search Algorithm implemented as a Triana workflow

Matthew Shields, Cardiff University Coalescing Binary Scenario Gridlab Test-bed GW Data Distributed Storage Logical File Name CB Search Controller GAT (GRMS, Adaptive) GW Data GAT (Data Management) Submit Job Optimised Mapping , SMS notification

Matthew Shields, Cardiff University GRMS Web Service rage1.man.poznan.pl Gridlab Testbed GAP Triana Service Job Submission

Matthew Shields, Cardiff University Triana GRMS Component Front end to GridLab GRMS Web Service Job Submission Service - interfaces with GRAM GAP Web Service binding + GSI Authentication Java CoG Kit X509 Certificate handling Axis authentication & communication GRMS executes applications on GridLab Testbed Heterogeneous hardware platforms Default software - Globus 2.4, GSISSH, cc, cvs, c++, F90, make, perl, mpicc

Matthew Shields, Cardiff University Service Composition Workflow Multiple GRMS Components Install Applications (ftp, tar, ant) Start installed Triana Services

Matthew Shields, Cardiff University Dynamic Distributed Workflow Distribution units are standard Triana tools, enabling users to create their own custom distributions Distribution Unit WaveGrapher Gaussian FFT Gaussian FFT Remote Services Local Triana The workflow is cloned/split/rewired to achieve the required distribution topology Custom distribution units allow sub- workflows to be distributed in parallel or pipelined

Matthew Shields, Cardiff University Conclusion Gridlab Test-bed GW Data Distributed Storage Logical File Name CB Search Controller GAT (GRMS, Adaptive) GW Data GAT (Data Management) Submit Job Optimised Mapping , SMS notification

Matthew Shields, Cardiff University Conclusion Shown three distinct workflows Service composition workflow to submit grid jobs that deploys multiple Triana Services on remote resources Local scientific workflow representing the algorithm Dynamic distributed workflow - rewire local workflow for data parallelism across multiple Triana Services GAP API Web Service binding + GSI - Grid Job Submission P2PS binding - service discovery + service communication Combined to perform parallel scientific computation

Matthew Shields, Cardiff University Thanks ! The Astronomers: Prof. B Sathyaprakash, David Churches, Roger Philp and Craig Robinson The Triana team: Ian Wang, Andrew Harrison, Omer Rana, Diem Lam and Shalil Majithia All the partners in the GridLab project

Matthew Shields, Cardiff University Thanks ! Information & Software