18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Dynamic Web Service Selection for Workflow Optimisation Lican Huang, David W. Walker, Yan Huang, Omer F. Rana Presented by Lican Huang School of Computer.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Lightweight Grid Computing Worksop 2 nd May 2006, Losehill Hall, Derbyshire Requirements and Expectations from Workflows Asif Akram e-Science Grid Technology.
SLA-Oriented Resource Provisioning for Cloud Computing
G. Alonso, D. Kossmann Systems Group
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Dynamic Web Service Selection for Workflow Optimisation Lican Huang, David W. Walker, Yan Huang, Omer F. Rana Presented by Lican Huang School of Computer.
Dynamic SLAs Discussion Omer Rana, School of Computer Science, Cardiff.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Variability Oriented Programming – A programming abstraction for adaptive service orientation Prof. Umesh Bellur Dept. of Computer Science & Engg, IIT.
CoLaB 22nd December 2005 Secure Access to Service-based Collaborative Workflow for DAME Duncan Russell Informatics Institute University of Leeds, UK.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Academic Advisor: Prof. Ronen Brafman Team Members: Ran Isenberg Mirit Markovich Noa Aharon Alon Furman.
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
Performance Evaluation
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 4 Assessing and Understanding Performance
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.
Grid Computing, B. Wilkinson, 20046c.1 Globus III - Information Services.
Vakgroep Informatietechnologie – IBCN Software Architecture Prof.Dr.ir. F. Gielen Quality Attributes & Tactics (4) Modifiability.
On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
23 September 2004 Evaluating Adaptive Middleware Load Balancing Strategies for Middleware Systems Department of Electrical Engineering & Computer Science.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.
Evaluation of a DAG with Intel® CnC Mark Hampton Software and Services Group CnC MIT July 27, 2010.
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis [1] 4/24/2014 Presented by: Rakesh Kumar [1 ]
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Holding slide prior to starting show. Grid Projects at WeSC: Synergies and Opportunities David W. Walker School of Computer Science Cardiff University.
Fault Detection Sathish S. Vadhiyar Source/Credits: From Referenced Papers.
Smita Vijayakumar Qian Zhu Gagan Agrawal 1.  Background  Data Streams  Virtualization  Dynamic Resource Allocation  Accuracy Adaptation  Research.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
Workflow Optimisation Services for e-Science Applications David W. Walker Cardiff University.
A PPARC funded project Common Execution Architecture Paul Harrison IVOA Interoperability Meeting Cambridge MA May 2004.
Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana Cardiff University, UK.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
AMH001 (acmse03.ppt - 03/7/03) REMOTE++: A Script for Automatic Remote Distribution of Programs on Windows Computers Ashley Hopkins Department of Computer.
Dynamic Invocation, Optimisation and Interoperation of Services- oriented Workflow Lican Huang, David W. Walker, Omer F. Rana, Yan Huang School of Computer.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Service Proforma Middleware Workshop. Notes Please complete as much of this proforma as possible – it will help make the workshop more informative & productive.
Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a.
ECHO A System Monitoring and Management Tool Yitao Duan and Dawey Huang.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
Reliable Web Service Execution and Deployment in Dynamic Environments * Markus Keidl, Stefan Seltzsam, and Alfons Kemper Universität Passau Passau,
Nguyen Thi Thanh Nha HMCL by Roelof Kemp, Nicholas Palmer, Thilo Kielmann, and Henri Bal MOBICASE 2010, LNICST 2012 Cuckoo: A Computation Offloading Framework.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
OPERATING SYSTEMS CS 3502 Fall 2017
September 2 Performance Read 3.1 through 3.4 for Tuesday
Provenance: Problem, Architectural issues, Towards Trust
Introduction to Load Balancing:
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Smita Vijayakumar Qian Zhu Gagan Agrawal
Presentation transcript:

18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer Science Presented by Omer F. Rana

18 May 2006CCGrid2006 Outline of Talk Background and introduction. The WOSE architecture for dynamic Web services. Performance experiments and results. Summary and conclusions.

18 May 2006CCGrid2006 The WOSE Project The Workflow Optimisation Services for e- Science Applications (WOSE). Funded by EPSRC Core e-Science Programme. Collaboration between: –Cardiff University –Imperial College (Prof John Darlington) –Daresbury Lab (Drs Martyn Guest and Robert Allan)

18 May 2006CCGrid2006 Workflow Optimisation Types of workflow optimisation –Through service selection –Through workflow re-ordering –Through exploitation of parallelism When is optimisation performed? –At design time (early binding) –Upon submission (intermediate binding) –At runtime (late binding)

18 May 2006CCGrid2006 Service Binding Models Late binding of abstract service to concrete service instance means: –We use up-to-date information to decide which service to use when there are. multiple semantically equivalent services –We are less likely to try to use a service that is unavailable.

18 May 2006CCGrid2006 Late Binding Case Search registry for all services that are consistent with abstract service description. Select optimal service based on current information, e.g, host load, etc. Execute this service. If it is not currently available then try the next best service. Doesn’t take into account time to transfer inputs to the service. In early and late binding cases we can optimise overall workflow.

18 May 2006CCGrid2006 WOSE Architecture Work at Cardiff has focused on implementing a late binding model for dynamic service discovery, based on a generic service proxy, and service discovery and optimisation services. History database Proxy Configuration script Workflow script User ConverterActiveBPEL workflow engine Web service instance Discovery Service Optimization Service Registry services (such as UDDI) Performance Monitor Service

18 May 2006CCGrid2006 Service Discovery Issues Discovery of equivalent services could be based on: –Service name. Applicable when all service providers agree on the naming of services. –Service metadata. –Service ontology. So far we have used the service name.

18 May 2006CCGrid2006 Performance-Based Service Selection In general, “performance” could refer to: –Service response time. –The availability of the service. –The accuracy of the results returned by the service. –The security of the service. In our work we have used service response time as the basis for service selection. Our approach can be readily adapted for other performance metrics.

18 May 2006CCGrid2006 Estimating Service Response Time Two methods for estimating the expected service response time: 1.Based on current performance metrics from the service hosts, e.g., load averages. 2.Based on the history of previous service invocations on the service hosts. In general, this requires a model that, for a given set of service inputs on a given service host, will return the expected service response time. So far we have used current (or very recent) performance metrics returned by the Ganglia monitoring system.

18 May 2006CCGrid2006 Estimating Service Response Time (Continued) Distributed job management systems such as Nimrod use the rate at which a computer completes jobs as an indicator of how “good” the computer is. Nimrod doesn’t distinguish between different jobs. This approach requires a substantial long-term record of job statistics in order to give satisfactory results. Same approach could be applied to dynamic invocation of Web services. This avoids need for a performance model for each Web service. Such an approach will sometimes make bad decisions in individual cases, but overall should be effective.

18 May 2006CCGrid2006 Optimisation Service Workflow script Workflow deploy XSLT converter 2. Dynamic invocation through proxy 3. Service query 4. List of services Discovery Service Proxy Service 1. Request 2A. Direct invocation 3A. Direct result Web service Workflow engine WOSE client 7. List of services 8. Selected service 11. Result through proxy 9. Invoke service 10. Result 12. Result Performance Service 5. Performance query 6. Performance data WOSE can either invoke a static Web service directly (steps 2A and 3A), or a dynamic Web service (steps 2 – 11), WOSE Sequence Diagram

18 May 2006CCGrid2006 Service A Proxy service Service B Service B1 Service B2 Service B3 Service B4 Service B5 Dynamic Service Selection within a Workflow Dynamic invocation is worthwhile only for sufficiently long-running services since the performance gained must offset the overhead of service discovery and selection. Select from one of the services B1 – B5. If the selected service is not available, WOSE will automatically try the next best one.

18 May 2006CCGrid2006 Performance Experiments Is there any relationship between the current load and service response time? This will depend on how variable the load is over the duration of the service execution, as well as how the OS schedules jobs. In general, we would expect the load- response time relationship to be stronger when the service hosts are lightly loaded.

18 May 2006CCGrid2006 Experiment 1 Try to keep load constant during service execution by running N instances of a long-running computation to create a background workload. Then invoke Web service and measure response time, i.e., time from invoking dynamic service to receiving back the result. The blastall Web service was used.

18 May 2006CCGrid2006 Experiment 1: Results

18 May 2006CCGrid2006 Experiment 1: Discussion Plot shows that a higher load average results in a longer service response time. The scatter in results for any particular value of the load average is probably due to the fact that the experiments were done on a machine used by others so we could not fully control the load.

18 May 2006CCGrid2006 Experiment 2 Create a synthetic, varying background workload. Then invoke Web service and measure response time. The blastall Web service was used.

18 May 2006CCGrid2006 Experiment 2: Results

18 May 2006CCGrid2006 Experiment 2: Discussion Both experiments show a general tendency for high load averages to result in longer service response times. Large amount of scatter results from the fact that the load changes while the Web service is running. No method can predict what the future load will be, and hence any method of estimating which service host will complete execution the soonest will give the wrong answer sometimes.

18 May 2006CCGrid2006 Experiment 3 Is selection based on the current load average better than making a random selection? If services are hosted on heterogeneous machines we all have to take into account the processing speed. Thus, we base service selection on the performance factor, P, defined as:

18 May 2006CCGrid2006 Experiment 3 (continued) Run synthetic workload on one computer. Record service response time for several executions of the workflow, and compute the average. Run synthetic workload on N computers each hosting the service. Run the workflow and dynamically select the service host based on the performance factor. Do this several times and compute the average.

18 May 2006CCGrid2006 Experiment 3: Results The average service response time for the single machine was 4252 seconds. The average service response time when selecting the optimal service from three hosts was 932 seconds. Since all the machines used are of the same type, this indicates the dynamic selection based on the current load average does result in better performance.

18 May 2006CCGrid2006 Conclusions and Future Work Dynamic service selection based on the load and CPU speed can result in faster execution of a workflow. We are currently repeating the experiments using a service that performs a molecular dynamics simulation. In the future we will also investigate dynamic service selection based on performance history data, such as rate at which a host completes service requests. Would like to develop statistical model of dynamic service selection for different types of background workload.

18 May 2006CCGrid2006 Thank you. Questions?