X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, 10-12-2010, Shanghai, China An Effective Framework for Handling Recoverable Temporal Violations.

Slides:



Advertisements
Similar presentations
A Lightweight Platform for Integration of Mobile Devices into Pervasive Grids Stavros Isaiadis, Vladimir Getov University of Westminster, London {s.isaiadis,
Advertisements

Grid Applications: Trends, Requirement, and evalution Prof. Qian Depei Director, Sino-German Joint Software Institute (JSI) Beihang University
School of Computing FACULTY OF ENGINEERING Grids and QoS Grid Computing has emerged in the last two decades, initially as a model for large-scale, resource-intensive.
SLA-Oriented Resource Provisioning for Cloud Computing
Designing a DTC Verification System Jennifer Mahoney NOAA/ESRL 21 Feb 2007.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
st International Conference on Parallel Processing (ICPP)
Data Management in Cloud Workflow Systems Dong Yuan Faculty of Information and Communication Technology Swinburne University of Technology.
A Cost-Effective Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen Swinburne University.
CLAG 2004 – April/041 A Workflow-based Architecture for e- Learning in the Grid Luiz A. Pereira, Fábio A. Porto, Bruno Schulze, Rubens N. Melo
Improving Robustness in Distributed Systems Jeremy Russell Software Engineering Honours Project.
CCGrid 2014 Panel: Architect Cloud and HPC for Big Data Era Rajkumar Buyya Cloud Computing and Distributed Systems (CLOUDS) Lab Dept. of Computer Science.
SPRING 2011 CLOUD COMPUTING Cloud Computing San José State University Computer Architecture (CS 147) Professor Sin-Min Lee Presentation by Vladimir Serdyukov.
New Challenges in Cloud Datacenter Monitoring and Management
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Client/Server Grid applications to manage complex workflows Filippo Spiga* on behalf of CRAB development team * INFN Milano Bicocca (IT)
X. Liu, J. Chen, Z. Wu, Z. Ni, D. Yuan, Y. Yang, CCGrid10, , Melbourne, Australia Handling Recoverable Temporal Violations in Scientific Workflow.
Dr. Xiao Liu Sessional Lecturer, Research Fellow Centre of SUCCESS Swinburne University of Technology Melbourne, Australia Overview: Cloud Computing and.
Introduction of CS3 and Research in Workflow Technology Program Xiao Liu CS3, Swinburne University of Technology Melbourne, Australia.
June Amsterdam A Workflow Bus for e-Science Applications Dr Zhiming Zhao Faculty of Science, University of Amsterdam VL-e SP 2.5.
Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)
June 10, 2007, IDAR’07, Beijing, China 1 Ensuring Consistent Termination of Composite Web Services An Liu 1,2,3 and Qing Li 2,3 1 Department of Computer.
A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms Jia Yu and Rajkumar Buyya Grid Computing and Distributed.
Cactus Computational Frameowork Freely available, modular, environment for collaboratively developing parallel, high- performance multi-dimensional simulations.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: Dennis Hoppe (HLRS) ATOM: A near-real time Monitoring.
Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Xiao Liu CITR - Centre for Information Technology Research Swinburne University of Technology, Australia Temporal Verification in Grid/
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
Xiao Liu CS3 -- Centre for Complex Software Systems and Services Swinburne University of Technology, Australia Key Research Issues in.
CCGrid 2003, Tokyo, Japan GridFlow: Workflow Management for Grid Computing Junwei Cao ( 曹军威 ) C&C Research Labs, NEC Europe Ltd., Germany Stephen A. Jarvis.
Graph Data Management Lab, School of Computer Science Add title here: Large graph processing
Xiao Liu, Jinjun Chen, Ke Liu, Yun Yang CS3: Centre for Complex Software Systems and Services Swinburne University of Technology, Melbourne, Australia.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Cascading Payment Content Exchange (CasPaCE) Framework for P2P Networks Gurleen Arora Supervisors: Dr. M. Hanneghan & Prof. M. Merabti Networked Appliances.
DV/dt - Accelerating the Rate of Progress towards Extreme Scale Collaborative Science DOE: Scientific Collaborations at Extreme-Scales:
Agent-Based Hybrid Intelligent Systems and Their Dynamic Reconfiguration Zili Zhang Faculty of Computer and Information Science Southwest University
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Streamflow - Programming Model for Data Streaming in Scientific Workflows Chathura Herath.
Service-oriented Resource Broker for QoS-Guaranteed in Grid Computing System Yichao Yang, Jin Wu, Lei Lang, Yanbo Zhou and Zhili Sun Centre for communication.
Enabling Self-management of Component-based High-performance Scientific Applications Hua (Maria) Liu and Manish Parashar The Applied Software Systems Laboratory.
Xiao Liu 1, Yun Yang 1, Jinjun Chen 1, Qing Wang 2, and Mingshu Li 2 1 Centre for Complex Software Systems and Services Swinburne University of Technology.
A Grid-enabled Multi-server Network Game Architecture Tianqi Wang, Cho-Li Wang, Francis C.M.Lau Department of Computer Science and Information Systems.
Light Weight Grid Platform: Design Methodology Vladimir Getov University of Westminster.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
Xiao Liu, Jinjun Chen, Yun Yang CS3: Centre for Complex Software Systems and Services Swinburne University of Technology, Melbourne, Australia {xliu, jchen,
Agent-Based Grid Load-Balancing Daniel P. Spooner University of Warwick, UK Junwei Cao NEC Europe Ltd., Germany.
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
Collection and storage of provenance data Jakub Wach Master of Science Thesis Faculty of Electrical Engineering, Automatics, Computer Science and Electronics.
Distributed Geospatial Information Processing (DGIP) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
IPDPS 2003, Nice, France Agent-Based Grid Load Balancing Using Performance-Driven Task Scheduling Junwei Cao (C&C Research Labs, NEC Europe Ltd., Germany)
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
Jérémie Sublime Sonia Yassa Development of meta-heuristics for workflow scheduling based on quality of service requirements 1.
Wolfgang Runte Slide University of Osnabrueck, Software Engineering Research Group Wolfgang Runte Software Engineering Research Group Institute.
Optimization of Industrial Water Networks
Astronomical Data Processing & Workflow Scheduling in cloud
iGrid Aron Kondoro – University of Dar-es-Salaam - Tanzania
Amity University, Noida, India
Probability-based Evolutionary Algorithms
Liang Chen Advisor: Gagan Agrawal Computer Science & Engineering
Supporting Fault-Tolerance in Streaming Grid Applications
Model-Driven Analysis Frameworks for Embedded Systems
SDM workshop Strawman report History and Progress and Goal.
An Adaptive Middleware for Supporting Time-Critical Event Response
Resource Allocation in a Middleware for Streaming Data
Presentation transcript:

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China An Effective Framework for Handling Recoverable Temporal Violations in Scientific Workflows Xiao Liu 1, Zhiwei Ni 2, Zhangjun Wu 2, Dong Yuan 1, Jinjun Chen 1, Yun Yang 1 1 SUCCESS ( Centre for Computing and Engineering Software Systems ), Swinburne University of Technology Melbourne, Australia 2 Institute of Intelligent Management, Hefei University of Technology Hefei, China

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Outline > Background – Workflow Technology Group – SwinDeW Family, SwinGrid, SwinCloud > Brief Overview: Workflow Temporal QoS Support > Handling Temporal Violations in Scientific Workflows – Problem Analysis – An Effective Light-Weight Handling Framework – Two-Stage Local Workflow Rescheduling Strategy > Evaluation > Summary 2

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Workflow Technology Group Overview >WT group is a part of SUCCESS ( Centre for Computing and Engineering Software Systems), a Tier-1 university research centre at Swinburne University of Technology. Our group conducts research into workflow technologies for complex software systems and services including peer-to- peer, grid, and cloud computing based e-science, e-business, transactional and inter-organisational workflows. 3 Leader: Prof Yun Yang Visitors (7-8/09): Prof Lee Osterweil Prof. Lori Clarke Researchers: Dr Jinjun Chen (Senior Lecture) Xiao Liu (PostDoc) Dong Yuan (PhD) Gaofeng Zhang (PhD) Wenhao Li (PhD ) Dahai Cao (PhD) Xuyun Zhang (PhD) Others: Prof Ryszard Kowalczyk Prof Chengfei Liu Dr Jun Yan (Wollongong) Prof Hai Jin (HUST) Prof Mingshu Li (ISCAS) Prof Qing Wang (ISCAS) Prof Zhiwei Ni (HFUT) Prof Jinpeng Huai (BUAA)

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China SwinDeW Family >SwinDeW – Swinburne Decentralised Workflow - foundation prototype based on p2p –SwinDeW – past –SwinDeW-A (for Agents) – ARC DP06 –SwinDeW-G (for Grid) – past –SwinDeW-V (for Verification) – current (ARC DP) –SwinDeW-C (for cloud) – current (ARC LP) –Others: SwinDeW-B / -S / -P / -G – past >Current Projects: –ARC DP , Cost effective storage of massive intermediate data in cloud computing applications, Duration: –ARC LP , Novel cloud computing based on workflow technology for managing large numbers of process instances, Duration:

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China SwinGrid to SwinCloud 5

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Outline > Background – Workflow Technology Group – SwinDeW Family, SwinGrid, SwinCloud > Brief Overview: Workflow Temporal QoS Support > Handling Temporal Violations in Scientific Workflows – Problem Analysis – An Effective Light-Weight Handling Framework – Two-Stage Local Workflow Rescheduling Strategy > Evaluation > Summary 6

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Scientific Workflows >Scientific Workflow often underlies many large-scale complex e- science applications such as climate modeling, astrophysics, structural biology and chemistry, earth quake simulation and disaster recovery. >Scientific workflows are usually deployed in distributed high performance computing infrastructures such as cluster, grid and cloud. >Compared with conventional business workflows, most scientific workflow are more data and/or computation intensive, less human interaction, large scale, complex process structures.

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Temporal QoS Support for Scientific Workflows >Motivation: most e-science applications are time constrained with global temporal constraints (deadlines) and local temporal constraints (milestones) to achieve some pre-defined goals on schedule. >Basic requirements: automation and cost-effectiveness. >Challenges: highly dynamic system environments, changing process structures, charge for the usage of resources >Solution: A Novel Probabilistic Temporal Framework and Its Strategies for Cost-Effective Delivery of High QoS in Scientific Cloud Workflow Systems [PhD Thesis - Xiao Liu]

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Lifecycle Support of Temporal QoS

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Lifecycle Support of Temporal QoS >At workflow build-time modeling stage –Component 1: temporal constraint setting Forecasting activity durations [eScience08], [JSS10b] Setting both coarse-grained and fine-grained temporal constraints [BPM08], [CCPE09], [JCSS10] –Component 2: temporal consistency monitoring Temporal checkpoint selection [ICSE08], [TAAS07] Temporal verification [CCPE07], [ToSEM09] –Component 3: temporal violation handling Temporal violation handling point selection [TSE] Temporal violation handling [CCGrid], [JSS10a], [TSE], [ICPADS]

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Outline > Background – Workflow Technology Group – SwinDeW Family, SwinGrid, SwinCloud > Brief Overview: Workflow Temporal QoS Support > Handling Temporal Violations in Scientific Workflows – Problem Analysis – An Effective Light-Weight Handling Framework – Two-Stage Local Workflow Rescheduling Strategy > Evaluation > Summary 11

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Problem Analysis >Basic requirements: automation and cost-effectiveness > 1) How to define fine-grained recoverable temporal violations. –Define statistical recoverable and non-recoverable temporal violations, to avoid heavy-weight exception handling strategies and facilitate light-weight ones –Divide fine-grained recoverable temporal violations, to facilitate the choice of different handling strategies with different capability (higher capability, higher cost) > 2) Which light-weight effective exception handling strategies to be facilitated. –Employ or design a set of light-weight handling strategies, from low capability to high capability (low cost to high cost)

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China An Effective Light-Weight Handling Framework >Three levels of temporal violations –Level I, Level II and Level III >Corresponding three levels of temporal violation handling strategies –TDA, ACOWR and TDA+ACOWR

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Three Levels of Handling Strategies >TDA (Time Deficit Allocation) [CCPE07] –TDA is to actively propagate small time deficits to the subsequent workflow activities so that they may be compensated by their saved execution time. >ACOWR (Ant Colony Optimisation based Workflow Rescheduling) [CCGrid10] –Based on our general two-stage local workflow rescheduling strategy –Using ACO as the metaheuristic algorithm >TDA+ACOWR (the hybrid strategy of TDA and ACOWR) –One time TDA and multiple times of ACOWR (normally smaller than 3)

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China A General Two-Stage Workflow Local Rescheduling Strategy >Handling temporal violations with workflow rescheduling >Key objective: reduce or ideally remove the time deficit at the current checkpoint, i.e. to reduce the execution time of the subsequent activities after the checkpoint in the violated workflow segment as much as possible >Requirement 1: fighting good balance between time deficit compensation and the completion time of other activities (workflow activities and general tasks, with or without temporal constraints) – from the overall makespan perspective >Requirement 2: utilising available resources in the system rather than recruiting additional resources – from the overall cost perspective 15

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Integrated Task Resource List 16

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China 17 Pseudo-code for An Abstract Strategy

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Outline > Background – Workflow Technology Group – SwinDeW Family, SwinGrid, SwinCloud > Brief Overview: Workflow Temporal QoS Support > Handling Temporal Violations in Scientific Workflows – Problem Analysis – An Effective Light-Weight Handling Framework – Two-Stage Local Workflow Rescheduling Strategy > Evaluation > Summary 18

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Evaluation >Performance analysis and comparison (with GA) for ACOWR –Optimisation on Total Makespan –Optimisation on Total Cost –Time Compensation on Violated Workflow Segment –CPU Time >Effectiveness evaluation of the three-level handing framework –Violation Rate of Global Temporal Constraints and Local Temporal Constraints –Cost Analysis

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Optimisation on Total Makespan 20

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Optimisation on Total Cost 21

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Time Compensation on Violated Workflow Segment 22

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China CPU Time 23

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Experiment Results on Temporal Violation Rates 24

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Cost Analysis

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Outline > Background – Workflow Technology Group – SwinDeW Family, SwinGrid, SwinCloud > Brief Overview: Workflow Temporal QoS Support > Handling Temporal Violations in Scientific Workflows – Problem Analysis – An Effective Light-Weight Handling Framework – Two-Stage Local Workflow Rescheduling Strategy > Evaluation > Summary 26

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China Summary >Temporal QoS Support is Critical in e-Science Applications >Temporal Violation Handling in Scientific Workflows –Automatic, Cost-Effective –Level I, Level II and Level III –TDA, ACOWR, TDA+ACOWR >A Two-Stage Workflow Local Rescheduling Strategy ACO, GA, PSO, many other metaheuristics >Future Work –Data movement cost –More scheduling algorithms 27

X. Liu, Z. Ni, Z. Wu, D. Yuan, J. Chen, Y. Yang, ICPADS10, , Shanghai, China The End – Thank You! >Any questions or comments? > >Website: >An extension of this paper, titled “A Novel General Framework for Automatic and Cost-Effective Handling of Recoverable Temporal Violations in Scientific Workflow Systems,” has been accepted by Journal of Systems and Software (JSS), 28