Jérémie Sublime Sonia Yassa Development of meta-heuristics for workflow scheduling based on quality of service requirements 1.

Slides:



Advertisements
Similar presentations
Dynamic Thread Mapping for High- Performance, Power-Efficient Heterogeneous Many-core Systems Guangshuo Liu Jinpyo Park Diana Marculescu Presented By Ravi.
Advertisements

Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.
Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Constraint Optimization We are interested in the general non-linear programming problem like the following Find x which optimizes f(x) subject to gi(x)
Hadi Goudarzi and Massoud Pedram
Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.
CS6800 Advanced Theory of Computation
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Multi-Objective Optimization NP-Hard Conflicting objectives – Flow shop with both minimum makespan and tardiness objective – TSP problem with minimum distance,
Tai, Yu-Chang 4/29/2013 Future Generation Computer Systems(FGCS.J) journal homepage: Saeid Abrishami a, ∗, Mahmoud Naghibzadeha,
Spie98-1 Evolutionary Algorithms, Simulated Annealing, and Tabu Search: A Comparative Study H. Youssef, S. M. Sait, H. Adiche
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
Multimodal Problems and Spatial Distribution Chapter 9.
On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
1 of 14 1 / 18 An Approach to Incremental Design of Distributed Embedded Systems Paul Pop, Petru Eles, Traian Pop, Zebo Peng Department of Computer and.
COST IC804 – IC805 Joint meeting, February Jorge G. Barbosa, Altino M. Sampaio, Hamid Harabnejad Universidade do Porto, Faculdade de Engenharia,
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Particle Swarm Optimization Algorithms
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
Parallel Genetic Algorithms with Distributed-Environment Multiple Population Scheme M.Miki T.Hiroyasu K.Hatanaka Doshisha University,Kyoto,Japan.
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Prepared by Barış GÖKÇE 1.  Search Methods  Evolutionary Algorithms (EA)  Characteristics of EAs  Genetic Programming (GP)  Evolutionary Programming.
A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms Jia Yu and Rajkumar Buyya Grid Computing and Distributed.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Tufts Wireless Laboratory School Of Engineering Tufts University “Network QoS Management in Cyber-Physical Systems” Nicole Ng 9/16/20151 by Feng Xia, Longhua.
 Escalonamento e Migração de Recursos e Balanceamento de carga Carlos Ferrão Lopes nº M6935 Bruno Simões nº M6082 Celina Alexandre nº M6807.
Neural and Evolutionary Computing - Lecture 10 1 Parallel and Distributed Models in Evolutionary Computing  Motivation  Parallelization models  Distributed.
Network Aware Resource Allocation in Distributed Clouds.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
A Comparison of Nature Inspired Intelligent Optimization Methods in Aerial Spray Deposition Management Lei Wu Master’s Thesis Artificial Intelligence Center.
Constrained Evolutionary Optimization Yong Wang Associate Professor, PhD School of Information Science and Engineering, Central South University
Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.
GRID’2012 Dubna July 19, 2012 Dependable Job-flow Dispatching and Scheduling in Virtual Organizations of Distributed Computing Environments Victor Toporkov.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
An Iterative Heuristic for State Justification in Sequential Automatic Test Pattern Generation Aiman H. El-MalehSadiq M. SaitSyed Z. Shazli Department.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Xiao Liu, Jinjun Chen, Ke Liu, Yun Yang CS3: Centre for Complex Software Systems and Services Swinburne University of Technology, Melbourne, Australia.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.
Doshisha Univ., Kyoto, Japan CEC2003 Adaptive Temperature Schedule Determined by Genetic Algorithm for Parallel Simulated Annealing Doshisha University,
Service-oriented Resource Broker for QoS-Guaranteed in Grid Computing System Yichao Yang, Jin Wu, Lei Lang, Yanbo Zhou and Zhili Sun Centre for communication.
Single-solution based metaheuristics. Outline Local Search Simulated annealing Tabu search …
Introduction Metaheuristics: increasingly popular in research and industry mimic natural metaphors to solve complex optimization problems efficient and.
Evaluating Meta-Scheduling Algorithms in VLAM-G Environment V.Korkhov, A.Belloum, L.O.Hertzberger FNWI, University of Amsterdam Key VLAM-G applications.
Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.
A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Genetic Algorithms. Solution Search in Problem Space.
Genetic Algorithms An Evolutionary Approach to Problem Solving.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Scheduling Algorithms Performance Evaluation in Grid Environments R, Zhang, C. Koelbel, K. Kennedy.
Genetic Algorithms And other approaches for similar applications Optimization Techniques.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Warehouse Lending Optimization Paul Parker (2016).
Multi-objective Motion Planning Presented by Khalafalla Elkhier Supervised by Dr. Yasser Fouad.
Dynamo: A Runtime Codesign Environment
Digital Optimization Martynas Vaidelys.
Meta-heuristics Introduction - Fabien Tricoire
A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids e-Science IEEE 2007 Report: Wei-Cheng Lee
Multi-Objective Optimization
Resource Allocation for Distributed Streaming Applications
Md. Tanveer Anwar University of Arkansas
Presentation transcript:

Jérémie Sublime Sonia Yassa Development of meta-heuristics for workflow scheduling based on quality of service requirements 1

Plan Introduction Related works Problem modeling Work on Algorithms Ongoing work 2

Introduction : Cloud computing Grid Computing enables the sharing exchange, selection and aggregation of geographically distributed “autonomous” resources existing under different domains. Cloud computing provides infrastructures, platforms and software as subscription-based services in a pay-as-you-go model to consumers. Functional and non-functional services of these resources are called Quality of Service (QoS) requirements. The QoS are negociated and expressed by the means of Service Level Agreements (SLAs). 3

Introduction : Cloud Computing A Workflow is a set of Tasks and dependencies. Workflow Scheduling is the process that maps and manages the process of inter- dependant tasks on different resources. 4

Introduction : Motivations & Challenges Workflow technology has been introduced to help scientists to perform their work. Scientific workflows usually contain a large number of tasks and complex data. It requires high computation power that grid, and more recently cloud computing, can provide. Deciding on an effective scientific workflow scheduling on a grid or a cloud is a difficult problem. This problem is even more difficult when several criteria to optimize have to be taken into account: Heterogeneity, dynamicity and elasticity of resources Performance constraints (minimum execution time) Resources shared between multiple users Transfer of large volumes of data Scalability, security, etc. 5

Introduction : Motivations & Challenges Workflow scheduling algorithms are classified into two types: Best-effort based scheduling : attempts to minimize the execution time ignoring other factors such as cost for access to resources and levels of satisfaction of users QoS. QoS based scheduling tries to improve performances under QoS constraints, for example, minimizing the time under budget constraints or cost minimization under time constraints. Several algorithms have already been proposed for the first category, but the second one has been less studied. 6

Introduction : Objectives Merging and adapting the existing works based on single objective optimization and Best-effort optimization, and extend it to larger workflows. Developing multi-objective optimization algorithms for SLA based workflow scheduling in a cloud environments. The scheduling algorithms will be based on either PSO or GA and must be able to: Analyze the users QoS parameters, Negotiate with service providers to establish a SLA. Map the workflow tasks on the appropriate resources so that: The execution must be completed, Users QoS constraints must be satisfied, The use of cloud resources must be optimized. Trying algorithm hybridization, as well as combinations of different algorithms to improve meta-heuristics performances. Comparing algorithms performances for SLA based workflow Scheduling 7

Related Works R. Buyya, “Economic-based Distributed Resource Management and Scheduling for Grid Computing,” Ph.D. dissertation, 2002 M. Rahman, S.Venugopal, R. Buyya, A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids, Proc. of the 3rd IEEE International Conference on e-Science and Grid Computing, 2007 Yu J., R. Buyya : Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Sci. Program, 2006 W. Chen, J. Zhang, An Ant Colony Optimization Approach to a Grid Workflow Scheduling Problem With Various QoS Requirements, IEEE Transactions on Systems,

Problem modeling : Workflow model Workflow can be modeled as a Directed Acyclic Graph (DAG) G = (V, E), where: V = {T1, T2,..., Tn} is the set of workflow tasks. E represents the data dependencies between these tasks. F j,k = (Tj, Tk) ∈ E is the data produced by TJ and consumed by TK Assumptions : A child task wait for parents task to be completed and data transfer to be done One resource can handle one task at a time The time needed to compute a given task on a given resource for a given frequency is known. The volume of data transferred between 2 tasks is known 9

Problem Modeling : Resource grid The Target environment is a set of heterogeneous resources linked with each others. It will be modeled with a DAG too. Assumptions : For each resources the following parameters are known : Reliability Ranges of available frequencies and voltages (working in pairs) Speed of transfer from and to the other resources. 10 A resource : -Frequency - Voltage - Reliability - Cost of usage data Speed 2,3 Speed 1,3 Speed 1,2 A network of resources :

Problem modeling : Formulas Makespan T = func(∑ (T exec + T transfer )) T transfert(Ra,Rb) = DataSize/Speed Ra,Rb Energy cost for executions on a resource i : E i exec = ∑(Time exec ).f.V 2 + λ.( Makespan-∑(Time exec )) Total Energy = ∑ (E i exec + E i trsf ) Speed factor for a given resource, SpeedFactor = f / f nom T exec(Ti/Rj) = NominalExecTime Ti/Rj.SpeedFactor j Error coefficient for a given resource i : Coeff i = ∑(Time exec )Reliability i Overall Reliability (%) : Reliability = exp -∑ Coeffi Theoretical Availability of a Ressource K : Av K = (Makespan - ∑ Time exec(Ti/Rk) ) /Makespan Theoretical Global Workflow Availability A = ∏Av K 11

Problem modeling : Structure of the solutions 12 Task (index)T1T2T3T4….TN ResourceR3R2R1 ….R3 (Volt,Freq)(V1,f1)(V3,f3)(V1,f1)(V2,f2)….(V1,f1) Ranking (facultative) 5432….1 A solution is composed of a series of substructures, one for each task. A Substructure contains a task, the resource assigned to the task, the voltage and frequency couple associated with the resource, and for some algorithms a ranking priority for the task. For a solution to be valid, all tasks should be assigned to a single resource. A scheduler then apply the solution and compute the various QoS : Makespan Reliability Cost Energy consumption ….

Work on Algorithms : GA 13 - Generation of a Random Population of Solutions (or retrieve population generated by another algorithm) While ( constraint conditions are not met && n<NbMaxLoop ) loop : - Select X% of best solutions based on a fitness function - Random Cross-over between the Y% best solutions to create a new population - Random mutations on the new population - Replace Old Population with the new one End loop Work to be done : Choice of the fitness function model for multi-objective optimization : penalties, Pareto front, pondered addition of fitness function of different parameters, … Choice of a cross-over method : how to mix solution ? Elitism or not ? … Mutations : How ? Rate ? Replacing old population : keeping old best solution or not ?

Work on Algorithms : PSO/DPSO 14 - Generation of a Random Population of Solutions (or retrieve population generated by another algorithm) While ( constraint conditions are not met && n<NbMaxLoop ) loop : For each Solution : - Compute Velocity V i,n = ω.V i,n-1 + φ p.r p (xbest i – X i,n-1 ) + φ g.r g (swarmbest – X i,n-1 ) - Update position X i,n = V i,n + X i,n-1 - If fitness(X i,n )>xbest i then Update xbest i - If fitness(X i,n )>swarmbest then Update swarmbest end For End loop Work to be done : Choice of the fitness function model for multi-objective optimization : penalties, Pareto front, pondered addition of fitness function of different parameters, … Choice of a discrete model for PSO : overloading the operators +,*,- and adaptation the algorithm to the solution model Choice of parameters : ω, φ p, φ g

Work on algorithms : associations and hybridization 15 Associations of algorithms : With a lot of parameters to optimize, the work can be divided and done by different algorithms depending on their performances for a given problem : Generation of a probation set of solution with a classical algorithm like HEFT and then refining with GA or PSO to optimize on several criterions Several GA or PSO in parallel, each working on a specific criteria, and then merging their work by practicing populations migrations between them. Using GA or PSO to get a set of potential solutions and then refining the results with other algorithms such as Nelder-Mead polytope Hybridization : Adding a mutation factor to PSO by using some of GA functions Adding a memory to GA by using the same features than PSO Adapting and merging ACO and PSO algorithms

Ongoing Work : Simulation Environment A C++ Application with a Qt GUI will be used to simulate workflow scheduling operations and test several meta-heuristics. The application will include : The basis implementation of workflow data structures The implementation of several heuristics and meta-heuristics Interfaces : To set up the algorithms To show and modify workflow parameters Choose appropriated QoS restrictions Display results The Task graph as well as resources graph and characteristics will be represented as matrices. The software reliability will be tested by comparing the results of already implemented algorithms with those of other universities. 16

Ongoing work : Current state of the test program 17 Work done : Basic Interfaces : implemented Scheduler : implemented Current implemented and stable algorithms : HEFT, NGAII Current GA cross-over methods : random shuffle, double-point cross over, elitism based cross-over Current fitness function models : penalties, pondered QoS Implemented QoS variables : time, energy, reliability. The cost model is not yet clearly defined. To be done : Implementation of the missing QoS Interfaces for complex combination of algorithms Multithreading of the application Implementation of missing algorithms : PSO and variants documentation

Ongoing Work : current tests 18 Comparison of different GA versions performances for different sizes of workflow. Analysis process : Generation of a pool of solutions with HEFT algorithm Improvement of the set using NGAII algorithm : Pop size : 100 Mutation rate : 5% Selection rate : 50% NbMaxLoop : 25 The process is repeated several times to have an average result. HEFT can be used as reference to compare the different algorithms performances.

19 Questions ?