Scheduling in computational grids with reservations Denis Trystram LIG-MOAIS Grenoble University, France AEOLUS, march 9, 2007.

Slides:



Advertisements
Similar presentations
Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Advertisements

Scheduling Criteria CPU utilization – keep the CPU as busy as possible (from 0% to 100%) Throughput – # of processes that complete their execution per.
CPU Scheduling Tanenbaum Ch 2.4 Silberchatz and Galvin Ch 5.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Efficient Scheduling Algorithms for resource management Denis Trystram ID-IMAG CSC, Toulouse, june 23, 2005.
Thomas Moscibroda Distributed Systems Research, Redmond Onur Mutlu
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Algorithmic Game Theory and Scheduling Eric Angel, Evripidis Bampis, Fanny Pascual IBISC, University of Evry, France GOTha, 12/05/06, LIP 6.
ISE480 Sequencing and Scheduling Izmir University of Economics ISE Fall Semestre.
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
Spring, Scheduling Operations. Spring, Scheduling Problems in Operations Job Shop Scheduling. Personnel Scheduling Facilities Scheduling.
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Presented by: Priti Lohani
1 IOE/MFG 543 Chapter 3: Single machine models (Sections 3.1 and 3.2)
1 Ecole Polytechnque, Nov 7, 2007 Scheduling Unit Jobs to Maximize Throughput Jobs:  all have processing time (length) = 1  release time r j  deadline.
What we will cover…  CPU Scheduling  Basic Concepts  Scheduling Criteria  Scheduling Algorithms  Evaluations 1-1 Lecture 4.
A. Frank - P. Weisberg Operating Systems CPU Scheduling.
Delay Efficient Sleep Scheduling in Wireless Sensor Networks Gang Lu, Narayanan Sadagopan, Bhaskar Krishnamachari, Anish Goel Presented by Boangoat(Bea)
Scheduling Parallel Task
Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Minimizing Makespan and Preemption Costs on a System of Uniform Machines Hadas Shachnai Bell Labs and The Technion IIT Tami Tamir Univ. of Washington Gerhard.
1 The Santa Claus Problem (Maximizing the minimum load on unrelated machines) Nikhil Bansal (IBM) Maxim Sviridenko (IBM)
CPU-Scheduling Whenever the CPU becomes idle, the operating system must select one of the processes in the ready queue to be executed. The short term scheduler.
Chapter 6: CPU Scheduling
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
Bold Stroke January 13, 2003 Advanced Algorithms CS 539/441 OR In Search Of Efficient General Solutions Joe Hoffert
EE 685 presentation Distributed Cross-layer Algorithms for the Optimal Control of Multi-hop Wireless Networks By Atilla Eryılmaz, Asuman Özdağlar, Devavrat.
Scheduling of Parallel Jobs In a Heterogeneous Multi-Site Environment By Gerald Sabin from Ohio State Reviewed by Shengchao Yu 02/2005.
Approximation schemes Bin packing problem. Bin Packing problem Given n items with sizes a 1,…,a n  (0,1]. Find a packing in unit-sized bins that minimizes.
Resource Provisioning based on Lease Preemption in InterGrid Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing and Distributed Systems.
Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
1 Server Scheduling in the L p norm Nikhil Bansal (CMU) Kirk Pruhs (Univ. of Pittsburgh)
Scheduling. Alternating Sequence of CPU And I/O Bursts.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
Scheduling policies for real- time embedded systems.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
For Wednesday No reading No homework There will be homework for Friday, as well the program being due – plan ahead.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
1 Grid Scheduling Cécile Germain-Renaud. 2 Scheduling Job –A computation to run on a machine –Possibly with network access e.g. input/output file (coarse.
Chapter 5: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 5: CPU Scheduling Basic Concepts Scheduling Criteria.
1 Job Scheduling for Grid Computing on Metacomputers Keqin Li Proceedings of the 19th IEEE International Parallel and Distributed Procession Symposium.
Outline Introduction Minimizing the makespan Minimizing total flowtime
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
Multi-users, multi-organizations, multi-objectives : a single approach Denis Trystram (Grenoble University and INRIA) Collection of results of 3 papers.
1 Uwe Schwiegelshohn, 2 Andrei Tchernykh, 1 Ramin Yahyapour 1 Technische Universität Dortmund, Germany
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Rounding scheme if r * j  1 then r j := 1  When the number of processors assigned in the continuous solution is between 0 and 1 for each task, the speed.
Static Process Scheduling
Analysis of cooperation in multi-organization Scheduling Pierre-François Dutot (Grenoble University) Krzysztof Rzadca (Polish-Japanese school, Warsaw)
Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.
1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Efficient Scheduling Algorithms for resource management
Some Topics in OR.
Q-Learning for Policy Improvement in Network Routing
Chapter 6: CPU Scheduling
Chapter 5: CPU Scheduling
Objective of This Course
Efficient Approaches to Scheduling
György Dósa – M. Grazia Speranza – Zsolt Tuza:
Non-clairvoyant Precedence Constrained Scheduling
Presentation transcript:

Scheduling in computational grids with reservations Denis Trystram LIG-MOAIS Grenoble University, France AEOLUS, march 9, 2007

General Context Recently, there was a rapid and deep evolution of high-performance execution platforms: supercomputers, clusters, computational grids, global computing, … Need of efficient tools for resource management for dealing with these new systems. This talk will investigate some scheduling problems and focus on reservations.

Parallel computing today. Different kinds of platforms Clusters, collection of clusters, grid, global computing Set of temporary unused resources Autonomous nodes (P2P) Our view of grid computing (reasonable trade-off): Set of computing resources under control (no hard authentication problems, no random addition of computers, etc.)

Content Some preliminaries (Parallel tasks model) Scheduling and packing problems On-line versus off-line: batch scheduling Multi-criteria Reservations

A national french initiative: GRID5000 Several local computational grids (like CiGri) National project with shared resources and competences with almost 4000 processors today with local administration but centralized control.

Target Applications New execution supports created new applications (data-mining, bio-computing, coupling of codes, interactive, virtual reality, …). Interactive computations (human in the loop), adaptive algorithms, etc.. See MOAIS project for more details.

Scheduling problem (informally) Given a set of tasks, the problem is to determine when and where to execute the tasks (according to the precedence constraints - if any - and to the target architecture).

Central Scheduling Problem The basic problem P | prec, pj | Cmax is NP-hard [Ulmann75]. Thus, we are looking for « good » heuristics.

Central Scheduling Problem The basic problem P | prec, pj | Cmax is NP-hard [Ulmann75]. Thus, we are looking for « good » heuristics. low cost based on theoretical analysis: good approximation factor

Available models Extension of « old » existing models (delay) Parallel Tasks Divisible load

Delay: if two consecutive tasks are allocated on different processors, we have to pay a communication delay.

Delay: if two consecutive tasks are allocated on different processors, we have to pay a communication delay. If L is large, the problem is very hard (no approximation algorithm is known) L

Extensions of delay Some tentatives have been proposed (like LogP). Not adequate for grids (heterogeneity, large delays, hierarchy, incertainties)…

Parallel Tasks Extension of classical sequential tasks: each task may require more than one processor for its execution [Feitelson and Rudolph].

Job

overhead Computational area

Classification rigid tasks

Classification moldable tasks

Classification moldable tasks decreasing time

Classification moldable tasks increasing area

Classification malleable tasks

Classification malleable tasks extra overhead

Divisible load Also known as « bag of tasks »: Big amount of arbitrary small computational units.

Divisible load Also known as « bag of tasks »: Big amount of arbitrary small computational units.

Divisible load (asymptotically) optimal for some criteria (throughput). Valid only for specific applications with regular patterns. Popular for best effort jobs.

Resource management in clusters

Users queue time job

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Users queue time

Integrated approach

m

m

m

m

m

m

m

m

m

m

m

m

(strip) Packing problems The schedule is divided into two successive steps: 1.Allocation problem 2.Scheduling with preallocation (NP-hard in general [Rayward-Smith 95]).

Scheduling: on-line vs off-line On-line: no knowledge about the future We take the scheduling decision while other jobs arrive

Scheduling: on-line vs off-line Off-line: we have a finite set of works We try to find a good arrangement

Off-line scheduler Problem: Schedule a set of independent moldable jobs (clairvoyant). Penalty functions have somehow to be estimated (using complexity analysis or any prediction-measurement method like the one obtained by the log analysis).

Example Let us consider 7 MT to be scheduled on m=10 processors.

Canonical Allotment W/m 1

Canonical Allotment Maximal number of processors needed for executing the tasks in time lower than 1. 1 m

2-shelves scheduling Idea: to analyze the structure of the optimum where the tasks are either greater than 1/2 or not. Thus, we will try to fill two shelves with these tasks.

2 shelves partitioning Knapsack problem: minimizing the global surface under the constraint of using less than m processors in the first shelf. 1 m 1/2

Dynamic programming For i = 1..n // # of tasks for j = 1..m // #proc. Wi,j = min( –Wi,j-minalloc(i,1) + work(i,minalloc(i,1)) –Wi,j + work(i,minalloc(i,1)) ) work Wn,m <= work of an optimal solution but the half-sized shelf may be overloaded

2 shelves partitioning 1 m 1/2

Drop down 1 m 1/2

1 m Insertion of small tasks

Analysis These transformations donot increase the work If the 2nd shelf is used more than m, it is always possible to do one of the transformations (using a global surface argument) It is always possible to insert the « small » sequential tasks (again by a surface argument)

Guaranty The 2-shelves algorithm has a performance guaranty of 3/2+  (SIAM J. on Computing, to appear) Rigid case: 2-approximation algorithm (Graham resource constraints)

Batch scheduling Principle: several jobs are treated at once using off-line scheduling.

Principle of batch jobs arrival time

Start batch 1

Batch chaining Batch i

Batch chaining Batch i

Batch chaining Batch i

Batch chaining Batch i

Batch chaining Batch i

Batch chaining Batch iBatch i+1

Constructing a batch scheduling Analysis: there exists a nice (simple) result which gives a guaranty for an execution in batch mode using the guaranty of the off-line scheduling policy inside the batches.

Analysis [Shmoys] previous last batch last batch Cmaxr (last job) n

previous last batch last batch Cmaxr n T k D k D K-1

Proposition

Analysis Tk is the duration of the last batch On another hand, and Thus:

Application Applied to the best off-line algorithm for moldable jobs (3/2-approximation), we obtain a 3- approximation on-line batch algorithm for Cmax. This result holds also for rigid jobs (using the 2- approximation Graham resource constraints), leading to a 4-approximation algorithm.

Multi criteria Cmax is not always the adequate criterion. User point of view: Average completion time (weighted or not) Other criteria: Stretch, Asymptotic throughput, fairness, …

How to deal with this problem? Hierachal approach: one criterion after the other (Convex) combination of criteria Transforming one criterion in a constraint Better - but harder - ad hoc algorithms

A first solution Construct a feasible schedule from two schedules of guaranty r for minsum and r’ for makespan with a guaranty (2r,2r’) [Stein et al.]. Instance: 7 jobs (moldable tasks) to be scheduled on 5 processors.

Schedules s and s’ Schedule s (minsum) Schedule s’ (makespan)

New schedule r’Cmax

New schedule

r’Cmax

New schedule r’Cmax Similar bound for the first criterion

Analysis The best known schedules are: 8 [Schwiegelsohn] for minsum and 3/2 [Mounie et al.] for makespan leading to (16;3). Similarly for the weighted minsum (ratio 8.53 for minsum).

Improvement We can improve this result by determining the Pareto curves (of the best compromises): (1+ )/ r and (1+ )r’ Idea: take the first part of schedule s up to r’Cmax

Pareto curve

Another way for designing better schedules We proposed [SPAA’2005] a new solution for a better bound which has not to consider explicitly the schedule for minsum (based on a dynamic framework). Principle: recursive doubling with smart selection (using a knapsack) inside each interval. Starting from the previous algorithm for Cmax, we obtain a (6;6) approximation.

Bi criteria: C max and  w i C i Generic On-line Framework [Shmoys et al.] Exponantially increasing time intervals Uses a max-weight  approximation algorithm If the optimal schedule of length d has weight w*, provides a schedule of length  d and weight  w* Yields a (4 , 4  ) approximation algorithm For moldable tasks, yields a (12, 12) approximation With the 2-shelf algorithm, yields a (6, 6) approximation [Dutot et al.]

Example for  = 2 Schedule for makespan Shortest job

Example for  = 2 "Contains more weight" t

A last trick The intervals are shaked (like in 2-opt local optimization techniques). This algorithm has been adapted for rigid tasks. It is quite good in practice, but there is no theoretical guaranty…

Reservations Motivation: Execute large jobs that require more than m processors. time

Reservations

The problem is to schedule n independent parallel rigid tasks such that the last finishing time is minimum. q m At each time t, r(t)≤m processors are not available

State of the art Most existing results deal with sequential tasks (qj=1). Without preemption: Decreasing reservations Only one reservation per machine With preemption: Optimal algorithms for independent tasks Optimal algorithms for some simple task graphs

Without reservation FCFS with backfilling

Without reservation FCFS with backfilling

Without reservation FCFS with backfilling

Without reservation FCFS with backfilling

Without reservation FCFS with backfilling

Without reservation FCFS with backfilling

Without reservation List algorithms: use available processors for executing the first possible task in the list. list algorithm FCFS with backfilling

Without reservation Proposition: list algorithm is a 2-1/m approximation. This is a special case of Graham 1975 (resource constraints), revisited by [Eyraud et al. IPDPS 2007]. The bound is tight (same example as in the well-known case in 1969 for sequential tasks).

With reservation The guaranty is not valid. This is a special case of Graham 1975 (resource constraints), revisited by [Eyraud et al. IPDPS 2007]. The bound is tight (same example as in the well-known case in 1969 for sequential tasks).

Complexity The problem is already NP-hard with no reservation. Even worse, an optimal solution with arbitrary reservation may be delayed as long as we want: Cmax*

Complexity The problem is already NP-hard with no reservation. Even worse, an optimal solution with arbitrary reservation may be delayed as long as we want: Cmax*

Complexity The problem is already NP-hard with no reservation. Even worse, an optimal solution with arbitrary reservation may be delayed as long as we want: Conclusion: can not be approximated unless P=NP, even for m=1

Two preliminary results Decreasing number of available processors Restricted reservation problem: always a given part  of the processors is available r(t) ≤ (1-  )m and for all task i, qi ≤  m. mm  m

Analysis Case 1. The same approximation bound 2-1/m is still valid Case 2. The list algorithm has a guaranty of 2/  Insight of the proof: while the optimal uses m processors, list uses only  processors with a approximation of 2…

Analysis Case 1. The same approximation bound 2-1/m is still valid Case 2. The list algorithm has a guaranty of 2/  Insight of the proof: while the optimal uses m processors, list uses only  processors with a approximation of 2… There exists a lower bound which is arbitrary close to this bound: 2/  -1 +  /2 if 2/  is an integer

Conclusion It remains a lot of interesting open problems with reservations. Using preemption Not rigid reservations Better approximation (more costly)