1 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective:

Slides:



Advertisements
Similar presentations
Formal Computational Skills
Advertisements

Minimizing Average Flow-Time Naveen Garg IIT Delhi Joint work with Amit Kumar, Jivi Chadha,V Muralidhara, S. Anand.
Introduction to Proofs
Price Of Anarchy: Routing
S ELFISH M IGRATE : A Scalable Algorithm for Non-clairvoyantly Scheduling Heterogeneous Processors Janardhan Kulkarni, Duke University Sungjin Im (UC Merced.
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Scheduling with Outliers Ravishankar Krishnaswamy (Carnegie Mellon University) Joint work with Anupam Gupta, Amit Kumar and Danny Segev.
Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.
Power Management Algorithms An effort to minimize Processor Temperature and Energy Consumption.
1 Better Scalable Algorithms for Broadcast Scheduling Ravishankar Krishnaswamy Carnegie Mellon University Joint work with Nikhil Bansal and Viswanath Nagarajan.
1 Stochastic Event Capture Using Mobile Sensors Subject to a Quality Metric Nabhendra Bisnik, Alhussein A. Abouzeid, and Volkan Isler Rensselaer Polytechnic.
Worst-case Fair Weighted Fair Queueing (WF²Q) by Jon C.R. Bennett & Hui Zhang Presented by Vitali Greenberg.
Ecole Polytechnique, Nov 7, Minimizing Total Completion Time Each job specified by  procesing time (length p j )  release time r j Goal: compute.
1 Single Machine Deterministic Models Jobs: J 1, J 2,..., J n Assumptions: The machine is always available throughout the scheduling period. The machine.
1 Ecole Polytechnque, Nov 7, 2007 Scheduling Unit Jobs to Maximize Throughput Jobs:  all have processing time (length) = 1  release time r j  deadline.
APPLICATIONS OF DIFFERENTIATION
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
Shuchi Chawla, Carnegie Mellon University Static Optimality and Dynamic Search Optimality in Lists and Trees Avrim Blum Shuchi Chawla Adam Kalai 1/6/2002.
Resource augmentation and on-line scheduling on multiprocessors Phillips, Stein, Torng, and Wein. Optimal time-critical scheduling via resource augmentation.
CSE 421 Algorithms Richard Anderson Lecture 6 Greedy Algorithms.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
1 Scheduling on Heterogeneous Machines: Minimize Total Energy + Flowtime Ravishankar Krishnaswamy Carnegie Mellon University Joint work with Anupam Gupta.
1 Scheduling Jobs with Varying Parallelizability Ravishankar Krishnaswamy Carnegie Mellon University.
Recap Priorities task-level static job-level static dynamic Migration task-level fixed job-level fixed migratory Baker/ Oh (RTS98) Pfair scheduling This.
Complexity 19-1 Parallel Computation Complexity Andrei Bulatov.
Competitive Analysis of Incentive Compatible On-Line Auctions Ron Lavi and Noam Nisan SISL/IST, Cal-Tech Hebrew University.
Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson.
Online Function Tracking with Generalized Penalties Marcin Bieńkowski Institute of Computer Science, University of Wrocław, Poland Stefan Schmid Deutsche.
Energy, Energy, Energy  Worldwide efforts to reduce energy consumption  People can conserve. Large percentage savings possible, but each individual has.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Minimizing Makespan and Preemption Costs on a System of Uniform Machines Hadas Shachnai Bell Labs and The Technion IIT Tami Tamir Univ. of Washington Gerhard.
Asaf Cohen (joint work with Rami Atar) Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan March 11,
Primal-Dual Meets Local Search: Approximating MST’s with Non-uniform Degree Bounds Author: Jochen Könemann R. Ravi From CMU CS 3150 Presentation by Dan.
Speed Scaling To Manage Temperature Nikhil Bansal IBM T.J. Watson Kirk Pruhs University of Pittsburgh.
Yossi Azar Tel Aviv University Joint work with Ilan Cohen Serving in the Dark 1.
Speed Scaling to Manage Energy and Temperature Nikhil Bansal (IBM Research) Tracy Kimbrel (IBM) and Kirk Pruhs (Univ. of Pittsburgh)
1 Server Scheduling in the L p norm Nikhil Bansal (CMU) Kirk Pruhs (Univ. of Pittsburgh)
Scheduling policies for real- time embedded systems.
Chapter 8 PD-Method and Local Ratio (4) Local ratio This ppt is editored from a ppt of Reuven Bar-Yehuda. Reuven Bar-Yehuda.
A Maiden Analysis of Longest Wait First Jeff Edmonds York University Kirk Pruhs University of Pittsburgh.
Competitive Queue Policies for Differentiated Services Seminar in Packet Networks1 Competitive Queue Policies for Differentiated Services William.
1 Job Scheduling for Grid Computing on Metacomputers Keqin Li Proceedings of the 19th IEEE International Parallel and Distributed Procession Symposium.
Parallel Programming with MPI and OpenMP
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Operational Research & ManagementOperations Scheduling Economic Lot Scheduling 1.Summary Machine Scheduling 2.ELSP (one item, multiple items) 3.Arbitrary.
15.082J and 6.855J March 4, 2003 Introduction to Maximum Flows.
Loss-Bounded Analysis for Differentiated Services. By Alexander Kesselman and Yishay Mansour Presented By Sharon Lubasz
© The McGraw-Hill Companies, Inc., Chapter 12 On-Line Algorithms.
Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.
The bin packing problem. For n objects with sizes s 1, …, s n where 0 < s i ≤1, find the smallest number of bins with capacity one, such that n objects.
The Base Model. Objectives of the chapter Describe the basic elements of the Basic Principal-Agent model Study the contracts that will emerge when information.
Multicast Pull Scheduling Kirk Pruhs. The Big Problem Movie Distribution Database Replication via Internet Harry Potter Book Download Software Download.
Section 15.3 Constrained Optimization: Lagrange Multipliers.
1 Approximation Algorithms for Generalized Scheduling Problems Ravishankar Krishnaswamy Carnegie Mellon University joint work with Nikhil Bansal, Anupam.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
Operating System Concepts and Techniques Lecture 6 Scheduling-2* M. Naghibzadeh Reference M. Naghibzadeh, Operating System Concepts and Techniques, First.
Scheduling Parallel DAG Jobs to Minimize the Average Flow Time K. Agrawal, J. Li, K. Lu, B. Moseley.
1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.
Chapter 8 PD-Method and Local Ratio (5) Equivalence This ppt is editored from a ppt of Reuven Bar-Yehuda. Reuven Bar-Yehuda.
Approximation Algorithms for Scheduling Lecture 11.
Linear program Separation Oracle. Rounding We consider a single-machine scheduling problem, and see another way of rounding fractional solutions to integer.
Static Optimality and Dynamic Search Optimality in Lists and Trees
On Scheduling in Map-Reduce and Flow-Shops
The Lagrange Multiplier Method
Throughput-Optimal Broadcast in Dynamic Wireless Networks
University of Pittsburgh
University of Pittsburgh
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Power-Aware Speed Scaling in Processor Sharing Systems
Presentation transcript:

1 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective: constraint that at most E energy is used  We make the simplifying assumptions that all jobs have the same (unit) amount of work  Optimal job selection policy?

2 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective: constraint that at most E energy is used  We make the simplifying assumptions that all jobs have the same (unit) amount of work  In this case the optimal job selection policy is First Come First Served.  We thus focus on speed setting policy.  wlog assume, r 1 ≤ r 2 ≤ … ≤ r n

Warm up Exercise: n unit jobs released at time 0  How much energy to devote to each job? 3

Warm up Exercise: n unit jobs released at time 0  How much energy to devote to each job?  Min Σ_i (n-i+1)/s_i  Subject to Σ_i s_i^2 <= E  Either by Lagrange multipliers or intuitive reasoning, s_i ≈ (n_i+1)^(1/3) 4

Convex Program with Release Times  Min Σ_i (C_i – r_i)  Subject to  Σ_i (C_i – max(r_{i}, C_{i-1})^2 <= E  C_i > C_{i-1} 5

6 KKT Optimality Conditions(2) Consider a strictly-feasible convex differentiable program A sufficient condition for a solution x to be optimal is the existence of Lagrange multipliers λ i such that

7 KKT Optimality Conditions  Total energy of E is used  C i < r i+1 implies ρ i = ρ n  C i > r i+1 implies ρ i = ρ i+1 + ρ n  C i = r i+1 implies ρ n ≤ ρ i ≤ ρ i+1 + ρ n  Example: P n = p 7 2p n 3p n pnpn pnpn p2p2 P 2 + p n r3r3 r2r2 r6r6 r7r7

Offline Algorithm 8

9 KKT Optimality Conditions  Algorithmic Difficulties:  This doesn’t tell us the value of ρ n  Solution: Binary search  Don’t know the value of p i when C i = r i+1  Solution: Can calculate since you know interval when job runs  Don’t know if C i r i+1  Easy for high energy E, C i < r i+1  Solution: Trace out optimal schedules as E decreases p2p2 p 2 + p n r3r3 r2r2

10 Algorithmic Evolution < = < > = > High Energy Low Energy > < r2r2 r3r3 Configurations

11 Intuition  Intuitively as you lose energy, should jobs run faster or slower?

12 Intuition  Intuitively as you lose energy, jobs should run slower, but this intuition is false  Example:  Higher energy: p 1 =2p 3 and p 2 = p 3  Lower energy: p 1 = 3p 3 and p 2 = 2 p 3  p 1 /p 2 decreases and job 2 speeds up as we lose energy

13 What Goes Wrong With Arbitrary Work Jobs ≤ = ≤ ≥ ≤ Arbitrary length Open Question: What is the complexity of finding optimal flow time schedules when jobs have arbitrary work? Optimal scheudule is not a continuous function of energy E

14  Theorem: There is no O(1)-competitive online algorithm for the bounded energy problem  Proof Idea: How much energy do you give the first job that arrives?  If it is not an Ω(E) then you are not O(1)- competitive

15 Energy/Flow Trade-Off Problem Definition [AF06]  Job i has release date r i and work y i  Optimize total flow + ρ * energy used  Natural interpretation: User specifies an energy amount ρ that he is willing to spend to get a unit improvement in response  e.g. If the user is willing to spend 1 ergs of energy for a 3 microsecond improvement in response, then ρ=3.  wlog, ρ=1.

One job example 16

Natural Policies  Job Selection?  Speed Scaling? 17

Natural Policies  Job Selection: SRPT  Speed Scaling: Power = Number of unfinished jobs  Increase in energy objective = increase in flow objective 18

19 Bansal, Chan, Pruhs, SODA 2009  We consider allowing an arbitrary power function P(s)  Only require something like P is piece-wise smooth, e.g. Power P(s) Speed s

20 Our Results  Main Theorem: SRPT + variation of natural speed scaling algorithm is 3-competitive for the objective of flow+energy for an arbitrary power function P(s) and for arbitrary work jobs  Later improved to 2-competitive by Lachlan L. H. Andrew, Adam Wierman and Ao Tang.  Second Theorem: HDF + variation of natural speed scaling algorithm is 2-competitive for the objective of fractional weighted flow+energy for essentially any power function P(s) and for arbitrary work and weight jobs  Probably not possible to get such a result for integral weighted flow since to be competitive for weighted flow requires resource/speed augmentation [BC09]

Equation for Amortized Local Competitiveness Argument  P on (t) + N on (t) + d Φ (t)/dt ≤ c [ P opt (t) + N opt (t) ]  P(t) is the power at time t  N(t) is the unfinished jobs at time t  on is the online algorithm  opt is the adversary/optimal  Φ (t) is the potential function  c is the competitive ratio 21

 Initial Equation: P on + N on + d Φ /dt ≤ 2 [ P opt + N opt ]  Since P on = N on, 2Pon + d Φ /dt ≤ 2 [ P opt + N opt ]  Guess potential Φ is a function of N = N on – N opt so job arrivals do not affect potential function  Worst case is when N opt = 0, or equivalently when N = N on  By the chain rule d Φ /dt=d Φ /dN dN/dt. Note dN/dt=(s opt – s on ), 2P on + d Φ /dN * (s opt – s on ) ≤ 2 * P opt  Solving for Φ gives d Φ /dN ≤ 2 [ P opt – P on ] /(s opt – s on )  The term [ P opt – P on ] /(s opt – s on ) looks like the slope of P(s) around the point s=s on.  So set d Φ /dN = 2 dP(s on )/ds = 2 dP(P -1 (N))/ds  Or equivalently, Φ = 2∫ 0 N dP(P -1 (N))/ds dy Derivation of Potential Function for Unit Work Jobs 22 Power P Speed s

Chan, Edmonds, Pruhs Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor 2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor We define and address one algorithmic multiprocessor power management problem

Outline of the Talk Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

Speed-up Curves  Each portion of work/code has a speed up function that specifies how fast work is processed as a function of the number of processors assigned. 25 Rate work is processed Number of Processors 1 Parallel speed-up curve Sequential speed-up curve Arbitrary speed-up curve

 Definition: Nonclairvoyant means that the scheduling algorithm doesn’t know the speed- up curves or the work of the jobs  Question: What is the most natural scheduling algorithm if one doesn’t know the speed-up curves? 26

 Definition: Nonclairvoyant means that the scheduling algorithm doesn’t know the speed- up curves or the work of the jobs  Question: What is the most natural scheduling algorithm if one doesn’t know the speed-up curves?  Answer: Equipartition (round-robin, processor sharing) which assigns an equal number of processors to each job 27

 Theorem [E99]: Equipartition is 2+εspeed O(1)-competitive  The average waiting time using Equipartition at at most a constant factor larger than the optimal schedule on processors that are slightly slower than ½ as fast 28

 Question: What else can a nonclairvoyant algorithm do besides sharing the processor equally among the jobs that might be better? 29

 Question: What else can a nonclairvoyant algorithm do besides sharing the processor equally among the jobs that might be better?  Answer: Late Arrival Processor Sharing (LAPS) which shares the processors equally among the latest arriving δn jobs  Theorem [EP09]: LAPS is 1+εspeed O(1)- competitive  Note that 1+εspeed is required to be competitive even if the online algorithm knows the speed-up curves and the work of each job 30

Analysis of Equipartition and LAPS  Key Lemma: The worst case is if all work is parallel or sequential 31

Outline of the Talk Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor 2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

Outline of the Talk Speed scaling on a uniprocessor

 Second most natural nonclairvoyant online algorithm  Job selection: LAPS  Recall LAPS shares the processing power equally among the latest arrive constant fraction of the jobs  Speed scaling: Power = number of unfinished jobs  Theorem [CELLMP09]: The above algorithm is O(1)-competitive for total flow time plus energy  Proof: amortized local competitiveness argument 34

Outline of the Talk Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor 2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

Outline of the Talk 36 SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

Problem we address in SPAA 2009 paper  Nonclairvoyantly scheduling and speed scaling on a multiprocessor so as to minimize total flow time plus energy 37 Schedule Job 1 Height = speed Processor 1 Processor 2

Warm-up Problem  Question: If you are running a single parallel job on m processors, what should the speed s be? 38

Warm-up Problem  Question: If you are running a single parallel job on m processors, what should the speed s be?  Answer:  To optimize flow+energy you should equate flow and energy, since x+y = Θ(max(x, y))  Thus you want the rate of increase of flow (which is the number of unfinished jobs) to equal the rate of increase in energy (which is power P).  Therefore total power = mP=1  P=1/m or equivalently s = 1/m 1/3 since P = s 3  Note that the total power used is independent of the number of machines 39

 Impossibility Theorem: There is no algorithm who competitiveness scales reasonably with the number of processors  Proof: Consider an instance of a single job that is either parallel or sequential 40

 Impossibility Theorem: There is no algorithm who competitiveness scales reasonably with the number of processors  Proof: Consider an instance of a single job that is either parallel or sequential  If you run the job on few processors, then either the flow or energy must be much higher than optimal in the case that the job is parallel  If you run the job on many processors, then either the flow or the energy must be much higher than optimal in the case that the job is parallel 41

 Question: If you have only one job that you know is either parallel or sequential, how would you schedule it? 42

 Question: If you have only one job that you know is either parallel or sequential, how would you schedule it?  Answer:  Run one copy at speed 1 on one processor and one copy at speed 1/m 1/3 on the rest of the processors  This is O(1)-competitive  This is allowable if the job has no side effects 43

 Impossibility Theorem: There is no algorithm whose competitiveness scales reasonably with the number of processors if jobs can have side effects 44

 Candidate algorithm for 1 job instance: Run one copy on 2 i processors at power 1/2 i  This candidate algorithm works if all portions of the job has a single speed up curve 45

 Candidate algorithm for 1 job instance: Run one copy on 2 i processors at power 1/2 i  Impossibility Theorem: The candidate algorithm, and no other possible nonclairvoyant algorithm, has a competitive ratio that scales reasonably with the number of processors  Proof: Consider a jobs where the speed-up curves of the different portions of the jobs change over time 46

 Question: If you have only one job where different portions have different speed up curves, how would you schedule it? 47

 Question: If you have only one job where different portions have different speed up curves, how would you schedule it?  Answer:  Run one copy on 2 i processors at power 1/2 i checkpointing constantly  This is O(log m) competitive 48

 Impossibility Theorem: There is no algorithm who competitiveness scales reasonably with the number of processors if jobs can have side effects or can not be checkpointed  But we can get a positive result if jobs don’t have side effects and we can checkpoint cheaply 49

 Main Theorem: If jobs have no side effects and are checkpointable then there a nonclairvoyant scheduling and speed scaling algorithm that is O(log m) competitive for the objective of flow + energy  Corollary: O(1)-competitiveness is possible for clairvoyant online algorithms 50

 Main Theorem: If jobs have no side effects and are checkpointable then there a nonclairvoyant scheduling and speed scaling algorithm that is O(log m) competitive for the objective of flow + energy  Algorithm Description:  Scheduling: LAPS, plus run copies at a faster rate on fewer processors in case the jobs are not parallel, checkpointing constantly  Speed Scaling: Natural algorithm that equates power and number of unfinished jobs 51

Algorithm Analysis  Key Lemma: The worst case is if every speed up curve is parallel up to some number of processors and then is sequential  Contrast this to the fixed speed processor case where the worst case is if all work is either parallel or sequential  The rest of the analysis is a reduction to the single processor case 52 Rate work is processed Number of Processors Speed up curve