1 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective:

1 1 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective: constraint that at most E energy is used  We make the simplifying assumptions that all jobs have the same (unit) amount of work  Optimal job selection policy?

2 2 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective: constraint that at most E energy is used  We make the simplifying assumptions that all jobs have the same (unit) amount of work  In this case the optimal job selection policy is First Come First Served.  We thus focus on speed setting policy.  wlog assume, r 1 ≤ r 2 ≤ … ≤ r n

3 Warm up Exercise: n unit jobs released at time 0  How much energy to devote to each job? 3

4 Warm up Exercise: n unit jobs released at time 0  How much energy to devote to each job?  Min Σ_i (n-i+1)/s_i  Subject to Σ_i s_i^2 <= E  Either by Lagrange multipliers or intuitive reasoning, s_i ≈ (n_i+1)^(1/3) 4

5 Convex Program with Release Times  Min Σ_i (C_i – r_i)  Subject to  Σ_i (C_i – max(r_{i}, C_{i-1})^2 <= E  C_i > C_{i-1} 5

6 6 KKT Optimality Conditions(2) Consider a strictly-feasible convex differentiable program A sufficient condition for a solution x to be optimal is the existence of Lagrange multipliers λ i such that

7 7 KKT Optimality Conditions  Total energy of E is used  C i < r i+1 implies ρ i = ρ n  C i > r i+1 implies ρ i = ρ i+1 + ρ n  C i = r i+1 implies ρ n ≤ ρ i ≤ ρ i+1 + ρ n  Example: P n = p 7 2p n 3p n pnpn pnpn p2p2 P 2 + p n r3r3 r2r2 r6r6 r7r7

8 Offline Algorithm 8

9 9 KKT Optimality Conditions  Algorithmic Difficulties:  This doesn’t tell us the value of ρ n  Solution: Binary search  Don’t know the value of p i when C i = r i+1  Solution: Can calculate since you know interval when job runs  Don’t know if C i r i+1  Easy for high energy E, C i < r i+1  Solution: Trace out optimal schedules as E decreases p2p2 p 2 + p n r3r3 r2r2

10 10 Algorithmic Evolution < = < > = > High Energy Low Energy > < r2r2 r3r3 Configurations

11 11 Intuition  Intuitively as you lose energy, should jobs run faster or slower?

12 12 Intuition  Intuitively as you lose energy, jobs should run slower, but this intuition is false  Example:  Higher energy: p 1 =2p 3 and p 2 = p 3  Lower energy: p 1 = 3p 3 and p 2 = 2 p 3  p 1 /p 2 decreases and job 2 speeds up as we lose energy

13 13 What Goes Wrong With Arbitrary Work Jobs ≤ = ≤ ≥ ≤ Arbitrary length Open Question: What is the complexity of finding optimal flow time schedules when jobs have arbitrary work? Optimal scheudule is not a continuous function of energy E

14 14  Theorem: There is no O(1)-competitive online algorithm for the bounded energy problem  Proof Idea: How much energy do you give the first job that arrives?  If it is not an Ω(E) then you are not O(1)- competitive

15 15 Energy/Flow Trade-Off Problem Definition [AF06]  Job i has release date r i and work y i  Optimize total flow + ρ * energy used  Natural interpretation: User specifies an energy amount ρ that he is willing to spend to get a unit improvement in response  e.g. If the user is willing to spend 1 ergs of energy for a 3 microsecond improvement in response, then ρ=3.  wlog, ρ=1.

16 One job example 16

17 Natural Policies  Job Selection?  Speed Scaling? 17

18 Natural Policies  Job Selection: SRPT  Speed Scaling: Power = Number of unfinished jobs  Increase in energy objective = increase in flow objective 18

19 19 Bansal, Chan, Pruhs, SODA 2009  We consider allowing an arbitrary power function P(s)  Only require something like P is piece-wise smooth, e.g. Power P(s) Speed s

20 20 Our Results  Main Theorem: SRPT + variation of natural speed scaling algorithm is 3-competitive for the objective of flow+energy for an arbitrary power function P(s) and for arbitrary work jobs  Later improved to 2-competitive by Lachlan L. H. Andrew, Adam Wierman and Ao Tang.  Second Theorem: HDF + variation of natural speed scaling algorithm is 2-competitive for the objective of fractional weighted flow+energy for essentially any power function P(s) and for arbitrary work and weight jobs  Probably not possible to get such a result for integral weighted flow since to be competitive for weighted flow requires resource/speed augmentation [BC09]

21 Equation for Amortized Local Competitiveness Argument  P on (t) + N on (t) + d Φ (t)/dt ≤ c [ P opt (t) + N opt (t) ]  P(t) is the power at time t  N(t) is the unfinished jobs at time t  on is the online algorithm  opt is the adversary/optimal  Φ (t) is the potential function  c is the competitive ratio 21

22  Initial Equation: P on + N on + d Φ /dt ≤ 2 [ P opt + N opt ]  Since P on = N on, 2Pon + d Φ /dt ≤ 2 [ P opt + N opt ]  Guess potential Φ is a function of N = N on – N opt so job arrivals do not affect potential function  Worst case is when N opt = 0, or equivalently when N = N on  By the chain rule d Φ /dt=d Φ /dN dN/dt. Note dN/dt=(s opt – s on ), 2P on + d Φ /dN * (s opt – s on ) ≤ 2 * P opt  Solving for Φ gives d Φ /dN ≤ 2 [ P opt – P on ] /(s opt – s on )  The term [ P opt – P on ] /(s opt – s on ) looks like the slope of P(s) around the point s=s on.  So set d Φ /dN = 2 dP(s on )/ds = 2 dP(P -1 (N))/ds  Or equivalently, Φ = 2∫ 0 N dP(P -1 (N))/ds dy Derivation of Potential Function for Unit Work Jobs 22 Power P Speed s

1. Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor 2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor We define and address one algorithmic multiprocessor power management problem

24 Outline of the Talk 24 1. Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

25 Speed-up Curves  Each portion of work/code has a speed up function that specifies how fast work is processed as a function of the number of processors assigned. 25 Rate work is processed Number of Processors 1 Parallel speed-up curve Sequential speed-up curve Arbitrary speed-up curve

26  Definition: Nonclairvoyant means that the scheduling algorithm doesn’t know the speed- up curves or the work of the jobs  Question: What is the most natural scheduling algorithm if one doesn’t know the speed-up curves? 26

27  Definition: Nonclairvoyant means that the scheduling algorithm doesn’t know the speed- up curves or the work of the jobs  Question: What is the most natural scheduling algorithm if one doesn’t know the speed-up curves?  Answer: Equipartition (round-robin, processor sharing) which assigns an equal number of processors to each job 27

1. Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor 2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

29  Question: What else can a nonclairvoyant algorithm do besides sharing the processor equally among the jobs that might be better? 29

30  Question: What else can a nonclairvoyant algorithm do besides sharing the processor equally among the jobs that might be better?  Answer: Late Arrival Processor Sharing (LAPS) which shares the processors equally among the latest arriving δn jobs  Theorem [EP09]: LAPS is 1+εspeed O(1)- competitive  Note that 1+εspeed is required to be competitive even if the online algorithm knows the speed-up curves and the work of each job 30

31 Analysis of Equipartition and LAPS  Key Lemma: The worst case is if all work is parallel or sequential 31

2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

33 Outline of the Talk 33 2. Speed scaling on a uniprocessor

34  Second most natural nonclairvoyant online algorithm  Job selection: LAPS  Recall LAPS shares the processing power equally among the latest arrive constant fraction of the jobs  Speed scaling: Power = number of unfinished jobs  Theorem [CELLMP09]: The above algorithm is O(1)-competitive for total flow time plus energy  Proof: amortized local competitiveness argument 34

1. Nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor 2. Speed scaling on a uniprocessor SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

SPAA 2009: Speed scaling and nonclairvoyant scheduling of jobs with arbitrary speed-up curves on a multiprocessor

37 Problem we address in SPAA 2009 paper  Nonclairvoyantly scheduling and speed scaling on a multiprocessor so as to minimize total flow time plus energy 37 Schedule Job 1 Height = speed Processor 1 Processor 2

38 Warm-up Problem  Question: If you are running a single parallel job on m processors, what should the speed s be? 38

39 Warm-up Problem  Question: If you are running a single parallel job on m processors, what should the speed s be?  Answer:  To optimize flow+energy you should equate flow and energy, since x+y = Θ(max(x, y))  Thus you want the rate of increase of flow (which is the number of unfinished jobs) to equal the rate of increase in energy (which is power P).  Therefore total power = mP=1  P=1/m or equivalently s = 1/m 1/3 since P = s 3  Note that the total power used is independent of the number of machines 39

40  Impossibility Theorem: There is no algorithm who competitiveness scales reasonably with the number of processors  Proof: Consider an instance of a single job that is either parallel or sequential 40

41  Impossibility Theorem: There is no algorithm who competitiveness scales reasonably with the number of processors  Proof: Consider an instance of a single job that is either parallel or sequential  If you run the job on few processors, then either the flow or energy must be much higher than optimal in the case that the job is parallel  If you run the job on many processors, then either the flow or the energy must be much higher than optimal in the case that the job is parallel 41

42  Question: If you have only one job that you know is either parallel or sequential, how would you schedule it? 42

43  Question: If you have only one job that you know is either parallel or sequential, how would you schedule it?  Answer:  Run one copy at speed 1 on one processor and one copy at speed 1/m 1/3 on the rest of the processors  This is O(1)-competitive  This is allowable if the job has no side effects 43

44  Impossibility Theorem: There is no algorithm whose competitiveness scales reasonably with the number of processors if jobs can have side effects 44

45  Candidate algorithm for 1 job instance: Run one copy on 2 i processors at power 1/2 i  This candidate algorithm works if all portions of the job has a single speed up curve 45

46  Candidate algorithm for 1 job instance: Run one copy on 2 i processors at power 1/2 i  Impossibility Theorem: The candidate algorithm, and no other possible nonclairvoyant algorithm, has a competitive ratio that scales reasonably with the number of processors  Proof: Consider a jobs where the speed-up curves of the different portions of the jobs change over time 46

47  Question: If you have only one job where different portions have different speed up curves, how would you schedule it? 47

48  Question: If you have only one job where different portions have different speed up curves, how would you schedule it?  Answer:  Run one copy on 2 i processors at power 1/2 i checkpointing constantly  This is O(log m) competitive 48

49  Impossibility Theorem: There is no algorithm who competitiveness scales reasonably with the number of processors if jobs can have side effects or can not be checkpointed  But we can get a positive result if jobs don’t have side effects and we can checkpoint cheaply 49

50  Main Theorem: If jobs have no side effects and are checkpointable then there a nonclairvoyant scheduling and speed scaling algorithm that is O(log m) competitive for the objective of flow + energy  Corollary: O(1)-competitiveness is possible for clairvoyant online algorithms 50

51  Main Theorem: If jobs have no side effects and are checkpointable then there a nonclairvoyant scheduling and speed scaling algorithm that is O(log m) competitive for the objective of flow + energy  Algorithm Description:  Scheduling: LAPS, plus run copies at a faster rate on fewer processors in case the jobs are not parallel, checkpointing constantly  Speed Scaling: Natural algorithm that equates power and number of unfinished jobs 51

52 Algorithm Analysis  Key Lemma: The worst case is if every speed up curve is parallel up to some number of processors and then is sequential  Contrast this to the fixed speed processor case where the worst case is if all work is either parallel or sequential  The rest of the analysis is a reduction to the single processor case 52 Rate work is processed Number of Processors Speed up curve

