Serve Assignment Policies

Slides:



Advertisements
Similar presentations
VARUN GUPTA Carnegie Mellon University 1 With: Mor Harchol-Balter (CMU)
Advertisements

Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
CPU Scheduling Tanenbaum Ch 2.4 Silberchatz and Galvin Ch 5.
Load Balancing of Elastic Traffic in Heterogeneous Wireless Networks Abdulfetah Khalid, Samuli Aalto and Pasi Lassila
I think your suggestion is, Can we do two things at once? Well, we’re of the view that we can walk and chew gum at the same time. —Richard Armitage, deputy.
Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms
Strategies for Implementing Dynamic Load Sharing.
01/22/09ICDCS20061 Load Unbalancing to Improve Performance under Autocorrelated Traffic Ningfang Mi College of William and Mary Joint work with Qi Zhang.
Verification & Validation
1 Chapters 8 Overview of Queuing Analysis. Chapter 8 Overview of Queuing Analysis 2 Projected vs. Actual Response Time.
1 The Effect of Heavy-Tailed Job Size Distributions on System Design Mor Harchol-Balter MIT Laboratory for Computer Science.
yahoo.com SUT-System Level Performance Models yahoo.com SUT-System Level Performance Models8-1 chapter11 Single Queue Systems.
Opportunistic Traffic Scheduling Over Multiple Network Path Coskun Cetinkaya and Edward Knightly.
An Optimal Design of the M/M/C/K Queue for Call Centers
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
CS333 Intro to Operating Systems Jonathan Walpole.
1 Task Assignment with Unknown Duration Mor Harchol-Balter Carnegie Mellon.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 17 Queueing Theory.
Simulation. Types of simulation Discrete-event simulation – Used for modeling of a system as it evolves over time by a representation in which the state.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Estimation and Confidence Intervals Chapter 9.
System Provisioning.
M/G/1 Queue & Renewal Reward Theory
Managing Waiting Lines & Queuing Systems
Chapter 7 Confidence Interval Estimation
Sample Size Determination
IEE 380 Review.
Load Balancing and Data centers
Networks and Operating Systems: Exercise Session 2
CHAPTER 8 Operations Scheduling
Dynamic Graph Partitioning Algorithm
Deadline Scheduling and Heavy tail distributionS
Chapter 6: Temporal Difference Learning
CPU Scheduling Algorithms
Queueing Theory What is a queue? Examples of queues:
Uniprocessor Scheduling
assemble-to-order (ATO) A system that produces standard modules to be
Chapter 2 Scheduling.
Uniprocessor Scheduling
Scheduling Preemptive Policies
Scheduling Non-Preemptive Policies
Sample Mean Distributions
Chapter 2.2 : Process Scheduling
Scheduling Preemptive Policies
Lecture 24: Process Scheduling Examples and for Real-time Systems
Lecture 23: Process Scheduling for Interactive Systems
Birth-Death Process Birth – arrival of a customer to the system
Generalized Jackson Networks (Approximate Decomposition Methods)
Queueing Theory Carey Williamson Department of Computer Science
CPU Scheduling Basic Concepts Scheduling Criteria
Chapter 5: CPU Scheduling
Andy Wang Operating Systems COP 4610 / CGS 5765
OverView of Scheduling
System Performance: Queuing
Scheduling Adapted from:
I. Statistical Tests: Why do we use them? What do they involve?
COMP60611 Fundamentals of Parallel and Distributed Systems
Outline Scheduling algorithms Multi-processor scheduling
Operating System 9 UNIPROCESSOR SCHEDULING
Chapter 6: Temporal Difference Learning
Chapter 8 Confidence Intervals.
COMP60621 Designing for Parallelism
CSE 451: Operating Systems Winter 2003 Lecture 6 Scheduling
Carey Williamson Department of Computer Science University of Calgary
Shortest-Job-First (SJR) Scheduling
M/G/1/MLPS Queue Mean Delay Analysis
Mean Delay Analysis of Multi Level Processor Sharing Disciplines
Uniprocessor Scheduling
Processor Sharing Queues
STA 291 Spring 2008 Lecture 14 Dustin Lueker.
CSE 451: Operating Systems Winter 2001 Lecture 6 Scheduling
Presentation transcript:

Serve Assignment Policies

Task Assignment Problem – (1) Task assignment: Incoming jobs need to be assigned to one of many (independent) servers, each one with its own queue Comparison to single queue feeding multiple servers Task assignment policy ≠ scheduling policy, which deals with the order in which a given server selects jobs for service from its own queue Scheduling policy General Poisson(λ) Task assignment policy General Poisson(λ) M/G/k vs?

Task Assignment Problem – (2) Goal is typically to minimize the system response time Best assignment policy is typically dependent on the servers’ scheduling policy, e.g., FCFS or PS Two main settings Optimize task assignment given a server scheduling policy Jointly optimize task assignment and scheduling policies Task assignment is most relevant when jobs have highly variable service time distributions

Task Assignment to FCFS Servers Given k FCFS servers Poisson arrivals of rate λ General service time distribution G with mean 1/μ How do we assign jobs to servers to minimize response time? Main parameters Jobs sizes are known or unknown (distribution is known) Policy is static or dynamic (based on current server states)

Task Assignment Policies (FCFS Servers) Policies given unknown job sizes Random (static): Job is sent to server i with probability 1/k (k independent M/G/1 queues with Poisson arrivals of rate λ/k) Round-robin (static): Job j is sent to server i, job j+1 is sent to server (i+1) mod k (k independent Ek/G/1 queues with rate λ/k – more regular than Poisson) – equalizes expected number of jobs in each queue Join-the-Shortest-Queue – JSQ (dynamic): Job is sent to server with smallest number of jobs in the queue (equalizes instantaneous number of jobs at each queue) M/G/k: As JSQ, it is dynamic, but has the added advantage of deferring decisions to the last moment Policies when job sizes are known Least-Work-Left – LWL (dynamic): Assign job to server with the smallest amount of unfinished work (can be shown equivalent to M/G/k) – equalizes total work at each server Size-Interval-Task-Assignment – SITA (static): Jobs of size in a given interval are all sent to the same server (reduces service time variability at individual servers) Finding the best thresholds is hard, and often favors highly unbalanced load (bounded-Pareto with α<1 calls for a light load for the “small jobs” server, while bounded-Pareto with α>1 instead favors keeping the “large jobs” server lightly loaded) Typically superior to other policies, including LWL, but not always (keeping servers busy, as LWL does, can yield better outcomes)

SITA vs. LWL SITA typically outperforms LWL, some times by an order of magnitude when the squared coefficient of variation C2 of the service time is sufficiently large But the outcome can reverse when C2 becomes extremely large (C2 → ∞) Assume a 2-server system, under any cutoff value for SITA, the first server sees a finite variance, but the second still sees infinite variance. Hence, the overall mean response time is still infinite Under LWL and assuming that the load is light, the odds of getting two big jobs sufficiently close to each other to affect performance is very low. Hence, one server remains available to serve short jobs, which keep the variance bounded

Exploring SITA (Problem 24.1) Two-server system with Poisson arrivals of rate λ, and service times with Pareto distribution with parameter α = 2.5 SITA threshold of x (jobs with size < x go to server 1, and remaining jobs got to server 2 Initially, as x increases, more jobs are assigned to server 1(increases its load while decreasing load and service variability at server 2) Consider scenario with λ = 0.5, i.e., ρ ≈ 0.417 Optimal x ≈ 1.727 ρ(S<1.727) ≈ 0.466 ρ(S1.727) ≈ 0.36 t x

Task Assignment to PS Servers All policies used for FCFS are still available, but the outcome is often different Random and SITA perform similarly, when SITA balances load (best option) SITA’s main advantage (reducing service time variability) disappears under PS JSQ will typically (highly variable services) outperform LWL JSQ is greedy and assigns jobs to server that is likely to yield the lowest response time It turns out to be close to “optimal” (within a few percent of a policy that assigns jobs to servers based on minimizing the response time of all jobs currently in the system, assuming that are no more arrivals)