Presentation is loading. Please wait.

Presentation is loading. Please wait.

Application of Methods of Queuing Theory to Scheduling in GRID A Queuing Theory-based mathematical model is presented, and an explicit form of the optimal.

Similar presentations


Presentation on theme: "Application of Methods of Queuing Theory to Scheduling in GRID A Queuing Theory-based mathematical model is presented, and an explicit form of the optimal."— Presentation transcript:

1 Application of Methods of Queuing Theory to Scheduling in GRID A Queuing Theory-based mathematical model is presented, and an explicit form of the optimal control procedure obtained as the solution to the problem of maximizing the system throughput.

2 Why Queuing Theory? Indeed, there are queues in real GRIDs The services GRIDs offer to end users much resemble the services offered by telephone networks, the typical subject of study in Queuing Theory The complexity of the associated processes leaves little options but to use the probabilistic techniques

3 Complexity: The Principal Limiting Factor to Modeling GRIDs are very complicated systems themselves GRIDs are composed of smaller complicated systems Computer hardware Networks Software GRIDs are embedded into the larger complicated systems: Scientific organizations R&D activities Globalization processes

4 Stopping Decomposition as Soon as Possible to Avoid Unnecessary Complexity Demarcate the phenomena specific to scheduling in GRID, and the generic phenomena Model complicated behavior of the components with probabilistic techniques Find the most general expression of the effects

5 Ultimate Stopper of Decomposition No entity in the modeled system should be decomposed, if the system persists when that entity is replaced with another similar one.

6 Implications There is no need to develop detailed models of computers, networks, software or interaction external to GRID There is no need to model the intra-GRID interaction, which does not directly affect scheduling Information about how long it will take to process a demand on each node is all we need to know about the demand.

7 Mathematical Concepts Involved Probability Poisson Process Multivariate Distribution Linear Programming Convergence “By Law”

8 Simplified Model: There is a finite number of classes of demands (all demands from the same class have equal complexity) Sub-Model of Structure: Set of N nodes with queues Sub-Model of Flow of Demands Poisson process of arrivals with intensity M classes of demands Sub-Model of Scheduling Procedure Recognizes distinct classes of demands and routes the demands to the nodes it chooses

9 Sub-Model: Structure

10 Sub-Model: Flow of Demands Demands from class j arrive with intensity j = p j ( 1 +…+ m = ) Upon arrival, a demand from class j is routed to node i with probability s i,j A demand from class j requires  i,j units of processing time, if routed to node i The computing time is “incompressible”: processing two demands with complexities T 1 and T 2 at a particular node requires T 1 +T 2 time units independently of the order (or level of parallelism) in which they are processed

11 Two Important Facts About Poisson Processes Let X 1 and X 2 be independent Poisson processes with intensity 1 and 2.Then X 1 + X 2 is a Poisson process with intensity 1 + 2. Suppose a Poisson process X with intensity is split into X 1 and X 2. With probability p events are passed to X1 and otherwise to X 2. Then X 1 and X 2 are Poisson processes with intensities p and (1-p).

12 Flow of Demands & Scheduling Procedure

13 Sub-Model: Scheduling Procedure The GRID operates in a stable environment Routing of any demand in each moment depends on the current state of the system only For all nodes load  i <1  The system can operate in the stationary mode The stationary mode is stable

14 Stationary Mode

15 Implications of Stationary Operation Incoming demands of class j are routed to node i with stationary probability s i,j Load of node i has the form  i =  s i,j  i,j p j < 1

16 Optimization Problem

17 Linear Programming It is possible to rewrite the constraints in the folowing form:  ’ i =  s i,j  i,j p j  ’ i   ’  ’  min Now it is an LP problem

18 From Simplified to Real-World Model How to handle non-discrete distributions of demands? How to handle errors in classification (imperfect information)? What about non-stationary modes? Short-term excesses are not fatal because of stability Long-term changes in distribution of demands can render the S.P. non-optimal

19 Approximating Actual Distribution of Demands with A Discrete Distribution

20 A Better Approximation

21 What Happens When M  ? Simplified s is a matrix s: NxM  [0,1]  : NxM  [0,  )  i =  s i,j  i,j p j Marginal s is a function s i : R M  [0,1]  : multivariate random value (in R M )  i = E  i s i (  )

22 Handling Imperfect Information Average values of  i,j can be used The scheduling procedure should be iteratively re-evaluated when more information becomes available In the real world applications, the exact distribution of demands is unknown, but can be approximated from the history of the system operation

23 A Comparison Let  be an exponentially distributed random value with average 1  i,j =1+  Trivial procedure distributes demands with equal probability to any node An optimized procedure is obtained as shown

24 Scheduling: Trivial vs. Optimized Maximum Throughput Num. of Nodes Optimized Trivial

25 Conclusions The exact upper bound of throughput for a given GRID can be estimated A scheduling procedure which achieves this limit can be constructed from a solution of an LP problem The optimal scheduling procedure should be non-deterministic Trivial and deterministic schedulers are generally unlikely to achieve the theoretical limit

26 References L. Kleinrock, “Queueing Systems”, 1976 Andrei Dorokhov, “Simulation simple models and comparison with queueing theory” http://csdl.computer.org/comp/proceedings/hpdc/2003/1 965/00/19650034abs.htm http://csdl.computer.org/comp/proceedings/hpdc/2003/1 965/00/19650034abs.htm Atsuko Takefusa, Osamu Tatebe, Satoshi Matsuoka, Youhei Morita, “Performance Analysis of Scheduling and Replication Algorithms on Grid Datafarm Architecture for High-Energy Physics Applications” GNU Linear Programming Kit, http://www.fsf.orghttp://www.fsf.org

27 My Special Thanks To: Dr. V.A. Ilyin for directing my work in the field of GRID systems Prof. A.N. Shiryaev for directing my work in the Theory of Probability


Download ppt "Application of Methods of Queuing Theory to Scheduling in GRID A Queuing Theory-based mathematical model is presented, and an explicit form of the optimal."

Similar presentations


Ads by Google