1 / 27

Application of Methods of Queuing Theory to Scheduling in GRID

Explore the use of queuing theory in scheduling for GRID systems, highlighting its relevance and benefits. Simplified models, mathematical concepts, and optimization problems are discussed.

Download Presentation

Application of Methods of Queuing Theory to Scheduling in GRID

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Application of Methods of Queuing Theory to Scheduling in GRID A Queuing Theory-based mathematical model is presented, and an explicit form of the optimal control procedure obtained as the solution to the problem of maximizing the system throughput.

  2. Why Queuing Theory? • Indeed, there are queues in real GRIDs • The services GRIDs offer to end users much resemble the services offered by telephone networks, the typical subject of study in Queuing Theory • The complexity of the associated processes leaves little options but to use the probabilistic techniques

  3. Complexity: The Principal Limiting Factor to Modeling • GRIDs are very complicated systems themselves • GRIDs are composed of smaller complicated systems • Computer hardware • Networks • Software • GRIDs are embedded into the larger complicated systems: • Scientific organizations • R&D activities • Globalization processes

  4. Stopping Decomposition as Soon as Possible to Avoid Unnecessary Complexity • Demarcate the phenomena specific to scheduling in GRID, and the generic phenomena • Model complicated behavior of the components with probabilistic techniques • Find the most general expression of the effects

  5. Ultimate Stopper of Decomposition No entity in the modeled system should be decomposed, if the system persists when that entity is replaced with another similar one.

  6. Implications • There is no need to develop detailed models of computers, networks, software or interaction external to GRID • There is no need to model the intra-GRID interaction, which does not directly affect scheduling • Information about how long it will take to process a demand on each node is all we need to know about the demand.

  7. Mathematical Concepts Involved • Probability • Poisson Process • Multivariate Distribution • Linear Programming • Convergence “By Law”

  8. Simplified Model: • There is a finite number of classes of demands (all demands from the same class have equal complexity) • Sub-Model of Structure: • Set of N nodes with queues • Sub-Model of Flow of Demands • Poisson process of arrivals with intensity  • M classes of demands • Sub-Model of Scheduling Procedure • Recognizes distinct classes of demands and routes the demands to the nodes it chooses

  9. Sub-Model: Structure

  10. Sub-Model: Flow of Demands • Demands from class j arrive with intensity j= pj (1 +…+m= ) • Upon arrival, a demand from class j is routed to node i with probability si,j • A demand from class j requires i,junits of processing time, if routed to node i • The computing time is “incompressible”: processing two demands with complexities T1 and T2 at a particular node requires T1+T2 time units independently of the order (or level of parallelism) in which they are processed

  11. Two Important Facts About Poisson Processes • Let X1 and X2 be independent Poisson processes with intensity 1 and 2.Then X1+ X2 is a Poisson process with intensity 1+2. • Suppose a Poisson process X with intensity  is split into X1 and X2. With probability p events are passed to X1 and otherwise to X2. Then X1 and X2 are Poisson processes with intensities p and (1-p).

  12. Flow of Demands & Scheduling Procedure

  13. Sub-Model: Scheduling Procedure • The GRID operates in a stable environment • Routing of any demand in each moment depends on the current state of the system only • For all nodes load i<1  • The system can operate in the stationary mode • The stationary mode is stable

  14. Stationary Mode

  15. Implications of Stationary Operation • Incoming demands of class j are routed to node i with stationary probability si,j • Load of node i has the form i =   si,j i,j pj < 1

  16. Optimization Problem

  17. Linear Programming • It is possible to rewrite the constraints in the folowing form: • ’i =  si,j i,j pj • ’i  ’ • ’min • Now it is an LP problem

  18. From Simplified to Real-World Model • How to handle non-discrete distributions of demands? • How to handle errors in classification (imperfect information)? • What about non-stationary modes? • Short-term excesses are not fatal because of stability • Long-term changes in distribution of demands can render the S.P. non-optimal

  19. Approximating Actual Distribution of Demands with A Discrete Distribution

  20. A Better Approximation

  21. Simplified s is a matrix s: NxM[0,1] : NxM[0,) i =   si,j i,j pj Marginal s is a function si: RM[0,1] : multivariate random value (in RM ) i = E isi() What Happens When M?

  22. Handling Imperfect Information • Average values of i,j can be used • The scheduling procedure should be iteratively re-evaluated when more information becomes available • In the real world applications, the exact distribution of demands is unknown, but can be approximated from the history of the system operation

  23. A Comparison • Let  be an exponentially distributed random value with average 1 • i,j =1+ • Trivial procedure distributes demands with equal probability to any node • An optimized procedure is obtained as shown

  24. Scheduling: Trivial vs. Optimized Maximum Throughput Optimized Trivial Num. of Nodes

  25. Conclusions • The exact upper bound of throughput for a given GRID can be estimated • A scheduling procedure which achieves this limit can be constructed from a solution of an LP problem • The optimal scheduling procedure should be non-deterministic • Trivial and deterministic schedulers are generally unlikely to achieve the theoretical limit

  26. References • L. Kleinrock, “Queueing Systems”, 1976 • Andrei Dorokhov, “Simulation simple models and comparison with queueing theory”http://csdl.computer.org/comp/proceedings/hpdc/2003/1965/00/19650034abs.htm • Atsuko Takefusa, Osamu Tatebe, Satoshi Matsuoka, Youhei Morita, “Performance Analysis of Scheduling and Replication Algorithms on Grid Datafarm Architecture for High-Energy Physics Applications” • GNU Linear Programming Kit, http://www.fsf.org

  27. My Special Thanks To: • Dr. V.A. Ilyin for directing my work in the field of GRID systems • Prof. A.N. Shiryaev for directing my work in the Theory of Probability

More Related