1 / 43

RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor

Paul Regnier † George Lima † Ernesto Massa † Greg Levin ‡ Scott Brandt ‡ † Federal University of Bahia ‡ University of California Brazil Santa Cruz.

rupali
Download Presentation

RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paul Regnier† George Lima† Ernesto Massa† Greg Levin‡ Scott Brandt‡ †Federal University of Bahia‡ University of California Brazil Santa Cruz RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor

  2. Multiprocessors • Most high-end computers today have multiple processors • In a busy computational environment, multiple processes compete for processor time • More processors means more scheduling complexity 2

  3. Real-Time Multiprocessor Scheduling Real-time tasks have workload deadlines Hard real-time = “Meet all deadlines!” Problem: Schedule a set of periodic, hard real-time tasks on a multiprocessor systems so that all deadlines are met.

  4. Example: EDF on One Processor • On a single processor, Earliest Deadline First (EDF) is optimal (it can schedule any feasible task set) 2 units of work for every 10 units of time Task 1: 6 units of work for every 15 units of time Task 2: 10 units of work for every 25 units of time Task 3: CPU 4 time = 0 10 20 30

  5. Scheduling Three Tasks Example: 2 processors; 3 tasks, each with 2 units of work required every 3 time units job release deadline Task 1 Task 2 Task 3 1 2 3 time = 0 5

  6. Global Schedule Example: 2 processors; 3 tasks, each with 2 units of work required every 3 time units • Task 1 migrates between processors CPU 1 CPU 2 1 2 3 time = 0 6

  7. Taxonomy of Multiprocessor Scheduling Algorithms Uniprocessor Algorithms Dedicated Global Algorithms EDF LLF Uniprocessor Multiprocessor Optimal Algorithms Globalized Uniprocessor Algorithms pfair Partitioned Algorithms LLREF EKG DP-Wrap Partitioned EDF Global EDF 7

  8. Problem Model n tasks running on m processors A periodic taskT = (p, e)requires a workloade be completed within each periodof length p T's utilizationu = e / p is the fraction of each period that the task must execute job release job release job release job release workloade p time = 0 2p 3p periodp

  9. Assumptions Processor Identity: All processors are equivalent Task Independence: Tasks are independent Task Unity: Tasks run on one processor at a time Task Migration: Tasks may run on different processors at different times No overhead: free context switches and migrations In practice: built into WCET estimates

  10. The Big Goal (Version 1) Design an optimal scheduling algorithm for periodic task sets on multiprocessors A task set is feasibleif there exists a schedule that meets all deadlines A scheduling algorithm is optimal if it can always schedule any feasible task set

  11. Necessary and Sufficient Conditions Any set of tasks needing at most 1 processor for each task ( for all i, ui ≤ 1 ) , and mprocessors for all tasks ( ui ≤ m) is feasible Status: Solved pfair (1996) was the first optimal algorithm

  12. The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations ( Finding feasible schedule with fewest migrations is NP-Complete )

  13. The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations Status: Ongoing… All existing improvements over pfair use some form of deadline partitioning…

  14. Deadline Partitioning Task 1 Task 2 Task 3 Task 4 14

  15. Deadline Partitioning CPU 1 CPU 2 15

  16. The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations Status: Ongoing… All existing improvements over pfair use some form of deadline partitioning… … but all the activity in each time window still leads to a large amount of preemptions and migrations Our Contribution: The first optimal algorithm which does not depend on deadline partitioning, and which therefore has much lower overhead 16

  17. The RUN Scheduling Algorithm RUN uses a sequence of reduction operations to transform a multiprocessor problem into a collection of uniprocessor problems Uniprocessor scheduling is solved optimally by Earliest Deadline First (EDF) Uniprocessor schedules are transformed back into a single multiprocessor schedule A reduction operation is composed of two steps: packing and dual

  18. A collection of tasks with total utilization at most one can be packed into a single fixed-utilization task: The Packing Operation Task 1: u = 0.1 Task 2: u = 0.3 Task 3: u = 0.4 u = 0.8 Packed Task:

  19. We divide time with the packed tasks’ deadlines, and schedule them with EDF. In each segment, consume according to total utilization. Scheduling a Packed Task’s Clients Task 1: p=5 , u=.1 Task 2: p=3 , u=.3 Task 3: p=2 , u=.4 EDF schedule of tasks 19

  20. The packed task temporarily replaces (acts as a proxy for) its clients. It may not be periodic, but has a fixed utilization in each segment. Defining a Packed Task Task 1: p=5 , u=.1 Task 2: p=3 , u=.3 Task 3: p=2 , u=.4 EDF schedule of tasks Packed Task 20

  21. The dual of a task T is a task T* with the same deadlines, but complementary utilization / workloads: The Dual of a Task T: u = 0.4, p = 3 T*: u = 0.6, p = 3 21

  22. The Dual of a System Given a system of n tasks and m processors, assume full utilization ( ui = m ) The dual system consists of n dual tasks running on n-m processors Note that the dual utilizations sum to (1-ui )=n - ui = n-m If n < 2m, the dual system has fewer processors The dual task T* represents the idle time of T. T* is executed in the dual system precisely when T is idle, and vice versa. 22

  23. The Dual of a System Task 1 Task 2 Task 3 23

  24. Task 1 Task 2 Task 3 Dual System 1 2 3 time = 0 The Dual of a System 24

  25. Original System Dual System 1 2 3 time = 0 The Dual of a System • Because each dual task represents the idle time of its primal task, scheduling the dual system is equivalent to scheduling the original system. 25

  26. The RUN Algorithm Reduction = Packing + Dual Keep packing until remaining tasks satisfy: ui +uj > 1 for all tasks Ti , Tj Schedule of Dual System Schedule for “Primal” System Replace packed tasks in schedule with their clients, ordered by EDF 26

  27. RUN: A Simple Example • Step 1: Pack tasks until all pairs have utilization > 1 Task 1: u=.2 Task 12: u=.8 Task 2: u=.6 Task 3: u=.3 Task 4: u=.4 Task 5: u=.5 27

  28. RUN: A Simple Example • Step 1: Pack tasks until all pairs have utilization > 1 Task 12: u=.8 Task 3: u=.3 Task 34: u=.7 Task 4: u=.4 Task 5: u=.5 28

  29. Task 12: u=.8 Task 34: u=.7 Task 5: u=.5 RUN: A Simple Example • Step 2: Find the dual system on n-m = 3-2 = 1 processor 29

  30. RUN: A Simple Example • Step 2: Find the dual system on n-m = 3-2 = 1 processor Task 12*: u=.2 Task 12: u=.8 Task 34*: u=.3 Task 34: u=.7 Task 5*: u=.5 Task 5: u=.5 30

  31. Task 12*: u=.2 Task 34*: u=.3 Task 5*: u=.5 Dual CPU time = 0 2 4 6 RUN: A Simple Example • Step 3: Schedule uniprocessor dual with EDF 31

  32. Dual CPU time = 0 2 4 6 RUN: A Simple Example • Step 4: Schedule packed tasks from dual schedule 32

  33. CPU 1 CPU 2 Dual CPU time = 0 2 4 6 RUN: A Simple Example • Step 4: Schedule packed tasks from dual schedule 33

  34. CPU 1 CPU 2 Dual CPU time = 0 2 4 6 RUN: A Simple Example • Step 5: Packed tasks schedule their clients with EDF 34

  35. Task 1: u=.2 Task 2: u=.6 Task 3: u=.3 CPU 1 Task 4: u=.4 CPU 2 Task 5: u=.5 time = 0 2 4 6 RUN: A Simple Example • The original task set has been successfully scheduled! 35

  36. RUN: A Few Details Several reductions may be necessary to produce a uniprocessor system Reduction reduces the number of processors by at least half On random task sets with 100 processors and hundreds of tasks, less than 1 in 600 task sets required more than 2 reductions. RUN requiresui = m . When this is not the case, dummy tasks may be added during the first packing step to create a partially or entirely partitioned system. 36

  37. Proven Theoretical Performance RUN is optimal Reduction is done off-line, prior to execution, and takes O(n log n) Each scheduler invocation is O(n), with total scheduling overhead O(jn log m) when j jobs are scheduled RUN suffers at most (3r+1)/2 preemptions per job on task sets requiring r reductions. Since r ≤ 2 for most task sets, this gives a theoretical upper bound of 4 preemptions per job. In practice, we never observed more than 3 preeptions per job on our randomly generated task sets. 37

  38. Simulation and Comparison We evaluated RUN in side-by-side simulations with existing optimal algorithms LLREF, DP-Wrap, and EKG. Each data point is the average of 1000 task sets, generated uniformly at random with utilizations in the range [0.1, 0.99] and integer periods in the range [5,100]. Simulations were run for 1000 time units each. Values shown are average migrations and preemptions per job. 38

  39. Comparison Varying Processors • Number of processors varies from m = 2 to 32, with 2m tasks and 100% utilization 39

  40. Comparison Varying Utilization • Total utilization varies from 55 to 100%, with 24 tasks running on 16 processors 40

  41. Ongoing Work • Heuristic improvements to RUN • Extend RUN to broader problems: sporadic arrivals, arbitrarily deadlines, non-identical multiprocessors, etc • Develop general theory and new examples of non-deadline-partitioning algorithms 41

  42. Summary- The RUN Algorithm: • is the first optimal algorithm without deadline partitioning • outperforms existing optimal algorithms by a factor of 5 on large systems • scales well to larger systems • reduces gracefully to the very efficient Partitioned EDF on any task set that Partitioned EDF can schedule 42

  43. Thanks for Listening Questions?

More Related