1 / 44

Mor Harchol-Balter Carnegie Mellon University Computer Science

Heavy Tails: Performance Models & Scheduling Disciplines. Part IV: Scheduling in Practice: The SYNC Project. Mor Harchol-Balter Carnegie Mellon University Computer Science. FCFS. jobs. jobs. PS. SRPT. jobs. Q: Which minimizes mean response time?. “size” = service requirement.

rafal
Download Presentation

Mor Harchol-Balter Carnegie Mellon University Computer Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Heavy Tails: Performance Models & Scheduling Disciplines Part IV: Scheduling in Practice: The SYNC Project Mor Harchol-Balter Carnegie Mellon University Computer Science

  2. FCFS jobs jobs PS SRPT jobs Q: Which minimizes mean response time? “size” = service requirement load r < 1

  3. Q: Which best represents scheduling in web servers ? FCFS jobs “size” = service requirement load r < 1 jobs PS SRPT jobs

  4. Request for file Requested file IDEA: Use SRPT instead of PS in Web servers client 1 “Get File 1” 1 APACHE WEB SERVER 2 client 2 3 Internet “Get File 2” 1000 client 3 Linux 0.S. “Get File 3”

  5. Objections to SRPT: Ö • Need to know “size” of request • Unfairness to requests for big files

  6. THEORY THEORY IMPLEMENT Outline of Talk I:Investigating Unfairness in SRPT (M/G/1, c.f.m.f.v.)[Sigmetrics 01] II:Unfairness in All Scheduling Policies (M/G/1, c.f.m.f.v.)[Performance 02], [Sigmetrics 03] III:Implementation of SRPT in Web servers [Transactions on Computer Systems 03], [ITC03] Papers are joint with Adam Wierman & Bianca Schroeder

  7. THEORY THEORY IMPLEMENT Outline of Talk I:Investigating Unfairness in SRPT (M/G/1, c.f.m.f.v.)[Sigmetrics 01] II:Unfairness in All Scheduling Policies (M/G/1, c.f.m.f.v.)[Performance 02], [Sigmetrics 03] III:Implementation of SRPT in Web servers [Transactions on Computer Systems 03], [ITC03] www.cs.cmu.edu/~harchol/

  8. SRPT has a long history ... 1966 Schrage & Miller derive M/G/1/SRPT response time: 1968 Schrage proves optimality 1979 Pechinkin & Solovyev & Yashkov generalize 1990 Schassberger derives distribution on queue length BUT WHAT DOES IT ALL MEAN?

  9. SRPT has a long history (cont.) 1990 - 97 7-year long study at Univ. of Aachen under Schreiber SRPT WINS BIG ON MEAN! 1998, 1999 Slowdown for SRPT under adversary: Rajmohan, Gehrke, Muthukrishnan, Rajaraman, Shaheen, Bender, Chakrabarti, etc. SRPT STARVES BIG JOBS! Various o.s. books: Silberschatz, Stallings, Tannenbaum: Warn about starvation of big jobs ... Kleinrock’s Conservation Law: “Preferential treatment given to one class of customers is afforded at the expense of other customers.”

  10. Real-world job sizes are Heavy Tailed log-log plot Heavy-tailed Property: “Largest 1% of jobs comprise half the load.” Job size (x seconds) [Sigmetrics 96]

  11. ? PS ? SRPT Unfairness Question Let r=0.9. Let G: Bounded Pareto(a = 1.1, max=1010) Question: Which queue does biggest job prefer? M/G/1 M/G/1

  12. PS SRPT I SRPT Results on Unfairness Let r=0.9. Let G: Bounded Pareto(a = 1.1, max=1010)

  13. Results on Unfairness Let G: Bounded Pareto(a = 1.1, max=1010) PS SRPT

  14. Unfairness – General Distribution All-can-win-theorem: For all distributions, if r< ½, E[T(x)]SRPT< E[T(x)]PSfor all x.

  15. x ò 2 + x F ( x ) 2 t f ( t ) dt l l x dt ò 0 - r ( - r ( 2 2 ( 1 x )) 1 t ) 0 All-can-win-theorem: Forall distributions, if r< ½, E[T(x)]SRPTE[T(x)]PSfor all x. £ Proof: + E[Wait(x)]SRPT E[Res(x)]SRPT E[T(x)]PS

  16. x ò 2 + x F ( x ) 2 t f ( t ) dt l l x dt ò 0 - r ( - r ( 2 2 ( 1 x )) 1 t ) 0 All-can-win-theorem: For all distributions, if r< ½, E[T(x)]SRPTE[T(x)]PSfor all x. £ Proof cont. - Need sufficient condition s.t.

  17. x ò 2 + x F ( x ) 2 t f ( t ) dt l l 0 - r ( 2 2 ( 1 x )) All-can-win-theorem: For all distributions, if r< ½, E[T(x)]SRPTE[T(x)]PSfor all x. £ Proof cont. Need sufficient condition s.t.

  18. x ò 2 + x F ( x ) 2 t f ( t ) dt l l 0 - r ( 2 2 ( 1 x )) All-can-win-theorem: For all distributions, if r< ½, E[T(x)]SRPTE[T(x)]PSfor all x. £ Proof cont. Need sufficient condition s.t. Observe:

  19. x x ò ò 2 2 + + x x F F ( ( x x ) ) 2 2 t t f f ( ( t t ) ) dt dt l l l l 0 0 All-can-win-theorem: For all distributions, if r< ½, E[T(x)]SRPTE[T(x)]PSfor all x. £ Proof cont. - r ( 2 2 ( 1 x )) - r 1 Suffices that: 2(1 - r(x))2 > 1 - r. Suffices that r(x) < 1/2

  20. 1) All-can-win-theorem: For all distributions, if r< ½, E[T(x)]SRPTE[T(x)]PSfor all x. 2) Under Bounded Pareto distribution (a = 1.1), r < 0.96, E[T(x)]SRPT < E[T(x)]PSfor all x. *HT Property* £ E[Wait(x)] E[Res(x)]: small small Intuition

  21. THEORY THEORY IMPLEMENT Outline of Talk I:Investigating Unfairness in SRPT (M/G/1, c.f.m.f.v.)[Sigmetrics 01] II:Unfairness in All Scheduling Policies (M/G/1, c.f.m.f.v.)[Performance 02], [Sigmetrics 03] III:Implementation of SRPT in Web servers [Transactions on Computer Systems 03], [ITC03] www.cs.cmu.edu/~harchol/

  22. What is fair? Response time for job of size x Slowdown for job of size x Slowdown is independent of size PS does not bias towards any particular job size. Definition: A policy P isfairifE[S(x)]P≤ E[S(x)]PSfor all x. Otherwise, P isunfair.

  23. Always Unfair Always Fair Sometimes Unfair Fair for all loads and distributions Fair for some loads, and unfair for other loads* Unfair for all loads and distributions Classification of Scheduling Policies * and distributions

  24. Always Fair Always Unfair Sometimes Unfair Fair for all loads and distributions Fair for some loads, and unfair for other loads* Unfair for all loads and distributions • Where does SRPT lie? __________ • What about other policies that prioritize based on remaining size? ____ • What about preemptive policies that prioritize based on age? _____ • What about preemptive policies that prioritize based on size? _____ Testing your intuition: * and distributions

  25. Classification of Scheduling Policies SJF LJF FCFS PS FB Non-preemptive FSP Age- Based Policies Always Unfair Sometimes Unfair Always FAIR Preemptive Size-based Policies Preemptive Remaining-size based Policies PLCFS PSJF SRPT LRPT Lots of open problems…

  26. Always Unfair Theorem: Any preemptive, size based policy, P, is Always Unfair. Always Unfair Case1: A finite size, y, receives lowest priority Case 2: No finite size receives lowest priority (2a) Priorities decrease monotonically -- PSJF (2b) Priorities decrease non-monotonically. Unfair for all loads and distributions

  27. Always Unfair Theorem: Any preemptive, size based policy, P, is Always Unfair. Always Unfair Case1: A finite size, y, receives lowest priority Unfair for all loads and distributions y V = Work in System

  28. Always Unfair Theorem: Any preemptive, size based policy, P, is Always Unfair. Always Unfair Case2a: Priorities decrease monotonically (PSJF) Infinite sized job has lowest priority ... Unfair for all loads and distributions … but that job is treated fairly?

  29. Always Unfair Theorem: Any preemptive, size based policy, P, is Always Unfair. Always Unfair Case2a: Priorities decrease monotonically (PSJF) E[S(x)] PS PSJF Unfair for all loads and distributions x 0 Finding a hump shows PSJF is Always Unfair

  30. Always Unfair Theorem: Any preemptive, size based policy, P, is Always Unfair. Always Unfair Case2b: Priorities decrease non-monotonically E[S(x)] PS PSJF Unfair for all loads and distributions x 0 Find y beyond which PSJF treats all jobs unfairly. Find x > y, where x has lower priority than y. => x is treated unfairly.

  31. The mysterious hump PS PSJF E[S(x)] x 0 x • This hump appears in many common policies, • PSJF • FB • SRPT • SJF 0

  32. THEORY THEORY IMPLEMENT Outline of Talk I:Investigating Unfairness in SRPT (M/G/1, c.f.m.f.v.)[Sigmetrics 01] II:Unfairness in All Scheduling Policies (M/G/1, c.f.m.f.v.)[Performance 02], [Sigmetrics 03] III:Implementation of SRPT in Web servers [Transactions on Computer Systems 03], [ITC03] www.cs.cmu.edu/~harchol/

  33. From theory to practice: What does SRPT mean within aWeb server? • Many devices: Where to do the scheduling? • Many jobs at once.

  34. Server’s Performance Bottleneck Site buys limited fraction of ISP’s bandwidth client 1 “Get File 1” WEB SERVER client 2 (Apache) Rest of Internet “Get File 2” ISP Linux 0.S. client 3 “Get File 3” 5 We schedule bandwidth at server’s uplink.

  35. Web Server Network/O.S. insides of traditional Web server Socket 1 Client1 Network Card Socket 2 Client2 BOTTLENECK Client3 Socket 3 Sockets take turns draining --- FAIR = PS.

  36. Web Server Network/O.S. insides of our improved Web server Socket 1 Client1 S Network Card 1st Socket 2 Client2 2nd M BOTTLENECK 3rd Client3 Socket 3 L priority queues. Socket corresponding to file with smallest remaining data gets to feed first.

  37. Experimental Setup 1 2 WAN EMU 3 1 200 APACHE WEB SERVER Linux 2 1 3 2 WAN EMU 3 switch 200 Linux Linux 0.S. 1 2 WAN EMU 3 200 Linux Implementation SRPT-based scheduling: 1) Modifications to Linux O.S.: 6 priority Levels 2) Modifications to Apache Web server 3) Priority algorithm design.

  38. Flash Experimental Setup Apache 10Mbps uplink 1 2 WAN EMU 3 100Mbps uplink APACHE WEB SERVER 1 200 Linux 2 Surge 1 3 2 Trace-based WAN EMU 3 switch 200 Linux Open system Linux 0.S. 1 Partly-open 2 WAN EMU 3 200 WAN EMU Linux Geographically- dispersed clients Trace-based workload: Number requests made: 1,000,000 Size of file requested: 41B -- 2 MB Distribution of file sizes requested has HT property. Load < 1 Transient overload + Other effects: initial RTO; user abort/reload; persistent connections, etc.

  39. Results: Mean Response Time . . . Mean Response Time (sec) . FAIR . SRPT . Load

  40. Mean Response Time vs. Size Percentile Load =0.8 FAIR Mean Response time (ms) SRPT Percentile of Request Size

  41. Transient Overload -- Mean response time FAIR SRPT

  42. Transient overload Response time as function of job size FAIR SRPT small jobs win big! big jobs aren’t hurt!

  43. Database Internet High-Priority Client Low-Priority Client Locks CPU(s) Disks New project: Scheduling dynamic web requests “$$$buy$$$” Web Server (eg: Apache/Linux) Internet “buy” “buy” Need to schedule the database ...

  44. Conclusion Misconceptions about unfairness Discrimination against high-performance scheduling policies Classifying policies with respect to unfairness is counter-intuitive. Good news: Many high-performing policies are also fair in practice!

More Related