1 / 50

THE RESEARCH PROCESS

OPEN VERSUS CLOSED: A CAUTIONARY TALE Bianca Schroeder Adam Wierman Mor Harchol-Balter Computer Science Department Carnegie Mellon University To appear at NSDI 2006 presenter: 吳泰廷. new system has smaller response time!. old. new system. new. This comparison

selene
Download Presentation

THE RESEARCH PROCESS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OPEN VERSUS CLOSED:A CAUTIONARY TALEBianca Schroeder Adam Wierman Mor Harchol-BalterComputer Science DepartmentCarnegie Mellon UniversityTo appear at NSDI 2006presenter:吳泰廷

  2. new system has smaller response time! old new system new This comparison requires testing the two systems on realistic workloads THE RESEARCH PROCESS standard system

  3. INTRODUTION • Need system models that “accurately represent" the real system. • Representing a system accurately involves many things: • bottleneck resource behavior, the scheduling of requests • at that bottleneck, workload parameters such as the distribution • of service request demands……. • One factor that researchers typically pay little attention to • is whether the job arrivals obey a closed or an open system • model.

  4. We show that closed and open system models yield significantly • different results, even when both models are run with the same • load and service demands. • Conclude with guidelines for choosing a system model.

  5. think send receive server MANY WAYS TO GENERATE REALISTIC WORKLOADS User requests web page, receives page, reads page, clicks on new link N=MPL (multiprogramming level) CLOSED SYSTEM MODEL

  6. Trace driven 1:01.12 ip1 GET a.gif HTTP/1.0 1:01.20 ip2 GET b.htm HTTP/1.0 1:01.25 ip1 GET c.jpg HTTP/1.0 1:01.27 ip1 GET d.txt HTTP/1.0 1:01.28 ip3 GET a.htm HTTP/1.0 1:01.35 ip4 GET d.gif HTTP/1.0 1:01.45 ip2 GET e.htm HTTP/1.0 : : arrival times service demands x x x new arrivals server MANY WAYS TO GENERATE REALISTIC WORKLOADS next arrival time from trace file sizes from trace OPEN SYSTEM MODEL

  7. x x x new arrivals server MANY WAYS TO GENERATE REALISTIC WORKLOADS Distribution driven Use distributions of interarrival times and service demands (typically using trace info) interarrival time dist. service demand dist. sample dist. sample dist. OPEN SYSTEM MODEL

  8. OPEN MODEL CLOSED MODEL Arrivals are independent of completions Arrivals are completely dependent on departures There is a fixed population of users, called the Multi-Programming-Level (MPL) There is no max number of simultaneous users

  9. OPEN MODEL WEB WORKLOAD GENERATORS CLOSED MODEL Do you use an open or closed model? Surge • Workload generators for thesame purpose use differentsystem models! • It’s often not clear which model workload generatorsuse! SPECWeb TPC-W Sclient RUBiS WebBench Webjamma

  10. NEITHER THE OPEN OR CLOSEDMODEL IS COMPLETELY REALISTIC

  11. PARTLY-OPEN MODEL with probability q return to the system think send receive x x x leave system new arrivals server PARTLY-OPEN SYSTEM

  12. OUR GOAL What is the impact of the choice of an open or closed model?

  13. OPEN CLOSED HOW DO WE COMPARE OPEN AND CLOSED SYSTEMS? • Fix the service distribution acrossthe systems • Fix the load across the systems adjust load using the arrival rate load depends only on mean arrival rate and mean service demands load depends on MPL, think times, mean of service demands, variability of service demands … adjust load using the think time

  14. How do open and closed response times compare? FCFS scheduling open  Poisson arrival process closed  Exponential think times

  15. 1000 100 10 mean response time 0 0.25 0.5 0.75 1 load FCFS scheduling open  Poisson arrival process closed  Exponential think times Open CLOSED <<OPEN Closed (MPL=10)

  16. 1000 100 10 mean response time 0 0.25 0.5 0.75 1 load FCFS scheduling open  Poisson arrival process closed  Exponential think times Open CLOSED OPEN Closed (MPL=1000) Closed (MPL=100) Closed (MPL=10)

  17. OPEN MODEL CLOSED MODEL VS CLOSED  OPEN AS MPL GROWS As MPL grows arrival rate becomes independent of completion rate

  18. 1500 1000 500 mean response time low variability high variability How quickly does Closed  Open? Open Web Workloads Closed (MPL=1000) Closed (MPL=100) Closed (MPL=10)

  19. There principles 1.For a given load, mean response times are significantly lower in closed systems than in open systems. 2. As theMPL grows, closed systems become open, but convergence is slow for practical purposes. 3.While variability has a large effect in open systems, the effect is much smaller in closed systems.

  20. OUR GOAL What is the impact of the choice of an open or closed model? • What is the impacton the effectivenessof scheduling? • What is the impactin practice? It matters a lot!

  21. FCFS (First-Come-First-Served): Jobs are processed in the same order as they arrive. • PS (Processor-Sharing) The server is shared evenly among all jobs in the system. • PESJF (Preemptive-Expected-Shortest-Job-First) The job with the smallest expected duration (size) is given preemptive priority. • SRPT (Shortest-Remaining-Processing-Time-First): At every moment the request with the smallest remaining processing requirement is given priority.

  22. Improved design Shortest Remaining Processing Time (SRPT) Standard design Processor Sharing (PS) Compare using a workload generator SCHEDULING IS A KEY COMPONENT OF SYSTEM DESIGN WEB SERVERS Does the effectiveness of scheduling depend on the system model (open vs. closed)?

  23. PLJF FCFS PS SRPT SCHEDULING IN OPEN SYSTEMS OPEN 1000 600 300 0 How do the closed results compare? mean response time 0 .25 .5 .75 1 load

  24. PLJF FCFS PS SRPT PLJF FCFS PS SRPT • Limited impact of variability in closed system • Bounded number of jobs in closed system • Dependencies between completions and arrivalsin closed system reduces burstiness Why? CONTRASTING THE IMPACT OF SCHEDULING OPEN CLOSED 1000 600 300 0 mean response time 0 .25 .5 .75 1 0 .25 .5 .75 1 load load

  25. Three priciples • While open systems benefit significantly from scheduling with respect to response time, closed systems improve much less. 2. Scheduling only significantly improves response time in closed systems under very specific parameter settings: moderate load (think times) and highMPL. 3. Scheduling can limit the effect of variability in both open and closed systems.

  26. OUR GOAL What is the impact of the choice of an open or closed model? It matters a lot! Especially when evaluating scheduling policies What is the impact in practice?

  27. OPEN VS CLOSEDIN PRACTICE 4 CASE STUDIES • Serving static web content • Database backend ofan e-commerce site 3. Auctioning web site testbed implementation trace-based simulation

  28. PS PS SRPT SRPT OPEN VS CLOSEDIN PRACTICE STATIC WEB SERVER OPEN CLOSED 300 200 100 MPL=50 mean response time 0 .25 .5 .75 1 0 .25 .5 .75 1 load load Different models give different conclusion about benefits of SRPT

  29. OPEN CLOSED 10 8 4 0 MPL=50 PS E-COMMERCE SITE PS PESJF PESJF mean response time 20 14 7 0 load load MPL=50 PS AUCTION SITE PS SRPT SRPT 0 .25 .5 .75 1 0 .25 .5 .75 1 load load

  30. How can we identify whether to use an open or closed model? OUR GOAL TODAY What is the impact of the choice of an open or closed model? It matters a lot in practice! Especially when evaluating scheduling policies

  31. PARTLY-OPEN MODEL with probability q return to the system think send receive x x x leave system new arrivals server A MORE REALISTIC ALTERNATIVE What parameters affect the load? Does think time affect the load? How do think times affect response times?

  32. Trace 12 ip1 GET a.gif HTTP/1.0 20 ip2 GET b.htm HTTP/1.0 25 ip1 GET c.jpg HTTP/1.0 27 ip1 GET d.txt HTTP/1.0 28 ip3 GET a.htm HTTP/1.0 35 ip4 GET d.gif HTTP/1.0 45 ip2 GET e.htm HTTP/1.0 : : PARTLY-OPEN service demands FITTING A PARTLY-OPEN MODEL file sizes from trace

  33. Trace 12 ip1 GET a.gif HTTP/1.0 20 ip2 GET b.htm HTTP/1.0 25 ip1 GET c.jpg HTTP/1.0 27 ip1 GET d.txt HTTP/1.0 28 ip3 GET a.htm HTTP/1.0 35 ip4 GET d.gif HTTP/1.0 45 ip2 GET e.htm HTTP/1.0 : : PARTLY-OPEN FITTING A PARTLY-OPEN MODEL Fitting the interarrival times • Distinguish userse.g. use ip address in a web trace • Identify user session boundaries  Use periods of inactivity of length > timeout

  34. 2e5 1e5 0 financial Number of sessions world cup dept store 0 30min Timeout length CHOOSING A TIMEOUT VALUE

  35. PS SRPT THE EFFECT OFTHINK TIME STATIC WEB SERVER 300 200 100 0 mean response time 1 10 100 1000 mean think time

  36. PARTLY-OPEN MODEL with probability q return to the system think send receive q0 q1 x x x ? ? OPEN CLOSED leave system new arrivals server A MORE REALISTIC ALTERNATIVE Workload generators are only Open/Closed! number of requests per visit ↓ number of requests per visit ↑

  37. PS open PS SRPT PS closed THE TRANSITION FROM OPEN  CLOSED STATIC WEB SERVER CLOSED 300 200 100 0 OPEN mean response time 0 5 10 15 20 mean number of requests per visit

  38. STATIC WEB E-COMMERCE SITE 200 100 0 9 6 3 0 PS PS SRPT PESJF 15 10 5 0 AUCTIONING PS SRPT 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 THE PARTLY-OPEN SYSTEM IN PRACTICE mean response time mean number of requests per visit

  39. PS SRPT PS PARTLY-OPEN SRPT PS SRPT THESE DIFFERENCES ARE IMPORTANT IN PRACTICE OPEN CLOSED VS

  40. CHOOSING A SYSTEM MODEL Web workloads 1. Large corporate web 2. CMU web server 3. Online department store 4. Science institute (USGS) 5. Online gaming site 6. Financial service provider 7. Supercomputing web site 8. Kasparov-DeepBlue match 9. Site seeing “slashdot effect” 10. Soccer world cup Open or closed? Use a partly-open model...

  41. CHOOSING A SYSTEM MODEL Web workloads 1. Large corporate web 2. CMU web server 3. Online department store 4. Science institute (USGS) 5. Online gaming site 6. Financial service provider 7. Supercomputing web site 8. Kasparov-DeepBlue match 9. Site seeing “slashdot effect” 10. Soccer world cup Open or closed? Use a partly-open model... ...to decide which is more accurate

  42. What is the expected num. of visits? Fit a partly open model to the trace else <5 5-10 >10 OPEN ??? CLOSED world cup 15 10 5 0 >>1000 dept store Mean num. of visits financial OPEN ≈ CLOSED 0 30min Timeout length HOW TO CHOOSE A SYSTEM MODEL How many simult. users are there? Gather a trace

  43. CHOOSING A SYSTEM MODEL <5 expected visits Web Workloads OPEN 1. Large corporate web 2. CMU web server 3. Online department store 4. Science institute (USGS) 5. Online gaming site 6. Financial service provider 7. Supercomputing web site 8. Kasparov-DeepBlue match 9. Site seeing “slashdot effect” 10. Soccer world cup 5-10 expected visits PARTLY OPEN >10 expected visits CLOSED

  44. CHOOSING A SYSTEM MODEL <5 expected visits 1. Large corporate web 2. CMU web server 4. Science institute (USGS) 6. Financial service provider 8. Kasparov-DeepBlue match 9. Site seeing “slashdot effect” Web Workloads OPEN 5-10 expected visits 3. Online department store 7. Supercomputing web site PARTLY OPEN >10 expected visits 5. Online gaming site 10. Soccer world cup CLOSED

  45. CONCLUSION • The differences in behavior of closed, open,and partly-open systems. • These principles underscore the importance of choosingthe appropriate system model. • Our findings provide guidelines for choosingwhether an open or closed model is the better approximationbased oncharacteristics of the workload. • Understandingthe appropriate system model is essential to understanding the impact of scheduling.

More Related