1 / 55

Admission Control and Request Scheduling in Dynamic E-Commerce Web Sites

Admission Control and Request Scheduling in Dynamic E-Commerce Web Sites. Sameh Elnikety, Erich Nahum, John Tracey, Willy Zwaenepoel. C.S. Dept. EPFL. IBM T.J.Watson Research Center. Dynamic Content. 1. 2. 3. Increasing Online Commerce. $11B in 3 rd Quarter 2002 (up 37%)

Leo
Download Presentation

Admission Control and Request Scheduling in Dynamic E-Commerce Web Sites

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Admission Control and Request Scheduling in Dynamic E-Commerce Web Sites Sameh Elnikety, Erich Nahum, John Tracey, Willy Zwaenepoel C.S. Dept. EPFL IBM T.J.Watson Research Center

  2. Dynamic Content 1 2 3

  3. Increasing Online Commerce • $11B in 3rd Quarter 2002 (up 37%) • $11B in last 2 months of 2002 (up 40%) (Source: News.com)

  4. Two Key Problems • Overloaded Web Sites: • The “Slashdot Effect” • Unanticipated load causes site to crash • Unresponsive Web Sites: • The “Abandoned Shopping Cart’’ • Unacceptable delays lead to reduced usage • Reduced usage leads to reduced $$$ How can we address these problems for dynamic sites?

  5. Generating Dynamic Content Web Server Dynamic Content Generator Database Server http • Consists of 3 Components: • Web Server: static content • Dynamic Content Generator: Java servlets • DB Server: state of the business

  6. Outline • Motivation & Background • The Gatekeeper Proxy • Admission Control • Request Scheduling • Experimental Environment • Results • Summary and Conclusions

  7. Ideal Throughput Actual Load Admission Control • To prevent overload, perform admission control: • Notion of capacity in the system • Identify the job ahead of time & amount of work generated • Only let jobs in if they won’t overload system • Once you reach full capacity: • Make jobs wait • Drop jobs

  8. The Gatekeeper Transparent Proxy Web Server Dynamic Content Generator Gate Keeper Database Server http • Transparently intercepts DB requests • connections to the DB via the JDBC interface • Maintains several measurement-based estimates: • Total capacity of the database • Current estimate of DB load • Work generated by each query type

  9. Estimating Work by Query Type Web Server Dynamic Content Generator Gate Keeper Database Server http • Key Observations: • Queries of the same type take (roughly) the same time • Different queries differ greatly in execution time • Any web site has a finite number of query types • Gatekeeper maintains per-query work estimates

  10. Service Time Distributions

  11. Service Time Distributions

  12. TPC-W: Execution Times (note times are in log scale)

  13. Estimating System Capacity Web Server Dynamic Content Generator Gate Keeper Database Server http • Query execution time = load or work units of a job • Database capacity = max # work units before overload • Rough approximation • Unit approximates resource usage • Use binary search to determine capacity • More elaborate methods (adaptive, control theoretic, etc)

  14. Admission Control - Example Q3 Q2 Q1 1 700 Q3 Q2 Q1 2 695 Q1 Q3 3 195 Q2 Q3 Q2 4 200

  15. Scheduling: Theory and Practice • Theory: SRPT scheduling is best • SRPT: shortest remaining processing time • Proven to have minimum response time (Schrage 68) • Perfect prediction of work costs • Pre-emption has zero overhead, does not affect service time • Practice: not so simple • Pre-emption isn’t free (context switch costs, cache affinity) • Priorities and inheritance • Deadlock (e.g., Q1 is holding a lock when pre-empted) • Gatekeeper: • Use shortest job first (SJF) policy • Once a job (query) is admitted, it is never pre-empted

  16. Request Scheduling - Example (0+500) + (500+10) = 1010  505 (0+10) + (10+500) = 520  260 10 500 500 10

  17. Outline • Motivation & Background • The Gatekeeper Proxy • Experimental Environment • Software & Hardware • Metrics & Methodology • Results • Summary and Conclusions

  18. Workload Generation Requests • Workload generators typically used for experimental server performance evaluation • Many available for use with static content: • WebStone, SPECweb, SURGE, httperf, WaspClient • Only 1 available for e-Commerce: TPC-W Responses

  19. TPC-W • Transaction Processing Council (TPC-W) • TPC more known for database workloads like TPC-D • Provides specification, not source • Use the implementation from Dynaserver project at Rice • Models a large e-commerce site: Amazon • Web serving, searching, browsing, shopping carts • Secure purchasing (SSL), best sellers, new products • Customer registration, administrative updates • Persistent data • Static images on Web Server • All others on back-end database

  20. TPC-W: Snapshot Image Promo Shopping Cart Next Interaction

  21. TPC-W: Interactions • 14 Interactions, e.g.: • Home (read-only query) • Best sellers (complex) • Secure payment (ssl) • Shopping cart (update query) • Workload Mixes • Browsing (95% read-only) • Shopping (80% read-only) • Ordering (50% read-only)

  22. TPC-W: Queries SELECT c_uname FROM customer WHERE c_id = 10 SELECT i_id, i_title, a_fname, a_lname FROM item, author, order_line WHERE item.i_id = order_line.ol_i_id AND item.i_a_id = author.a_id AND order_line.ol_o_id > (SELECT MAX(o_id)-3333 FROM orders) AND item.i_subject = ‘ARTS’ GROUP BY i_id, i_title, a_fname, a_lname ORDER BY SUM(ol_qty) DESC FETCH FIRST 50 ROWS ONLY 3 ms 4000 ms

  23. TPC-W: Frequencies

  24. Software Web Server Dynamic Content Generator Database Server http

  25. Hardware Apache Tomcat MySQL DB2 http sql

  26. Emulated Clients Emulated Clients Apache Tomcat MySQL DB2 http sql • Remote Browser Emulator • Session duration • Think time • Markov model • Load is a function of the number of clients

  27. Experiments • Performance Metrics: • Throughput (interactions/minute) • Response time (msec, submission to completion) • Examine each as a function of load (# of clients) • Examine two locking approaches: • Locking in the database (slower, more general) • Locking in the application server (faster, less general) • Methodology: • Average of 5 runs • Each run lasts 600 seconds • Measurement starts after 100 second warm-up • 90 % confidence intervals

  28. Outline • Motivation & Background • The Gatekeeper Proxy • Experimental Environment • Results • Admission Control • Request Scheduling • Summary and Conclusions

  29. Admission Control - Throughput

  30. Admission Control - Throughput

  31. Admission Control - Explanation (Captured using systat utility on Linux)

  32. Admission Control - Explanation • Memory Pressure • Clients 200 to 300 • Captured using Rabbit (Athlon performance counters) • L1 data cache miss increases 24% • L1 DTLB miss & L2 DLTB hit increases 25% • L1 DTLB miss & L2 DLTB miss increases 23% • Database Processes • Kernel linear and logarithmic overhead (e.g., maintain the ready queue) • Database logarithmic overhead (e.g., list operations, sorting, searching)

  33. Throughput – DB Lock Contention

  34. Throughput - DB2

  35. Outline • Motivation & Background • The Gatekeeper Proxy • Experimental Environment • Results • Admission Control • Request Scheduling • Summary and Conclusions

  36. Request Scheduling - Response Time

  37. Response Time - DB Lock Contention

  38. Request Scheduling - Explanation 10000 1 1 1 1 10000 10000 1 1 1 1 10000/5

  39. Request Scheduling - Analysis • Same throughput, lower response time • Response time = Waiting time + Execution (service) time • Fairness • FIFO: all wait for same amount of time • SJF: favors short requests Q: How much are long jobs penalized?

  40. Request Scheduling - Explanation • Short Job: “Exec Search” • Response time breakdown: • Service time unchanged • 400 ms • Waiting time reduced • 8000 ms -> 100 ms • 80x difference!

  41. Request Scheduling - Explanation • Long Job: “Admin Response” • Response time breakdown: • Service time unchanged • 4800 ms • Waiting time increases • 12890 ms -> 15621 ms • Wait time increases 21 % • Response time increases 13 %

  42. Request Scheduling - Explanation • Average over all requests • Response time breakdown: • Service time unchanged • 428 ms • Waiting time decreases • 8856 ms -> 225 ms

  43. Preventing Starvation Aging mechanism, locking in App Server

  44. Preventing Starvation Aging mechanism, locking in DB

  45. Related Work • Admission Control/QoS for Static Content Web Servers: • Bhatti99, Li00, Voigt01, Abdelzaher02, Pradhan02, Voigt02 • Identify content via IP addr, URL, Cookie • Provide throughput/resp. time/BW guarantees • Request Scheduling: • Crovella99, Bansal01, Schroeder02 • Use SRPT scheduling for static content servers • Better response time, reasonable fairness, better overload protection • Dynamic Content: • Dynaserver project at Rice/EPFL • Iyengar97, Challenger00: Fragments, dependency graphs, caching • Akamai Edge Side Includes

  46. Summary • Presented the Gatekeeper Proxy • Transparent, DB-independent • Admission Control • Consistent performance during overload • Improves throughput 10 % • Request Scheduling using SJF • Improves response time 14 times • Penalizes long jobs only 13 %

  47. Future Work • Workloads where application server is bottleneck • Place Gatekeeper in front of application server • Workload characterization • Get dynamic site traces from IGS • See if TPC-W is representative • System support for dynamic content • Use Linux profiling support to identify bottlenecks • Implement and evaluate improvements • Scaling issues in multiple-tiered Web sites • Content-aware back-end redirection

  48. Thank You!

  49. TPC-W Queries

  50. TPC-W Resources (Shopping Mix) Conclusion: Bottleneck is DB Lock contention

More Related