1 / 26

Capriccio: Scalable Threads for Internet Services

Capriccio: Scalable Threads for Internet Services. Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley {jrvb,jcondit,zf, necula, brewer}@cs.berkeley.edu http://capriccio.cs.berkeley.edu. The Stage. Highly concurrent applications

gunda
Download Presentation

Capriccio: Scalable Threads for Internet Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley {jrvb,jcondit,zf, necula, brewer}@cs.berkeley.edu http://capriccio.cs.berkeley.edu

  2. The Stage • Highly concurrent applications • Internet servers & frameworks • Flash, Ninja, SEDA • Transaction processing databases • Workload • High performance • Unpredictable load spikes • Operate “near the knee” • Avoid thrashing! Ideal Peak: some resource at max Performance Overload: someresource thrashing Load (concurrent tasks)

  3. The Price of Concurrency • What makes concurrency hard? • Race conditions • Code complexity • Scalability (no O(n) operations) • Scheduling & resource sensitivity • Inevitable overload • Performance vs. Programmability • No current system solves • Must be a better way! Threads Ideal Ease of Programming Events Threads Performance

  4. The Answer: Better Threads • Goals • Simplify the programming model • Thread per concurrent activity • Scalability (100K+ threads) • Support existing APIs and tools • Automate application-specific customization • Tools • Plumbing: avoid O(n) operations • Compile-time analysis • Run-time analysis • Claim: User-Level threads are key

  5. The Case for User-Level Threads • Decouple programming model and OS • Kernel threads • Abstract hardware • Expose device concurrency • User-level threads • Provide clean programming model • Expose logical concurrency • Benefits of user-level threads • Control over concurrency model! • Independent innovation • Enables static analysis • Enables application-specific tuning App User Threads OS

  6. The Case for User-Level Threads • Decouple programming model and OS • Kernel threads • Abstract hardware • Expose device concurrency • User-level threads • Provide clean programming model • Expose logical concurrency • Benefits of user-level threads • Control over concurrency model! • Independent innovation • Enables static analysis • Enables application-specific tuning App User Threads OS

  7. Capriccio Internals • Cooperative user-level threads • Fast context switches • Lightweight synchronization • Kernel Mechanisms • Asynchronous I/O (Linux) • Efficiency • Avoid O(n) operations • Fast, flexible scheduling

  8. Safety: Linked Stacks Fixed Stacks • The problem: fixed stacks • Overflow vs. wasted space • Limits thread numbers • The solution: linked stacks • Allocate space as needed • Compiler analysis • Add runtime checkpoints • Guarantee enough space until next check Linked Stack

  9. Linked Stacks: Algorithm • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls • Use large stack 3 3 2 5 2 4 3 6 MaxPath = 8

  10. Linked Stacks: Algorithm • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls • Use large stack 3 3 2 5 2 4 3 6 MaxPath = 8

  11. Linked Stacks: Algorithm • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls • Use large stack 3 3 2 5 2 4 3 6 MaxPath = 8

  12. Linked Stacks: Algorithm • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls • Use large stack 3 3 2 5 2 4 3 6 MaxPath = 8

  13. Linked Stacks: Algorithm • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls • Use large stack 3 3 2 5 2 4 3 6 MaxPath = 8

  14. Linked Stacks: Algorithm • Parameters • MaxPath • MinChunk • Steps • Break cycles • Trace back • Special Cases • Function pointers • External calls • Use large stack 3 3 2 5 2 4 3 6 MaxPath = 8

  15. Scheduling:The Blocking Graph Web Server • Lessons from event systems • Break app into stages • Schedule based on stage priorities • Allows SRCT scheduling, finding bottlenecks, etc. • Capriccio does this for threads • Deduce stage with stack traces at blocking points • Prioritize based on runtime information Open Read Write Accept Read Write Close

  16. Resource-Aware Scheduling • Track resources used along BG edges • Memory, file descriptors, CPU • Predict future from the past • Algorithm • Increase use when underutilized • Decrease use near saturation • Advantages • Operate near the knee w/o thrashing • Automatic admission control

  17. Thread Performance • Slightly slower thread creation • Faster context switches • Even with stack traces! • Much faster mutexes Time of thread operations (microseconds)

  18. Runtime Overhead • Tested Apache 2.0.44 • Stack linking • 78% slowdown for null call • 3-4% overall • Resource statistics • 2% (on all the time) • 0.1% (with sampling) • Stack traces • 8% overhead

  19. Web Server Performance

  20. Future Work • Threading • Multi-CPU support • Kernel interface • (enabled) Compile-time techniques • Variations on linked stacks • Static blocking graph • Atomicity guarantees • Scheduling • More sophisticated prediction

  21. Conclusions • Capriccio simplifies high concurrency • Scalable & high performance • Control over concurrency model • Stack safety • Resource-aware scheduling • Enables compiler support, invariants • Themes • User-level threads are key • Compiler techniques very promising

  22. Apache Blocking Graph

  23. Microbenchmark: Buffer Cache

  24. Microbenchmark: Disk I/O

  25. Microbenchmark: Producer / Consumer

  26. Microbenchmark: pipetest

More Related