capriccio scalable threads for internet services
Download
Skip this Video
Download Presentation
Capriccio: Scalable Threads for Internet Services

Loading in 2 Seconds...

play fullscreen
1 / 26

Capriccio: Scalable Threads for Internet Services - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Capriccio: Scalable Threads for Internet Services. Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley {jrvb,jcondit,zf, necula, brewer}@cs.berkeley.edu http://capriccio.cs.berkeley.edu. The Stage. Highly concurrent applications

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Capriccio: Scalable Threads for Internet Services' - gunda


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
capriccio scalable threads for internet services

Capriccio: Scalable Threads for Internet Services

Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer

University of California at Berkeley

{jrvb,jcondit,zf, necula, brewer}@cs.berkeley.edu

http://capriccio.cs.berkeley.edu

the stage
The Stage
  • Highly concurrent applications
    • Internet servers & frameworks
      • Flash, Ninja, SEDA
    • Transaction processing databases
  • Workload
    • High performance
    • Unpredictable load spikes
    • Operate “near the knee”
    • Avoid thrashing!

Ideal

Peak: some resource at max

Performance

Overload: someresource thrashing

Load (concurrent tasks)

the price of concurrency
The Price of Concurrency
  • What makes concurrency hard?
    • Race conditions
    • Code complexity
    • Scalability (no O(n) operations)
    • Scheduling & resource sensitivity
    • Inevitable overload
  • Performance vs. Programmability
    • No current system solves
    • Must be a better way!

Threads

Ideal

Ease of Programming

Events

Threads

Performance

the answer better threads
The Answer: Better Threads
  • Goals
    • Simplify the programming model
      • Thread per concurrent activity
      • Scalability (100K+ threads)
    • Support existing APIs and tools
    • Automate application-specific customization
  • Tools
    • Plumbing: avoid O(n) operations
    • Compile-time analysis
    • Run-time analysis
  • Claim: User-Level threads are key
the case for user level threads
The Case for User-Level Threads
  • Decouple programming model and OS
    • Kernel threads
      • Abstract hardware
      • Expose device concurrency
    • User-level threads
      • Provide clean programming model
      • Expose logical concurrency
  • Benefits of user-level threads
    • Control over concurrency model!
    • Independent innovation
    • Enables static analysis
    • Enables application-specific tuning

App

User

Threads

OS

the case for user level threads1
The Case for User-Level Threads
  • Decouple programming model and OS
    • Kernel threads
      • Abstract hardware
      • Expose device concurrency
    • User-level threads
      • Provide clean programming model
      • Expose logical concurrency
  • Benefits of user-level threads
    • Control over concurrency model!
    • Independent innovation
    • Enables static analysis
    • Enables application-specific tuning

App

User

Threads

OS

capriccio internals
Capriccio Internals
  • Cooperative user-level threads
    • Fast context switches
    • Lightweight synchronization
  • Kernel Mechanisms
    • Asynchronous I/O (Linux)
  • Efficiency
    • Avoid O(n) operations
    • Fast, flexible scheduling
safety linked stacks
Safety: Linked Stacks

Fixed Stacks

  • The problem: fixed stacks
    • Overflow vs. wasted space
    • Limits thread numbers
  • The solution: linked stacks
    • Allocate space as needed
    • Compiler analysis
      • Add runtime checkpoints
      • Guarantee enough space until next check

Linked Stack

linked stacks algorithm
Linked Stacks: Algorithm
  • Parameters
    • MaxPath
    • MinChunk
  • Steps
    • Break cycles
    • Trace back
  • Special Cases
    • Function pointers
    • External calls
    • Use large stack

3

3

2

5

2

4

3

6

MaxPath = 8

linked stacks algorithm1
Linked Stacks: Algorithm
  • Parameters
    • MaxPath
    • MinChunk
  • Steps
    • Break cycles
    • Trace back
  • Special Cases
    • Function pointers
    • External calls
    • Use large stack

3

3

2

5

2

4

3

6

MaxPath = 8

linked stacks algorithm2
Linked Stacks: Algorithm
  • Parameters
    • MaxPath
    • MinChunk
  • Steps
    • Break cycles
    • Trace back
  • Special Cases
    • Function pointers
    • External calls
    • Use large stack

3

3

2

5

2

4

3

6

MaxPath = 8

linked stacks algorithm3
Linked Stacks: Algorithm
  • Parameters
    • MaxPath
    • MinChunk
  • Steps
    • Break cycles
    • Trace back
  • Special Cases
    • Function pointers
    • External calls
    • Use large stack

3

3

2

5

2

4

3

6

MaxPath = 8

linked stacks algorithm4
Linked Stacks: Algorithm
  • Parameters
    • MaxPath
    • MinChunk
  • Steps
    • Break cycles
    • Trace back
  • Special Cases
    • Function pointers
    • External calls
    • Use large stack

3

3

2

5

2

4

3

6

MaxPath = 8

linked stacks algorithm5
Linked Stacks: Algorithm
  • Parameters
    • MaxPath
    • MinChunk
  • Steps
    • Break cycles
    • Trace back
  • Special Cases
    • Function pointers
    • External calls
    • Use large stack

3

3

2

5

2

4

3

6

MaxPath = 8

scheduling the blocking graph
Scheduling:The Blocking Graph

Web Server

  • Lessons from event systems
    • Break app into stages
    • Schedule based on stage priorities
    • Allows SRCT scheduling, finding bottlenecks, etc.
  • Capriccio does this for threads
    • Deduce stage with stack traces at blocking points
    • Prioritize based on runtime information

Open

Read

Write

Accept

Read

Write

Close

resource aware scheduling
Resource-Aware Scheduling
  • Track resources used along BG edges
    • Memory, file descriptors, CPU
    • Predict future from the past
    • Algorithm
      • Increase use when underutilized
      • Decrease use near saturation
  • Advantages
    • Operate near the knee w/o thrashing
    • Automatic admission control
thread performance
Thread Performance
  • Slightly slower thread creation
  • Faster context switches
    • Even with stack traces!
  • Much faster mutexes

Time of thread operations (microseconds)

runtime overhead
Runtime Overhead
  • Tested Apache 2.0.44
  • Stack linking
    • 78% slowdown for null call
    • 3-4% overall
  • Resource statistics
    • 2% (on all the time)
    • 0.1% (with sampling)
  • Stack traces
    • 8% overhead
future work
Future Work
  • Threading
    • Multi-CPU support
    • Kernel interface
  • (enabled) Compile-time techniques
    • Variations on linked stacks
    • Static blocking graph
    • Atomicity guarantees
  • Scheduling
    • More sophisticated prediction
conclusions
Conclusions
  • Capriccio simplifies high concurrency
    • Scalable & high performance
    • Control over concurrency model
      • Stack safety
      • Resource-aware scheduling
      • Enables compiler support, invariants
  • Themes
    • User-level threads are key
    • Compiler techniques very promising
ad