1 / 13

Compiler Support for Multithreaded Software

Compiler Support for Multithreaded Software. Jeremy Condit Rob von Behren Feng Zhou Eric Brewer George Necula. Designing Concurrent Systems. The great debate: threads vs. events Thread model Each logically concurrent task is represented by a thread Modules communicate via call/return

lolita
Download Presentation

Compiler Support for Multithreaded Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Support forMultithreaded Software Jeremy Condit Rob von Behren Feng Zhou Eric Brewer George Necula

  2. Designing Concurrent Systems • The great debate: threads vs. events • Thread model • Each logically concurrent task is represented by a thread • Modules communicate via call/return • Each thread gets its own stack • Event model • Each logically concurrent task is represented by an event • Modules communicate via event passing • Event handlers unwind stack after each event

  3. Conventional Wisdom • Recent research favors event-based model • Event handlers execute atomically • Lower overhead for managing state • Better scheduling and locality • More flexible control flow • TinyOS, SEDA, Flash, … • We argue that all of these benefits can be achieved in thread-based systems • Thread systems and event systems are duals • Duality proposed by Lauer and Needham in 1978 (message passing vs. process-based systems)

  4. The Stack Problem • How do we limit stack space? • Event systems: stacks are empty at end of handler • Thread systems: stacks can be arbitrarily large at (or between) blocking points • Old solution: preallocate large stacks • Inappropriate when memory available to each thread is limited • New solution: linked stack frames • This talk!

  5. Linked Stack Goals • Limit amount of preallocated memory • Enable stacks of arbitrary size • Recursive functions • Temporary buffers • Short-lived “spikes” in stack size • Provide development tools • Existing debuggers • Profiling tools to tune stack allocation

  6. preallocate small chunks Our Options preallocated whole stack never preallocate

  7. Instrumenting Call Sites • Add instrumentation to some call sites • Check for sufficient stack space • Allocate and link new chunk if necessary • How much space is sufficient? • Largest amount of stack space used until another instrumented call site is reached • How do we get this information? • Analyze call graph at compile time • Dynamic programming • Instrumenting call site $ removing graph edge

  8. Call Graph Analysis Input: • Call graph • MaxPath parameter Output: • Set of edges to instrument • Stack bound for each node • Instrument all back edges • Process each node in call graph, bottom-up • For each successor • Let bound = successor’s bound + current node’s stack • If bound > MaxPath, instrument edge • Set node’s stack bound

  9. Call Graph Example 5k 9k 2k 8k 4k 10k 1k 2k 2k 1k 2k 2k MaxPath = 10k

  10. Wasted Space • Two kinds of wasted space: • Internal: unused space at the end of interior chunks • External: unused space at the end of the final chunk

  11. Tuning • Two parameters: • MaxPath: maximum desired path length • MinChunk: minimum allowable chunk size • Tradeoffs: amount of instrumentation internal wasted space external wasted space MaxPath MinChunk

  12. Results: Apache 2.0.44 • MaxPath = 4 KB, MinChunk = 8 KB • Compile-time statistics • 7500 call sites • 17% instrumented external calls • 5% instrumented internal calls • Run-time statistics (one request) • 1300 function calls • 25% instrumented external calls • instrumentation can be eliminated in 95% of cases • 8% instrumented internal calls • new chunk linked in 25% of cases • 1000 instructions per request

  13. Conclusions • Threads and events are duals • With proper compiler support, threads can perform just as well as events • Threads provide a more appropriate abstraction for many concurrent applications • Linked stacks can reduce stack waste • One example of compiler support for threads

More Related