1 / 32

CSE 532 Fall 2013 Take Home Final Exam

CSE 532 Fall 2013 Take Home Final Exam. Posted on the course web site in Word and Acrobat Complete electronically then email to cdgill@cse.wustl.edu Alternatively, you can print, complete, and submit hard copy Please complete the exam without consulting others

Download Presentation

CSE 532 Fall 2013 Take Home Final Exam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 532 Fall 2013 Take Home Final Exam • Posted on the course web site in Word and Acrobat • Complete electronically then email to cdgill@cse.wustl.edu • Alternatively, you can print, complete, and submit hard copy • Please complete the exam without consulting others • Except Prof. Gill, to whom you can individually e-mail any clarifying questions, etc. that you may have as you go • You are free to use compilers, web site slides, etc. • Copies of the required and optional text books will be on reserve on the CSE Department office (Bryan 509C) • You may search for and read other sources of information on the internet, but you may not post questions for others there

  2. Construct a Thread to Launch It • The std::thread constructor takes any callable type • A function, function pointer, function object, or lambda • This includes things like member function pointers, etc. • Additional constructor arguments passed to callable instance • Watch out for argument passing semantics, though • Constructor arguments are copied locally without conversion • Need to wrap references in std::ref, force conversions, etc. • Default construction (without a thread) also possible • Can transfer ownership of it via C++11 move semantics, i.e., using std::move from one std::thread object to another • Be careful not to move thread ownership to a std::thread object that already owns one (terminates the program)

  3. Always Join or Detach a Launched Thread • Often you should join with each launched thread • E.g., to wait until a result it produces is ready to retrieve • E.g., to keep a resource it needs available for its lifetime • However, for truly independent threads, can detach • Relinquishes parent thread’s option to rendezvous with it • Need to copy all resources into the thread up front • Avoids circular wait deadlocks (if A joins B and B joins A) • Need to ensure join or detach for each thread • E.g., if an exception is thrown, still need to make it so • The guard (a.k.a. RAII) idiom helps with this, since guard’s destructor always joins or detaches if needed • The std::thread::joinable() method can be used to test that

  4. Design for Multithreaded Programming • Concurrency • Logical (single processor): instruction interleaving • Physical (multi-processor): parallel execution • Safety • Threads must not corrupt objects or resources • More generally, bad inter-leavings must be avoided • Atomic: runs to completion without being preempted • Granularity at which operations are atomic matters • Liveness • Progress must be made (deadlock is avoided) • Goal: full utilization (something is always running)

  5. Atomic Types • Many atomic types in C++11, at least some lock-free • Always lock-free: std::atomic_flag • If it matters, must test others with is_lock_free() • Also can specialize std::atomic<> class template • This is already done for many standard non-atomic type • Can also do this for your own types that implement a trivial copy-assignment operator, are bitwise equality comparable • Watch out for semantic details • E.g., bitwise evaluation of float, double, etc. representations • Equivalence may differ under atomic operations

  6. Lock-Free and Wait-Free Semantics • Lock-free behavior never blocks (but may live-lock) • Suspension of one thread doesn’t impede others’ progress • Tries to do something, if cannot just tries again • E.g., while(head.compare_exchange_weak(n->next,n)); • Wait-free behavior never starves a thread • Progress of each is guaranteed (bounded number of retries) • Lock-free data structures try for maximum concurrency • E.g., ensuring some thread makes progress at every step • May not be strictly wait-free but that’s something to aim for • Watch out for performance costs in practice • E.g., atomic operations are slower, may not be worth it • Some platforms may not relax memory consistency well

  7. Lock Free Design Guidelines • Prototype data structures using sequential consistency • Then analyze and test thread-safety thoroughly • Then look for meaningful opportunities to relax consistency • Use a lock-free memory reclamation scheme • Count threads and then delete when quiescent • Use hazard pointers to track threads accesses to an object • Reference count and delete in a thread-safe way • Detach garbage and delegate deletion to another thread • Watch out for the ABA problem • E.g., with coupled variables, pop-push-pop issues • Identify busy waiting, then steal or delegate work • E.g., if thread would be blocked, “help it over the fence”

  8. Dividing Work Between Threads • Static partitioning of data can be helpful • Makes threads (mostly) independent, ahead of time • Threads can read from and write to their own locations • Some partitioning of data is necessarily dynamic • E.g., Quicksort uses a pivot at run-time to split up data • May need to launch (or pass data to) a thread at run-time • Can also partition work by task-type • E.g., hand off specific kinds of work to specialized threads • E.g., a thread-per-stage pipeline that is efficient once primed • Number of threads to use is a key design challenge • E.g., std::thread::hardware_concurrency() is only a starting point (blocking, scheduling, etc. also matter)

  9. Factors Affecting Performance • Need at least as many threads as hardware cores • Too few threads makes insufficient use of the resource • Oversubscription increases overhead due to task switching • Need to gauge for how long (and when) threads are active • Data contention and cache ping-pong • Performance degrades rapidly as cache misses increas • Need to design for low contention for cache lines • Need to avoid false sharing of elements (in same cache line) • Packing or spreading out data may be needed • E.g., localize each thread’s accesses • E.g., separate a shared mutex from the data that it guards

  10. Additional Considerations • Exception safety • Affects both lock based and lock-free synchronization • Use std::packaged_taskand std::future to allow for an exception being thrown in a thread (see listing 8.3) • Scalability • How much of the code is actually parallizable? • Various theoretical formulas (including Amdahl’s) apply • Hiding latency • If nothing ever blocks you may not need concurrency • If something does, concurrency makes parallel progress • Improving responsiveness • Giving each thread its own task may simplify, speed up tasks

  11. Thread Pools • Simplest version • A thread per core, all run a common worker thread function • Waiting for tasks to complete • Promises and futures give rendezvous with work completion • Could also post work results on an active object’s queue, which also may help avoid cache ping-pong • Futures also help with exception safety, e.g., a thrown exception propagates to thread that calls get on the future • Granularity of work is another key design decision • Too small and the overhead of managing the work adds up • To coarse and responsiveness, concurrency, may suffer • Work stealing lets idle threads relieve busy ones • May need to hand off promise as well as work, etc.

  12. Interrupting Threads (Part I) • Thread with interruption point is (cleanly) interruptible • Another thread can set a flag that it will notice and then exit • Clever use of lambdas, promises, move semantics lets a thread-local interrupt flag be managed (see listing 9.9) • Need to be careful to avoid dangling pointers on thread exit • For simple cases, detecting interruption may be trivial • E.g., event loop with interruption point checked each time • For condition variables interruption is more complex • E.g., using the guard idiom to avoid exception hazards • E.g., waiting with a timeout (and handling spurious wakes) • Can eliminate spurious wakes with a scheme based on a custom lock and a condition_variable_any (listing 9.12)

  13. Concurrency Related Bugs • Deadlock occurs when a thread never unblocks • Complete deadlock occurs when no thread ever unblocks • Blocking I/O can be problematic (e.g., if input never arrives) • Livelock is similar but involves futile effort • Threads are not blocked, but never make real progress • E.g., if a condition never occurs, or with protocol bugs • Data races and broken invariants • Can corrupt data, dangle pointers, double free, leak data • Lifetime of thread relative to its data also matters • If thread exits without freeing resources they can leak • If resources are freed before thread is done with them (or even gains access to them) behavior may be undefined

  14. What is a Pattern Language? • A narrative that composes patterns • Not just a catalog or listing of the patterns • Reconciles design forces between patterns • Provides an outline for design steps • A generator for a complete design • Patterns may leave consequences • Other patterns can resolve them • Generative designs resolve all forces • Internal tensions don’t “pull design apart”

  15. Categories of Patterns (for CSE 532) • Service Access and Configuration • Appropriate programming interfaces/abstractions • Event Handling • Inescapable in networked systems • Concurrency • Exploiting physical and logical parallelism • Synchronization • Managing safety and liveness in concurrent systems

  16. pthread_create (thread, attr, start_routine, arg); pthread)_exit (status); pthread_cancel (thread); … Wrapper Facade thread thread (); thread (function, args); ~thread(); join(); … Combines related functions/data (OO, generic) Used to adapt existing procedural APIs Offers better interfaces Concise, maintainable, portable, cohesive, type safe

  17. Asynchronous Completion Token Pattern • A service (eventually) passes a “cookie” to client • Examples with C++11 futures and promises • A future (eventually) holds ACT (or an exception) from which initiator can obtain the result • Client thread can block on a call to get the data or can repeatedly poll (with timeouts if you’d like) for it • A future can be packaged up with an asynchronously running service in several ways • Directly: e.g., returned by std::async • Bundled: e.g., via a std::packaged_task • As a communication channel: e.g., via std::promise • A promise can be kept or broken • If broken, an exception is thrown to client

  18. Synchronization Patterns • Key issues • Avoiding meaningful race conditions and deadlock • Scoped Locking (via the C++ RAII Idiom) • Ensures a lock is acquired/released in a scope • Thread-Safe Interface • Reduce internal locking overhead • Avoid self-deadlock • Strategized Locking • Customize locks for safety, liveness, optimization

  19. Concurrency Patterns • Key issue: sharing resources across threads • Thread Specific Storage Pattern • Separates resource access to avoid contention among them • Monitor Object Pattern • One thread at a time can access the object’s resources • Active Object Pattern • One worker thread owns the object‘s resources • Half-Sync/Half-Async (HSHA) Pattern • A thread collects asynchronous requests and works on the requests synchronously (similar to Active Object) • Leader/Followers Pattern • Optimize HSHA for independent messages/threads

  20. Event Driven Server • Inversion of control • Hollywood principle – Don’t call us, we’ll call you (“there is no main”) (reusable: e.g., from ACE) Event Dispatching Logic (pluggable: you write for your application) Event Handling Logic Connection Acceptor handle_connection_request Event Handlers handle_data_read Data Reader

  21. Client and Server Roles • Each process plays a single role • E.g., Logging Server example • Logging Server gets info from several clients • Roles affect connection establishment as well as use • Clients actively initiate connection requests • Server passively accepts connection requests • Client/server roles may overlap • Allow flexibility to act as either one • Or, to act as both at once (e.g., in a publish/subscribe gateway) Listening port Server Client Server/Client

  22. Reactor Pattern Solution Approach Application De-multiplexing & Dispatching Application logic Event sources Event Handlers Reactor Synchronous wait Serial Dispatching

  23. Acceptor/Connector Solution Approach Acceptor Event sources Dispatcher Handler Connector Connection establishment Service instantiation & initialization Event de-multiplexing & dispatching Service handling (connection use)

  24. Proactor in a nutshell 2 1 0 register handlers handle events asynch_io ACT1 ACT2 1 2 4 handle_event 8 complete 1 2 ACT 3 5 7 associate handles with I/O completion port wait 1 2 ACT 6 completion event create handlers Completion Handler2 Completion Handler1 Application Proactor I/O Completion port OS (or AIO emulation)

  25. Compare Reactor vs. Proactor Side by Side Reactor Proactor Application Application ASYNCH accept/read/write handle_events Reactor Handle handle_event handle_events Event Handler Proactor accept/read/write handle_event Handle Completion Handler

  26. Motivation for Interceptor Pattern • Fundamental concern • How to integrate out-of-band tasks with in-band tasks? • Straightforward (and all to common) approach • Paste the out-of-band logic wherever it’s needed • May be multiple places to paste it • Brittle, tedious, error-prone, time/space (e.g., inlining) • Is there a better and more general approach? Out-of-Band (Admin) In-Band (Client) In-Band (Server)

  27. Interceptor Solution Approach • Our goal is to find a general mechanism to integrate out-of-band tasks with in-band tasks • In-band tasks • Processed as usual via framework • Out-of-band tasks • register with framework via special interfaces • are triggered by framework on certain events • Some events are generated by in-band processing • are provided access to framework internals (i.e., context) via specific interfaces

  28. Component Configurator Pattern • Motivation • 7x24 server upgrades • “always on, always connected” • Web server load balancing • Work flows (re-)distributed to available endsystems • Mobile agents • Service reconfiguration on agent arrival/departure • Solution Approach • Decouple implementations over time • Allow different behaviors of services at run-time • Offer points where implementations are re-configured • Allow configuration decisions to be deferred until service initiation or while service is running (suspend/resume)

  29. Service Lifecycle • Compare picture to • Thread states • Process states • E.g., Silberschatz & Galvin 4th ed, Fig. 4.1 • Can “park” a service • Users wait for a bit • Or, upgrade a copy • Background reconfig • Hot swap when ready

  30. Mixed Duration Request Handlers (MDRH) Vertical Design of an Architecture follower threads • Reactor + HS/HA or LF • Designed to handle streams of mixed duration requests • Focused on interactions among local mechanisms • Concurrency and synchronization concerns • Hand-offs among threads • Well suited for “hub and spokes” or “processing pipeline” style applications • However, in some applications, a distributed view is more appropriate hand off chains enqueue requests leader thread reactor thread

  31. Horizontal Design of an Architecture • Application components are implemented as handlers • Use reactor threads to run input and output methods • Send requests to other handlers via sockets, upcalls • These in turn define key interception points end-to-end handler h1 handler h2 handler h4 handler h3 socket socket reactor r3 reactor r1 reactor r2

  32. CSE 532 Fall 2013 Take Home Final Exam • Posted on the course web site in Word and Acrobat • Complete electronically then email to cdgill@cse.wustl.edu • Alternatively, you can print, complete, and submit hard copy • Please complete the exam without consulting others • Except Prof. Gill, to whom you can individually e-mail any clarifying questions, etc. that you may have as you go • You are free to use compilers, web site slides, etc. • Copies of the required and optional text books will be on reserve on the CSE Department office (Bryan 509C) • You may search for and read other sources of information on the internet, but you may not post questions for others there

More Related