1 / 21

Transactional Memory

Transactional Memory. Student Presentation: Stuart Montgomery. Why?. Shan Lu, et. al, Learning From Mistakes. Avoid explicit locking Non-blocking: can’t deadlock Higher level abstraction for the programmer

Download Presentation

Transactional Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transactional Memory Student Presentation: Stuart Montgomery CS5204 – Operating Systems

  2. Why? Shan Lu, et. al, Learning From Mistakes • Avoid explicit locking • Non-blocking: can’t deadlock • Higher level abstraction for the programmer • “The value of STM is that it allows you to focus on designing applications which happen to scale instead of the mechanisms employed to scale those applications.” CS5204 – Operating Systems

  3. What is TM? http://intranet.cs.man.ac.uk/apt/projects/TM/LeeRouting/ • Regions of code that manipulate values (memory) atomically • ACID: Atomicity, Consistency, Isolation, Durability • All or nothing • Atomic Regions - 1977 (Lomet) • Hardware TM - 1986/1993 (Knight/Herlihy and Moss) • Software TM -1995 (Shavit and Touitou) CS5204 – Operating Systems

  4. Hardware Transactional Memory CS5204 – Operating Systems

  5. Hardware TM • Transactions operate on hardware data cache • Changes to memory are committed atomically to other caches • A natural extension of Load-Linked/Store-Conditional • LL/SC provides an atomic update to a single word of memory • HTM extends LL/SC to create groups of atomic instructions • Generally, HTM systems are bounded • Restricts the size of a transaction! CS5204 – Operating Systems

  6. . . . cache address value state tag Bus Shared Memory Hardware TM Concepts CPU CPU • Snoopy Cache Coherence – Goodman 1983 • Caches listen to the bus to detect changes to their copy of the same cached address CS5204 – Operating Systems

  7. Hardware TM Instructions • LT • Read with NON-exclusive access (normal) • LTX • Read with exclusive access • Useful for when we anticipated writing to the read value soon • ST • Write to memory • Transaction State Instructions: Validate, Commit, Abort • XCOMMIT = Old vs. XABORT = New • Processor flags TSTATUS and TACTIVE CS5204 – Operating Systems

  8. Quick Example Herlihy and Moss, Transactional Memory: Architectural Support for Lock-Free Data Structures shared int counter; void process(int work){ int success = 0, backoff = BACKOFF_MIN; unsigned wait; while (success < work) { ST(&counter, LTX(&counter) + 1); if (COMMIT()) { success++; backoff = BACKOFF_MIN; } else { wait = random() % (1 << backoff); while (wait--); if (backoff < BACKOFF_MAX) backoff++; } } } CS5204 – Operating Systems

  9. Hardware TM Concerns • Requires new hardware • Possibility of starvation with BUSY signals • Augment with queuing mechanism • Separate Transactional Cache or not • 1 cache: set size limits transaction size • But could use cache emulation in software • Extra abort, etc. logic applies to entire larger cache • Tradeoff between strong atomicity and efficiency • Hybrid Systems: • VTM • HASTM • Hybrid TM CS5204 – Operating Systems

  10. Notable HTM Implementations HTM Test on Rock Surpasses STM SPARC Rock CPU • Simulations • Sun’s Rock SPARC multicore processor CS5204 – Operating Systems

  11. Performance • Simulation results show that transactional memory matches or outperforms the best known locking techniques for simple benchmarks, even in the absence of priority inversion, convoying, and deadlock. • Fewer access to memory • Long transactions more likely to abort • Large transactions more likely to exceed TM cache size CS5204 – Operating Systems

  12. Software Transactional Memory CS5204 – Operating Systems

  13. Word-Based STM CS5204 – Operating Systems • Shavit & Touitou • Historical • Each word of memory has an ownership record with old values • Transactions can “help” each other • Harris & Fraser • Track the deltas in transaction descriptors • Atomically commit transactions to the ownership records

  14. Object-Based STM Larus and Kozyrakis, Transactional Memory CS5204 – Operating Systems • Dynamic STM (Herlihy et. al.) • Higher-level TM Objects • FSTM (Fraser) • Similar to Dynamic STM

  15. Software TM Design • Closed nesting • The child commits into the parent • Open nesting • The child commits to the world • Other considerations: • Direct/Deferred Update • Early/Late Conflict Detection • Conflict Resolution • E.g. Abort or Backoff • Nesting • Exceptions • Often ignored in STM designs figure adopted from [tcc-mcdonald-isca06] CS5204 – Operating Systems

  16. Case Study: STM.NET “But with the wisdom of age and hindsight, I do believe limited forms of TM could be wildly successful at particular tasks and yet would have avoided many of the biggest challenges with unbounded TM.” • Microsoft’s experiment 2008-2010, then dropped • Hook the C# JIT compiler, also investigated C++ • Separate Haskell development • Joe Duffy’s retrospective on unbounded STM: • Applying transactions to intrinsically non-transactional operations (I/O) • Reading a block or file from the FS, output to the console, entry in the Event Log, web service calls, etc. • Weak vs. Strong Atomicity • Weak: Non-TM regions seeing the results of TM regions • Privatization • “Where is the killer App?”… most applications naturally parallel? CS5204 – Operating Systems

  17. Other Notable STM Implementations Intel STM Compiler Prototype (C/C++) Sun DSTM2 (Java factories) Several libraries from Harris & Fraser Other languages: Haskell, LISP, Clojure, C#, OCaml, Perl CS5204 – Operating Systems

  18. Summary • Hardware TM • Make memory access atomic by holding in a transactional cache • Caches for each CPU cooperate in determining use of memory locations • Faster • Software TM • Allows for a larger transactions and more design flexibility than HTM • Both word and object level granularity • Many possible design choices: • Strong/Weak Atomicity • Granularity • Conflict detection • Nested (Open or Closed) • Real-world implementations continue • Issues with particular design choices of STM.NET CS5204 – Operating Systems

  19. Questions? CS5204 – Operating Systems

  20. Backups CS5204 – Operating Systems

  21. Weak Atomicity Problem bool itIsOwned = false; MyObj x = new MyObj(); … atomic { // Tx0                          atomic { // Tx1     // Claim the state for my use:           if (!itIsOwned)     itIsOwned = true;                             x.field += 42; }                                        } int z = x.field; ... CS5204 – Operating Systems

More Related