1 / 22

Transactional Memory An Overview of Hardware Alternatives

Transactional Memory An Overview of Hardware Alternatives. David A. Wood University of Wisconsin Transactional Memory Workshop April 8 th , 2005. What’s database got to do with it?. Atomicity All updates, or none Consistency Correct at begin and end Isolation Partial work not visible

Download Presentation

Transactional Memory An Overview of Hardware Alternatives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Transactional MemoryAn Overview of Hardware Alternatives David A. Wood University of Wisconsin Transactional Memory Workshop April 8th, 2005

  2. What’s database got to do with it? • Atomicity • All updates, or none • Consistency • Correct at begin and end • Isolation • Partial work not visible • Inputs stay stable • Durability • Survive “system” failures All (or some) memory ops, not just database objects Despite increasing awareness of failures Thread-Level Transactional Memory

  3. CPU 801 Database Storage • Lock bits on virtual memory • 128 byte granularity • Added to pagetable and TLB • Caches user’s lock state • Trap on lock conflict • No h/w for logging, abort, etc. • Only uniprocessors • 801 and RS/6000 Memory TLB Tid Was this transactional memory? Thread-Level Transactional Memory

  4. SQL/801 • “The development of SQL/801 was greatly simplified because, with minor exceptions, it considers only a single user. It achieves multiuser concurrency [on a uniprocessor] by running in multiple processes using the shared database storage….” Chang and Mergen, ’88 • Largest transactional memory application • Only real hardware transactional memory implementation • No one seems to be looking at what they learned Thread-Level Transactional Memory

  5. Basic Transactional Mechanisms • Isolation • Detect when transactions conflict • Track read and write sets • Version management • Record new and old values • Atomicity • Commit new values • Abort back to old values Thread-Level Transactional Memory

  6. H/W Transactional Memory Systems • Knight’s Lisp Work • Transactional Memory • Oklahoma Update • SLE/TLR • Transactional Coherence and Consistency • Unbounded TM • Virtual TM • Thread-level TM Thread-Level Transactional Memory

  7. Knight’s Lisp Work [’86] • Parallel execution of sequential code • Break program into “transaction blocks” • Multiple loads in a transaction • Exactly one store ends the transaction • No register state passed between transactions • Execute transactions in parallel • Track dependences (i.e., read set) • Abort and restart on conflicting write • Transactions commit in sequential order • Broadcast writes on commit Thread-Level Transactional Memory

  8. CPU Knight’s Hardware • Two caches • Dependency cache • Tracks read set • Bus monitor detects conflicts • Confirm cache • Holds write set • Supports multiple writes • Commits • Check dep. cache • Broadcast writes • Fast aborts • Invalidate Confirm cache • Use old values in Dep. Cache • Immediately restart execution Memory Dependency Cache Confirm Cache Spawned two threads: TLS & TM Thread-Level Transactional Memory

  9. H&M’s Transactional Memory [’93] • Targets explicitly parallel (non-functional) codes • Motivated by lock-free data structures • Transactions: • Read and write multiple locations • Commit in arbitrary order • Implicit begin, explicit commit operations • Abort affects memory, not registers • Software manages restarting execution • Validate instruction detects pending abort • Implementation extends cache coherence • Read/Write locks correspond to MOESI states • Add orthogonal transaction states Thread-Level Transactional Memory

  10. CPU H&M’s Transactional Memory • Adds Transaction Cache • Stores all data accessed by transactions • 2 copies of each line • Before and after image • Even for read-only data • Small, fully associative • Abort on all conflicts • NACK conflicting requests • Abort NACKed transaction • Fast commit and abort • Change trans. cache state Memory Cache Transaction Cache Thread-Level Transactional Memory

  11. SLE/TLR • Hardware exploits speculative processors • Read sets tracked by coherence protocol • Write set maintained in store queue • Abort restarts execution, including register state • Speculative lock elision (SLE) • Elide locks from the dynamic execution stream • Convert critical sections to optimistic transactions • Concurrently execute non-conflicting transactions • Fall back on explicit locks if conflicts • Transactional Lock Removal (TLR) • Resolve conflicts using priority ordering (timestamps) • Delay lower priority transactions • Deadlock and starvation free Thread-Level Transactional Memory

  12. Transactional Coherence and Consistency [’04] • TCC unifies coherence, memory consistency, and transaction support • All transactions, all the time • Transaction ordering • Ordered, Unordered, Partially Ordered • Supports thread-level speculation • Optimistic concurrency model • Unordered transactions serialize at commit • Conflicts detected at commit Thread-Level Transactional Memory

  13. TCC On-Chip Interconnect Broadcast updates at commit Write buffer ~4 kB, holds new values until commit Shadow register file checkpoints architectural registers L2 Cache Logically Shared CPU L1 D L1 cache tracks read set, bit per line SRF Thread-Level Transactional Memory

  14. TCC • Commits are sequential • Broadcasts addresses of all updates • Supports large transactions • Serialize all other transactions • Grabs and holds the commit bus • Cannot abort large transactions • Updates affect L2/Mem; no undo • Extensions forthcoming • talk to Kunle and Christos Thread-Level Transactional Memory

  15. Unbounded Transactional Memory (UTM) • Unbounded transactions • Arbitrary size • Not limited by write buffer, cache, or memory • Arbitrary duration • Not limited by interrupts, context switch, etc. • Complex implementation • Not justified by performance • Settle for “nearly” unbounded transactions • Much simpler hardware Thread-Level Transactional Memory

  16. Transactional Linux • Almost all of the transactions require < 100 cache lines • 99.9% need fewer than 54 cache lines • There are, however, some very large transactions! • >500k-byte fully-associative cache required Log-log scale Thread-Level Transactional Memory

  17. Large Transaction Memory (LTM) • Register checkpoints • Snapshot of rename maps • Cache tracks read and write sets • T-bits mark transactional blocks • Cache holds new data values “in place” • O-bit indicates overflow to in-memory hashtable • Memory holds committed state • Abort invalidates all modified blocks • Miss on re-execution • Transactional writes force memory updates • Repeated writes (e.g., to local data) are written through Thread-Level Transactional Memory

  18. Virtual Transactional Memory (VTM) • Only an overflow mechanism • No overhead on common in-cache case • Check shared overflow counter on cache miss • Low overhead when no conflict • Shared Bloom Filter rules out conflicts • Filter resides in virtual memory • Higher overhead on possible conflict • Hardware table walk to detect actual conflict • Table resides in virtual memory • Only incurred by large transactions with likely conflict • Supports context switches and paging Thread-Level Transactional Memory

  19. 801 revisited • Why didn’t 801 database storage succeed? • Lock bits helped performance and simplified software • Answer #1: • Changing lock bits requires TLB shootdown • Too complicated for the benefits? •  Not a current problem: transaction h/w is easy • Answer #2: • Not universally available • DB2 was (is) multiplatform • Can’t rely on feature only available in one architecture • Still a relevant concern Thread-Level Transactional Memory

  20. Need Standard Transaction Interface • Abstract away resource requirements • Support large, long transactions • Virtualize transactional memory • Transaction semantics between threads • NOT a hardware property • Permit range of implementations • Hardware, software, and combinations Thread-Level Transactional Memory

  21. Thread-level Transactional Memory • Abstract mechanisms • Version management • Update memory “in place” • Log “before images” to thread level VM • Isolation • Logically extend memory words with read and write bits • Implementations can be conservative (e.g., blocks) • Atomicity • Commits easy due to in place updates • Aborts trap to user-level software • Hardware can accelerate common case Thread-Level Transactional Memory

  22. Conclusions • Make the common case fast • 99+% of transactions fit in hardware • Lots of alternatives • Make both commits and aborts fast • Handle the uncommon case • Large transactions will occur, deal with ‘em • Shouldn’t be limited by hardware • Agree on a common abstraction • Success requires multi-platform support • Let vendors compete on price-performance Thread-Level Transactional Memory

More Related