1 / 16

Transactional Memory: Architectural Support for Lock-Free Data Structures

Transactional Memory: Architectural Support for Lock-Free Data Structures. Herlihy & Moss Presented by Robert T. Bauer. Problem. Software implementations of lock-free (not using locks) data structures do not perform as well as locking-based implementations. Qualifications:

susan
Download Presentation

Transactional Memory: Architectural Support for Lock-Free Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transactional Memory: Architectural Support for Lock-Free Data Structures Herlihy & Moss Presented by Robert T. Bauer

  2. Problem • Software implementations of lock-free (not using locks) data structures do not perform as well as locking-based implementations. • Qualifications: • Lock based implementations can suffer from: • Priority Inversion; • Convoying; • Deadlock; and • Contention & Synchronization (Memory Barrier) • In the absence of these, lock based implementations can out perform lock-free approaches.

  3. “Solution” • If software is the problem, perhaps the solution is hardware. • In this case the solution tendered is transactional memory • Modify cache-coherence protocol • Provide new instructions • Goal: • Make lock-free approaches as efficient and easy to use as conventional lock-based approaches

  4. Results • Demonstrate that transactional memory can be more efficient than: • Test and Test and Set (TTS) • MCS (Software Queueing – instead of spinning wait on a queue) • LL/SC • Hardware Queueing – uses cache-lines to maintain the “list” • Important: The reported results were obtained from a simulator

  5. About the simulator • 32 processors • Regular cache: direct-mapped with 2048 eight byte lines • Transaction cache 64 eight byte lines • Simulator based on Proteus – doesn’t capture effects of I or D caches. • Simulation: • Cache (regular or transaction) access = 1 cycle • Single cycle commit (is this realistic???)

  6. Cache • Memory Bus Cycles • Read – (cache line access: shared) • Read For Ownership (RFO) – private read – (cache line access: exclusive) • Write – (cache line access: exclusive) • T_Read • T_RFO • rfo is usually issued by a compiler. Read a cache line and gain ownership over the line in anticipation of a subsequent write. • Busy • Abort and retry

  7. Transaction Operations: General • Transaction operations cache two entries • XCommit (discard on commit) [old value] • XAbort (discard on abort) [new value] • Transaction Commits • XCommit  Empty (contains no data) • XAbort  Normal (contains committed data) • Transaction Aborts • XCommit  Normal • XAbort  Empty • New Entry • Search for Empty entry • Search for Normal entry  If dirty, needs to be “evicted” • Search for XCommit (error in paper, this can never be “dirty”, but might be invalid)

  8. Transaction Operations: LT • LT operation • Exists XAbort in Trans. Cache  return value • Exists Normal in Trans. Cache  • Change Normal to XAbort • Allocate second entry with same data tag XCommit • Otherwise issue a T_Read cycle • Create Trans. Cache entry tagged XCommit • Create Trans. Cache entry tragged XAbort • If read returns busy (cache line is being updated) • Drop all XAbort, set all XCommit  Normal, TStatus = False

  9. Transaction Operations: ST • ST Operation • Cache hit • XAbort entry is updated • Cache miss • Set up two cache lines as before • XCommit • XAbort • Use T_RFO, set cache line state to reserved “Exclusive” (so T_READ, T_RFO from other processors will return “BUSY”) • As before, if read cycle (T_RFO) returns “Busy” we abort the transaction

  10. Transaction Operations: LTX • LTX Operation • Use T_RFO on cache miss

  11. Transaction Operations: Validate • Validate • TStatus (false means trans has been aborted)

  12. An Example (Counting) Read (exclusive access) Write Commit In multiprocessor environment, it is possible for all writes to be lost except for one. If each of M processors add “N” to counter (initially 0), the final value of counter is in the range: N ≤ counter ≤ M*N

  13. Performance (Counting) Locking: read lock, write lock, read counter, write counter, write lock == 5 mem ref Trans. Mem Trans. Mem LL/SC (single word mem) No commit (cache write)

  14. Another Example (Double Linked List) Read (exclusive); Plan to write If no other processor has modifed anything In the transaction set (read ‘u’ write) Commit fails if another processor/transaction modified anything in the transaction set

  15. Performance (Double Linked List) LL/SC MCS Trans Mem

  16. Observations • Many simplifications • Small data sets • Single cycle updates • S.C. Memory (no barriers) • Write back cache • More complex cache control logic • Can only snoop on a write, but in transaction system write-first won’t work; so need to “propagate” ownership.

More Related