290 likes | 310 Views
Explore the principles of Log-Based Transactional Memory (LogTM) as outlined by Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, and David A. Wood, featured in a presentation by Colleen Lewis. Discover the design decisions, conflict detection mechanisms, transaction log management, and more. Dive into examples detailing transaction logging, commit processes, conflict resolutions, and the implications of in-cache and out-of-cache conflicts. Gain insights into handling false positives and learn about lazy cleanup strategies. Unveil the intricacies of LogTM through a comprehensive examination of key components and operational scenarios.
E N D
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis
Credits • Animations from the original LogTM HPCA presentation • Original graphs modified for readability
Big Picture • Hardware transaction motivation • Per thread log • Optimize commits (Hardware)
Design Decisions • Version Management • Eager – write in place • Lazy – write on commit • Conflict Detection • Eager – detect at read/write time • Lazy – detect at commit time
Transaction Logs • Pointer to the beginning of the log • Pointer to the end of the log • Read and Write bits for each cache line
Transaction Log Example VA Data Block R W • Initial State • LogBase = LogPointer • TM count > 0 00 12-------------- 0 0 40 --------------23 0 0 C0 34-------------- 0 0 1000 Log Base 1000 1040 Log Ptr 1000 1080 TM count 0 1 HPCA-12
Transaction Log Example VA Data Block R W • Store r2, (c0) /* r2 = 56 */ • Set W bit for block (c0) • Store address (c0) and old data on the log • Increment Log Ptr to 1048 • Update memory 00 12-------------- 0 0 40 --------------23 0 0 C0 34-------------- 56-------------- 0 0 1 1000 c0 34------------ Log Base 1000 1040 -- Log Ptr 1000 1048 1080 TM count 1 HPCA-12
0 0 0 1 Transaction Log Example VA Data Block R W • Commit transaction • Clear R & W for all blocks • Reset Log Ptr to Log Base (1000) • Clear TM count 00 12-------------- 0 0 40 --------------23 0 0 C0 56-------------- 0 0 1000 c0 34------------ Log Base 1000 1040 -- Log Ptr 1048 1000 1080 TM count 1 0 HPCA-12
0 0 0 1 34------------ -- Transaction Log Example VA Data Block R W • Abort transaction • Replay log entries to “undo” the transaction • Reset Log Ptr to Log Base (1000) • Clear R & W bits for all blocks • Clear TM count 00 12-------------- 0 0 40 --------------23 0 0 C0 56-------------- 34-------------- 0 0 1000 c0 Log Base 1000 1040 Log Ptr 1090 1000 1048 1080 TM count 1 0 HPCA-12
Conflict Detection • Checked at every read/write • Directory forwards read requests • Directory can have “sticky” data • Individual nodes responsible for detecting conflicts • Needs • Transaction mode bit • Overflow bit
GETX DATA Conflict Detection (example) • P0 store • P0 sends get exclusive (GETX) request • Directory responds with data (old) • P0 executes store Directory I [old] M@P0 [old] P0 P1 TM mode TM mode 0 1 0 Overflow Overflow 0 0 M (-W) [new] M (--) [old] I (--) [none] I (--) [none] HPCA-12
Conflict! Conflict Detection (example) • In-cache transaction conflict • P1 sends get shared (GETS) request • Directory forwards to P0 • P0 detects conflict and sends NACK Directory M@P0 [old] GETS Fwd_GETS P0 P1 TM mode TM mode 1 0 0 Overflow Overflow 0 0 M (-W) [new] M (-W) [new] I (--) [none] NACK HPCA-12
Conflict Detection (example) • Cache overflow • P0 sends put exclusive (PUTX) request • Directory acknowledges • P0 sets overflow bit • P0 writes data back to memory Directory M@P0 [old] Msticky@P0 [new] PUTX ACK DATA P0 P1 TM mode TM mode 1 0 0 Overflow Overflow 1 0 0 M (-W) [new] I (--) [none] I (--) [none] HPCA-12
Conflict! Conflict Detection (example) • Out-of-cache conflict • P1 sends GETS request • Directory forwards to P0 • P0 detects a (possible) conflict • P0 sends NACK Directory M@P0 [old] Msticky@P0 [new] GETS Fwd_GETS P0 P1 TM mode TM mode 1 0 0 Overflow Overflow 1 0 1 0 M (--) [old] I (--) [none] M (-W) [new] I (--) [none] I (--) [none] NACK HPCA-12
0 0 Conflict Detection (example) • Commit • P0 clears TM mode and Overflow bits Directory M@P0 [old] Msticky@P0 [new] P0 P1 TM mode TM mode 1 0 0 Overflow Overflow 0 1 0 M (--) [old] I (--) [none] M (-W) [new] I (--) [none] I (--) [none] HPCA-12
Conflict Detection (example) • Lazy cleanup • P1 sends GETS request • Directory forwards request to P0 • P0 detects no conflict, sends CLEAN • Directory sends Data to P1 Directory S(P1) [new] Msticky@P0 [new] GETS DATA CLEAN Fwd_GETS P0 P1 TM mode TM mode 0 0 0 Overflow Overflow 0 0 0 S (--) [new] I (--) [none] M (--) [old] M (-W) [new] I (--) [none] I (--) [none] HPCA-12
False Positives? • What if P0 has started a new transaction without cleaning the sticky data?
False Positive Example • Cache overflow • P0 sends put exclusive (PUTX) request • Directory acknowledges • P0 sets overflow bit • P0 writes data back to memory Directory M@P0 [old] Msticky@P0 [new] PUTX ACK DATA P0 P1 TM mode TM mode 0 1 0 Overflow Overflow 0 1 0 M (-W) [new] I (--) [none] I (--) [none]
0 0 False Positive Example • Commit • P0 clears TM mode and Overflow bits • Start New Transaction • P0 set TM mode • Eventually overflow • Set overflow bits Directory M@P0 [old] Msticky@P0 [new] P0 P1 TM mode TM mode 0 1 1 0 Overflow Overflow 1 1 0 0 I (--) [none] M (--) [old] M (-W) [new] I (--) [none] I (--) [none]
Conflict! Conflict Detection (example) • Out-of-cache conflict • P1 sends GETS request • Directory forwards to P0 • P0 detects a (possible) conflict • P0 sends NACK Directory M@P0 [old] Msticky@P0 [new] GETS Fwd_GETS P0 P1 TM mode TM mode 0 1 0 Overflow Overflow 1 0 1 0 M (--) [old] I (--) [none] M (-W) [new] I (--) [none] I (--) [none] NACK
Conflict Resolution and Deadlock Avoidance • Options • Wait – risk deadlock? • Abort – risk livelock? • Current Behavior • Wait • Abort if waiting on a logically younger process • Future Behavior? • Software contention manager
Evaluation • 32 SPARC processors • Solaris 9 OS • SIMICS – full system simulator • Magic no-ops • Tests • Micro-benchmarks • SPLASH suite
Microbenchmarks • High Contention / Short Transactions • Comparing: • EXP - TTS locks with exponential backoff • MCS – SW Queue based locks BEGIN_TRANSACTION(); new_total = total.count + 1; private_data[id].count++; total.count = new_total; COMMIT_TRANSACTION();
SPLASH2 Benchmark Results • Data presented as: PARMACS locks execution time LogTM execution time • Modified version: LogTM execution time PARMACS locks execution time 1 -
Conclusions • Optimize commits • Aborts handled by software • Stall to avoid wasting work • Allow sticky data because overflow is rare • Good performance on microbenchmark • False sharing has a big impacts on LogTM