1 / 25

TxLinux: Using and Managing Hardware Transactional Memory in an Operating System

TxLinux: Using and Managing Hardware Transactional Memory in an Operating System. Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan, Aditya Bhandari, and Emmett Witchel Presented by Ahmed Badran. Advantages of TM. Atomicity Isolation Longer critical sections

vidar
Download Presentation

TxLinux: Using and Managing Hardware Transactional Memory in an Operating System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TxLinux: Using and Managing Hardware Transactional Memory in an Operating System Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan, Aditya Bhandari, and Emmett Witchel Presented by Ahmed Badran

  2. Advantages of TM • Atomicity • Isolation • Longer critical sections • True concurrency • No deadlocks • Composable • (for HTM) Efficient, all done in HW

  3. acquire(&lock1) /* code fragment #1 */ release(&lock1) /* code fragment #2 */ acquire(&lock2) /* code fragment #3 */ release(&lock2) /* code fragment #4 */ acquire(&lock3) /* code fragment #5 */ release(&lock3) /* code fragment #6 */ acquire(&lock4) /* code fragment #7 */ release(&lock4) start_transaction(); /* * Some mix of fragments 1 * to 7 that semantically * accomplishes the same job */ commit_transaction(); Which is easier…? VS Transactions allow for longer critical sections

  4. acquire(&lock1) /* code fragment #1 */ release(&lock1) /* code fragment #2 */ acquire(&lock2) /* code fragment #3 */ release(&lock2) /* code fragment #4 */ acquire(&lock3) /* code fragment #5 */ release(&lock3) /* code fragment #6 */ acquire(&lock4) /* code fragment #7 */ release(&lock4) start_transaction(); /* * Some mix of fragments 1 * to 7 that semantically * accomplishes the same job */ commit_transaction(); …Well, yes, but… Longer critical sections mean higher chances for conflict and more work wasted in case of a conflict

  5. acquire(&lock1) /* code fragment #1 */ release(&lock1) /* code fragment #2 */ acquire(&lock2) /* code fragment #3 */ release(&lock2) /* code fragment #4 */ acquire(&lock3) /* code fragment #5 */ release(&lock3) /* code fragment #6 */ acquire(&lock4) /* code fragment #7 */ release(&lock4) start_transaction(); /* * Some mix of fragments 1 * to 7 that semantically * accomplishes the same job */ commit_transaction(); One thread One thread One thread One thread Many threads Many threads Many threads However… Transactions allow higher concurrency, since all threads are in the critical section at the same time

  6. acquire(&lock1) /* code fragment #1 */ release(&lock1) /* code fragment #2 */ acquire(&lock2) /* code fragment #3 */ release(&lock2) /* code fragment #4 */ acquire(&lock3) /* code fragment #5 */ release(&lock3) /* code fragment #6 */ acquire(&lock4) /* code fragment #7 */ release(&lock4) start_transaction(); /* code fragment #1 */ /* code fragment #2 */ /* code fragment #3 */ /* code fragment #4 */ /* code fragment #5 */ /* code fragment #6 */ /* code fragment #7 */ commit_transaction(); Let’s replace all locks by transactions! How long will it take multiple threads executing this transaction to finish?!

  7. if (we_are_being_attacked) { acquire(&nuclear_lock); if (number_of_bombs_activated < 1) { send_signal_to_activate_nuclear_bomb(); number_of_bombs_activated++; } release(&nuclear_lock); } Let’s replace all locks by transactions! if (we_are_being_attacked) { start_transaction(); if (number_of_bombs_activated < 1) { send_signal_to_activate_nuclear_bomb(); number_of_bombs_activated++; } commit_transaction(); }

  8. if (we_are_being_attacked) { acquire(&nuclear_lock); if (number_of_bombs_activated < 1) { send_signal_to_activate_nuclear_bomb(); number_of_bombs_activated++; } release(&nuclear_lock); } Let’s replace all locks by transactions! if (we_are_being_attacked) { start_transaction(); if (number_of_bombs_activated < 1) { send_signal_to_activate_nuclear_bomb(); number_of_bombs_activated++; } commit_transaction(); } I/O operation !!!

  9. Simple transactions cannot replace locks! • For highly contended critical sections, number of retries will be high • As we have seen we cannot do I/O in a transaction (“output commit problem”)

  10. HTM Primitives in MetaTM • xbegin • xend • xrestart • xgettxid • xpush • xpop • xtest • xcas

  11. spin_lock(&l->list_lock); offset = l->colour_next; if (++l->color_next >= cachep->colour) l->color_next = 0; spin_unlock(&l->list_lock); if ( !(objp = kmem_getpages(cachep, flags, nodeid))) goto failed; spin_lock(&l->list_lock); list_add_tail(&slabp->list, &(l->slabs_free)); spin_unlock(&l->list_lock); Sample transformation xbegin offset = l->colour_next; if (++l->color_next >= cachep->colour) l->color_next = 0; xend; if ( !(objp = kmem_getpages(cachep, flags, nodeid))) goto failed; xbegin; list_add_tail(&slabp->list, &(l->slabs_free)); xend;

  12. Cooperative transactional spinlocks (cxspinlock) • Allow a critical section to “sometimes” be protected by a lock and other times by a transaction • Allows the same data structure to be accessed from different critical regions that are protected by transactions or locks • I/O handled automatically • Provides a simple lock replacement in existing code

  13. Cooperative transactional spinlocks (cxspinlock) • cx_optimistic • cx_exclusive • Automatic I/O handling

  14. cxspinlock state diagram cx_optimistic Thread proceeds cx_optimistic Thread proceeds Sfree Stxnl cx_unlock If no other threads cx_exclusive Thread proceeds io_operation Restart transaction cx_unlock If no other threads cx_unlock If waiting excl. threads, release one excl. thread, If waiting atomic threads, release all atomic cx_unlock If waiting non-tx threads, Release one Sexcl cx_exclusive Thread waits OR Restart Transaction and thread proceeds cx_exclusive/cx_optimistic Thread waits

  15. cx_optimistic void cx_optimistic (lock) { status = xbegin; // Use mutual exclusion if required if (status == NEED_EXCLUSIVE){ xend; //xrestart for closed nesting if (xgettxid) xrestart(NEED_EXCLUSIVE); else cx_exclusive(lock); return; } // Spin waiting for lock to be free (==1) while (xtest(lock,1)==0); // spin disable_interrupts(); }

  16. cx_exclusive void cx_exclusive(lock){ // Only for non-transactional threads if (xgettxid) xrestart(NEED_EXCLUSIVE); while(1) { // Spin waiting for lock to be free while (*lock!=1); //spin disable_interrupts(); // Acquire lock by setting it to 0 // Contention manager arbitrates lock if(xcas(lock,1,0)) break; enable_interrupts(); } }

  17. cx_end void cx_end(lock){ if (xgettxid){ xend; } else { *lock=1; } enable_interrupts(); }

  18. Scheduling in TxLinux • Priority and Policy inversion • Contention management • (conflict priority, size, age) • Transaction-aware scheduling • Scheduler techniques • Dynamic Priority based on HTM state • Conflict-reactive descheduling

  19. Performance (1) Time lost due to restarted transactions and acquiring spin locks in 16 & 32 CPU experiments

  20. Performance (2) Spinlock performance for an unmodified Linux vs. the subsystem kernel TxLinux-SS

  21. Performance (3) Distribution of maximum concurrency across TxLinux-CX critical sections for the benchmark programs on 32 processors

  22. Performance (4)

  23. Performance (5) Cxspinlock usage in TxLinux

  24. Performance (6)

  25. Performance (7)

More Related