Model checking transactional memory with spin
Download
1 / 30

Model checking transactional memory with Spin - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Model checking transactional memory with Spin. John O’Leary, Bratin Saha, Mark Tuttle Intel Corporation. We used the Spin model checker to prove that Intel’s software transactional memory is correct. What is transactional memory?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Model checking transactional memory with Spin' - eze


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Model checking transactional memory with spin

Model checking transactional memorywith Spin

John O’Leary, Bratin Saha, Mark Tuttle

Intel Corporation

We used the Spin model checker to prove that Intel’s software transactional memory is correct.


What is transactional memory
What is transactional memory?

A programming abstraction that makes it easier to write concurrent programs.


Concurrent programs are tricky
Concurrent programs are tricky

  • How do you synchronize access to tail of the queue?

    • What keeps two threads from writing the same queue entry?

enqueue(a)

concurrent queue

enqueue(b)

enqueue(c)

enqueue(v) = if last == max return false; last := last + 1; queue[last] := v; return true;


Locks are hard
Locks are hard

  • Locks used badly can lead to many subtle problems:

    • Hot-spots, blocking, dead-lock, priority inversion, preemption …

enqueue(a)

concurrent queue

enqueue(b)

enqueue(c)

enqueue(v) = if last == max return false; last := last + 1; queue[last] := v; return true;

acquire(lock);

release(lock);


Transactional memory is easy
Transactional memory is easy

  • A programming abstraction for many core

    • Makes it easy to implement atomic operations without locks

    • Makes is possible for average programmers to write correct code

enqueue(a)

concurrent queue

enqueue(b)

enqueue(c)

enqueue(v) = if last == max return false; last := last + 1; queue[last] := v; return true;

atomic {

}


Programming is easy
Programming is easy

  • Elegant code, properly synchronized, no data races

concurrent linked list

head

0

0

0

0

A takes head off list

A: atomic { result := head; if head != null then head := head.next; } use result;

B increments list elements

B: atomic { node := head; while (node != null) { node.value++; node := node.head; }


Implementation is hard
Implementation is hard

  • Some implementations expose intermediate states

    • A and B appear sequential, but run concurrently: data races!

    • A and B can exhibit the “privatization” bug:

result

head

head

A reads head

0

0

0

0

1

1

1

1

B reads head

result = 0

result = 1!

result = 0!!

  • Proving correctness is not easy

    • We think model checking can help


Our results
Our results

  • McRT is a software transactional memory from Intel

  • Spin is a software model checker from AT&T + NASA

  • We use Spin to prove that McRT is correct

    • “Every execution of every purely-transactional program with two transactions doing three reads and writes is serializable”

    • We validate an implementation model of an industrial product, not just an abstract protocol model

  • We give a Spin accelerator for shared memory programs



That s it
That’s it

  • We modeled this pseudocode exactly

    • We even model pointer dereferencing with array indexing

  • We do make the usual simplifying assumptions

    • No partial writes: modeled only whole-block loads and stores

    • No conflict handling: one of two conflicting transactions aborts

  • Timestamps are the key to the protocol


Timestamps are everywhere
Timestamps are everywhere

  • Global timestamp: global.ts

    • Advances whenever a transaction tries to commit or abort

    • When it changes, memory may have changed, so be careful

  • Transaction timestamp: txn.ts

    • Transaction start time (and current proposal for commit time)

    • Will be read by other transactions when they commit

    • Stored in transaction descriptor

      • Along with transaction read set, write set, undo log (local data)

  • Memory block timestamp: blk.ts

    • Commit time of last transaction writing the block

    • Stored in transaction record

      • Along with a lock needed to write the block


Design rule 1
Design rule 1

  • No transaction ever sees inconsistent data

    • Not even an aborting transaction!

    • Requires frequent checks that the read set is still valid

  • Validate() =

    • ts := global.ts

    • for each blk in my read set

      • confirm blk is not locked by another transaction

      • confirm blk.ts  my.ts

      • abort if either confirmation fails

    • my.ts := ts

  • After validation conclude

    • Read set has not change since transaction start


Design rule 2
Design rule 2

  • No transaction commits until conflicting transactions abort

    • Wait for conflicting transactions to undo changes upon abort

    • Avoids linked list privatization bug illustrated in introduction

  • Quiesce(my.ts) =

    • for each active transaction txn

      • block while txn.ts < my.ts and txn remains active

  • After quiescence conclude

    • Every conflicting transaction will validate which it commits

    • Validation will fail, transaction will abort, and undo its changes


Protocol sketch

Commit

Increment global ts

Validate read set

Set write set ts to global ts

Abort

Increment global ts

Undo changes to write set

Set write set ts to global ts

Read

Add block to read set

… unless

Block is locked

blk.ts > txn.ts

Write

Add block to write set

Add block value to undo log

Update block value

… unless

Block is locked

blk.ts > txn.ts

Protocol sketch



An invocation response model

pgm

pgm

pgm

StartICommitI

ReadI(x)

WriteI(x,v)

StartRCommitRAbortR

ReadR(v)AbortR

WriteRAbortR

mcrt

mcrt

mcrt

global timestamp

transaction timestamps

Shared memory

program memory block timestamps and locks

An invocation/response model


Mcrt environment
McRT environment

Environment

pgm1

pgm2

pgm3

send1

recv1

send2

recv2

send3

recv3

mcrt1

mcrt2

mcrt3

shared memory


Environment generates programs on the fly
Environment generatesprograms on the fly

pgm k

pgm k

pgm k

read(x,_)

read(x,v)

read(x,v)

----

read(y,_)

read(y,_)

----

----

----

ReadI(y)

ReadI(x)

ReadR(v)

mcrt k

mcrt k


Environment simulates programs including aborts
Environment simulates programsincluding aborts

pgm k

pgm k

pgm k

pc

read(x,v)

read(x,_)

read(x,_)

read(y,w)

read(y,_)

read(y,_)

pc

write(z,u)

write(z,u)

write(z,u)

WriteI(z,u)

StartI

AbortR

mcrt k

mcrt k


Environment checks results
Environment checks results

pgm 1

pgm 2

pgm 3

read(x,v)

read(w,a)

write(m,l)

read(y,w)

read(y,w)

write(n,p)

write(z,u)

write(w,b)

write(z,v)

CommitR

CommitR

CommitR(ordering hint)

mcrt 1

  • CommitR carries transaction ordering hint

  • Environment finds a transaction ordering consistent with transaction results and program memory


We modeled pseudocode exactly
We modeled pseudocode “exactly”

Let’s look at the least “exact” match: Abort


STMTxnAbort(TxnDesc* txnDesc, uint32 reason) {

for ( (addr, val, size) in txnDesc->undoLog ) {

if (addr is on dead stack frames) continue;

switch(size) {

case 4: *(uint32*)addr = val; break;

...

}

}

if ((token = txnDesc->token) == 0)

token = lockedIncrement(globalTimeStamp);

for ( txnRecPtr in txnDesc->writeSet )

*txnRecPtr = token;

txnDesc->localTimeStamp = 0;

backoff();

abortInternal(txnDesc); /* longjmp */

}

inline abortTransaction(txnDescPtr, ...) {

foreach adr in 0..(num_addresses)-1 {

if

:: txnDesc(txnDescPtr).undoLog[adr] != null_data ->

memory[adr] = txnDesc(txnDescPtr).undoLog[adr];

:: else

fi

};

fetch_and_incr (globalTimeStamp,token,token_new);

foreach blk in 0..(num_memory_blocks)-1 {

if

:: txnDesc(txnDescPtr).writeSet[blk] ->

txnRecHeap[blk] = token_new;

:: else

fi

};

/* reset transaction descriptor for restart */

initTxnDesc(txnDesc(txnDescPtr),...);

txnDesc(txnDescPtr).localTimeStamp = 0;

}

Pseudocode

Our model



Challenges
Challenges

  • Modeling environment, abort, timestamps, …

  • Code-level models are hard to model check

    • Too much detail, too many interleavings

  • SPIN statement-merging is conservative

    • Intended to reduce detail by creating larger atomic blocks

    • Looping over data structures inhibits statement-merging

  • SPIN partial-order reduction is conservative

    • Intended to identify and ignore “redundant” interleavings

    • Global variables (like shared memory) inhibit partial-order reduction


A spin preprocessor
A SPIN preprocessor

  • Loop unrolling to help statement merging, etc.

    • Loop unrolling alone gives 50% speedup

  • Model rewriting to help partial order reduction (planned)

    • Help Spin find fewer, longer atomic blocks to reorder

    • Rewrite model as a set of transitions of the form

      atomic{ local access; local access; … ; global access}

adr = 0;do :: adr < num_addresses -> memory[adr] = 0 :: else -> break;od;

memory[0] := 0memory[1] := 0memory[2] := 0memory[3] := 0


Related work
Related work

A deep result:Model checking TM often reduces to checking 2 threads

  • Deferred update [Guerraoui, Henzinger, Jobstmann, Singh, PLDI’08]

    • Applies to any TM that satisfies four structural properties

    • Clean, elegant result, but doesn’t apply to McRT

  • Update in place [Guerraoui, Henzinger, Singh, CAV’09]

    • Requires hand proof than TM satisfies four generalize properties

    • They prove this for an abstract model of McRT

    • Proof not clear for our implementation model of McRT


The abstract model
The abstract model

Our implementation model is 2500+ lines of Spin


Conclusion
Conclusion

We validated Intel’s implementation of STM.

We optimized SPIN’s performance on shared memory protocols.


ad