Modeling Laws of Concurrent Programming with Tracelets

Tracelets:a Model for theLaws of Concurrent Programming Tony Hoare Oxford Feb 2012

Our Universe • E is a set of events that can occur in or around a computer during execution of a program • drawn as boxes. • D is a set of dependencies between the events • drawn as arrows between boxes • e --> f means f depends on e • source, target: D --> E • source(d) --> target(d)

Labels • P are sets of properties of D or E • e.g., the type of command executed • the objects involved in the event • the value of the data transmitted on the arrow • labels:E + D --> P • labels(e) are drawn in the box, • labels(d) on top of the arrow

A tracelet is a subset of E , denoted by p, q, r . For example: • a trace of a single execution of a program • or of a single execution of a command • or of a single object used in the execution • I = { }, the tracelet that contains no event

An object • is used by a program to store or transmit data • e.g., a variable, a semaphore, a channel, … • Its behaviour is modelled by a tracelet • containing some or all events in which it has engaged • A trace of a complete program execution • is the union of all the tracelets for every resource that it has used

Pictorially δ ν ν labels the allocation of an object. δ labels its disposal. All other events of its tracelet lie in between

A variable δ ν :=4 :=2 • :=3 =2 =4 =3 =2 :=4 labels an assignment of value 4 =4 labels a fetch of value 4

Object names δx νx x:=4 x:=2 • x:=3 x=2 x= 4 x=3 x=2 may be added to the labels

A variable δ ν :=4 :=2 • :=3 =2 =4 =3 =2 The arrows from each fetch to the next assignment ensures prompt overwriting

Weak memory δ ν :=4 :=2 • :=3 =2 =4 =3 =2 … which does not occur in modern weak memories.

A Semaphore P V V ν δ P P is an acquisition of the semaphore V is a release (by the same owner)

A buffered channel !4 !2 • !=3 δ ν ?4 ?3 ?2 !4 labels an output of value 4 ?4 labels an input of value 4

A single-buffered channel δ !4 !2 • !3 ν ?4 ?3 ?2 Each output depends on prior input of the previous message

A complete program trace is the union of the tracelets for every command that it contains. • The tracelet of a command can be analysed into sub-tracelets for each of its immediate sub-commands. • The analysis determines whether the trace is a valid trace for the program.

Concurrent Composition • p|q= p  q, provided that p  q = { } • otherwise the analysis is invalid, because no event is an execution of two distinct commands. • Theorem: | is associative and commutative with unit 

Definitions • p --> q = e є p, f є q . e --> f • p => q = p = q or p is undefined.

Sequential Composition • p ; q =p|q provided not q--> p • otherwise the analysis is invalid, because no event in execution of the first command can depend on any event in the execution of the second. • Theorem: ; is associative with unit 

Example If x is a shared variable x := 3 ; x:=4 = x := 3 x := 4

Or, if blue arrows are equal, x := 3 ; x:=4 = x := 3 x := 4

Theorems • p;q => p|q • Proof: they are equal whenever not q --> p • otherwise, lhs is undefined • (p|p’);q => p|(p’;q) • they are equal when they are both defined • if rhs is undefined, then q --> p’ • which implies that q --> (p’|q), • therefore the lhs is also undefined.

Exchange laws • p;(q|q’) => (p;q)|q’ • proof similar • (p|p’);(q|q’) => (p;q)|(p’;q’) • proof similar • All exchange laws are derivable from the last, • by substituting  for p’, or q’, • or for both q and q’

Separating concurrency? • r = p||q = r = p;q & r = q;p • there is no arrow between p and q • BUT • this would prohibit shared variables • p||q <p;q • p ; (q||q’) < (p;q)||q’ etc. • the wrong way round! • Let’s postpone this problem

A command • is modelled by the set of tracelets of all its possible executions in all its possible environments. •  = { { } } • x := 3 = { p | ‘x :=3’ є labels(p)} • x := y = { p | n.{‘x:= n’, ‘y = n’} labels(p)}

Let P, Q, R, be commands • P | Q = { (p|q)| p є P & q є Q } • P ; Q = { (p;q) | p є P & q є Q } • P \/ Q = { r | r є P or r є Q } • P  Q = r . rє P => r є Q All our theorems p => q also hold for P  Q, • because every variable appears exactly once on each side of the inequation.

Separation • Let L be the set of red arrows • they must not cross thread boundaries • Blue arrows must cross thread boundaries • Black arrows may cross either boundaries. • Definitions of ; and | must be changed to ensure these rules

Dependency ordering • Let e < f mean that there is a path of arrows from e to f . • e <L f means the path consists of red arrows • p <L q means e --> f, for some e є p , f є q

Interfaces of a tracelet p • arrows(p) = s(p) u t(p) • ins(p) =t(p) - s(p) • outs(p)= s(p) - t(p) • where s(p) = {d| source(d) є p} t(p) = {d| target(d) є p}

In pictures… ins outs

Separating ; • p;q=p|q provided that not q <p & outs(p) L = ins(q)  L • outs(p;q) L = outs(q)  L • ins(p;q) L = ins(p)  L • Theorem: ; is associative and has unit 

Ok is a set of tracelets that are always preferred • e.g., no overflow, no races, no divergence, etc. • p => q means p = q or p is not defined or not q є ok •  є ok • p;qє ok = p є ok & q є ok

Separating concurrency • e <f = there is a path of red dependencies from e to f • p*q є ok = p є ok & qє ok & not p <q & not q <p • p*q = p|q if p*q є ok • because that’s the way it will be implemented!

Lift to sets • P => Q means p є P. p є ok => p є Q otherwise q є Q . not q є ok & ins(q) = ins(p)

Modeling Laws of Concurrent Programming with Tracelets