1 / 80

Program analysis & Synthesis

Lecture 03 – Background + intro AI + dataflow and connection to AI. Program analysis & Synthesis. Eran Yahav. Previously…. structural operational semantics trace semantics meaning of a program as the set of its traces abstract interpretation abstractions of the trace semantics.

zalika
Download Presentation

Program analysis & Synthesis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 03 – Background + intro AI + dataflow and connection to AI Program analysis & Synthesis EranYahav

  2. Previously… • structural operational semantics • trace semantics • meaning of a program as the set of its traces • abstract interpretation • abstractions of the trace semantics Example: Cormac’s PLDI’08 Example: KupersteinPLDI’11

  3. Today • abstract interpretation • some basic math background • lattices • functions • Galois connection • A little more dataflow analysis • monotone framework

  4. Trace Semantics [z := 1]2 [y := x]1 [y := x]1; [z := 1]2; while [y > 0]3 ( [z := z * y]4; [y := y − 1]5; ) [y := 0]6 < 1,{ x42, y0, z0 } >  < 2,{ x42, y42, z0 } >  < 3,{ x42, y42, z1 } > < 4,{ x42, y42, z1 }> … • note that input (x) is unknown • while loop? • trace semantics is not computable [y > 0]3 [y := x]1 [z := 1]2 < 1,{ x73, y0, z0 } >  < 2,{ x73, y73, z0 } >  < 3,{ x73, y73, z1 } > < 4,{ x73, y73, z1 }> … [y > 0]3 …

  5. Abstract Interpretation abstract state abstract state S  • ’ abstract   C’’  S C C’ concrete set of states set of states

  6. Dataflow Analysis { (x,?), (y,?), (z,?) } { (x,?), (y,1), (z,?) } 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 • Reaching Definitions • The assignment lab: var := exp reaches lab’ if there is an execution where var was last assigned at lab { (x,?), (y,1), (z,2) } { (x,?), (y,1), (z,2) } { (x,?), (y,?), (z,4) } { (x,?), (y,5), (z,4) } { (x,?), (y,1), (z,2) } { (x,?), (y,?), (z,?) } (adapted from Nielson, Nielson & Hankin)

  7. Dataflow Analysis { (x,?), (y,?), (z,?) } { (x,?), (y,1), (z,?) } 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 • Reaching Definitions • The assignment lab: var := exp reaches lab’ if there is an execution where var was last assigned at lab { (x,?), (y,1), (z,2), (y,5), (z,4) } { (x,?), (y,1), (z,2), (y,5), (z,4) } { (x,?), (y,?), (z,4), (y,5) } { (x,?), (y,5), (z,4) } { (x,?), (y,1), (z,2), (y,5), (z,4) } { (x,?), (y,6), (z,2), (z,4) } (adapted from Nielson, Nielson & Hankin)

  8. Dataflow Analysis • Build control-flow graph • Assign transfer functions • Compute fixed point

  9. Control-Flow Graph 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 1: y:=x 2: z:=1 3: y > 0 6: y:=0 4: z=z*y 5: y=y-1

  10. Transfer Functions 1: y:=x in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) in(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } 2: z:=1 out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } 3: y > 0 out(3) = in(3) 6: y:=0 4: z=z*y out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } 5: y=y-1 out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) }

  11. System of Equations in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: ((Var x Lab) )12 ((Var x Lab) )12

  12. System of Equations in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } … F: ((Var x Lab) )12 ((Var x Lab) )12 in(1)={(x,?),(y,?),(z,?)} , in(2)=out(1), in(3)=out(2) U out(5), in(4)=out(3), in(5)=out(4),In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) }, out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3), out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) }, out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) }

  13. System of Equations … F: ((Var x Lab) )12 ((Var x Lab) )12 RD  RD’ when i: RD(i)  RD’(i)

  14. Monotonicity F: ((Var x Lab) )12 ((Var x Lab) )12 … out(1) = {(y,1)} when (x,?)  out(1) in(1) \ { (y,l) | l  Lab } U { (y,1) }, otherwise (silly example just for illustration)

  15. Convergence n n … … … 2 2 2 1 1 1 0 0 0

  16. Least Fixed Point • We will see later why it exists • For now, mostly informally… Fn(RD) F: ((Var x Lab) )12 ((Var x Lab) )12 RD  RD’ when i: RD(i)  RD’(i) F is monotone: RD  RD’ implies that F(RD)  F(RD’) RD = (, ,…,) F(RD), F(F(RD) , F(F(F(RD)), … Fn(RD) Fn+1(RD) = Fn(RD)  … F(F(RD))  F(RD)  RD

  17. Partially Ordered Sets (posets) • Partial ordering is a relation  : L  L  { true, false} that is: • Reflexive (l: l  l) • Transitive (l1,l2,l3: l1  l2  l2  l3  l1  l3) • Anti-symmetric ( l1  l2  l2  l1  l1 = l2) • A partially ordered set (L, ) is a set L equipped with a partial ordering  • For example: ((S), ) • Captures intuition of implication between facts • l1  l2 will intuitively mean l1  l2 • Later, in abstract domains , when l1  l2 , we will say that l1 is “more precise” than l2 (represents fewer concrete states)

  18. Bounds in Posets • Given a poset(L, ) • A subset Y of L has l  L as an upper bound if l’  Y : l’  l • A subset Y of L has l  L as a lower bound if l’  Y : l’  l (we write l’  l when l  l’) • A least upper bound (lub) l of Y is an upper bound of Y that satisfies l  l’ whenever l’ is another upper bound of Y • A greatest lower bound (glb) l of Y is a lower bound of Y that satisfies l  l’ whenever l’ is another lower bound of Y • For any subset Y  L • If the lub exists then it is unique, and denoted by Y (join) • We often write l1  l2 for { l1, l2} • If the glb exists then it is unique, and denoted by Y (meet) • We often write l1  l2 for { l1, l2}

  19. Complete Lattices • A complete lattice (L, ) is a poset such that all subsets have least upper bounds as well as greatest lower bounds • In particular •  =  = L is the least element (bottom) •  = L =  is the greatest element (top)

  20. Complete Lattices: Examples {1,2,3}  {1,3} {1,2} {2,3} - + {1} {3} {2} 0  

  21. Complete Lattices: Examples [- ,] … … [- ,2] [-2,] … … [-2,2] [-,1] [-1,] … … [- ,0] [0,] [-2,1] [-1,2] … … [-,-1] [1,] [-1,1] [0,2] [-2,0] … … [- ,-2] [2,] [-1,0] [-2,-1] [0,1] [1,2] … … [-2,-2] [-1,-1] [0,0] [1,1] [2,2] … … … 

  22. Moore Families • A Moore family is a subset Y of a complete lattice (L,) that is closed under greatest lower bounds • Y’  Y: Y’  Y • Hence, a Moore family is never empty • Always contains least element Y and a greatest element  • Why do we care? • We will see that Moore families correspond to choices of abstractions (as targets of upper closure operators) • More on this… later…

  23. Moore Families: Examples {1,2,3} {1,2,3} {1,3} {1,2} {1,2} {2,3} {2,3} {1} {3} {2} {2} {1,2,2}  

  24. Chains • In program analysis we will often look at sequences of properties of the form l0  l1  …  ln • Given a poset (L, ) a subset Y  L is a chain if every two elements in Y are ordered • l1,l2  Y: (l1  l2) or (l2  l1) • When Y is a finite subset we say that the chain is a finite chain • A sequence of elements l0  l1  …  L is an ascending chain • A sequence of elements l0  l1  …  L is a descending chain • L has finite height when all chains in L are finite • A poset (L, ) satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes, that is s N: n N: n  s  ln = ls

  25. Chains: Example Posets   0 … 2 -1 -1 0 1 … - …  1 -2 … 0 -  (a) (b) (c)

  26. Properties of Functions • A function f: L1  L2 between two posets (L1,1) and (L2, 2) is • Monotone (order-preserving) • l,l’  L1: l 1 l’  f(l) 2 f(l’) • Special case f: L  L, l,l’  L: l  l’  f(l)  f(l’) • Distributive (join morphism) • l,l’  L1 f(l  l’) = f(l)  f(l’)

  27. Function Properties: Examples {1,2,3} {1,3} {1,2} {2,3} {1} {3} {2} 

  28. Function Properties: Examples {1,2,3} {1,2,3} {1,2} {1,2} {2,3} {2,3} {2} {2}

  29. Function Properties: Examples {1,2,3} {1,2,3} {1,2} {1,2} {2,3} {2,3} {2} {2}

  30. Function Properties: Examples {1,2,3} {1,2,3} {1,3} {1,3} {1,2} {2,3} {1,2} {2,3} {1} {3} {1} {3} {2} {2}  

  31. Fixed Points  • Consider a complete lattice L = (L,,,,,) and a monotone function f: L  L • An element l  L is • A fixed point iff f(l) = l • A pre-fixedpointiff l  f(l) • A post-fixedpointiff l  f(l) • Fix(f) = set of all fixed points • Ext(f) = set of all pre-FP • Red(f) = set of all post-FP Red(f) fn() gfp Fix(f) lfp fn() Ext(f) 

  32. Tarski’s Theorem • Consider a complete lattice L = (L,,,,,) and a monotone function f: L  L, then lfp(f) and gfp(f) exist and • lfp(f) = Red(f) = Fix(f) • gfp(f) = Ext(f) = Fix(f)

  33. Kleene’sFixedpoint Theorem • Consider a complete lattice L = (L,,,,,) and a continuous function f: L  L thenlfp(f) = n  N fn() • A function f is continuous if for every increasing chain Y  L, f(Y) =  { f(y) | y Y } • Monotone functions on posets satisfying ACC are continuous • Every ascending chain eventually stabilizes l0  l1  …  ln = ln+1 = … hence ln is the least upper bound of { l0,l1,…,ln } , thus f(Y) = f(ln) • From monotonicity of f f(l0)  f(l1)  …  f(ln) = f(ln+1) = … Hence f(ln) is the least upper bound of { f(l0),f(l1), … ,f(ln) } , thus  { f(y) | y Y } = f(ln)

  34. An Algorithm for Computing lfp • Kleene’s fixed point theorem gives a constructive method for computing the lfp lfp(f) = n  N fn() l =  while f(l)  l do l = f(l) • In our case, can compute efficiently using a worklist algorithm for “chaotic iteration”

  35. Finally…

  36. Reaching Definitions… in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: ((Var x Lab) )12 ((Var x Lab) )12 l =  while F(l)  l do l = F(l)

  37. Algorithm Revisited • Input: equation system for RD • Output: lfp(F) • Algorithm: • RD1 = , … , RD12 =  • While RDj Fj(RD1,…,RD12) for some j • RDj = Fj(RD1,…,RD12) • Idea: non-deterministically select which component of RD should make a step • Can make selection efficiently based on dependencies • We have information on what equations depend on Fj that has been updated, need only recompute these on every step

  38. But… • where did the transfer functions come from? • e.g., out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } • how do we know transfer functions really over-approximate the concrete operations?

  39. Trace Semantics [z := 1]2 [y := x]1 [y := x]1; [z := 1]2; while [y > 0]3 ( [z := z * y]4; [y := y − 1]5; ) [y := 0]6 < 1,{ x42, y0, z0 } >  < 2,{ x42, y42, z0 } >  < 3,{ x42, y42, z1 } > < 4,{ x42, y42, z1 }> … • note that input (x) is unknown • while loop? • trace semantics is not computable [y > 0]3 [y := x]1 [z := 1]2 < 1,{ x73, y0, z0 } >  < 2,{ x73, y73, z0 } >  < 3,{ x73, y73, z1 } > < 4,{ x73, y73, z1 }> … [y > 0]3 …

  40. Instrumented Trace Semantics • on every assignment, record the label in which the assignment was performed

  41. Abstract Interpretation 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 (x,?)(y,?)(z,?)(y,1)(z,2)(y,6) • Instrumented trace semantics (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(z,4)(y,5)(y,6) …

  42. Collecting Semantics (x,?)(y,?)(z,?) (x,?)(y,?)(z,?) 1: y:=x (x,?)(y,?)(z,?)(y,1) (x,?)(y,?)(z,?)(y,1) 2: z:=1 (x,?)(y,?)(z,?) (y,1)(z,2) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) 3: y > 0 (x,?)(y,?)(z,?)(y,1)(z,2) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) 6: y:=0 4: z=z*y (x,?)(y,?)(z,?)(y,1)(z,2) (y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) 5: y=y-1 (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)

  43. System of Equations in(1) = { (x,?)(y,?)(z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = {  (y,1) |  in(1) } out(2) = {  (z,2) |  in(2) } out(3) = in(3) out(4) = {  (z,4) |  in(4) } out(5) = {  (y,5) |  in(5) } out(6) = {  (y,6) |  in(6) } lfp(G) … Gn(TR)  … G(G(TR))  G(TR)  Trace = (Var x Lab)* TR G: ((Trace) )12 ((Trace) )12

  44. Galois Connection  (A) A   C (C)  (C)  A  C  (A)

  45. Reaching Definitions: Abstraction • DOM() = variables in  • SRD()(v) = l iffrightmost pair for v in  is (v,l) • (C) = { (v,SRD()(v)) | v  DOM()    C } • (A) = {  | v  DOM() : (v,SRD()(v) )  A }  A (A) Set of Traces Set of (var,lab) pairs   C (x,?)(y,?)(z,?)(y,1)(z,2)(y,6) (C) (x,?),(z,2),(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?),(z,2),(z,4),(y,6) (x,?),(z,4),(y,6)  (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) (x,?),(z,4),(y,6)

  46. Reaching Definitions: Concretization • DOM() = variables in  • SRD()(v) = l iff rightmost pair for v in  is (v,l) • (C) = { (v,SRD()(v)) | v  DOM()    C } • (A) = {  | v  DOM() : (v,SRD()(v) )  A }  A (x,?),(z,4),(y,6) (A)   C (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (C) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) …  (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)…(z,4)(y,5)(y,6)

  47. Best (Induced) Transformer  C’ S ? A’ C S#  A How do we figure out the abstract effect S#of a statement S ?

  48. Sound Abstract Transformer  A’  C’ Aopt S S# C  A   S   (A)  S#(A)

  49. Soundness of Induced Analysis lfp(G) lfp(F) …    Gn(TR)  … (lfp(G)) … F(F(RD)) G(G(TR))   F(RD) G(TR)   RD TR (lfp(G))  lfp(  G  )  lfp(F)

  50. Back to a bit of dataflow analysis…

More Related