1 / 143

Noam Rinetzky Lecture 9: Abstract Interpretation I

Program Analysis and Verification 0368- 4479 http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html. Noam Rinetzky Lecture 9: Abstract Interpretation I. Slides credit: Roman Manevich , Mooly Sagiv , Eran Yahav. We begin …. Mobiles Scribe Home assignment – next lesson.

berg
Download Presentation

Noam Rinetzky Lecture 9: Abstract Interpretation I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Program Analysis and Verification 0368-4479http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html Noam Rinetzky Lecture 9: Abstract Interpretation I Slides credit: Roman Manevich, MoolySagiv, EranYahav

  2. We begin … • Mobiles • Scribe • Home assignment – next lesson

  3. Previously • Operational Semantics • Large step (Natural) • Small step (SOS) • Denotational Semantics • aka mathematical semantics • Axiomatic Semantics • aka Hoare Logic • aka axiomatic (manual) verification How? What? Why?

  4. From verification to analysis • Manual program verification • Verifier provides assertions • Loop invariants • Automatic program verification (P analysis) • Tool automatically synthesize assertions • Finds loop invariants

  5. Manual proof for max nums : arrayN : int{ N0 } x := 0 { N0  x=0 }res := nums[0]{ x=0 }Inv = { xN }while x < N{ x=k  k<N } if nums[x] > res then{ x=k k<N } res := nums[x]{ x=k k<N }{ x=k k<N } x := x + 1{ x=k+1 k<N } { xN xN} { x=N }

  6. Manual proof for max nums : arrayN : int{ N0 } x := 0 {N0  x=0 }res := nums[0]{ x=0 }Inv = { xN }while x < N{ x=k  k<N } if nums[x] > res then{ x=k k<N } res := nums[x]{ x=k k<N }{ x=k k<N } x := x + 1{ x=k+1 kN } { xN xN} { x=N }

  7. Can we find this proof automatically? nums : arrayN : int{ N0 } x := 0 {N0  x=0 }res := nums[0]{ x=0 }Inv = { xN }while x < N{ x=k  k<N } if nums[x] > res then{ x=k k<N } res := nums[x]{ x=k k<N }{ x=k k<N } x := x + 1{ x=k+1 kN } { xN xN} { x=N } Observation: predicates in proof have the general formconstraintwhere constraint has the formX - Y c orX c

  8. Zone Abstract Domain (Analysis) • Developed by Antoine Minein his Ph.D. thesis • Uses constraints of the formX - Y c and X c • Built on top of Difference Bound Matrices (DBM) and shortest-path algorithms • O(n3) time • O(n2) space

  9. Analysis with Zone abstract domain nums : arrayN : int{ N0 } x := 0 {N0  x=0 }res := nums[0]{ N0  x=0 }Inv = { N0  0xN }while x < N{ N0  0x<N } if nums[x] > res then{ N0  0x<N } res := nums[x]{ N0  0x<N }{ N0  0x<N } x := x + 1{ N0  0<xN } {N0  0x  x=N } nums : arrayN : int{ N0 } x := 0 {N0 x=0 }res := nums[0]{ x=0 }Inv = { xN }while x < N{ x=k  kN } if nums[x] > res then{ x=k k<N } res := nums[x]{ x=k k<N }{ x=k k<N } x := x + 1{ x=k+1 kN } { xN xN} { x=N } Static Analysis with Zone Abstraction Manual Proof

  10. Abstract Interpretation [Cousot’77] • Mathematical foundation of static analysis

  11. Abstract Interpretation [Cousot’77] • Mathematical foundation of static analysis • Abstract (semantic) domains (“abstract states”) • Transformer functions (“abstract steps”) • Chaotic iteration (“abstract computation”)

  12. Abstract Interpretation [CC77] • A very general mathematical framework for approximating semantics • Generalizes Hoare Logic • Generalizes weakest precondition calculus • Allows designing sound static analysis algorithms • Usually compute by iterating to a fixed-point • Not specific to any programming language style • Results of an abstract interpretation are (loop) invariants • Can be interpreted as axiomatic verification assertions and used for verification

  13. Abstract Interpretation by Example

  14. Motivating Application: Optimization • A compiler optimization is defined by a program transformation: T : Prog Prog • The transformation is semantics-preserving: sState. SsosC s = Ssos T(C) s • The transformation is applied to the program only if an enabling condition is met • We use static analysis for inferring enabling conditions

  15. Common Subexpression Elimination • If we have two variable assignmentsx := a op b… y := a op band the values of x, a, and b have not changed between the assignments, rewrite the code as x = a op b… y := x • Eliminates useless recalculation • Paves the way for more optimizations • e.g., dead code elimination op  {+, -, *, ==, <=}

  16. What do we need to prove? { true } C1 x := a op bC2{ x = a op b } y := a op bC3 { true } C1 x := a op bC2{ x = a op b } y := xC3 CSE

  17. Available Expressions Analysis • A static analysis that infers for every program point a set of facts of the formAV = { x = y | x, y Var } { x = -y | x, y Var }  { x = yopz | y, z Var, op  {+, -, *, <=} } • For every program with n=|Var| variables number of possible facts is finite: |AV|=O(n3) • Yields a trivial algorithm … but, is it efficient?

  18. Which proof is more desirable? { true } x := a + b{ x=a+b } z := a + c{ x=a+b } y := a + b... { true } x := a + b{ x=a+b } z := a + c{ z=a+c} y := a + b... { true } x := a + b{ x=a+b } z := a + c{ x=a+b z=a+c} y := a + b...

  19. Which proof is more desirable? { true } x := a + b{ x=a+b } z := a + c{ x=a+b } y := a + b... { true } x := a + b{ x=a+b } z := a + c{ z=a+c} y := a + b... More detailed predicate =more optimization opportunities { true } x := a + b{ x=a+b } z := a + c{ x=a+b z=a+c} y := a + b... x=a+b z=a+c x=a+b x=a+b z=a+c z=a+c Implication formalizes “more detailed” relation between predicates

  20. Developing a theory of approximation • Formulae are suitable for many analysis-based proofs but we may want to represent predicates in other ways: • Sets of “facts” • Automata • Linear (in)equalities • … ad-hoc representation • Wanted: a uniform theory to represent semantic values and approximations

  21. A Blast from the Past • Recall denotational semantics • Pre-orders • Partial orders • Complete partial orders CPO, • Pointed CPO • Lattices • Least upper bounds • Chains • Monotonic functions • Fixpoints

  22. Abstract Domains • Mathematical foundations

  23. Preorder • We say that a binary order relation over a set D is a preorder if the following conditions hold for every d, d’, d’’  D • Reflexive: d  d • Transitive: d  d’ and d’  d’’ implies d  d’’ • There may exist d, d’ such that d  d’ and d’  d yet d  d’

  24. Preorder example • Simple Available Expressions • Define SAV = { x = y | x, y Var }  { x = y + z | y, z Var } • For D=2SAV (sets of available expressions) define (for two subsets A1, A2  D) A1impA2 if and only if A1A2 • A1 is “more detailed” if it implies all facts of A2 • Compare {x=y  x=a+b} with {x=y  y=a+b} • Which one should we choose? Can we decide A1impA2 ?

  25. The meaning of implication • A predicate P represents the set of states models(P) = { s | s P} • P  Q meansmodels(P)  models(Q)

  26. Partially ordered sets • A partially ordered set (poset) is a pair (D , ) • D is a set of elements – a (semantic) domain •  is a partial order between pairs of elements from D. That is  : D  D with the following properties, for all d, d’, d’’ in D • Reflexive: d  d • Transitive: d  d’ and d’  d’’ implies d  d’’ • Anti-symmetric: d  d’ and d’  d implies d = d’ • Notation: if d  d’ and d  d’ we write d  d’ Unique “most detailed” element

  27. From preorders to partial orders • We can transform a preorder into a poset by • Coarsening the ordering • Switching to a canonical form by choosing a representative for the set of equivalent elementsd* for { d’ | d  d’ and d’  d }

  28. Coarsening for SAV • For D=2SAV (sets of available expressions) define (for two subsets A1, A2  D) A1coarseA2 if and only if A1A2 • Notice that if A1A2 then A1A2 • Compare {x=y  x=a+b} with {x=y  y=a+b} • How about {x=y  x=a+b  y=a+b} ?

  29. Canonical form for SAV • For an available expressions element A defineExplicate(A) = minimal set B such that: • A B • x=y  B implies y=x  B • x=y  B and y=z  B implies x=z  B • x=y+z  B implies x=z+y  B • x=y  B and x=z+w  B implies y=z+w B • x=y  B and z=x+w  B implies z=y+w B • x=z+w  B and y=z+w  B implies x=y  B • Makes all implicit facts explicit • Define A* = Explicate(A) • Define (for two subsets A1, A2  D) A1expA2 if and only if A1*A2* • Lemma:A1expA2 if and only A1impA2 Therefore A1impA2 is decidable

  30. Some posets-related terminology • If x  y we can say • x is lowerthan y • x is more precise than y • x is more concrete than y • x under-approximates y • y isgreaterthan x • y is less precise than x • y is more abstractthan x • y over-approximates x

  31. Visualizing ordering for SAV Greater {true} {x=y}* {y=z+a}* {p=q}* {x=y  y=z+a}* {x=y  p=q}* {y=z+a p=q}* Lower {false} D={x=y, y=x, p=q, q=p, y=z+a, y=a+z, z=y+z, x=z+a}

  32. Pointed poset • A poset (D, ) with a least element  is called a pointed poset • For all dD we have that   d • The pointed poset is denoted by (D , , ) • We can always transform a poset(D, ) into a pointed poset by adding a special bottom element(D  {},   {d | dD}, ) • Greatest element for SAV = {true = ?} • Least element for SAV = {false= ?}

  33. Annotating conditions { P}if bthen{ b P}S1{ Q1}elseS2{ Q2}{ Q} { b P} S1 { Q}, { b P} S2 { Q} { P} if bthenS1elseS2 { Q} We need a general way to approximate a set of semantic elements by a single semantic element [ifp] Q approximates Q1 and Q2

  34. Join operator • Assume a poset(D, ) • Let X  D be a subset of D (finite/infinite) • The join of X is defined as • X = the least upper bound (LUB) of all elements in X if it exists • X = min{ b | forallxX we have that xb} • The supremum of the elements in X • A kind of abstract union (disjunction) operator • Properties of a join operator • Commutative: x  y = y  x • Associative: (x  y)  z = x  (y  z) • Idempotent: x  x = x

  35. Meet operator • Assume a poset (D, ) • Let X  D be a subset of D (finite/infinite) • The meet of X is defined as • X = the greatest lower bound (GLB) of all elements in X if it exists • X = max{ b | forallxX we have that bx} • The infimum of the elements in X • A kind of abstract intersection (conjunction) operator • Properties of a join operator • Commutative: x  y = y  x • Associative: (x  y)  z = x  (y  z) • Idempotent: x  x = x

  36. Complete lattices • A complete lattice (D, , , , , ) is • A set of elements D • A partial order x  y • A join operator  • A meet operator  • A bottom element  = ? • A top element = ?

  37. Complete lattices • A complete lattice (D, , , , , ) is • A set of elements D • A partial order x  y • A join operator  • A meet operator  • A bottom element  =   • A top element =  D

  38. Transfer Functions • Mathematical foundations

  39. Towards an automatic proof • Goal: automatically compute an annotated program proving as many facts of the formx = y + z as possible • Decision 1: develop a forward-going proof • Decision 2: draw predicates from a finite set D • “looking under the light of the lamp” • A compromise that simplifies problem by focusing attention – possibly miss some facts that hold • Challenge 1: handle straight-line code • Challenge 2: handle conditions • Challenge 3: handle loops

  40. Domain for SAV • Define atomic facts (for SAV) as = { x = y | x, y Var }  { x = y + z | x, y, z Var } • For n=|Var| number of atomic facts is O(n3) • Define sav-predicatesas  = 2 • For D  , Conj(D) = D • Conj({a=b, c=b+d, b=c}) = (a=b)  (c=b+d)  (b=c) • Note: • Conj(D1  D2) = Conj(D1)  Conj(D1) • Conj({})  true

  41. Challenge 2:handling straight-line code

  42. handling straight-line code: Goal • Given a program of the formx1 := a1; … xn := an • Find predicates P0, …, Pn such that • {P0} x1 := a1 {P1} … {Pn-1} xn := an {Pn} is a proof • sp(xi := ai, Pi-1)  Pi • Pi = Conj(Di) • Di is a set of simple (SAV) facts

  43. Example { } x := a + b{ x=a+b } z := a + c{ x=a+b, z=a+c} b := a * c{ z=a+c } • Find a proof that satisfies both conditions

  44. Example { } x := a + b{ x=a+b } z := a + c{ x=a+b, z=a+c } b := a * c{ z=a+c } sp • Can we make this into an algorithm? cons Frame

  45. Algorithm for straight-line code • Goal: find predicates P0, …, Pn such that • {P0} x1 := a1 {P1} … {Pn-1} xn := an {Pn} is a proof That is: sp(xi := ai, Pi-1)  Pi • Each Pi has the form Conj(Di) where Di is a set of simple (SAV) facts • Idea: define a function FSAV[x:=a] :    s.t.if FSAV[x:=a](D) = D’then sp(x := a, Conj(D))  Conj(D’) • We call F the abstract transformer for x:=a • Initialize D0={} • For each i: compute Di+1 = Conj(FSAV[xi := ai] Di) • Finally Pi = Conj(Di)

  46. Defining an SAV abstract transformer • Goal: define a function FSAV[x:=a] :    s.t.if FSAV[x:=a](D) = D’then sp(x := a, Conj(D))  Conj(D’) • Idea: define rules for individual factsand generalize to sets of facts by the conjunction rule

  47. Defining an SAV abstract transformer • Goal: define a function FSAV[x:=a] :    s.t.if FSAV[x:=a](D) = D’then sp(x := a, Conj(D))  Conj(D’) • Idea: define rules for individual factsand generalize to sets of facts by the conjunction rule

  48. Defining an SAV abstract transformer • Goal: define a function FSAV[x:=a] :    s.t.if FSAV[x:=a](D) = D’then sp(x := a, Conj(D))  Conj(D’) • Idea: define rules for individual factsand generalize to sets of facts by the conjunction rule { y=z+w} x:=a { y=z+w } { y=x+w} x:=a { } { y=w+x} x:=a { } {} x:=  { x= } { x=} x:=a { } [kill-rhs-1] [kill-rhs-2] [kill-lhs] [preserve] [gen]  Is either a variable v or an addition expression v+w

  49. SAV abstract transformer example { } x := a + b{ x=a+b } z := a + c{ x=a+b, z=a+c } b := a * c{ z=a+c } { y=z+w} x:=a { y=z+w } { y=x+w} x:=a { } { y=w+x} x:=a { } {} x:=  { x= } { x=} x:=a { } [kill-rhs-1] [kill-rhs-2] [kill-lhs] [preserve] [gen]  Is either a variable v or an addition expression v+w

  50. Problem 1: large expressions { } x := a + b + c{ } y := a + b + c{ } • Large expressions on the right hand sides of assignments are problematic • Can miss optimization opportunities • Require complex transformers • Solution: transform code to normal form where right-hand sides have bounded size Missed CSE opportunity

More Related