1 / 35

Annotations for (more) Precise Points-to Analysis

Mike Barnett – Microsoft Research, USA Manuel Fähndrich – Microsoft Research, USA Diego Garbervetsky – DC. FCEyN . UBA, Argentina Francesco Logozzo – Microsoft Research, USA - IWACO’07 -. Annotations for (more) Precise Points-to Analysis. Original Motivation. Objective

mandek
Download Presentation

Annotations for (more) Precise Points-to Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mike Barnett – Microsoft Research, USA Manuel Fähndrich – Microsoft Research, USA Diego Garbervetsky – DC. FCEyN. UBA, Argentina Francesco Logozzo – Microsoft Research, USA - IWACO’07 - Annotations for (more) Precise Points-to Analysis

  2. Original Motivation • Objective • Have a points-to and effect analysis to reason (among other things) about (weak) purity in .Net programs • Approach • Try to use Salcianu-Rinard points-to analysis • Problems • Their heap model neither support struct types nor parameter passing by reference • Relies in having a complete call graph for the app • Very conservative in case of non-analyzable methods

  3. Motivating Example List<int> Copy(IEnumerable<int> src) { List<int> l = new List(); foreach (int x in src) l.Add(x); return l; } Is Copy (weakly) pure? • It is difficult to predict the runtime type of src and iter • We cannot predict the effect of methods applied to iter on src … List<int> Copy(IEnumerable<int> src) { List<int> l = new List<int>(); IEnumerator<int>iter = src.GetEnumerator(); while (iter.MoveNext()){ int x = iter.get_Current(); l.Add(x); } return l; } • May GetEnumerator modify src? • May MoveNext or get_Current indirectly modify src?

  4. Our work • Interprocedural Points-to and read/ write Effects Analysis • Based on Salcianu’sPoinst-to and Purity Analysis • Support s some .NET features • Managed pointers, struct types • Extended support s for non analyzable calls • A small annotation Language • Represents points-to and effects information • Leverages on some Spec# annotations (ownership) • Implementation in Spec# compiler • Used to infer/verify method purity • Reentrancy analysis • Checking specifications admissibility in the Boogie Methodology

  5. Salcianu’s analysis reminder Main abstraction: Poinst-to graphs PTG=<I ,O, L, E> Models the part of the heap accessed by the method void m1(A p1, A p2) { p2.g = p1.f; } • Inside Nodes (objects allocated by m) • Load Nodes (placeholder for unknown objects read from outside the scope of m) • Parameter Nodes (represent the object/s passed as parameter) • Inside egdes: References created by m • Outside egdes: References read from outside the scope of m • W = set of write effects (n, field). W: {P2.f1}

  6. Salcianu’s analysis reminder void m0(A a2) { a1 = new A(); b = new B(); m1(a1,a2); } void m1(A p1, A p2) { p2.g = p1.f; } What happends is m1 is not analyzable? W: {a2.f1} • μ :: Node  P(Node) relates every node n in the callee’s summary to a set of existing or fresh nodes in the caller (nodes(Pm )  nodes(Pcallee)) • Fixpoint calculation • Match argument with parameters • Match reads from callee with writes of caller (outside egdes disambiguation)

  7. First extension • A new set of nodes: address nodes • A new level of indirection • Every variable or field is represented by its address • For objects and primitive values an outgoing edge (labeled *) meaning “the contents of” • For struct types the outgoing edges are fields. void m1(A p1, A p2) { p2.g = p1.f; }

  8. Second extension • Annotations to improve the precision of the analysis of non-analyzable calls • Approach • Add a few annotations to non-analyzable methods (pt&e info) • Compute PTG for every annotated method • We need to enrich the PTG model • Treat every method as analyzable • When method code is available check against annotations

  9. Why annotations? • Useful for interfaces and abstract methods • There is no source code • Virtual calls • Documentation • Impose restrictions over implementing classes • Useful for native code and 3rd party libraries • Source not available • Useful for local reasoning in program analysis • Used as summaries for method calls • Assumed valid for inter-procedural analysis • The callee will be eventually checked

  10. Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [Fresh] [Escapes(true)] [GlobalAccess(false)] IEnumerableGetEnumerator(); boolMoveNext();

  11. Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); boolMoveNext(); • Node = represents the set of objects reachable from the object/s represented by that node

  12. Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); boolMoveNext();

  13. Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); [GlobalAccess(false)] boolMoveNext();

  14. Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); [GlobalAccess(false)] boolMoveNext();

  15. GetEnumerator revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)][Captures(false)] IEnumerableGetEnumerator(); [GlobalAccess(false)] boolMoveNext(); $ Fields: readonly reference using not owned fields

  16. Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)][Captures(false)] IEnumerableGetEnumerator(); [WriteConfined] boolMoveNext(); $ Fields: readonly reference using not owned fields C: Means all nodes reachable in its ownership domain

  17. The Annotation Language • Fresh: for out parameters and ret value • It is a newly created object • Write: for parameters • The method may write objects reachable from the parameter • WriteConfined: for parameters • The method may write objects reachable from the parameter but only within its ownership domain • Escape(bool): for parameters • The method may create some link between objects reachable from the parameter and other objects reachable from the return value or another parameter. • Capture(bool): for parameters • The escaping parameter will be owned by some callee argument? • GlobalRead/GlobalWrite(bool): for methods • Does the method read or write a global? • WriteConfined: for methods • The method mutates only the objects owned by its parameters • Pure: for methods • The method can not mutate the pre-estate unless allowed (out parameters)

  18. Conclusions • Extend an existing poinst-to analysis • To support .NET memory model • Improve precision for non analyzable calls • Using omega nodes and an small annotation language • We find the annotation useful as documentation for the methods. • Leverages on some of existing Spec# annotations • Pure, Ownerships • Initial experiment are encouraging • Interesting improvements in precision (purity, aliasing)

  19. Future Work • Integration with Spec# • Generate and use modifies and read clauses • Improve precision: • Recompute $,? fields when more info becomes available • Use type information as annotations to reduce potential aliasing • Use omega nodes to “abstract” poinst-to graph (scalability)

  20. Questions?

  21. Additional Slides • About annotations • Omega Nodes • Motivation, Definition, integration with annotations • Experiments • PT Analysis step by step

  22. About annotations void m(Ap1, A p2){ A a; A b = m2(p1, p2); b.v = 20; c = new A(); m3(c); } • [Pure] annotation is not always enough information • They still can impact the caller • A pure method can: • Return a non fresh value ( a global variable, a parameter) • Make a parameter or a global reachable from outside • On the other hand… • A caller can be pure even if the callee is not. [Pure] A m2(A p1, A p2) { return p2; } A m3(A p2) { p2.v = 0; }

  23. PTG for non-analyzable calls • Even using annotations we don’t have total control of method behavior • If we allow writing a parameter: how deep is accessing its content? • Salcianu’s analysis generates the interprocedural mapping by matching operations on callee side with operation on caller side • One to one traverse on both graphs

  24. Some problems • Suppose we want to model that an unknown callee might potentially write any object reachable from a parameter • p1.f1 = 0; • p1.f1.f2 = 0; • p1.f1.f2….fn = 0; • Attempt 1: Mark the effect directly over all nodes reachable from the caller • Problem: When binding caller with callee, we may not have enough context… • M1: m2(a1); • The effect of the callee must persist the caller context • Attempt 2: Add to the callee PTG nodes/egdes corresponding with each potential access • Problem: They may be infinite

  25. Omega nodes • Omega nodes: • Represent every object reachable from that node • ? fields: Represent any field • When computing μ (binding time) • If an omega node appears in μ(n) , then add all nodes reachable from n to μ(n) • If some of them is a Load Node convert it into an omega node

  26. Annotations and Omega Nodes • Generates a conservative PTG for non analyzable calls using omega nodes for the parameters and “?” edges between them • Clean the PTG using information provided by annotations • Remove ? Egdes (or replace by $ edges) • Add omega confine nodes • Add inside nodes (fresh returns)

  27. Some experiments on Boogie

  28. First extension - Language • We define a subset of IL like language including managed pointer support • Necessary for parameter passing by reference and for dealing with struct types a = &b d = a.f1

  29. First extension • A new set of nodes: address nodes • A new level of indirection • Every variable or field is represented by its address • For objects and primitive values an outgoing edge (labeled *) meaning “the contents of” • For struct types the outgoing edges are fields. void m1(A p1, A p2) { p2.f1 = p1.f2; }

  30. Dealing with struct types • Struct type has value semantics • Treated as Values • Impact in assignments • Parameter passing

  31. P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; }

  32. P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; }

  33. P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; }

  34. P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; } Write Effects: [PLN.this].f • this.f • a.f

  35. P&PEA intraproc Summary for m2 void m2(A a) { a = this; D d = new D(); a.f = d; } Write Effects: [PLN.this].f • this.f

More Related