1 / 90

Pointer analysis

Pointer analysis. p. t. l. S1. S2. Flow insensitive loss of precision. S1: l := new Cons. Flow-insensitive Soln (Andersen). Flow-sensitive Soln. p := l. p. t. l. S1. S2. S2: t := new Cons. p. t. l. S1. S2. *p := t. p. t. l. S1. S2. p := t. p. t. l. S1. S2.

Download Presentation

Pointer analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pointer analysis

  2. p t l S1 S2 Flow insensitive loss of precision S1: l := new Cons Flow-insensitive Soln (Andersen) Flow-sensitive Soln p := l p t l S1 S2 S2: t := new Cons p t l S1 S2 *p := t p t l S1 S2 p := t p t l S1 S2

  3. Flow insensitive loss of precision • Flow insensitive analysis leads to loss of precision! main() { x := &y; ... x := &z; } Flow insensitive analysis tells us that x may point to z here! • However: • uses less memory (memory can be a big bottleneck to running on large programs) • runs faster

  4. Worst case complexity of Andersen Worst case: N2 per statement, so at least N3 for the whole program. Andersen is in fact O(N3) x y x y *x = y a b c d e f a b c d e f

  5. New idea: one successor per node • Make each node have only one successor. • This is an invariant that we want to maintain. x y x y *x = y a,b,c d,e,f a,b,c d,e,f

  6. More general case for *x = y x y *x = y

  7. x y x y x y *x = y More general case for *x = y

  8. Handling: x = *y x y x = *y

  9. x y x y x y x = *y Handling: x = *y

  10. Handling: x = y (what about y = x?) x y x = y Handling: x = &y x y x = &y

  11. x y x y x y x = y x y x y x x = &y y,… Handling: x = y (what about y = x?) get the same for y = x Handling: x = &y

  12. Our favorite example, once more! S1: l := new Cons 1 p := l 2 S2: t := new Cons 3 *p := t 4 p := t 5

  13. Our favorite example, once more! l l p 1 2 S1: l := new Cons 1 S1 S1 3 p := l 2 l p t l p t 4 S2: t := new Cons 3 S1 S2 S1 S2 5 *p := t 4 l p t l p t p := t 5 S1 S2 S1,S2

  14. p t l S1 S2 Flow insensitive loss of precision Flow-insensitive Unification- based S1: l := new Cons Flow-sensitive Subset-based Flow-insensitive Subset-based p := l p t l S1 S2 S2: t := new Cons p t l p t l S1 S2 *p := t p t S1,S2 l S1 S2 p := t p t l S1 S2

  15. Another example bar() { i := &a; j := &b; foo(&i); foo(&j); // i pnts to what? *i := ...; } void foo(int* p) { printf(“%d”,*p); } 1 2 3 4

  16. Another example p bar() { i := &a; j := &b; foo(&i); foo(&j); // i pnts to what? *i := ...; } void foo(int* p) { printf(“%d”,*p); } i i j i j 1 2 3 1 2 a a b a b 3 4 4 p p i j i,j a b a,b

  17. Steensgaard & beyond • A well engineered implementation of Steensgaard ran on Word97 (2.1 MLOC) in 1 minute. • One Level Flow (Das PLDI 00) is an extension to Steensgaard that gets more precision and runs in 2 minutes on Word97.

  18. Correctness

  19. Compilers have many bugs Searched for “incorrect” and “wrong” in the gcc-bugs mailing list. Some of the results: • [Bug middle-end/19650] New: miscompilation of correct code • [Bug c++/19731] arguments incorrectly named in static member specialization • [Bug rtl-optimization/13300] Variable incorrectly identified as a biv • [Bug rtl-optimization/16052] strength reduction produces wrong code • [Bug tree-optimization/19633] local address incorrectly thought to escape • [Bug target/19683] New: MIPS wrong-code for 64-bit multiply • [Bug c++/19605] Wrong member offset in inherited classes • Bug java/19295] [4.0 regression] Incorrect bytecode produced for bitwise AND • … Total of 545 matches… And this is only for one month! On a mature compiler!

  20. if (…) { x := …; } else { y := …; } …; Compiler bugs cause problems Compiler Exec • They lead to buggy executables • They rule out having strong guarantees about executables

  21. Original program Optimized program Optimization The focus: compiler optimizations • A key part of any optimizing compiler

  22. The focus: compiler optimizations • A key part of any optimizing compiler • Hard to get optimizations right • Lots of infrastructure-dependent details • There are many corner cases in each optimization • There are many optimizations and they interact in unexpected ways • It is hard to test all these corner cases and all these interactions

  23. Goals • Make it easier to write compiler optimizations • student in an undergrad compiler course should be able to write optimizations • Provide strong guarantees about the correctness of optimizations • automatically (no user intervention at all) • statically (before the opts are even run once) • Expressive enough for realistic optimizations

  24. The Rhodium work • A domain-specific language for writing optimizations: Rhodium • A correctness checker for Rhodium optimizations • An execution engine for Rhodium optimizations • Implemented and checked the correctness of a variety of realistic optimizations

  25. Broader implications • Many other kinds of program manipulators:code refactoring tools, static checkers • Rhodium work is about program analyses and transformations, the core of any program manipulator • Enables safe extensible program manipulators • Allow end programmers to easily and safely extend program manipulators • Improve programmer productivity

  26. Outline • Introduction • Overview of the Rhodium system • Writing Rhodium optimizations • Checking Rhodium optimizations • Discussion

  27. Rdm Opt Rdm Opt Rdm Opt Rhodium system overview Written by the Rhodium team Rhodium Execution engine Checker Written by programmer

  28. Rdm Opt Rdm Opt Rdm Opt Rhodium system overview Written by the Rhodium team Rhodium Execution engine Checker Written by programmer

  29. Checker Checker Checker Rhodium system overview Rdm Opt Rdm Opt Rdm Opt

  30. if (…) { x := …; } else { y := …; } …; Checker Checker Checker Checker Checker Checker Rhodium system overview Compiler Rhodium Execution engine Exec Rdm Opt Rdm Opt Rdm Opt

  31. The technical problem • Tension between: • Expressiveness • Automated correctness checking • Challenge: develop techniques • that will go a long way in terms of expressiveness • that allow correctness to be checked

  32. Automatic Theorem Prover Solution: three techniques Rdm Opt Verification Task Checker Show that for any original program: behavior of original program = behavior of optimized program Verification Task

  33. Automatic Theorem Prover Solution: three techniques Rdm Opt Verification Task Verification Task

  34. Automatic Theorem Prover Solution: three techniques Rdm Opt Verification Task Verification Task

  35. Automatic Theorem Prover Solution: three techniques Rdm Opt • Rhodium is declarative • declare intent using rules • execution engine takes care of the rest

  36. Automatic Theorem Prover Solution: three techniques Rdm Opt • Rhodium is declarative • declare intent using rules • execution engine takes care of the rest

  37. Automatic Theorem Prover Solution: three techniques Heuristics not affecting correctness Part that must be reasoned about • Rhodium is declarative • Factor out heuristics • legal transformations • vs. profitable transformations Rdm Opt

  38. Automatic Theorem Prover Solution: three techniques Heuristics not affecting correctness Part that must be reasoned about • Rhodium is declarative • Factor out heuristics • legal transformations • vs. profitable transformations

  39. Automatic Theorem Prover Solution: three techniques • Rhodium is declarative • Factor out heuristics • Split verification task • opt-dependent • vs. opt-independent opt-dependent opt-independent

  40. Automatic Theorem Prover Solution: three techniques • Rhodium is declarative • Factor out heuristics • Split verification task • opt-dependent • vs. opt-independent

  41. Automatic Theorem Prover Solution: three techniques • Rhodium is declarative • Factor out heuristics • Split verification task • opt-dependent • vs. opt-independent

  42. Automatic Theorem Prover Solution: three techniques • Rhodium is declarative • Factor out heuristics • Split verification task • Result: • Expressive language • Automated correctness checking

  43. Outline • Introduction • Overview of the Rhodium system • Writing Rhodium optimizations • Checking Rhodium optimizations • Discussion

  44. a b a b c MustPointTo analysis a = &b c = a d = *c d = b

  45. mustPointTo(a, b) mustPointTo(c, b) mustPointTo(a, b) a b a b c MustPointTo info in Rhodium a = &b c = a d = *c

  46. mustPointTo(a, b) mustPointTo(a, b) mustPointTo(c, b) mustPointTo(c, b) mustPointTo(a, b) mustPointTo(a, b) a a b b a a b b c c MustPointTo info in Rhodium a = &b a = &b c = a c = a d = *c d = *c

  47. mustPointTo(a, b) mustPointTo(c, b) mustPointTo(a, b) a b a b c MustPointTo info in Rhodium define fact mustPointTo(X:Var,Y:Var) with meaning «X == &Y¬ a = &b Fact correct on edge if: whenever program execution reaches edge, meaning of fact evaluates to true in the program state c = a d = *c

  48. mustPointTo(a, b) mustPointTo(c, b) mustPointTo(a, b) a b a b c Propagating facts define fact mustPointTo(X:Var,Y:Var) with meaning «X == &Y¬ a = &b c = a d = *c

  49. mustPointTo(a, b) mustPointTo(c, b) mustPointTo(a, b) a b a b c Propagating facts define fact mustPointTo(X:Var,Y:Var) with meaning «X == &Y¬ a = &b a = &b if currStmt == [X = &Y] then mustPointTo(X,Y)@out c = a d = *c

  50. mustPointTo(a, b) mustPointTo(c, b) mustPointTo(a, b) a b a b c Propagating facts define fact mustPointTo(X:Var,Y:Var) with meaning «X == &Y¬ a = &b if currStmt == [X = &Y] then mustPointTo(X,Y)@out c = a d = *c

More Related