1 / 194

Compilation 0368-3133 (Semester A, 2013/14)

Compilation 0368-3133 (Semester A, 2013/14). Lecture 12: Abstract Interpretation Noam Rinetzky. Slides credit: Roman Manevich , Mooly Sagiv and Eran Yahav. What is a compiler?.

kovit
Download Presentation

Compilation 0368-3133 (Semester A, 2013/14)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compilation 0368-3133 (Semester A, 2013/14) Lecture 12: Abstract Interpretation Noam Rinetzky Slides credit: Roman Manevich, MoolySagiv and EranYahav

  2. What is a compiler? “A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.” --Wikipedia

  3. Stages of compilation Source code (program) LexicalAnalysis Syntax Analysis Parsing ContextAnalysis CodeGeneration Target code (executable) Portable/Retargetable code generation Assembly Text AST + Sym. Tab. Token stream AST IR

  4. Stages of Compilation Source code (program) LexicalAnalysis Syntax Analysis Parsing ContextAnalysis CodeGeneration Target code (executable) Portable/Retargetable code generation Assembly register allocation IR (TAC) generation Instruction selection IR Optimization Text AST + Sym. Tab. Token stream AST IR Peephole optimization “Assembly” (symbolic registers) “Naive” IR Assembly

  5. Optimization points sourcecode Frontend Codegenerator targetcode IR Program Analysis Abstract interpretation today

  6. Program Analysis • In order to optimize a program, the compiler has to be able to reason about the properties of that program • An analysis is called sound if it never asserts an incorrect fact about a program • All the analyses we will discuss in this class are sound • (Why?)

  7. Optimization path donewith IRoptimizations CodeGeneration(+optimizations) TargetCode IRoptimizations CFGbuilder Control-FlowGraph ProgramAnalysis IR AnnotatedCFG OptimizingTransformation

  8. Soundness int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, xholds some integer value”

  9. Soundness int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, xis either 137 or 42”

  10. (Un) Soundness int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, xis 137”

  11. Soundness & Precision int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, xis either 137,42, or 271”

  12. Semantics-preserving optimizations • An optimization is semantics-preserving if it does not alter the semantics (meaning) of the original program • Eliminating unnecessary temporary variables • Computing values that are known statically at compile-time instead of computing them at runtime • Evaluating iteration-independent expressions outside of a loop instead of inside • Replacing bubble sort with quicksort (why?) • The optimizations we will consider in this class are all semantics-preserving

  13. Types of optimizations An optimization is local if it works on just a single basic block An optimization is global if it works on an entire control-flow graph

  14. Local optimizations • int main() { • int x; • int y; • int z; • y = 137; • if (x == 0) • z = y; • else • x = y; • } start _t0 = 137; y = _t0; IfZ x Goto _L0; _t1 = y; z = _t1; _t2 = y; x = _t2; start

  15. Local optimizations • int main() { • int x; • int y; • int z; • y = 137; • if (x == 0) • z = y; • else • x = y; • } start y = 137; IfZ x Goto _L0; z = y; x = y; End

  16. Global optimizations • int main() { • int x; • int y; • int z; • y = 137; • if (x == 0) • z = y; • else • x = y; • } start y = 137; IfZ x Goto _L0; z = y; x = y; End

  17. Global optimizations • int main() { • int x; • int y; • int z; • y = 137; • if (x == 0) • z = y; • else • x = y; • } start y = 137; IfZ x Goto _L0; z = 137; x = 137; End

  18. Global Analyses • Common subexpression elimination • Copy propagation • Dead code elimination • Live variables • Used for register allocation

  19. Global liveness analysis IN[s]=(OUT[s] – {a})  {b, c} a=b+c OUT[s]=IN[s2]  IN[s3] IN[s2] IN[s3] s2 s3

  20. Live variable analysis- initialization b = c + d;c = c + d; {} Entry a = b + c;d = a + c; {} c = a + b; {} a = a + b;d = b + c; {} Exit {a}

  21. Live variable analysis b = c + d;c = c + d; {} Entry a = b + c;d = a + c; {} c = a + b; {} a = a + b;d = b + c; {} {a} Exit {a}

  22. Live variable analysis b = c + d;c = c + d; {} Entry a = b + c;d = a + c; {} c = a + b; {} a = a + b;d = b + c; {a, b, c} IN[s1]=(OUT[s1]– {a}) {a, b} ? IN[s2]=(OUT[s2]– {d})  {b, c} {a} Exit {a}

  23. Live variable analysis b = c + d;c = c + d; {} Entry a = b + c;d = a + c; {} c = a + b; {} {a, b, c} a = a + b;d = b + c; {a, b, c} {a} Exit {a}

  24. CFGs with loops - iteration b = c + d;c = c + d; {} Entry a = b + c;d = a + c; {b, c} c = a + b; {} {a, b, c} a = a + b;d = b + c; {a, b, c} {a} Exit {a}

  25. CFGs with loops - iteration b = c + d;c = c + d; {} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {} {a, b, c} a = a + b;d = b + c; {a, b, c} {a} Exit {a}

  26. CFGs with loops - iteration b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a} Exit {a}

  27. Live variable analysis b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a} Exit {a}

  28. Live variable analysis b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a} Exit {a}

  29. Live variable analysis b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  30. Live variable analysis b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  31. Live variable analysis b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  32. Live variable analysis b = c + d;c = c + d; {c, d} Entry {b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  33. Live variable analysis b = c + d;c = c + d; {c, d} Entry {a, b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  34. Live variable analysis b = c + d;c = c + d; {a, c, d} Entry {a, b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  35. Live variable analysis b = c + d;c = c + d; {a, c, d} Entry {a, b, c} a = b + c;d = a + c; {b, c} c = a + b; {a, b} {a, b, c} {a, b, c} a = a + b;d = b + c; {a, b, c} {a, c, d} Exit {a}

  36. Global liveness analysis: Kill/Gen Gen Kill IN[s]=(OUT[s] – {a})  {b, c} a=b+c OUT[s]=IN[s2]  IN[s3] IN[s2] IN[s3] s2 s3

  37. Formalizing data-flow analyses • Define an analysis of a basic block as a quadruple (D, V, , F, I) where • D is a direction (forwards or backwards) • V is a set of values the program can have at any point •  merging (joining) information • F is a family of transfer functions defining the meaning of any expression as a function f : VV • I is the initial information at the top (or bottom) of a basic block

  38. Liveness Analysis • Direction: Backward • Values: Sets of variables • Transfer functions: Given a set of variable assignments V and statement a = b + c: • Remove a from V (any previous value of a is now dead.) • Add b and c to V (any previous value of b or c is now live.) • Formally: Vin = (Vout \ {a})  {b,c} • Initial value: Depends on semantics of language • E.g., function arguments and return values (pushes) • Result of local analysis of other blocks as part of a global analysis • Merge:  = 

  39. Available Expressions • Direction: Forward • Values: Sets of expressions assigned to variables • Transfer functions: Given a set of variable assignments V and statement a = b + c: • Remove from V any expression containing a as a subexpression • Add to V the expression a = b + c • Formally: Vout = (Vin \ {e | e contains a})  {a = b + c} • Initial value: Empty set of expressions • Merge:  = ?

  40. Why does this work?(Live Var.) • To show correctness, we need to show that • The algorithm eventually terminates, and • When it terminates, it has a sound answer • Termination argument: • Once a variable is discovered to be live during some point of the analysis, it always stays live • Only finitely many variables and finitely many places where a variable can become live • Soundness argument (sketch): • Each individual rule, applied to some set, correctly updates liveness in that set • When computing the union of the set of live variables, a variable is only live if it was live on some path leaving the statement

  41. Abstract Interpretation • Theoretical foundations of program analysis • Cousot and Cousot 1977 • Abstract meaning of programs • “Executed” at compile time • Execution = Analysis

  42. Another view of program analysis We want to reason about some property of the runtime behavior of the program Could we run the program and just watch what happens? Idea: Redefine the semantics of our programming language to give us information about our analysis

  43. Another view of program analysis We want to reason about some property of the runtime behavior of the program Could we run the program and just watch what happens?

  44. Another view of program analysis • The only way to find out exactly what a program will actually do is to run it • Problems: • The program might not terminate • The program might have some behavior we didn't see when we ran it on a particular input

  45. Another view of program analysis • The only way to find out exactly what a program will actually do is to run it • Problems: • The program might not terminate • The program might have some behavior we didn't see when we ran it on a particular input • Inside a basic block, it is simpler • Basic blocks contain no loops • There is only one path through the basic block

  46. Another view of program analysis • The only way to find out exactly what a program will actually do is to run it • Problems: • The program might not terminate • The program might have some behavior we didn't see when we ran it on a particular input • Inside a basic block, it is simpler • Basic blocks contain no loops • There is only one path through the basic block • But still …

  47. Assigning new semantics (Local) Example: Available Expressions Redefine the statement a = b + c to mean “a now holds the value of b + c, and any variable holding the value a is now invalid” “Run” the program assuming these new semantics

  48. Assigning new semantics (Global) Example: Available Expressions Redefine the statement a = b + c to mean “a now holds the value of b + c, and any variable holding the value a is now invalid” “Run” the program assuming these new semantics Merge information from different paths

  49. Abstract Interpretation Example: Available Expressions Redefine the statement a = b + c to mean “a now holds the value of b + c, and any variable holding the value a is now invalid” “Run” the program assuming these new semantics Merge information from different paths Treat the optimizer as an interpreter for these new semantics

  50. Theory of Program Analysis • Building up all of the machinery to design this analysis was tricky • The key ideas, however, are mostly independent of the analysis: • We need to be able to compute functions describing the behavior of each statement • We need to be able to merge several subcomputations together • We need an initial value for all of the basic blocks • There is a beautiful formalism that captures many of these properties

More Related