1 / 44

Introduction to Compilers

Introduction to Compilers. Ben Livshits. Based in part of Stanford class slides from http ://infolab.stanford.edu /~ ullman/dragon/w06/w06.html . My Lectures. reliability & bug finding. performance. security. Lecture 1 : Introduction to static analysis

brooke
Download Presentation

Introduction to Compilers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Compilers Ben Livshits Based in part of Stanford class slides from http://infolab.stanford.edu/~ullman/dragon/w06/w06.html

  2. My Lectures reliability & bug finding performance security Lecture 1: Introduction to staticanalysis Lecture 2: Introduction to runtimeanalysis Lecture 3: Applicationsof static and runtime analysis

  3. L1: Lecture Organization • Really basic stuff • Flow Graphs • Constant Folding • Global Common Subexpression Elimination • Data-flow analysis • Proving Little Theorems • Data-Flow Equations • Major Examples: reaching definitions • Looking forward…

  4. Dawn of Code Optimization A never-published Stanford technical report by Fran Allen in 1968 Flow graphs of intermediate code

  5. Compilers Under the Hood

  6. Stages of Compilation Source code Lexing Parsing IR Analysis Code generation Executable code

  7. Stages of Compilation Source code Lexing Parsing IR Analysis Code generation Executable code

  8. Stages of Compilation Source code Lexing Parsing IR Analysis Code generation Executable code

  9. Stages of Compilation Source code Lexing Parsing IR Analysis Code generation Executable code

  10. Stages of Compilation Source code Lexing Parsing IR Analysis Code generation Executable code

  11. Stages of Compilation Source code Lexing Parsing IR Analysis Code generation Executable code

  12. i = 0 if i>=n goto … t1 = 8*i A[t1] = 1 i = i+1 Intermediate Code for (i=0; i<n; i++) A[i] = 1; Make flow explicit by breaking into basic blocks= sequences of steps with entry at beginning, exit at end Intermediate code exposes optimizable constructs we cannot see at source-code level

  13. Constant Folding Sometimes a variable has a known constant value at a point If so, replacing the variable by the constant simplifies and speeds-up the code Easy within a basic block; harder across blocks

  14. i = 0 n = 100 if i>=n goto … t1 = 8*i A[t1] = 1 i = i+1 Constant Folding Example t1 = 0 if t1>=800 goto … A[t1] = 1 t1 = t1+8

  15. Global Common Subexpressions • Suppose block B has a computation of x+y • Suppose we are sure that when we reach this computation, we are sure to have: • Computed x+y, and • Not subsequently reassigned x or y • Then we can hold the value of x+yand use it in B

  16. a = x+y t = x+y a = t b = x+y t = x+y b = t c = x+y c = t GCSE Example

  17. t = x+y a = t t = x+y b = t c = t Example: Even Better t = x+y a = t b = t t = x+y b = t c = t

  18. Data-Flow Analysis • Proving Little Theorems • Data-Flow Equations • Major Examples: reaching definitions

  19. An Obvious Theorem boolean x = true; while (x) { . . . // no change to x } Let’s prove that this loop doesn’t terminate Proof: only assignment to x is at top, so x is always true

  20. As a Flow Graph x = true if x == true “body”

  21. Formulation: Reaching Definitions Each place some variable x is assigned is a definition Ask: for this use of x, where could x last have been defined In our example: only at x=true

  22. d2 d1 Example: Reaching Definitions d1: x = true d1 d2 if x == true d1 d2: a = 10

  23. Clincher Since at x == true, d1is the only definition of x that reaches, it must be that x is true at that point The conditional is not really a conditional and can be replaced by a branch

  24. Not Always That Easy int i = 2; intj = 3; while (i != j) { if (i < j) i += 2; else j += 2; } We’ll develop techniques for this problem, but later …

  25. d1 d2 d3 d4 d2, d3, d4 d1, d3, d4 d1, d2, d3, d4 d1, d2, d3, d4 The Flow Graph d1: i = 2 d2: j = 3 if i != j d1, d2, d3, d4 if i < j d3: i = i+2 d4: j = j+2

  26. DFA Is Sometimes Insufficient In this example, i can be defined in two places, and j in two places No obvious way to discover that i!=jis always true But OK, because reaching definitions is sufficient to catch most opportunities for constant folding(replacement of a variable by its only possible value)

  27. Be Conservative! It’s OK to discover a subset of the opportunities to make some code-improving transformation. It’s not OK to think you have an opportunity that you don’t really have. Must we always be conservative?

  28. Example: Be Conservative boolean x = true; while (x) { . . . *p = false; . . . } Is it possible that p points to x?

  29. Another def of x d2 As a Flow Graph d1: x = true d1 if x == true d2: *p = false

  30. Possible Resolution Just as data-flow analysis of “reaching definitions” can tell what definitions of x might reach a point, another DFA can eliminate cases where p definitely does not point to x Example: in a C program the only definition of p is p = &yand there is no possibility that y is an alias of x

  31. Reaching Definitions Formalized • A definition d of a variable x is said to reach a point p in a flow graph if: • Every path from the entry of the flow graph to p has d on the path, and • After the last occurrence of d there is no possibility that x is redefined

  32. Data-Flow Equations: (1) • A basic block can generate a definition • A basic block can either • Killa definition of x if it surely redefines x • Transmit a definition if it may not redefine the same variable(s) as that definition

  33. Data-Flow Equations: (2) • Variables: • IN(B) = set of definitions reaching the beginning of block B. • OUT(B) = set of definitions reaching the end of B.

  34. Data-Flow Equations: (3) • Two kinds of equations: • Confluence equations: IN(B) in terms of outs of predecessors of B • Transfer equations: OUT(B) in terms of of IN(B) and what goes on in block B.

  35. {d1, d2} {d2, d3} Confluence Equations P1 P2 {d1, d2, d3} B IN(B) = ∪predecessors P of B OUT(P)

  36. Transfer Equations Generate a definition in the block if its variable is not definitely rewritten later in the basic block. Killa definition if its variable is definitely rewritten in the block. An internal definition may be both killed and generated.

  37. Example: Gen and Kill IN = {d2(x), d3(y), d3(z), d5(y), d6(y), d7(z)} Kill includes {d1(x), d2(x), d3(y), d5(y), d6(y),…} d1: y = 3 d2: x = y+z d3: *p = 10 d4: y = 5 Gen = {d2(x), d3(x), d3(z),…, d4(y)} OUT = {d2(x), d3(x), d3(z),…, d4(y), d7(z)}

  38. Transfer Function for a Block For any block B: OUT(B) = (IN(B) – Kill(B)) ∪Gen(B)

  39. Iterative Solution to Equations For an n-block flow graph, there are 2n equations in 2n unknowns Alas, the solution is not unique Use iteration to get the least fixed-point Has nice performance guarantees

  40. Iterative Solution: (2) IN(entry) = ∅; for each block B do OUT(B)= ∅; while (changes occur) do for each block B do { IN(B) = ∪predecessors P of B OUT(P); OUT(B) = (IN(B) – Kill(B)) ∪Gen(B); }

  41. IN(B1) = {} OUT(B1) = { IN(B2) = { d1, OUT(B2) = { IN(B3) = { d1, OUT(B3) = { Example: Reaching Definitions d1: x = 5 B1 d1} d2} if x == 10 B2 d1, d2} d2} d2: x = 15 B3 d2}

  42. Notice the Conservatism Not only the most conservative assumption about when a def is killed or gen’d Also the conservative assumption that any path in the flow graph can actually be taken Must we always be conservative?

  43. Where are we Going From Here? • Inter-procedural analysis • Context-sensitive analysis • Object-sensitive analysis • Pointers or references • Reflection • Dynamic code generation research Some is covered in Yannis’s lectures…

  44. A Little JavaScript Program f = function(x){ return x+1; } function compose(){ return function(x){ return f(f(x)) }; } compose(f)(3); this[‘compose’](f)(3) this['comp' + 'os' + ("E".toLowerCase())](f)(3) function(f){ return ; eval("compose(f)(" + arguments[0] + ");"); }()(3);

More Related