1 / 45

Compilation: A Retrospective

Compilation: A Retrospective. CS 671 April 29, 2008. What You’ve Learned This Semester. What happens when you compile your code Theory of compilers Internal challenges and solutions: widely-used algorithms Available tools and their other applications Traditional and modern challenges.

clinton
Download Presentation

Compilation: A Retrospective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compilation: A Retrospective CS 671 April 29, 2008

  2. What You’ve Learned This Semester • What happens when you compile your code • Theory of compilers • Internal challenges and solutions: widely-used algorithms • Available tools and their other applications • Traditional and modern challenges Compiler High-Level Programming Languages Machine Code Error Messages

  3. Compiler High-Level Programming Languages Machine Code An Interface to High-Level Languages • Programmers write in high-level languages • Increases productivity • Easier to maintain/debug • More portable • HLLs also protect the programmer from low-level details • Registers and caches – the register keyword • Instruction selection • Instruction-level parallelism • The catch: HLLs are less efficient

  4. An Interface to Computer Architectures • Parallelism • Instruction level • multiple operations at once; minimize dependences • Processor level • multiple threads at once; minimize synchronization • Memory Hierarchies • Register allocation (only portion explicitly managed in SW) • Code and data layout (helps the hardware cache manager) • Designs driven by how well compilers can leverage new features! Compiler High-Level Programming Languages Machine Code

  5. Simplified Compiler Structure Source code (character stream) if (b==0) a = b; Lexical Analysis Token stream Front End Parsing Machine independent Abstract syntax tree Intermediate Code Generation Intermediate code Code Optimization Back End IR Assembly code (character stream) CMP CX, 0 CMOVZ CX, DX Machine dependent Code Generation

  6. Giant NFA NFA for each RE Giant DFA Building a Lexer Specification “if” “while” [a-zA-Z][a-zA-Z0-9]* [0-9][0-9]* ( ) Table-driven code

  7. High Level View • Regular expressions = specification • Finite automata = implementation • Every regex has a FSA that recognizes its language source code tokens Scanner Compile time Design time Scanner Generator specification

  8. Lex: A Lexical Analyzer Generator • Lex produces a C program from a lexical specification • http://www.epaperpress.com/lexandyacc/index.html • %% • DIGITS [0-9]+ • ALPHA [A-Za-z] • CHARACTER {ALPHA}|_ • IDENTIFIER {ALPHA}({CHARACTER}|{DIGITS})* • %% • if {return IF; } • {IDENTIFIER} {return ID; } • {DIGITS} {return NUM; } • ([0-9]+”.”[0-9]*)|([0-9]*”.”[0-9]+) {return REAL; } • . {error(); }

  9. Phases of a Compiler Source code (character stream) if (b==0) a = b; Lexical Analysis Token stream Front End Parsing Machine independent Abstract syntax tree Intermediate Code Generation Intermediate code Code Optimization Back End IR Assembly code (character stream) CMP CX, 0 CMOVZ CX, DX Machine dependent Code Generation

  10. Parsing – Syntax Analysis • Convert a linear structure – sequence of tokens – to a hierarchical tree-like structure – an AST • The parser imposes the syntax rules of the language • Work should be linear in the size of the input  type consistency cannot be checked in this phase • Deterministic context free languages for the basis • Bison and yacc allow a user to construct parsers from CFG specifications

  11. Context-Free Grammar Terminology • Terminals • Token or ε • Non-terminals • Syntactic variables • Start symbol • A special nonterminal is designated (S) • Productions • Specify how non-terminals may be expanded to form strings • LHS: single non-terminal, RHS: string of terminals or non-terminals • Vertical bar is shorthand for multiple productions S  (S) S S  ε

  12. Shift-Reduce Parsing • Bottom-up parsing uses two kinds of actions: Shift and Reduce • Shift:Move I one place to the right • Shifts a terminal to the left string • E + (Iint )  E + (intI) • Reduce: Apply an inverse production at the right end of the left string • If E  E + ( E )is a production, then • E + (E + ( E )I)  E +(EI)

  13. next actions next state on reduction Shift-Reduce Parsing Table terminal symbols non-terminal symbols • Action table • 1. shift and goto state n • 2. reduce using X → γ • pop symbols γ off stack • using state label of top (end) of stack, look up X in goto tableand goto that state • • DFA + stack = push-down automaton (PDA) state

  14. Parsing Table

  15. Yacc / Bison • Yet Another Compiler Compiler • Automatically constructs an LALR(1) parsing table from a set of grammar rules • Yacc/Bison specification: file.y parser declarations %% grammar rules %% auxiliary code bison –vd file.y -or- yacc –vd file.y y.tab.c y.tab.h y.output

  16. Parsing - Semantic Analysis • Calculates the program’s “meaning” • Rules of the language are checked (variable declaration, type checking) • Type checking also needed for code generation (code gen for a + b depends on the type of a and b)

  17. a \ a x Scope change X y x x Symbol Table Implementation • Can consist of: • a hash table for all names, and • a stack to keep track of scope y \ x \ y x

  18. Typical Semantic Errors • Traverse the AST created by the parser to find: • Multiple declarations:a variable should be declared (in the same scope) at most once • Undeclared variable:a variable should not be used before being declared • Type mismatch:type of the left-hand side of an assignment should match the type of the right-hand side • Wrong arguments:methods should be called with the right number and types of arguments

  19. Stack Frames • Activation record or stack frame stores: • local vars • parameters • return address • temporaries • (…etc) • (Frame size not known until • late in the compilation process) Arg n … Arg 2 Arg 1 Static link previous frame fp Local vars Ret address Temporaries Saved regs Arg m … Arg 1 Static link current frame sp next frame

  20. Phases of a Compiler Source code (character stream) if (b==0) a = b; Lexical Analysis Token stream Front End Parsing Machine independent Abstract syntax tree Intermediate Code Generation Intermediate code Code Optimization Back End IR Assembly code (character stream) CMP CX, 0 CMOVZ CX, DX Machine dependent Code Generation

  21. optimize x86 AST IR PowerPC Alpha Intermediate Code Generation • Makes it easy to port compiler to other architectures • e.g., Pentium to MIPS • Can also be the basis for interpreters • such as in Java • Enables optimizations that are not machine specific

  22. Phases of a Compiler • Intermediate Code Optimization • Constant propagation, dead code elimination, common sub-expression elimination, strength reduction, etc. • Based on dataflow analysis – properties that are independent of execution paths Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Target program

  23. Analysis and Transformation • Most optimizations require some global understanding of program flow • Moving, removing, rearranging instructions • Achieve understanding by discovering the control flow of the procedure • What blocks follow/are reachable from other blocks • Where loops exist (focus optimization efforts) • We call this Control-Flow Analysis • Connect definitions and uses of variables • We call this Data-Flow Analysis

  24. out[I] = ( in[I] – kill[I] )  gen[I] in[B] = out[B’] B’  succ(B) Data-Flow Analysis • Properties: • either a forward analysis (out as function of in) or • a backward analysis (in as a function of out). • either an “along some path” problem or • an “along all paths” problem.

  25. if (…) B1 X  5 X  3 B2 B3 Y  X B4 if (…) B1 X0 5 X1 3 B2 B3 X2 (X0, X1) Y0 X2 B4 Static Single Assignment Form Before SSA After SSA

  26. Optimizations • What are they? • Code transformations • Improve some metric • Metrics • Performance: time, instructions, cycles • Space: Reduce memory usage • Code Size • Energy

  27. Optimizations High-level IR Low-level IR Inlining Constant folding Algebraic simplification Constant propagation Dead code elimination Loop-invariant code motion Common sub-expression elimination Strength reduction Branch prediction/optimization Register allocation Loop unrolling Cache optimization

  28. Scope of Optimization • Local (or single block) • Confined to straight-line code • Simplest to analyze • Intraprocedural (or global) • Consider the whole procedure • Interprocedural (or whole program) • Consider the whole program

  29. Phases of a Compiler • Native Code Generation • Intermediate code is translated into native code • Register allocation, instruction selection Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator • Native Code Optimization • Peephole optimizations – small window is optimized at a time Code optimizer Code generator Target program

  30. mov t1, [bp+x] mov t2, t1 add t2, 1 mov [bp+x], t2 Instruction Tiling x = x + 1; t2 MOVE MEM + t1 + MEM 1 FP x + FP x

  31. Register Allocation • Live Range Interference s1  2 s2  4 s3  s1 + s2 s4  s1 + 1 s5  s1 * s2 s6  s4 * 2 s1 s2 s3 s4 s5 s6 S1 S4 S5 S3 S2 S6 • Result • r1  2 r2  4 r3  r1 + r2 r3  r1 + 1 r1  r1 * r2 r2  r3 * 2 • Graph Coloring S1 S4 S5 r1  green r2  blue r3  pink S3 S2 S6

  32. Instruction Scheduling • Create a DAG of dependences • Determine priority • Schedule instructions with • Ready operands • Highest priority • Heuristics: If multiple possibilities, fall back on other priority functions • Height, slack, register usage, etc. 1m 2m 3 4m 5 6 7 8 9m 10

  33. Modern Topics! • Alternatives to the static compilation model • Compiling for the Core2 • Compiling for GPUs • Compiling for power and temperature • Compiling for parallelism

  34. Alternatives to the Traditional Model • Static Compilation • All work is done “ahead-of-time” • Just-in-Time Compilation • Postpone some compilation tasks • Multiversioning and Dynamic Feedback • Include multiple options in binary • Dynamic Binary Optimization • Traditional compilation model • Executables can adapt

  35. Just-in-Time Compilation • Ship bytecodes (think IR) rather than binaries • Binaries execute on machines • Bytecodes execute on virtual machines Compiler High-Level Programming Languages Machine Code Front End Back End Error Messages

  36. Compiling for the Core Architecture • Netburst architecture: has trace cache with decoded instructions • Core architecture: instructions are repeatedly fetched and decoded • Problems: • Code alignment! (Fetch, brpred effects) 20% Speedup!!!

  37. CPUs Lots of instructions little data Out of order exec Branch prediction Reuse and locality Task parallel Needs OS Complex sync Latency machines GPUs Few instructions lots of data SIMD Hardware threading Little reuse Data parallel No OS Simple sync Throughput machines Compiling for GPUs

  38. Bug Report Graphics Shadow acne Compiler Round off error

  39. Shader Compiler (SC) • Developers ship games in byte code • Each time a game starts the shader is compiled • Compiler is hidden in driver • No user gets to set options or flags • Compiler updates with new driver (once a month) • Compile done each time game is run • About ½ code is traditional compiler, all the usual stuff • Strange features: SC fights MS compiler

  40. Power-Aware Computing

  41. Vdd 20 cycles Vin Vout CL Minimum Voltage Power Issues in Microprocessors Capacitive (Dynamic) Power Static (Leakage) Power Di/Dt (Vdd/Gnd Bounce) Temperature Current (A) Voltage (V)

  42. Code Optimizations for Low Power • Reorder instructions to reduce switching effect at functional units and I/O buses • Operand swapping • Swap the operands at the input of multiplier • Result is unaltered, but power changes significantly! • Other standard compiler optimizations • Software pipelining, dead code elimination, redundancy elimination • Use processor-specific instruction styles • on ARM the default int type is ~ 20% more efficient than char or short (sign/zero extension)

  43. Code Optimization of Parallel Programs • Crises lead to paradigm shifts • Most of our algorithms break down in light of parallelism! • Many “band aids” exist, need a unified solution • Need a way to represent synchronization, parallelism in our dataflow analyses • Need a new IR … PIR • CFG + PDG = PPG • Can then update our dataflow analyses

  44. The Big Picture We now know: • All of the components of a compiler • What needs to be done statically vs. dynamically • The potential impact of language or architecture changes • Why Java moved the “back-end” to run time • What compiler researchers are working on in 2008!

More Related