1 / 25

CS412/413

CS412/413. Introduction to Compilers and Translators Spring ’99 Lecture 13: Transforming Intermediate Code. Administration. Prelim 1 on Monday in class topics covered: regular expressions, tokenizing, context-free grammars, LL & LR parsers, static semantics No class Wednesday March 3

sutton
Download Presentation

CS412/413

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 13: Transforming Intermediate Code

  2. Administration • Prelim 1 on Monday in class • topics covered: regular expressions, tokenizing, context-free grammars, LL & LR parsers, static semantics • No class Wednesday March 3 • Programming Assignment 2 due Friday March 5 • Read: Appel 7, 8 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  3. Where we are Source code (character stream) Lexical analysis regular expressions Token stream Syntactic Analysis grammars Abstract syntax tree Semantic Analysis static semantics Abstract syntax tree + types Intermediate Code Generation translation functions Intermediate Code CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  4. Intermediate Code • Abstract machine code in tree form • Statements • MOVE, EXP, JUMP, CJUMP, SEQ, LABEL, RET • Expressions • CONST, TEMP, OP, MEM, CALL, ESEQ, LABEL • 13 kinds of tree nodes vs. hundreds of Pentium instructions—easier to generate, reason about CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  5. Intermediate Representations • High-level IR (HIR)  AST + extra node types • Medium-level IR (MIR) • intermediate between AST and assembly • other MIRs exist (quadruples, UCODE) • advantage of tree IR: easy to generate, easier to do reasonable instruction selection • Low-level IR (LIR)  assembly code + extra pseudo-instructions CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  6. IR expressions • CONST(i) : the integer constant i • TEMP(t) : a temporary register t. The abstract machine has an infinite number of these • OP(e1, e2) : one of the following operations • PLUS, MINUS, MUL, DIV, MOD • AND, OR, XOR, LSHIFT, RSHIFT, ARSHIFT • MEM(e) : contents of memory locn w/ address e • CALL(f, l) : result of fcn f applied to arguments l • ESEQ(s, e) : result of e after stmt s is executed • NAME(n) : address of the statement labeled n CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  7. IR statements • MOVE(e, dest) : move result of e into dest • dest = TEMP(t) : assign to temporary t • dest = MEM(e) : assign to memory locn e • EXP(e) : evaluate e, discard result • SEQ(s1, s2) : execute s1 and then s2 • JUMP(e) : jump to address e • CJUMP(e, l1, l2) : jump to l1or l2depending on whether e is true or false • LABEL(n) : a labeled statement (may be used in NAME, JUMP, CJUMP) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  8. Translation • Intermediate code gen is tree translationAbstract syntax tree IR tree • Each subtree of AST translated to subtree in IR tree • Translation process described by translation function T [ E, A ] CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  9. location v : k  A T [v] = MEM(PLUS(FP, CONST( k ))) fp 4 fp 8 Translation Example T [E1== E2 , A] = OP(==, T[E1, A], T[E2, A]) SEQ SEQ SEQ CJUMP LABEL(L1) == L2 L1 MEM if (b==0) a = b; CONST 0 LABEL(L2) + MOVE if fp 8 boolean int MEM MEM == = ; int b int 0 int a intb + + CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  10. Translation Code • Function T [ E, A] corresponds to a translation method class ASTnode IRnode translate(SymTab A); } • Note similarity to type-checking method: Type typeCheck(SymTab A); CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  11. Translating control structure • If, while, return statements cause transfer of control within program • Idea: Manage flow of control by introducing labels for statements, use CJUMP and JUMP statements to transfer control to the labels CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  12. Translating if CJUMP(T[E], t, f) t: T[S] f: T [ if (E) S ] = SEQ SEQ CJUMP T[E]NAME(t) NAME(f) SEQ LABEL(t) LABEL(f) T[ S ] = SEQ(CJUMP(T[E],NAME(t),NAME(f)), SEQ(LABEL(t), SEQ(T[S], LABEL(f)) (if t, f fresh) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  13. SEQ LABEL(loop) SEQ CJUMP SEQ T[ E ] NAME(t) NAME(f) LABEL(t) SEQ T[ S ] SEQ JUMP(NAME(loop)) LABEL(f) Translating while while (E) S loop: CJUMP (T[ E ], t, f) t: T[ S ] JUMP loop f: = SEQ(LABEL(loop), CJUMP, LABEL(t), T[S], JUMP(NAME(loop)), LABEL(f)) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  14. Function calls, returns • Translate to corresponding IR node label id : lid  A T[id ( E1,…En) , A] = CALL(lid, T[ E1], …, T[ En ]) T[ return E , A] = RET(T[E, A]) alternatively, = SEQ(MOVE(T[E ], RV), JUMP(NAME(end)) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  15. Progress • Now have rules for transforming AST into intermediate representation • Can apply this to AST of each function defn to get IR for function • Intermediate representation has many features not found in real assembly code • arbitrarily deep expression trees vs. 1-2 deep • ability to perform statements with side-effects as part of an expression (ESEQ, CALL); undefined behavior • CJUMP is two-way jump rather than fall-through • Why do we allow this in IR at all? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  16. Canonical form • Idea: rewrite trees to get rid of constructs incompatible with assembly • arbitrarily deep expression trees -- deal with this later as part of instruction tiling • ESEQ & CALL nodes -- push ESEQ nodes upward in tree until they become SEQ nodes, push all CALL nodes up, make top-level backbone of SEQ nodes. • CJUMP is two-way jump rather than fall-through -- rewrite so jump on false is always to the very next instruction CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  17. Canonical form • In canonical form, all SEQ nodes go down right chain: • Function is just one big SEQ containing all statements: SEQ(s1,s2,s3,s4,s5,…) • Can translate to assembly more directly SEQ s1 SEQ s2 SEQ s3 SEQ s4 SEQ s5 ... CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  18. Non-canonical features • ESEQ nodes put a statement node underneath an expression: int x = 1 + { while (y > 0) { … } z; } • CALL nodes have side effects; must move to top level as EXP(CALL(…)) or MOVE(CALL(…)) to define behavior ESEQ S E CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  19. ESEQ rewriting • Want to move ESEQ nodes up to top of tree where they can become SEQ nodes • Idea: define transformation rules that take an IR tree and move ESEQ nodes to top. • Goal: move side-effecting statements to top of tree without ripping apart expressions more than necessary -- leads to better code because expression patterns can be recognized and mapped to instruction set CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  20. ESEQ Transformations • Example transformations: ESEQ(s1, ESEQ(s2, e))Þ ESEQ(SEQ(s1, s2), e)) MOVE(ESEQ(s1, e), dest) Þ SEQ(s1, MOVE(e, dest)) OP(ESEQ(s1, e1), e2) Þ ESEQ(s1, OP(e1, e2)) OP(e1, ESEQ(s1, e2)) Þ ? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  21. Rewriting expressions • OP(e1, ESEQ(s1, e2)) ESEQ ? OP e1 s1 OP ESEQ e1 e2 s1 e2 ? { a=0; e1 + e2 } e1 + { a=0; e2 } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  22. Introducing temporaries • If e1 does not commute with s1 • i.e., {s1; e1; e2}¹{e1; s1; e2} • Must save value of e1 in temporary ESEQ OP OP e1 SEQ ESEQ s1 TEMP(t) e2 MOVE s1 e2 e1 TEMP(t) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  23. General case • When we move all ESEQ nodes to top, arbitrary expression node looks like: • ESEQ transformation takes arbitrary expression node, returns list of sub-statements to be executed plus final expression. • ESEQ node built as shown ESEQ expr SEQ SEQ s1 SEQ s2 ... s3 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  24. Interface class CanonicalExpr { IRStmt[] pre_stmts; IRExpr expr; } abstract class IRExpr { CanonicalExpr canonical( ); } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  25. Conclusions • AST statements for structured control flow like “if” and “while” can be translated to unstructured IR nodes using JUMP, CJUMP, LABEL nodes. • Simple code transformations can transform the IR representation into a canonical form that has many of the properties of assembly code. CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

More Related