1 / 51

Instruction Selection

Instruction Selection. Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html. Already Studied. Source program (string). lexical analysis. Tokens. syntax analysis. Abstract syntax tree. semantic analysis. Abstract syntax tree.

artie
Download Presentation

Instruction Selection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instruction Selection Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/courses/wcc.html

  2. Already Studied Source program (string) lexical analysis Tokens syntax analysis Abstract syntax tree semantic analysis Abstract syntax tree Translate Tree IR Cannon Cannonical Tree IR

  3. Instruction Selection • Input: • Cannonical IR • Description of translation rules from IR into machine language • Output • Machine code • Unbounded number of registers • Some prologue and epilogue instructions are missing

  4. LABEL(l3) CJUMP(EQ, TEMP t128, CONST 0, l0, l1) LABEL( l1) MOVE(TEMP t131, TEMP t128) MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1))) MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130)) LABEL(l2) MOVE(TEMP t103, TEMP t129) JUMP(NAME lend) LABEL(l0) MOVE(TEMP t129, CONST 1) JUMP(NAME l2)

  5. l3: beq t128, $0, l0 l1: or t131, $0, t128 addi t132, t128, -1 or $4, $0, t132 jal nfactor or t130, $0, $2 or t133, $0, t131 mult t133, t130 mflo t133 or t129, $0, t133 l2: or t103, $0, t129 b lend l0: addi t129, $0, 1 b l2

  6. The Challenge • “Clumps” of trees can be translated into a single machine instruction MOVE lw t1, c(t2) TEMP t1 MEM BINOP PLUS TEMP t2 CONST c

  7. Outline • The “Tiling” problem • An optimal solution • An optimum solution (via dynamic programming) • Tree grammars • The Pentium architecture • Instruction selection for Tiger • Abstract data type for machine instructions

  8. Instruction set inthe Jouette Machine ADD ri rj + rk MUL ri rj * rk SUB ri rj - rk DIV ri rj / rk ADDI ri rj + c SUBI ri rj - c LOAD ri M[rj + c] STORE M[ri + c]  rj MOVEM M[ri] M[rj]

  9. Tree Patterns for Jouette Machine

  10. Tree Patterns for Jouette Machine(cont)

  11. The Tiling Problem • Cover the tree with non overlapping tiles from the tree patterns • Minimize “the cost” of the generated code

  12. Example MOVE • Tiger input a[e] := x MEM MEM BINOP BINOP PLUS TEMP FP CONST -4 BINOP PLUS MEM TIMES TEMP te CONST 4 BINOP PLUS TEMP FP CONST -8

  13. STORE M[ r1+ 0] r2 LOAD r2 M[FP+ -4] ADDr1r1 +r2 ADDI r2 r0+ 4 MUL r2 te * r2 LOAD r1 M[FP+ -8] MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP TEMP FP BINOP MEM TIMES TEMP te CONST 4 BINOP PLUS CONST -8 TEMP FP

  14. MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP TEMP FP BINOP MEM CONST 4 TEMP te TIMES BINOP LOAD r1 M[FP+ -8] ADDI r2 r0+ 4 MUL r2 te * r2 ADDr1r1 +r2 LOAD r2 M[FP+ -4] STORE M[ r1+ 0] r2 PLUS CONST -8 TEMP FP

  15. MOVEM M[ r1]M[ r2 ] ADDI r2 FP+ -4 ADDr1r1 +r2 ADDI r2 r0+ 4 MUL r2 te * r2 LOAD r1 M[FP+ -8] MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP TEMP FP BINOP MEM TIMES TEMP te CONST 4 BINOP PLUS CONST -8 TEMP FP

  16. MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP BINOP TEMP FP MEM TEMP te CONST 4 TIMES LOAD r1 M[FP+ -8] ADDI r2 r0+ 4 MUL r2 te * r2 ADDr1r1 +r2 ADD r2 FP+ r2 MOVEM M[ r1]M[ r2 ] BINOP PLUS CONST -8 TEMP FP

  17. The Tiling Problem • Cover the tree with non overlapping tiles from the tree patterns • Minimize “the cost” of the generated code • Assures that every tree can be covered • Tree patterns for all the “tiny” tiles

  18. STORE M[ r1+ 0] r2 LOAD r2 M[r2+ 0] ADDI r2 FP+ -4 ADDr1r1 +r2 LOAD r1 M[r1 +0] ADD r1 FP+ r1 ADDI r2 r0+ 4 MUL r2 te * r2 ADDI r1 r0 + -8 MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP TEMP FP BINOP MEM TIMES TEMP te CONST 4 BINOP PLUS CONST -8 TEMP FP

  19. MOVE MEM MEM BINOP PLUS TEMP FP CONST -4 PLUS BINOP BINOP ADDI r1 r0 + -8 ADD r1 FP+ r1 LOAD r1 M[r1+ 0] ADDI r2 r0+4 MUL r2 te * r2 ADDr1r1 +r2 ADDI r2 r0+ -4 ADD r2 FP+ r2 LOAD r2 M[r2+ 0] STORE M[ r1] r2 CONST 4 MEM TEMP te TIMES BINOP PLUS CONST -8 TEMP FP

  20. Optimal vs. Optimum Tiling • Optimum Tiling • Minimum cost of tile sum • Optimal Tiling • No two adjacent tiles can be combined to reduce the cost

  21. STORE M[ r1+ 0] r2 LOAD r2 M[FP+ -4] ADDr1r1 +r2 ADDI r2 r0+ 4 MUL r2 te * r2 LOAD r1 M[FP+ -8] MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP TEMP FP BINOP MEM TIMES TEMP te CONST 4 BINOP PLUS CONST -8 TEMP FP

  22. STORE M[ r1+ 0] r2 LOAD r2 M[r2+ 0] ADDI r2 FP+ -4 ADDr1r1 +r2 LOAD r1 M[r1 +0] ADD r1 FP+ r1 ADDI r2 r0+ 4 MUL r2 te * r2 ADDI r1 r0 + -8 MOVE MEM MEM BINOP PLUS PLUS CONST -4 BINOP TEMP FP BINOP MEM TIMES TEMP te CONST 4 BINOP PLUS CONST -8 TEMP FP

  23. Optimum Tiling LOAD r1 M[FP+ -8] ADDI r2 r0+ 4 MUL r2 te * r2 ADDr1r1 +r2 LOAD r2 M[FP+ -4] STORE M[ r1+ 0] r2 LOAD r1 M[FP+ -8] ADDI r2 r0+ 4 MUL r2 te * r2 ADDr1r1 +r2 ADD r2 FP+ r2 MOVEM M[ r1]M[ r2 ]

  24. RISC vs. CISC Machines

  25. Architecture and Tiling Algorithm • RISC • Cost of operations is uniform • Optimal tiling usually suffices • CISC • Optimum tiling may be significantly better

  26. Optimal Tiling using “Maximal Munch” • Top-down traversal of the IR tree • At every node try the relevant tree patterns in “cost-order” • Generate assembly code in reverse order • Tiny tiles guarantees that we can never get stack

  27. static void munchStm(T_stm s) { switch(s->kind) { case T_MOVE: T_exp dst = s->u.MOVE.dst, src=s->u.MOVE.src; if (dst->kind==T_MEM) if (dst->u.MEM->kind==T_BINOP && dst->u.MEM->u.BINOP.op==T_PLUS && dst->u.MEM->u.BINOP.right.kind==T_CONST) { T_exp e1 =dst->u.MEM->u.BINOP.left, e2=src; /* MOVE(MEM(BINOP(PLUS, e1, CONST c,), e2) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (dst->u.MEM->kind==T_BINOP && dst->u.MEM->u.BINOP.op==T_PLUS && dst->u.MEM->u.BINOP.left.kind==T_CONST) { T_exp e1 =dst->u.MEM->u.BINOP.right, e2=src; /* MOVE(MEM(BINOP(PLUS, CONST c, e1), e2) */ munchExp(e1); munchExp(e2); emit(“STORE”); }

  28. static void munchStm(T_stm s) { switch(s->kind) { case T_MOVE: T_exp dst = s->u.MOVE.dst, src=s->u.MOVE.src; if (dst->kind==T_MEM) if (… ) { /* MOVE(MEM(BINOP(PLUS, e1, CONST c,), e2) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (…) { /* MOVE(MEM(BINOP(PLUS, CONST c, e1), e2) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (src->kind==T_MEM) { T_exp e1= dst->u.MEM, e2=src->u.MEM; /* MOVE(MEM(e1), MEM(e2)) */ munchExp(e1), munchExp(e2); emit(“MOVEM”) ; } else { T_exp e1=dst->u.MEM, e2=src; /* MOVE(MEM(e1), e2) */ munchExp(e1), munchExp(e2); emit(“STORE”) ; }

  29. case T_MOVE: T_exp dst = s->u.MOVE.dst, src=s->u.MOVE.src; if (dst->kind==T_MEM) if (… ) { /* MOVE(MEM(BINOP(PLUS, e1, CONST c,), e2) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (…) { /* MOVE(MEM(BINOP(PLUS, CONST c, e1), e2) */ munchExp(e1); munchExp(e2); emit(“STORE”); } else if (…) { /* MOVE(MEM(e1), MEM(e2)) */ munchExp(e1), munchExp(e2); emit(“MOVEM”) ; } else { /* MOVE(MEM(e1), e2) */ munchExp(e1), munchExp(e2); emit(“STORE”) ; } else if (dst->kind==T_TEMP) { T_exp e=src; /* MOVE(TEMP t, e) */ munchExp(e); emit(“ADD”); } else assert(0);

  30. static void munchStm(T_stm s) { MOVE(MEM(BINOP(PLUS, e1, CONST c), e2) munchExp(e1); munchExp(e2); emit(“STORE”); MOVE(MEM(BINOP(PLUS, CONST c, e1), e2)  munchExp(e1); munchExp(e2); emit(“STORE”); MOVE(MEM(e1), MEM(e2))  munchExp(e1), munchExp(e2); emit(“MOVEM”) ; MOVE(TEMP t, e)  munchExp(e); emit(“ADD”); JUMP(e)  … CJUMP(e)  … LABEL(l)  }

  31. static void munchExp(T_exp e) { MEM(BINOP(PLUS, e, CONST c)) munchExp(e); emit(“LOAD”); MEM(BINOP(PLUS, CONST c, e1) munchExp(e); emit(“LOAD”); MEM(CONST c)  emit(“LOAD”); MEM(e)  munchExp(e); emit(“LOAD”); BINOP(PLUS, e, CONST c)  munchExp(e); emit(“ADDI”); BINOP(PLUS, CONST c, e)  munchExp(e); emit(“ADDI”); BINOP(CONST c)  munchExp(e); emit(“ADDI”); BINOP(PLUS, e1, e2)  munchExp(e1; munchExp(e2); emit(“ADD”); … TEMP t 

  32. Example MOVE • Tiger input a[e] := x MEM MEM BINOP BINOP PLUS TEMP FP CONST -4 BINOP PLUS MEM TIMES TEMP te CONST 4 BINOP PLUS TEMP FP CONST -8

  33. Optimum Tiling • Maximal munch does not necessarily produce optimum results • The number of potential code sequences is quite big • But Dynamic Programming yields an optimum solution in linear time • Assign optimum cost to every sub-tree • Two phase solution • Find the optimum cost for every subtree in a bottom up traversal • Generate the optimum solution in a top down traversal • Skip subtrees

  34. Dynamic Programming • For each subtree with root n • For each tile t which matches n of cost c • Calculate the cost of t as: c + ci • The cost of the subtree rooted at n is the minimum of all matching tiles • Generate the optimum code during top-down traversal

  35. Example MEM BINOP PLUS CONST 1 CONST 2

  36. CONST 1

  37. CONST 2

  38. BINOP BINOP e PLUS PLUS CONST 1 CONST 2 BINOP PLUS e2 e1 CONST C BINOP CONST c e PLUS

  39. MEM BINOP PLUS CONST 2 CONST 1

  40. MEM e MEM BINOP e2 e1 PLUS MEM BINOP e CONST c PLUS MEM BINOP CONST c PLUS e

  41. Top-Down Code Generation LOAD(2) MEM ADDI(2) BINOP PLUS CONST 2 CONST 1 ADDI(1) ADDI(1) ADDI r1r0 + 1 LOAD r1 M[r1 + 2]

  42. The “Schizo”-Jouette Machine • In the spirit of Motorola 68000 • Two types of registers • data registers • address registers • Arithmetic performed on data registers • Load and Store using address registers • Machine instruction to convert between addresses and data

  43. Tree Patterns for Schizo-Jouette

  44. Tree Patterns for Schizo-Jouette Machine

  45. Tree Patterns for Schizo-Jouette

  46. Tree Grammars • A generalization of dynamic programming • Input • A (usually ambiguous) context free grammar describing the machine tree patterns • non-terminals correspond to machine types • every production has machine cost • A linearized IR tree • Output • A parse-tree with the minimum cost

  47. d TEMP t a TEMP t d +(d, d) d +(d, CONST) d +(CONST, d) d MEM(+(a, CONST)) d MEM(+(CONST, a)) d MEM(CONST) d MEM(a) d a a d Partial Grammar for Schizo-Jouette MEM(+(CONST 1, CONST 2))

  48. Six general purpose registers The multiply requires that the left arg. is eax Two-address instructions Arithmetic on memory Several addressing modes Variable-length instructions Instructions with side-effects Good register allocation For t1 t2 * t3 mov eax, t1 mul t2 mov t3, eax For t1 t2 + t3 mov t1, t2 add t1, t3 add [ebp –8], ecx mov eax, [ebp –8] add eax, ecx mov [ebp-8], eax Simple Instruction-Selection in the Pentium Architecture

  49. Instruction-Selection in the Tiger Compiler • Use maximal munch • Store the generated code in an abstract data type • The following phases are machine-independent • Control flow of the program is explicitly represented • Special representation of MOVE • Register allocation can remove

  50. /* assem.h */ typedef struct {Temp_labelList labels;} AS_targets; AS_targets AS_Targets(Temp_labelList labels); typedef struct AS_instr_ *AS_instr; typedef enum {I_OPER, I_LABEL, I_MOVE} AS_instr_kind; struct AS_instr_ { AS_instr_kind kind; union {struct {string assem; Temp_tempList dst, src; AS_targets jumps;} OPER; struct {string assem; Temp_label label;} LABEL; struct {string assem; Temp_tempList dst, src;} MOVE; } u; }; AS_instr AS_Oper(string a, Temp_tempList d, Temp_tempList s, AS_targets j); AS_instr AS_Label(string a, Temp_label label); AS_instr AS_Move(string a, Temp_tempList d, Temp_tempList s);

More Related