code generation
Download
Skip this Video
Download Presentation
Code Generation

Loading in 2 Seconds...

play fullscreen
1 / 24

Code Generation - PowerPoint PPT Presentation


  • 178 Views
  • Uploaded on

Code Generation. CPSC 388 Ellen Walker Hiram College. Intermediate Representations. Source code Parse tree (or abstract syntax tree) Symbol table Intermediate code Target code. Why Intermediate Code?. Easier analysis for optimization Multiple target machines

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Code Generation' - taro


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
code generation

Code Generation

CPSC 388

Ellen Walker

Hiram College

intermediate representations
Intermediate Representations
  • Source code
    • Parse tree (or abstract syntax tree)
    • Symbol table
    • Intermediate code
  • Target code
why intermediate code
Why Intermediate Code?
  • Easier analysis for optimization
  • Multiple target machines
  • Direct interpretation (e.g. Java P-code)
3 address code
3-Address Code
  • Statements like x = y op z
  • Generous use of temp. variables
    • One for each internal node of (abstract) parse tree
  • Closely related to arithmetic expression
    • Example: a = b*(c+d) becomes:

tmp1 = c+d

a = b*tmp1

beyond math operations
Beyond Math Operations
  • No standardized 3 address code
  • Other operators in textbook
    • Comparison operators (e.g. x = y == z)
    • I/O (read x and write x)
    • Conditional & unconditional branch operators (if_true x goto L1, goto L2)
    • Label instructions (label L1)
    • Halt instruction (halt)
representing 3 address code
Representing 3-address code
  • Quadruple implementation
    • 4 fields: (op,y,z,x) for x=y op z
    • Fields are null if not needed, e.g. (rd,x,,)
    • Instead of names, put pointers into symbol table
  • Triple implementation
    • 4th element is always a temp
    • Don’t name temp, use triple index instead
example a b c d
Example: a = b+(c*d)
  • [quadruple] [triple]
  • (rd,c,_,_) 1: (rd,c,_)
  • (rd,d,_,_) 2: (rd,d,_)
  • (mul,c,d,t1) 3: (mul,c,d)
  • (rd,b,_,_) 4: (rd,b,_)
  • (add,b,t1,t2) 5: (add,b,3)
  • (asn,a,t2,_) 6: (asn,a,5)
p code
P-Code
  • Developed for Pascal compilers
  • Code for hypothetical P-machine
  • P-machine is a stack (0-address) machine [Load inst. takes 1-address]
    • Load = push, Store = pop
    • Operators act on top element(s) of stack
    • No temp. variable names needed
p code operators
LDC x - load const. x

LDA x - load addr. x

LOD x - load var. x

STO - store val in addr

STN - store & push

MPI - multiply integers

SBI - subtract integers

ADI - add integers

RDI -read int

WRI - write int

LAB - label

FJP - jump on false

GRT - >

EQU - =

STP - stop

P-Code operators
example a b c d1
Example: a = b+(c*d)
  • LDA a
  • LOD d
  • LOD c
  • MPI
  • LOD b
  • ADI
  • STO
p code as attribute
P-Code as attribute
  • Include code (so far) as attribute in attribute grammar
    • exp -> id = exp
      • $$.code = LDA $1.name; $3.code; STN
    • aexp -> aexp+factor
      • $$.code = $1.code;$3.code;ADI
    • factor -> id
      • $$.code = LOD $1.name
generating 3 address code
Generating 3 address code
  • Need a meta-function to generate temp names (newtemp())
    • exp -> id = exp
      • $$.code = $3.code; “$1.name = $3.name”
    • aexp -> aexp+factor
      • $$.name = newtemp()
      • $$.code = “$1.code;$3.code;$$.name=$1.name+$3.name”
why real compilers don t do this
Why real compilers don’t do this
  • Generating strings is inefficient
    • Lots of copying
    • Code, when generated, isn’t saved; just copied around until done
    • Code generation depends on inherited (not just synthesized) attributes
      • E.g. object type for assignment
      • This complicates grammars!
practical code generation
Practical code generation
  • Modified postorder traversal of syntax tree
  • Remember postorder:
    • Act on the children recursively
    • Act on the parent directly
  • In this case, the action is “generate code”
code generation1
Code Generation

Gen_code(node *n){

switch(n->op){

case ‘+’:

gen_code(n->first);

gen_code(n->first->next);

cout << “ADI”;

break;

more code generation
More Code Generation

case ‘=’:

cout << “LDA “ << t->name;

Gen_code(t->first);

cout << “STN”);

break;

}

nothing new
Nothing new!
  • Postorder traversal executes in the same order as LALR parsing!
  • Code for code generation looks almost like the attribute grammar
    • $n.code --> Generate_code(child N);
    • $$.attr --> n->attr; (where n is param)
code gen in yacc
Code Gen in YACC
  • Looks like attribute grammar, almost
  • Use code inside expression for assignment

Exp : id {//generate lda code} ‘=‘ exp {generate rest}

  • Can we combine code generation with other attribute computation?
intermediate target code
Intermediate -> Target Code
  • Macro expansion
    • Direct replacement of intermediate statement with target statement(s)
    • Prepend a definition file to the code, then assemble
    • But it’s not as easy as it seems
      • Different data types require different code
      • Compiler tracks locations, etc. separately
intermediate target code cont
Intermediate -> Target Code (cont)
  • Static simulation
    • Simulate results of intermediate code (i.e. interpret it)
    • Then generate equivalent assembly code to get results
    • Might include abstract interpretation (e.g. symbolic algebra)
p code 3 address code
P-code -> 3 address code
  • We must “run” the p-code to see what is on the stack for the 3 address code
  • Use a stack data structure during translation
    • “new top” = “old top” + “old second”
    • New temp. for “new top”
    • Temp or variable names stored in stack elements
  • Code is generated when stack is popped (only)
3 address code pcode
3 address code -> pcode
  • Each instruction a = b op c translates to:
    • LDA a
    • LOD b
    • LOD c
    • ADI -- or other operator based on “op”
    • STO
too much pcode
Too much Pcode!
  • 3 address code has many temps
  • Temps are simply loaded & stored without changing!
  • Sequence “lda x, lod x, sto” is useless!
  • Similarly, “lda x, lda t1, … sto, sto” doesn’t really need t1
cleaning it up
Cleaning it up
  • Instead, use a tree form
    • Parent is op, has label of variable name
    • Children are id, num, or another op
  • Assignment statements generate no code, only an alternative label
  • Pcode generated from the eventual tree (which is essentially an expression tree)
    • Extra tmp names are ignored (p. 416)
ad