1 / 29

6. Phase 3 : Code Generation Part I

6. Phase 3 : Code Generation Part I. Overview of compilation. The unit directory. genProg.cxx . What you must do. Compiling, assembling, downloading and running C--. The Gnu assembler and the VxWorks dynamic linker. Monkey see, monkey do. gener.cxx . Structure of an M68K assembly file.

talbot
Download Presentation

6. Phase 3 : Code Generation Part I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 6. Phase 3 : Code Generation Part I • Overview of compilation. • The unit directory. • genProg.cxx. • What you must do. • Compiling, assembling, downloading and running C--. • The Gnu assembler and the VxWorks dynamic linker. • Monkey see, monkey do. • gener.cxx. • Structure of an M68K assembly file. • Declarations. • Statements.

  2. Overview compiler • A compiler is lexer + syner + gener. • Written lexer and syner. Now you write the gener. • Easiest of the three once you know what M68K code to generate. • And I tell you that bit. stdin stdout Errors || M68K C-- Code

  3. The Unit Directory • The unit directory for this phase is /usr/users/staff/aosc/cm049icp/phase3 • Among other things it contains the following : • genprog.cxx : the test bed program for phase 3. • gener.template : a template file for your phase 3 programs. • gener.h : the header file for phase 3. • trueval, falseval : constants to represent true and false in M68K code. • INT_MAX_16_BIT,INT_MIN_16_BIT : maximum and minimum values for 16 bit 2s complement integers. • RPolish : struct for holding the Reverse Polish representation of C-- expressions. • makefile : the makefile for phase 3.

  4. The Unit Directory II • gener : an executable for my phase 3 program. • tests/test*.c-- : testing programs for the demo. • rpolish.cxx : Reverse Polish conversion programs. RPolish *append(RPolish *rp1, RPolish *rp2) RPolish *toRPF(Factor *fact) RPolish *toRPT(Term *term) RPolish *toRPBE(BasicExp *bexp) RPolish *toRPE(Expression *expr) • partialExp.cxx : partial code generator for expressions. void genExpression(SymTab *st, Expression *expr, int &label, int &finalLabel) • Only handles literal constants. • Do a full implementation of genExpression after you’ve got the rest of it to work.

  5. You must write this subprogram genProg.cxx • The test bed program is as follows : #include “.../phase2/syner.h” #include “.../phase3/gener.h” void main() { SymTab *st = NULL ; AST *ast = NULL ; int label = 0 ; synAnal(st, ast, label) ; generate(st, ast, label) ; } • First calls synAnal to parse the C--, then calls generate to produce M68K code. • Input/Output is from/to stdin/stdout. • For a ‘real’ compiler would use argc/argv and command line arguments to use files for Input/Output.

  6. What You Must Do • Your implementation of generate must be in a file called gener.cxx in your directory. • Take a copy of makefile and gener.template. • Print out a copy of gener.h. • Print out a copy of rpolish.cxx. • Print out a copy of partialExp.cxx. • Useful commands : testphase3, demophase3. • They work as usual. • Your program’s output must be exactly the same as mine to get the marks.

  7. UNIX command. Compiling, Assembling, Downloading & Running C-- • C-- program in prog.c-- : const string s = “Hello\n” ; { cout << s ; } • Make and run gener : jaguar> make gener jaguar> gener < prog.c-- > a.s jaguar> assem a jaguar> • Connect to VxWorks box and download and run : rlogin moloch -> ld < a -> run Hello ->

  8. M68K Assembler & VxWorks Dynamic Linker • assem is a shell script (in /usr/users/staff/aosc/bin) which calls the Gnu Motorola 68000 assembler. • Gnu assembler is a high level Macro-Assembler. • Supports medium level memory management. • Makes variables/constants very easy. • VxWorks has a dynamic linker. • Similar to NT except that it works. • M68K programs contain calls to library subroutines. • e.g. scanf, printf. • Run-time addresses of these subroutines are not known to the compiler. • When programs are downloaded the required addresses are automatically linked in.

  9. Monkey See, Monkey Do • My code generator is in a file called gener in the unit directory for this phase. • /usr/users/staff/aosc/cm049icp/phase3 • To work out what assembly code you need to generate run gener on C-- source code and inspect the output that is produced. • More or less the approach I adopted using C source and the GNU C compiler, cc68k. • Took longer than I expected because GNU assembler uses non-standard M68K assembly code mnemonics and assembler directives. • Bloody idiots. • Rest of this lecture is just a few ‘handy hints’.

  10. Monkey See, Monkey Do II • Monkey see, monkey do is standard in the industry. • Usually have to tweak the instruction set of one chip into the instruction set of another. • Tend to stick to a small set of instructions which are common to all chips. • e.g. MOVE, ADD, JMP etc. • Usually about 20% of a CISC chip’s instruction set. • Main reason for RISC chips. • Why provide lots of instructions that no-one uses? • RISC chips have a lot fewer instructions than CISC chips. • Fewer instructions means less tweaking means less work.

  11. Top Level Structure For gener.cxx • Contents of gener.template : #include <iostream.h> #include <fstream.h> #include <iomanip.h> #include <ctype.h> #include <stddef.h> #include <stdlib.h> #include “.../lib/cstring.h” #include “.../phase2/syner.h” #include “.../phase3/rpolish.cxx” void genHeader() { cout << “genHeader\n” ; } void genFooter(int finalLabel) { cout << “genFooter\n” ;

  12. Top Level Structure For gener.cxx II void genDec(SymTab *st) { cout << “genDec\n” ; } void genDeclarations(SymTab *st) { cout << “genDeclarations\n” ; #include “.../phase3/partialExp.cxx” // Forward Declaration. void genStatements(SymTab *st, AST *ast, int &label, int &finalLabel) ; void genIfSt(SymTab *st, AST *ast, int &label, int &finalLabel) { cout << “genIfSt\n” ; }

  13. Top Level Structure For gener.cxx III void genWhileSt(SymTab *st, AST *ast, int &label, int &finalLabel) { cout << “genWhileSt\n” ; } void genCinSt(SymTab *st, AST *ast, int &label, int &finalLabel) { cout << “genCinSt\n” ; } void genCoutSt(SymTab *st, AST *ast, int &label, int &finalLabel) { cout << “genCoutSt\n” ; }

  14. Top Level Structure For gener.cxx IV void genAssignSt(SymTab *st, AST *ast, int &label, int &finalLabel) { cout << “genAssignSt\n” ; } void genStatements(SymTab *st, AST *ast, int &label, int &finalLabel) { cout << “genStatements\n” ; }

  15. Top Level Structure For gener.cxx V void generate(SymTab *st, AST *ast, int label) { int finalLabel = label++ ; genHeader() ; genDeclarations(st) ; genStatements(st, ast, label, finalLabel) ; genFooter(finalLabel) } // generate • finalLabel used to label the error code for integer overflow. • Avoids using a 2-pass generator.

  16. Structure Of A M68K Assembler File • Assembler code file is made up of 3 parts : • Standard header part. • Specific assembly code generated from C-- source. • Standard footer part. • Standard header : #NO_APP _IOinteger: .asciz “%d” _Eintegeroverflow: .asciz “\n\nInteger Overflow!\n” | | Declarations go here. | .even .globl _run _run: To find out what this means RTFM.

  17. Structure Of A M68K Assembler File II • Standard footer : RTS LfinalLabel: LINK A6,#0 PEA _Eintegeroverflow JBSR __printf ADDQ.W #4,SP UNLK A6 RTS • Code after LfinalLabel label is integer overflow handling code. • Obviously, use value of finalLabel not its name. • RTS : VxWorks calls the assembly code as a subprogram. • ‘\t’ at start of all indented lines throughout assembly code. • No ‘\t’ anywhere else (except in strings).

  18. Variable And Constant Declarations • genDec handles a single declaration. • C-- : int i1 = 0 ; int i2 ; const string str = “Hello\n” ; bool b1 = false ; bool b2 ; • M68K : .comm i1,4 .comm i2,4 Lstr: .asciz “Hello\n” ; .comm b1,4 .comm b2,4 • Note that strings are initialised on declaration.

  19. Code For genDeclarations void genDeclarations(SymTab *st) { SymTab *stsave = NULL ; stsave = st ; while (st != NULL){ genDec(st) ; st = st->next ; } cout << “.even\n” ; cout << “.globl _run\n” ; cout << “_run\n” ; st = stsave ; while (st != NULL) // Initialise ints and bools. st = st->next ; }

  20. Initialising ints and bools • int and bool constants and variables must be initialised when the program runs. • i.e. by M68K MOVE instructions. • In genDeclarations : if (st->initialise != NULL) && (st-type != STRINGDATA) { cout << “\tMOVE.W “ ; if (st->type == INTDATA) cout << “#’ << st->initialise->litInt ; else if (st->type == BOOLDATA) { if (st->initialise->litBool == “true”) cout << ‘#’ << trueval ; else if (st->initalise->litBool == “false”) cout << ‘#’ << falseval ; } cout << ‘,’ << st->ident << endl ; }

  21. genStatements • genStatements simply steps through the AST calling other subprograms to generate the code for individual statements : void genStatements(...) { while (ast != NULL) { if (ast->tag == IFST) genIfSt(st, ast, label, finalLabel) ; else if (ast->tag == WHILEST) genWhileSt(st, ast, label, finalLabel) ; else if (ast->tag == CINST) genCinSt(st, ast, label, finalLabel) ; else if (ast->tag == COUTST) genCoutSt(st, ast, label, finalLabel) ; else if (ast->tag == ASSIGNST) genAssignSt(st, ast, label, finalLabel) ; } ast = ast->next ; } // genStatements

  22. cin Statements • C-- : cin >> invar ; • M68K : LINK A6,#-4 LEA A6@(-4),A0 MOVE.L A0,SP@- PEA _IOinteger JBSR _scanf ADDQ.W #8,SP MOVE.L A6@(-4),invar UNLK A6 MOVE.L invar,D0 CMP.L #INT_MAX_16_BIT,D0 BGT LfinalLabel CMP.L #INT_MIN_16_BIT,D0 BLT LfinalLabel

  23. cout Statements • C-- : cout >> outvar ; • M68K for strings : LINK A6,#-0 PEA Loutvar JSBR _printf ADDQ.W #4,SP UNLK A6 • M68K for ints : LINK A6,#-4 MOVE.L outvar,SP@- PEA _IOinteger JSBR _printf ADDQ.W #4,SP UNLK A6

  24. Assignment Statements • C-- : var = expression ; • M68K : | Code to evaluate expression. MOVE.L D0,var • Code for the expression is generated by genExpression. • Convention : result of the expression will be left in D0. • Next lecture on how to write genExpression. For now just use the partial implementation from partialExp.cxx. • Can only use literal constants.

  25. while Statements • C-- : while (condition) { statements } ; • M68K : Lstartlabel: | Code to evaluate condition. CMP.L trueval,D0 BNE Lendlabel | Code to execute statements. JMP Lstartlabel Lendlabel:

  26. while Statements II • Obviously, use the integer values of ast->whilest->startlabel and ast-whilest->endlabel rather than their names after the Ls. • Code to evaluate condition expression is generated by genExpression. • Initially can only use boolean literal constants. • Code to execute statements is generated by genStatements. • Must be forward declared as it is mutually recursive with genWhileSt and genIfSt.

  27. if Statements • C-- : if (condition) { statements } ; • M68K : | Code to evaluate condition. CMP.L trueval,D0 BNE Lendlabel | Code to execute statements. Lendlabel: • Obviously, use the integer value of ast->ifst->endlabel rather than its name after the Ls. • Code to evaluate condition expression is generated by genExpression. • Code to execute statements is generated by genStatements.

  28. if Statements II • C-- : if (condition) { thenstatements } ; else { elsestatements } ; • M68K : | Code to evaluate condition. CMP.L trueval,D0 BNE Lelselabel | Code to execute thenstatements. JMP Lendlabel Lelselabel: | Code to execute elsestatements. Lendlabel: • Obviously, use the integer values of ast->ifst->elselabel and ast-ifst->endlabel rather than their names after the Ls.

  29. Summary • Copy gener.template, makefile and gener (renamed dhgener) into your directory. • Print out gener.h, rpolish.cxx and partialExp.cxx. • Rename gener.template to gener.cxx. • Complete the stubs in gener.cxx in the following order : • genHeader, genFooter, genDeclarations, genDec, genCinSt, genCoutSt, genAssignSt, genIfSt, genWhileSt. • For now, assume all expressions are simply literal constants. • Use the genExpression in partialExp.cxx. • #included into gener.template.

More Related