1 / 25

Introduction

Introduction. Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code). Compiler. Source Code. Compiler. Target Code. ?. Error Messages. What is Involved.

nariko
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code). D. M. Akbar Hussain: Department of Software & Media Technology

  2. Compiler Source Code Compiler Target Code ? Error Messages D. M. Akbar Hussain: Department of Software & Media Technology

  3. What is Involved • Programming Languages • Components of a Compiler • Applications • Formal Languages D. M. Akbar Hussain: Department of Software & Media Technology

  4. Programming Languages • We use natural languages to communicate • We use programming languages to speak with computers D. M. Akbar Hussain: Department of Software & Media Technology

  5. Component of Compilers • Analysis • Lexical Analysis • Syntax Analysis • Semantic Analysis • Synthesis • Intermediate Code Generation • Code Optimization • Code Generation D. M. Akbar Hussain: Department of Software & Media Technology

  6. The Input • Read string input • sequence of characters • Character set: • ASCII • ISO Latin-1 • ISO 10646 (16-bit = unicode) Ada, Java • Others (EBCDIC, JIS, etc) D. M. Akbar Hussain: Department of Software & Media Technology

  7. The Output • Tokens: kind, name • Punctuation ( ) ; , [ ] • Operators + - ** := • Keywords begin end if while try catch • Identifiers Square_Root • String literals “press Enter to continue” • Character literals ‘x’ • Numeric literals • Integer • Floating_point D. M. Akbar Hussain: Department of Software & Media Technology

  8. Lexical Analysis • Free form languages All modern languages are free form • White space, tabs, new line, carriage return does not matter just Ignore these. • Ordering of token is only important • Fixed format languages • Layout is critical • Fortran, label in cols 1-6 • COBOL, area A B • Lexical analyzer must know about layout to find tokens D. M. Akbar Hussain: Department of Software & Media Technology

  9. Relevant Formalisms • Type 3 (Regular) Grammars • Regular Expressions • Finite State Machines Useful for program construction, even if hand-written D. M. Akbar Hussain: Department of Software & Media Technology

  10. Interface to Lexical Analyzer • Either:Convert entire file to a file of tokens • Lexical analyzer is separate phase • Or:Parser calls lexical analyzer to supply next token • This approach avoids extra I/O • Parser builds tree incrementally, using successive tokens as tree nodes D. M. Akbar Hussain: Department of Software & Media Technology

  11. Performance Issues • Speed • Lexical analysis can become bottleneck • Minimize processing per character • Skip blanks fast • I/O is also an issue (read large blocks) • We compile frequently • Compilation time is important • Especially during development • Communicate with parser through global variables D. M. Akbar Hussain: Department of Software & Media Technology

  12. Syntax Analysis (SA) D. M. Akbar Hussain: Department of Software & Media Technology

  13. Semantic Analyserzer D. M. Akbar Hussain: Department of Software & Media Technology

  14. Intermediate Code Generation D. M. Akbar Hussain: Department of Software & Media Technology

  15. Code Optimization D. M. Akbar Hussain: Department of Software & Media Technology

  16. Code Generation D. M. Akbar Hussain: Department of Software & Media Technology

  17. Data structure tools • Syntax tree: • Literal table: • Symbol table: D. M. Akbar Hussain: Department of Software & Media Technology

  18. Error handler • One of the difficult part of a compiler. • Must handle a wide range of errors • Must handle multiple errors. • Must not get stuck. • Must not get into an infinite loop (typical simple-minded strategy:count errors, stop if count gets too high). D. M. Akbar Hussain: Department of Software & Media Technology

  19. Kinds of errors ? D. M. Akbar Hussain: Department of Software & Media Technology

  20. Sample compiler • TINY:Pages 22-26, 4-pass compiler for the TINY language based on Pascal. • C-Minus based on C :Pages 26-27 and Appendix A. D. M. Akbar Hussain: Department of Software & Media Technology

  21. TINY Example read x; if x > 0 then fact := 1; repeat fact := fact * x; x := x - 1 until x = 0; write fact end D. M. Akbar Hussain: Department of Software & Media Technology

  22. C-Minus Example int fact( int x ) { if (x > 1) return x * fact(x-1); else return 1; } void main( void ) { int x; x = read(); if (x > 0) write( fact(x) ); } D. M. Akbar Hussain: Department of Software & Media Technology

  23. Structure of the TINY Compiler globals.h main.c util.h util.c scan.h scan.c parse.h parse.c symtab.h symtab.c analyze.h analyze.c code.h code.c cgen.h cgen.c D. M. Akbar Hussain: Department of Software & Media Technology

  24. Conditional Compilation Options • NO_PARSE: Builds a scanner-only compiler. • NO_ANALYZE: Builds a compiler that parses and scans only. • NO_CODE: Builds a compiler that performs semantic analysis, but generates no code. D. M. Akbar Hussain: Department of Software & Media Technology

  25. Listing Options (built in - not flags) • EchoSource: Echoes the TINY source program to the listing, together with line numbers. • TraceScan: Displays information on each token as the scanner recognizes it. • TraceParse: Displays the syntax tree in a linearlised format. • TraceAnalyze: Displays summary information on the symbol table and type checking. • TraceCode: Prints code generation-tracing comments to the code file. D. M. Akbar Hussain: Department of Software & Media Technology

More Related