1 / 25

Compiler Design (40-414)

Compiler Design (40-414). Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007 Evaluation: Midterm Exam 35% Final Exam 35% Assignments and Quizzes 10% Project 20%.

elsu
Download Presentation

Compiler Design (40-414)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Design (40-414) • Main Text Book: Compilers: Principles, Techniques & Tools, 2nd ed., Aho, Lam, Sethi, and Ullman, 2007 • Evaluation: • Midterm Exam 35% • Final Exam 35% • Assignments and Quizzes 10% • Project 20%

  2. Compiler learning Isn’t it an old discipline? Yes, it is a well-established discipline Algorithms, methods and techniques were developed in early stages of computer science There are many compilers around, and many tools to generate them automatically So, why we need to learn it? Although you may never write a full compiler But the techniques we learn is useful in many tasks like: writing an interpreter for a scripting language, validation checking for forms, and so on

  3. Terminology Compiler: a program that translates an executable program in a source language (usually high level)into an equivalent executable program in a target language (usually low level) Interpreter: a program that reads an executable program and produces the results of running that program usually, this involves executing the source program in some fashion Our course is mainly about compilers but many of the same issues arise in interpreters

  4. A Compiler Source Program Compiler Target Program Errors Target Progtam Output Input

  5. An Interpreter Source Program Interpreter Output Input • Translates line by line • Executes each translated line immediately • Execution is slower because translation is repeated • But, usually give better error diagnostics than a compiler

  6. A Hybrid Compiler Input Virtual Machine Output Source Program Translator Errors Intermediate Program

  7. Classifications of Compilers • There are different types of Compilers: Single Pass Multiple Pass Construction Absolute (e.g., *.com) Relocateable (e.g., *.exe) Type of produced code

  8. The Many Phases of a Compiler Source Program 1 Lexical analyzer Analyses 2 Syntax Analyzer 3 Semantic Analyzer Intermediate Code Generator Symbol-table Manager 4 Error Handler 5 Code Optimizer Syntheses 6 Code Generator 7 Peephole Optimization 1, 2, 3, 4, 5 : Front-End 6, 7 : Back-End Target Program

  9. Front-end, Back-end division Front end maps legal code into IR Back end maps IR onto target machine Simplifies retargeting Allows multiple front ends Front end IR Back end Source code Machine code errors

  10. Front end Scanner: Maps characters into tokens – the basic unit of syntax x = x + y becomes <id, x> = <id, x> + <id, y> Typical tokens: number, id, +, -, *, /, do, end Eliminate white space (tabs, blanks, comments) A key issue is speed so instead of using a tool like LEX it sometimes needed to write your own scanner Scanner tokens Parser Source code Parse Tree errors

  11. Front end Parser: Recognize context-free syntax Guide context-sensitive analysis Construct IR Produce meaningful error messages Attempt error correction There are parser generators like YACC which automates much of the work Scanner tokens Parser Source code Parse Tree errors

  12. Front end Context free grammars are used to represent programming language syntaxes: <expr> ::= <expr> <op> <term> | <term> <term> ::= <number> | <id> <op> ::= + | -

  13. Front end A parser tries to map a program to the syntactic elements defined in the grammar A parse can be represented by a tree called a parse or syntax tree

  14. Front end A parse tree can be represented more compactly referred to as Abstract Syntax Tree (AST) AST can be used as IR between front end and back end

  15. Back end Translate IR into target machine code Choose instructions for each IR operation Decide what to keep in registers at each point Instruction selection Register Allocation Machine code IR errors

  16. Back end Produce compact fast code Use available addressing modes CodeGeneration PeepholeOptimization Machine code IR errors

  17. Back end Limited resources Optimal allocation is difficult CodeGeneration PeepholeOptimization Machine code IR errors

  18. The Analysis Task For Compilation • Three Phases: • Lexical Analysis: • Left-to-right Scan to Identify Tokenstoken: sequence of chars having a collective meaning • Syntax Analysis: • Grouping of Tokens Into Meaningful Collection • Semantic Analysis: • Checking to ensure Correctness of Components

  19. Phase 1. Lexical Analysis All are tokens Easiest Analysis - Identify tokens which are the basic building blocks For Example: Position := initial + rate * 60 ; _______ __ _____ _ ___ _ __ _ Blanks, Line breaks, etc. are scanned out

  20. Phase 2. Syntax Analysisor Parsing assignment statement := identifier expression + position expression expression * identifier expression expression initial identifier number rate 60 For previous example, we would have Parse Tree: Nodes of tree are constructed using a grammar for the language

  21. Phase 3. Semantic Analysis := := position + position + initial * initial * rate 60 rate inttoreal 60 • Finds Semantic Errors • One of the Most Important Activity in This Phase: • Type Checking - Legality of Operands Syntax Tree Conversion Action

  22. Supporting Phases/ Activities for Analysis • Symbol Table Creation / Maintenance • Contains Info (storage, type, scope, args) on Each “Meaningful” Token, Typically Identifiers • Data Structure Created / Initialized During Lexical Analysis • Utilized / Updated During Later Analysis & Synthesis • Error Handling • Detection of Different Errors Which Correspond to All Phases • What Happens When an Error Is Found?

  23. The Synthesis Task For Compilation • Intermediate Code Generation • Abstract Machine Version of Code - Independent of Architecture • Easy to Produce and Do Final, Machine Dependent Code Generation • Code Optimization • Find More Efficient Ways to Execute Code • Replace Code With More Optimal Statements • Final Code Generation • Generate Relocatable Machine Dependent Code • Peephole Optimization • With a Very Limited View Improves Produced Final Code

  24. Reviewing the Entire Process intermediate code generator lexical analyzer syntax analyzer semantic analyzer := + id1 id2 * id3 60 := + id1 id2 * id3 inttoreal 60 position := initial + rate * 60 id1 := id2 + id3 * 60 Symbol Table Errors position .... initial …. rate….

  25. Reviewing the Entire Process intermediate code generator code optimizer final code generator Symbol Table position .... initial …. rate…. Errors t1 := inttoreal(60) t2 := id3 * t1 temp3 := id2 + t2 id1 := t3 3 address code t1 := id3 * 60.0 id1 := id2 + t1 MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R1, R2 MOVF R1, id1

More Related