150 likes | 176 Views
This module provides an in-depth look at the translation process, covering compilers, interpreters, phases of compilation, syntax analysis, and more. Understand the formal models and efficiency of dividing compilers into phases for correct code synthesis.
E N D
Overview of CompilationThe Compiler Front End Module 02.1COP4020 – Programming Language Concepts Dr. Manuel E. Bermudez
Definition: A translator is an algorithm that converts source programs into equivalent target programs. Definition: A compiler is a translator whose target language is at a “lower” level than its source language. Target Source Translator Overview of translation
When is one language’s level “lower” than another’s? Definition: An interpreter is an algorithm that simulates the execution of programs written in a given source language. input Source Interpreter output Overview of translation
Definition: An implementationof a programming language consists of a translator (or compiler) for that language, and an interpreter for the corresponding target language. Overview of translation input Source Target Compiler Interpreter output
A source program may be translated an arbitrary number of times before the target program is generated. Overview of translation Source Translator1 Translator2 ... TranslatorN Target
Each translation is a phase. Not to be confused with a pass, i.e., a disk dump. Divide a compiler into phases: Use a formal model of computation, Do it efficiently. Overview of translation
Usual division into phases: Two major phases, many possibilities for subdivision. Phase 1: Analysis (determine correctness) Phase 2: Synthesis (produce target code) Another criterion: Phase 1: Syntax (form). Phase 2: Semantics (meaning). Overview of translation
Group character sequences in the source. Form logical atomic units called tokens. Examples of tokens: Identifiers, keywords, integers, strings, punctuation marks, “white spaces”, end-of-line characters, comments, etc. PHASE 1: Scanning (Lexical analysis). Scanner (Lexical analysis) Source Sequence of Tokens
Proceeds sequentially. First character usually determines the token. A preliminary classification of tokens is made. Example: ‘program’ and ‘Ex’ are classified as Identifier. Lexical rules must be provided. “_” allowed in identifiers ? Comments cross line boundaries ? Must deal with end-of-line and end-of-file characters. PHASE 1: Scanning (Lexical analysis).
Remove unwanted tokens (spaces, comments). Recognize keywords. Merge/simplify tokens. Prepare token list for next phase (parser). Sequence of Tokens Screener Sequence of Tokens PHASE 1: Screening (post-process)
Is the token sequence syntactically correct ? Group the tokens into the correct syntactic structures. Expressions, statements, procedures, functions, modules. Use “re-write” rules (a.k.a. BNF). Build a “syntax tree”, bottom-up, as the rules are used. Use a stack of trees. PHASE 2: Parsing (Syntax Analysis)
FIRST 2 PHASES OF COMPILATION: PHASE 1: Scanning, Screening (a.k.a. Lexical Analysis) From characters to tokens. Proceeds sequentially. PHASE 2: Parsing (Syntax Analysis) From tokens to a tree. Post-order tree traversal. Summary