1 / 30

Chapter 7

Chapter 7. Introduction to Languages and Compiler. Contents. Computer architecture Compiler Grammars Formal languages Parse trees Ambiguity Regular expressions. Von Neumann Architecture. Compiler. A compiler is a program that reads a program written in one

manchu
Download Presentation

Chapter 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7 Introduction to Languages and Compiler SEG2101 Chapter 7

  2. Contents • Computer architecture • Compiler • Grammars • Formal languages • Parse trees • Ambiguity • Regular expressions SEG2101 Chapter 7

  3. Von Neumann Architecture SEG2101 Chapter 7

  4. Compiler A compiler is a program that reads a program written in one language – the source language – and translates it into an equivalent program in another language – the target language. SEG2101 Chapter 7

  5. The Compilation process SEG2101 Chapter 7

  6. Grammars • A grammar is defined as a 4-tuple: the alphabet , the nonterminals N, the production P, and a goal symbol S. • (, N, P, S) • , N, P are set, S is a particular element of set N. SEG2101 Chapter 7

  7. Alphabets and Strings •  is the alphabet, or set of terminals. • It is a finite set consisting of all the input characters or symbols that can be arranged to form sentences in the language. • English: A to Z, in our definition, punctuation and space symbols • Programming language: usually some well-defined computer set such as ASCII SEG2101 Chapter 7

  8. Alphabets and Strings (II) • A compiler is usually defined with 2 grammars. • The alphabet for the scanner grammar is ASCII or some subset of it. • The alphabet for the parse grammar is the set of tokens generated by the scanner, not ASCII at all. SEG2101 Chapter 7

  9. An Example of Strings • ={a,b,c,d} • Possible strings of terminals from  include aaa, aabbccdd, d, cba, abab, ccccccccccacccc, and so on. SEG2101 Chapter 7

  10. Formal Languages • : alphabet, it is a finite set consisting of all input characters or symbols. • *: closure of the alphabet, the set of all possible strings in , including the empty string . • A (formal) language is some specified subset of *. SEG2101 Chapter 7

  11. Nonterminals • Nonterninal set N is a finite set of symbols not in the alphabet. • A particular nonterminal, the goal symbol S, represents exactly all the strings in the language. • The goal symbol is also often called the start symbol because we start with it. • The set of terminal and set of nonterminals, taken together, is called vocabulary of the grammar. SEG2101 Chapter 7

  12. Productions • The productions P of a grammar is a set of rewriting rules, each written as two strings of symbols separated by an arrow. • The symbols on each side of the arrow may be drawn from both terminals and nonterminals, subject to certain restrictions in the form of the grammars. SEG2101 Chapter 7

  13. An Example Grammar • G1=({a,b,c}, {A,B}, {AaB, AbB, AcB, B a, B b, B c}, A) • The grammar generates 9 two-letter strings. SEG2101 Chapter 7

  14. Syntax and Semantics • Syntax: a syntax of a programming language is the form of its expression, statements, and program units. • Semantics: the meaning of those expression, statements, and program units. • If (<expr>) <statement> SEG2101 Chapter 7

  15. Sentences, Lexeme, Token • Sentences: the strings of a language are called sentences or statements. • Lexeme: the lexemes of a programming language include its identifier, literals, operators, and special words. • Token: a token of a language is a category of its lexemes. SEG2101 Chapter 7

  16. Lexeme and Token Index = 2 * count +17; SEG2101 Chapter 7

  17. The Role of Grammars • The grammar of a language defines the correct form for sentences in that language. • Grammar is the formal language generation mechanism that are commonly used to describe the syntax of programming languages. SEG2101 Chapter 7

  18. BNF: Backus-Naur Form • Backus presented a new formal notation for specifying programming language syntax. • Naur modified the notation slightly. • Known as Backus-Naur Form, or BNF. • BNF is a very natural notation for describing syntax. • BNF and context-free grammar (grammar) are used interchangeably. SEG2101 Chapter 7

  19. BNF • Metalanguage: A language used to describe another language. BNF is a metalanguage for programming language. • Abstraction: the symbol on the left-hand of the arrow • Definition: the text to the right of the arrow • Rule (production): altogether the description is called rule. SEG2101 Chapter 7

  20. BNF Description(A simple C assignment statement) SEG2101 Chapter 7

  21. Nonterminal and Terminal • Nonterminal symbol: the abstraction in a BNF description or grammar • Terminal symbol: the lexemes and tokens of the rules • A BNF description or grammar is simply a collection of rules. • Nonterminals can have two or more distinct definitions. • Multiple definitions can be written as a single rule, with the different definitions separated by |, meaning logical OR. <if_stmt>if <logic_expr>then<stmt> |if <logic_expr>then<stmt>else<stmt> SEG2101 Chapter 7

  22. List of Syntactic Elements • BNF does not include ellipsis (…) • BNF uses recursion • A rule is recursive if its LHS appears in its RHS. • e.g., <ident_list>  identifier | identifier , <ident_list> SEG2101 Chapter 7

  23. A Grammar SEG2101 Chapter 7

  24. A Derivation of a Program SEG2101 Chapter 7

  25. Another Grammar SEG2101 Chapter 7

  26. A Derivation of a Statement SEG2101 Chapter 7

  27. Parse Tree Grammars naturally describe the hierarchical syntactic structure of the sentences of the languages they define. These hierarchical structures are called parse trees. SEG2101 Chapter 7

  28. Ambiguous Grammar • A grammar that generates a sentence for which there are two or more distinct parse trees is said to be ambiguous. SEG2101 Chapter 7

  29. Ambiguity SEG2101 Chapter 7

  30. Regular Expressions Regular expression is a method of describing string. SEG2101 Chapter 7

More Related