1 / 38

Programming Languages 2nd edition Tucker and Noonan

Programming Languages 2nd edition Tucker and Noonan. Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth. Contents. 2.1 Grammars 2.1.1 Backus-Naur Form 2.1.2 Derivations 2.1.3 Parse Trees

lars-byrd
Download Presentation

Programming Languages 2nd edition Tucker and Noonan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming Languages2nd editionTucker and Noonan Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth

  2. Contents 2.1 Grammars 2.1.1 Backus-Naur Form 2.1.2 Derivations 2.1.3 Parse Trees 2.1.4 Associativity and Precedence 2.1.5 Ambiguous Grammars 2.2 Extended BNF 2.3 Syntax of a Small Language: Clite 2.3.1 Lexical Syntax 2.3.2 Concrete Syntax 2.4 Compilers and Interpreters 2.5 Linking Syntax and Semantics 2.5.1 Abstract Syntax 2.5.2 Abstract Syntax Trees 2.5.3 Abstract Syntax of Clite

  3. Review SUMMARY Programming language principles • Grammars, syntax, semantics What makes a language successful? • Reliable, readable, writeable • Supported by simplicity, orthogonality, efficiency, … Paradigms • Imperative, object-oriented, functional, logic Language implementation • Compilers, interpreters, combinations

  4. Thinking about Syntax • Three levels: • Lexical syntax – describes basic language symbols (names, operators, etc.) • Concrete syntax - rules for writing expressions, statements and programs, describes the external representation of a program • Abstract syntax – describes an internal representation of the program, emphasizes content over form, derived during parsing • Clite, a mini-language, used as a teaching tool in the study of syntax and semantics.

  5. 2.1 More About Grammars • A metalanguage is a language used to define other languages. • A grammar is a set of rules, written in a metalanguage, and used to define the concrete syntax of a language. • Programming languages are defined by a context-free grammar (more about this later) • Syntax can be defined in ways other than by formal grammars; e.g., in a natural language or syntax diagrams.

  6. 2.1.1 Backus-Naur Form (BNF) • Notation for describing a context-free grammar • Sometimes called Backus Normal Form • First used to define syntax of Algol 60 • Now used to define (concrete) syntax of most major languages

  7. Elements of a Context-Free Grammar • Set of • productions: P (productions = rules) • terminal symbols: T • nonterminal symbols: N • start symbol: • A production has the form A → ω, where A is a nonterminal symbol and ωis a string from N and T.

  8. Consider the grammar: IntegerDigit | Integer Digit Digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 The productions are the rules for Integer & Digit Non-terminals: Integer, Digit start symbol = Integer Terminals : 0 1 2 3 4 5 6 7 8 9 Metasymbols:  | 2.1.2 Example

  9. Grammars such as the Integer grammar can be used in several ways: Theoretically, to produce (derive) all legal stings, starting with the Start symbol. Practically, to show how to write syntactically correct “sentences” in the language described by the grammar. (Programmers use grammars this way) To show that a particular string is or is not correctly formed. (Language translators – compilers & interpreters – use it this way) 2.1.2 Derivations

  10. A 6-step process, begins with the start symbol Rule 1: Integer Digit | Integer Digit Replace a nonterminal by a RHS of one of its rules: Step 1: Integer Integer Digit Step 2:  Integer Digit Digit Step 3:  Digit DigitDigit Step 4:  3 Digit Digit Step 5:  3 5 Digit Step 6:  3 5 2 Finished when there are only terminals on the RHS This is a leftmost derivation. Derivation of 352 as an Integer

  11. Integer  Integer Digit  Integer 2  Integer Digit 2  Integer 5 2  Digit 5 2  3 5 2 This is called a rightmost derivation, since at each step the rightmost nonterminal is replaced. A Different Derivation of 352

  12. Definition • The language L defined by a BNF grammar G is the set of all terminal strings that can be derived from the start symbol.

  13. 2.1.3 Parse Trees • A parse tree is a graphical representation of a derivation. The root node of the tree is the start symbol. Each internal node of the tree corresponds to a non-terminal The child(ren) of a node represent a right-hand side of a production for which the node is the left-hand side. Each leaf node represents a terminal symbol of the derived string, reading from left to right. Leaves must match the original string.

  14. Integer Integer Digit e.g., the step Integer  Integer Digitappears in the parse tree as:

  15. Parse Tree for 352 as an Integer Figure 2.1

  16. The following grammar defines the language of arithmetic expressions with 1-digit integers, addition, and subtraction. ExprExpr + Term|Expr – Term | Term Term  0 |... | 9 | ( Expr) Nonterminals: Expr, Term Terminals: +, -, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, ) Arithmetic Expression Grammar

  17. Parse of the String 5-4+3 Figure 2.2

  18. 2.1.4 Associativity and Precedence • Grammars define associativity and precedence among the operators in an express • Precedence: which operator is evaluated first ; e.g., in the expression “a + b / c” • Associativity: evaluation order for adjacent operators that have equal precedence; e.g., in the expression “a - b + c”

  19. Grammar G1: • Consider the more interesting grammar G1: Expr  Expr + Term | Expr – Term | Term Term  Term * Factor | Term / Factor | Term % Factor | Factor Factor  Primary ** Factor | Primary Primary  0 | ... | 9 | ( Expr )

  20. Parse of 4**2**3+5*6+7 for Grammar G1 Figure 2.3 • Expr  Expr + Term • |Expr – Term • | Term • Term  Term * Factor • | Term / Factor |Term % Factor • | Factor • Factor  Primary ** Factor • | Primary • Primary  0 | ... | 9 • |( Expr )

  21. Precedence Associativity Operators 3 right ** 2 left * / % 1 left + - The grammar rules define precedence & associativity. The parse tree shows something about the relations: operators lower in the tree are generally evaluated first; an operation can’t be performed until its operands are evaluated Associativity and Precedence for Grammar G1Table 2.1

  22. How do Grammar Rules Indicate Precedence & Associativity ? • An operator’s precedence is determined by the length of the shortest derivation from the start symbol to the operator (see Figure 2.3) • Left- or right- associativity is determined by left- or right- recursion. • compare the operators ** and + in Figure 2.3

  23. 2.1.5 Ambiguous Grammars • A grammar is ambiguous if one of its strings has two or more different parse trees. • C, C++, and Java have a large number of operators and precedence levels • Instead of using a large grammar, we can: • Write a smaller ambiguous grammar, and • Specify precedence and associativity rules separately; e.g., Table 2.1

  24. Expr→ Expr Op Expr | ( Expr ) | Integer Op → + | - | * | / | % | ** G2 is equivalent to G1; i.e., its language is the same; but… G2 has fewer productions and non-terminals than G1. G2 is ambiguous. All operators have same precedence and associativity isn’t specified. An Ambiguous Expression Grammar G2

  25. Ambiguous Parse of 5-4+3 Using Grammar G2 Figure 2.4

  26. IfStatement→ if ( Expression ) Statement | if ( Expression ) Statement else Statement where Statement → Assignment |IfStatement|Block Suppose one of the statements was another If? The Dangling Else

  27. With which ‘if’ does the following ‘else’ associate ? if (x < 0) if (y < 0) y = y - 1; else y = 0; Answer: according to the grammar, either one! Example

  28. The Dangling Else Ambiguity Figure 2.5

  29. Ambiguity Can Be Addressed Syntactically: (see below for dangling if) Algol 68, Modula, Ada: use an explicit delimiter to end every conditional, for example: if (x < 0) if (x < 0) if (y<0) if (y<0) y = y - 1; y = y - 1; else fi; y = x / y; else fi; y = x / y; fi; fi;

  30. Solving The Dangling Else Ambiguity 3. Java: rewrite the grammar to define two different kinds of If statements: IfThenStatement→ if ( Expression ) Statement IfThenElseStatement → if ( Expression )StatementNoShortIf else Statement The category StatementNoShortIf includes all statement types except IfThenStatement.

  31. Or … • C, C++, Pascal, other languages: • Leave the grammar ambiguous • Have a separate rule outside the grammar to explain the usage • Tradeoff: large grammar with no ambiguity, smaller grammar with extra rules

  32. 2.2 Extended BNF (EBNF) • BNF: recursion to represent iteration • EBNF: additional metacharacters represent iteration • { } braces: show a series of zero or more occurrences • ( )parens: pick exactly one from the enclosed list • [ ] brackets: pick zero or one from the enclosed list • Metacharacters are distinguished from terminal symbols by a different typeface.

  33. CompareBNF/EBNF Examples BNF Expr → Term | Expr + Term | Expr – Term IfStatement →if ( Expr ) Statement | if ( Expr ) Statement else Statement EBNF Expr→ Term { (+| -) Term } IfStatement →if (Expr) Statement [else Statement ]

  34. C-style EBNF C-style EBNF lists alternatives on separate lines and uses opt to signify optional parts. e.g., IfStatement: if ( Expression ) Statement ElsePartopt ElsePart: else Statement

  35. We can always rewrite an EBNF grammar as a BNF grammar. e.g., A→ x { y } z can be rewritten: A → x A' z A' → ε | y A' (Rewriting EBNF rules with ( ), [ ] is left as an exercise.) EBNF to BNF

  36. Syntax Diagram for Expressions with Addition – Figure 2.6 • Syntax diagrams are another way to describe grammar rules. • Popularized when they were used to describe Pascal grammar.

  37. All Three are Equally Powerful • BNF is considered equivalent to context-free grammars because it can express any rule in the grammar • EBNF is no more (or less) powerful or expressive than BNF. Its virtue is compactness. • Syntax diagrams are equally expressive.

  38. Summary & Preview • Grammars • BNF notation • Grammars & parse trees • Grammars, parse trees, associativity & precedence • Ambiguity in grammars • Next up: • Clite syntax • Lexical and concrete syntax • More about compilers & interpreters • Abstract syntax

More Related