1 / 26

CMP 339/692 Programming Languages Day 6 Thursday, February 16, 2012

CMP 339/692 Programming Languages Day 6 Thursday, February 16, 2012. Rhys Eric Rosholt. Office: Office Phone: Web Site: Email Address:. Gillet Hall - Room 304 718-960-8663 http://comet.lehman.cuny.edu/rosholt/ rhys.rosholt @ lehman.cuny.edu. Chapter 3. Describing Syntax and Semantics.

weldon
Download Presentation

CMP 339/692 Programming Languages Day 6 Thursday, February 16, 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMP 339/692Programming LanguagesDay 6Thursday,February 16, 2012 Rhys Eric Rosholt Office: Office Phone: Web Site: Email Address: Gillet Hall - Room 304 718-960-8663 http://comet.lehman.cuny.edu/rosholt/ rhys.rosholt @ lehman.cuny.edu

  2. Chapter 3 Describing Syntax and Semantics

  3. Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs: Dynamic Semantics

  4. Review • Language description includes two main components • Syntax • The form of expressions, statements, and program units • Semantics • The meaning of expressions, statements, and program units • Describing syntax is easier than describing semantics • Universally-accepted notations can be used to describe syntax. (e.g. BNF) • No universally-accepted systems have been created for describing semantics.

  5. The General Problem of Describing Syntax: Terminology Definition: A sentence is an ordered string of characters over some alphabet Definition: A language is a set of sentences Definition: A lexeme is the lowest level syntactic unit of a language (e.g., *, sum,begin) Definition: A token is a category of lexemes (e.g., identifier)

  6. Formal Grammars andFormal Languages A formal grammar G = (N,Σ,P,S) is a quad-tuple such that N is a finite set of nonterminal symbols Σis a finite set of terminal symbols, disjoint from N P is a finite set of production rules of the formαNβ → γ S ЄN the start symbol The language of a formal grammar G, denoted as L(G), is the set of all strings over Σ that can be generated by starting with the start symbol S and then applying the production rules in P until no nonterminal symbols are present.

  7. Formal Methodsof Describing Syntax • The most widely known methods for describing programming language syntax: • Backus-Naur Form (BNF) • Context-Free Grammars • Extended BNF (EBNF) • Improves readability and writability • Grammars and Recognizers

  8. Context-Free Grammars • Developed by Noam Chomsky • mid-1950s • Language generators • meant to describe the syntax of natural languages • Defines a class of languages called context-free languages

  9. Formal Definition of Languages Recognizers A device that reads input strings of the language and decides whether the input strings belong to the language Generators A device that generates sentences of a language which are used to compare with the syntax of a particular sentence

  10. Backus-Naur Form (BNF) • Invented by John Backus and Peter Naur • Equivalent to context-free grammars • A metalanguage used to describe another language • Abstractions are used to represent classes of syntactic structures • act like syntactic variables • called nonterminal symbols

  11. BNF Fundamentals • Non-terminals: abstractions • Terminals: lexemes and tokens • Grammar: a collection of rules • Examples of BNF rules: <id_list> -> ident | ident, <id_list> <if_stmt> -> if <logic_expr> then <stmt>

  12. BNF Rules • A rule has • a left-hand side (LHS) • a right-hand side (RHS) • consists of terminal and nonterminal symbols • A grammar is a finite nonempty set of rules • An abstraction (or nonterminal symbol) can have more than one RHS <stmt> -> <single_stmt> | begin <stmt_list> end

  13. Describing Lists • Syntactic lists are described using recursion <id_list> -> ident | ident, <id_list> • A derivation is • a repeated application of rules, • starting with the start symbol, and • ending with a sentence • all terminal symbols

  14. An Example Grammar <program> -> <stmts> <stmts> -> <stmt> | <stmt> ; <stmts> <stmt> -> <var> = <expr> <var> -> a | b | c | d <expr> -><term> + <term> |<term> - <term> <term> -> <var> | const

  15. An Example Derivation <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const

  16. Derivation • Every string of symbols in the derivation is a sentential form • A sentence is a sentential form that has only terminal symbols • A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded • A derivation may be neither leftmost nor rightmost

  17. Parse Tree A hierarchical representation of a derivation <program> <stmts> <stmt> <var> = <expr> a <term> + <term> <var> const b

  18. Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees

  19. An Ambiguous Expression Grammar <expr>  <expr> <op> <expr> | const <op>  / | - <expr> <expr> <expr> <op> <op> <expr> <expr> <op> <expr> <expr> <op> <expr> <expr> <op> <expr> const - const / const const - const / const

  20. An Unambiguous Expression Grammar If we use the parse tree to indicate precedence levels of the operators, we cannot have ambiguity <expr>  <expr> - <term> | <term> <term>  <term> / const | const <expr> <expr> - <term> <term> <term> / const const const

  21. Associativity of Operators Operator associativity can also be indicated by a grammar. ambiguous: <expr> -> <expr> + <expr> | const unambiguous: <expr> -> <expr> + const | const <expr> <expr> <expr> + const <expr> + const const

  22. Extended BNF Optional parts are placed in brackets [] <proc_call> -> ident [(<expr_list>)] Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term>(+|-) const Repetitions (0 or more) are placed inside braces {} <ident> → letter {letter|digit}

  23. BNF and EBNF BNF <expr> -><expr> + <term> | <expr> - <term> | <term> <term> -><term> * <factor> | <term> / <factor> | <factor> EBNF <expr> -><term> {(+|-) <term> } <term> -><factor> {(*|/) <factor> }

  24. The Chomsky Hierarchy

  25. Semantics • The meaning, not the form • Need a language to describe the semantics of languages • Assorted mathematical formalisms • Static Semantics • Attribute Grammars • Dynamic Semantics • Operational Semantics • Axiomatic Semantics • Denotational Semantics

  26. Next ClassThursdayFebruary 23, 2012 Rhys Eric Rosholt Office: Office Phone: Web Site: Email Address: Gillet Hall - Room 304 718-960-8663 http://comet.lehman.cuny.edu/rosholt/ rhys.rosholt @ lehman.cuny.edu

More Related