1 / 20

CS 331, Principles of Programming Languages

Understand the difference between syntax and semantics in programming languages, including the concepts of expressions, grammars, tree representations, and abstract syntax trees. Learn how to describe a programming language using various approaches, such as tutorials, reference manuals, and formal definitions.

mistyk
Download Presentation

CS 331, Principles of Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 331, Principles of Programming Languages Chapter 2

  2. Overview • What’s the difference between syntax and semantics? • Expressions • Grammars • Tree representations • abstract syntax trees • parse trees

  3. How to describe a PL • Tutorials - SNOBOL is still the best example • Reference manuals - ADA • Formal definitions - to describe both syntax and semantics, which is hard • Pascal, ADA, PL/I

  4. Syntax vs. Semantics • Syntax - what is a legal program? • Semantics - what does a (legal) program mean? Three major approaches: • axiomatic, i.e. a set of proof rules • denotational, i.e. mathematical description • operational, i.e. operations on a real or abstract machine

  5. How to describe syntax? • By example? • Possibly ambiguous or incomplete • Used to describe shells, e.g. man pages • By use of a meta-language • Also possibly ambiguous or incomplete • But probably more precise • Possible to give some semantics in the same notation

  6. Expressions • Prefix, postfix, or infix • Issues related to operators • arity (unary, binary, ternary, or whatever) • associativity • exponentiation is right-associative, usually • other ops are usually left-associative • precedence • follows rules from arithmetic

  7. - 1 - 2 0 Abstract Syntax Trees • Useful for indicating how an expression is evaluated • The expression 2-0-1 is represented • Or is it?

  8. Examples of Prefix and Postfix • Prefix • LISP operators use prefix • Postfix • Postscript operators use postfix • The simple expression 8-(7*3) is represented as: 8 7 3 mul sub • Old HP calculators did, too - no parens keys

  9. To run LISP in emacs • Invoke emacs • M-x lisp-interaction-mode • type control-j at the end of each line • Or using an inferior emacs lisp process, • M-x ielm

  10. (+ 2 2) 4 (sqrt 9) 3.0 (setq b 6) 6 (setq a 2) 2 (setq c 5) 5 a 2 (- b) -3 (+ (- b) (sqrt( - (* b b) (* (* 4 a) c)))) -2.5358983848622456

  11. Prefix, Infix, Postfix • Given an abstract syntax tree, an expression can be represented in any of the three ways • Consider for example a+b*c/d • What does the abstract syntax tree look like? • What are the prefix and postfix expressions equivalent to the infix form given above?

  12. Formal Grammars • Set of terminal symbols (or tokens) • Set of non-terminal symbols • A designated start symbol • A set of productions (or rules) that specify how symbols are to be combined to form legal strings • G=<T, N, S, R>

  13. Context-free Grammars • There are lots of varieties of grammars • regular, context-free, context-sensitive, and unrestricted • CFGs are constrained so that exactly one non-terminal can appear on the left-side of a production • but a non-terminal may appear on the left-side of more than one production

  14. CFG Notation • CFG productions have exactly one non-terminal on the left side, and zero or more non-terminals or terminals on the right side • Usually, nonterminals are enclosed in <anglebrackets> • Terminals (aka tokens) may be quoted for clarity

  15. Backus-Naur Form (BNF) • BNF is a popular notation for CFGs • from a simple subset of Pascal <program> ::= <block> . <block> ::= <statement> <block> ::= begin <statements> end <statements> ::= <statement> <statements> ::= <statement>;<statements> <statement> ::= <if> | <while> | <repeat> | ... <if> ::= if <expr> then <block> <if> ::= if <expr> then <block> else <block>

  16. BNF Operators • Sequence <A> ::= <B> c • Alternation <A> ::= <B> | <C> • Optional <A> ::= <B> [<C>] • Zero or more <A> ::= <B>* • One or more <A> ::= <B>+ • note that <B>* is a shorthand for [<B>+]

  17. Ambiguity • There may be many (equivalent) grammars for a language. • There may be more than one way to evaluate a string with respect to a grammar • A grammar is ambiguous if, for any string in the language, that string can be parsed in more than one way.

  18. Dangling-Else • Suppose a grammar has the production • How should we parse this statement? if E then if E2 then S1 else S2 <stmt> ::= if <expr> then <stmt> | if <expr> then <stmt> else <stmt>

  19. A different ambiguity header ::= <header> title (link? | script?) </header> title ::= <title> text </title> link ::= <link> text </link> script ::= <script> text </script> This grammar allows the <link> and <script> constructs to appear in either order. The grammar above is then ambiguous!

  20. header ::= <header> title (link? | script?) </header> title ::= <title> text </title> link ::= <link> text </link> script ::= <script> text </script> How do we parse this string? <header> <title> Some Title </title> </header>

More Related