slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Note PowerPoint Presentation
Download Presentation
Note

Loading in 2 Seconds...

play fullscreen
1 / 41

Note - PowerPoint PPT Presentation


  • 227 Views
  • Uploaded on

Note. These notes are based on the Sebesta text. The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett. Introduction: syntax and semantics.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Note' - dara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
Note
  • These notes are based on the Sebesta text.
  • The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.
introduction syntax and semantics
Introduction: syntax and semantics
  • Syntax: a formal description of the structure of programs in a given language.
  • Semantics: a formal description of the meaning programs in a given language.
  • Together the syntax and semantics define a language.
who uses a language definition
Who uses a language definition?
  • Those who design a language
  • Those who implement a language (e.g. write compilers for it)
  • The who use the language (i.e. software developers)
  • Those who make tools for developers (e.g. JDT in Eclipse)
language grammar
Language & grammar
  • A given language can have more than one grammar which describes it.
  • The grammar presented to a user is not necessarily the same as the grammar used in an implementation.
    • implementation requires a very detailed gramamr
    • user needs a human-readable grammar
syntax and semantics of programming languages
Syntax and semantics of programming languages
  • I have cautioned against getting too hung up on the syntax of a programming language.
  • But, you still need to learn the syntax of any language you work with so that you can read and write programs in the language.
  • To understand the meaning of programs expressed in a language you also have know the semantics of the language.
general background
General background
  • Chomsky hierarchy
  • Context-free grammars
  • Backus-Naur form
chomsky hierarchy
Chomsky hierarchy
  • Noam Chomsky defined a hierarchy of grammars and languages known as the Chomsky hierarchy:
    • regular languages (most restrictive)
    • context-free languages
    • context-sensitive languages
    • unrestricted languages (least restrictive)
context free cf grammar
Context-free (CF) grammar
  • A CF grammar is formally presented as a 4-tuple G=(T,NT,P,S), where:
    • T is a set of terminal symbols (the alphabet)
    • NT is a set of non-terminal symbols
    • P is a set of productions (or rules), where PNT(TNT)*
    • SNT
example 1 a small formal language
Example 1A small formal language

L1 = { 0, 00, 1, 11 }

G1 = ( {0,1}, {S}, { S0, S00, S1, S11 }, S )

example 2 a small fragment of english
Example 2A small fragment of English

L2 = { the dog chased the dog, the dog chased a dog, a dog chased the dog, a dog chased a dog, the dog chased the cat, … }

G2 = ({a, the, dog, cat, chased},

{S, NP, VP, Det, N, V},

{S  NP VP, NP  Det N, Det  a | the,

N  dog | cat, VP  V | VP NP, V  chased},

S )

Notes: S = Sentence, NP = Noun Phrase , N = Noun

VP = Verb Phrase, V = Verb, Det = Determiner

language terminology from sebesta p 125
Language terminology(from Sebesta, p. 125)
  • A language is a set of strings of symbols, drawn from some finite set of symbols (called the alphabet of the language).
  • “The strings of a language are called sentences”
  • “Formal descriptions of the syntax […] do not include descriptions of the lowest-level syntactic units […] called lexemes.”
  • “A token of a language is a category of its lexemes.”
  • Syntax of a programming language is often presented in two parts:
    • regular grammar for token structure (e.g. structure of identifiers)
    • context-free grammar for sentence structure
backus naur form bnf
Backus-Naur Form (BNF)
  • Backus-Naur Form (1959)
    • Invented by John Backus to describe ALGOL 58, modified by Peter Naur for ALGOL 60
    • BNF is equivalent to context-free grammars
    • BNF is a metalanguage used to describe another language, the object language
    • Extended BNF: adds syntactic sugar to produce more readable descriptions
bnf fundamentals
BNF Fundamentals
  • Sample rules [p. 128]

<assign> → <var> = <expression>

<if_stmt> → if <logic_expr> then <stmt>

<if_stmt> → if <logic_expr> then <stmt> else <stmt>

  • non-terminals/tokens surrounded by < and >
  • lexemes are not surrounded by < and >
  • keywords in language are in bold
  • → separates LHS from RHS
  • | expresses alternative expansions for LHS

<if_stmt> → if <logic_expr> then <stmt>

| if <logic_expr> then <stmt> else <stmt>

  • = is in this example a lexeme
bnf rules
BNF Rules
  • A rule has a left-hand side (LHS) and a right-hand side (RHS), and consists of terminal and nonterminal symbols
  • A grammar is often given simply as a set of rules (terminal and non-terminal sets are implicit in rules, as is start symbol)
describing lists
Describing Lists
  • There are many situations in which a programming language allows a list of items (e.g. parameter list, argument list).
  • Such a list can typically be as short as empty or consisting of one item.
  • Such lists are typically not bounded.
  • How is their structure described?
describing lists17
Describing lists
  • The are described using recursive rules.
  • Here is a pair of rules describing a list of identifiers, whose minimum length is one:

<ident_list> -> ident

| ident , <ident_list>

  • Notice that ‘,’ is part of the object language (the language being described by the grammar).
derivation of sentences from a grammar
Derivation of sentences from a grammar
  • A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
  • Example: derivation of the dog chased a cat

S  NP VP

 Det N VP

 the N VP

 the dog VP

 the dog V NP

 the dog chased NP

 the dog chased Det N

 the dog chased a N

 the dog chased a cat

recall example 2
Recall example 2

G2 = ({a, the, dog, cat, chased},

{S, NP, VP, Det, N, V},

{S  NP VP, NP  Det N, Det  a | the,

N  dog | cat, VP  V | VP NP, V  chased},

S)

example derivation from g 2
Example: derivation from G2
  • Example: derivation of the dog chased a cat

S  NP VP

 Det N VP

 the N VP

 the dog VP

 the dog V NP

 the dog chased NP

 the dog chased Det N

 the dog chased a N

 the dog chased a cat

example 3
Example 3

L3 = { 0, 1, 00, 11, 000, 111, 0000, 1111, … }

G3 = ( {0, 1},

{S, ZeroList, OneList},

{S  ZeroList | OneList,

ZeroList  0 | 0 ZeroList,

OneList  1 | 1 OneList },

S )

example derivations from g 3
Example: derivations from G3
  • Example: derivation of 0 0 0 0

S  ZeroList

 0 ZeroList

 0 0 ZeroList

 0 0 0 ZeroList

 0 0 0 0

  • Example: derivation of 1 1 1

S  OneList

 1 OneList

 1 1 OneList

 1 1 1

observations about derivations
Observations about derivations
  • Every string of symbols in the derivation is a sentential form.
  • A sentence is a sentential form that has only terminal symbols.
  • A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded.
  • A derivation can be leftmost, rightmost, or neither.
an example programming language grammar fragment
An example programming language grammar fragment

<program> -> <stmt-list>

<stmt-list> -> <stmt>

| <stmt> ; <stmt-list>

<stmt> -> <var> = <expr>

<var> -> a

| b

| c

| d

<expr> -> <term> + <term>

| <term> - <term>

<term> -> <var>

| const

a leftmost derivation of a b const
A leftmost derivation ofa = b + const

<program> => <stmts>

=> <stmt>

=> <var> = <expr>

=> a = <expr>

=> a = <term> + <term>

=> a = <var> + <term>

=> a = b + <term>

=> a = b + const

parse tree
Parse tree
  • A parse tree is an hierarchical representation of a derivation:

<program>

<stmts>

<stmt>

<var>

=

<expr>

a

<term>

+

<term>

<var>

const

b

parse trees and compilation
Parse trees and compilation
  • A compiler builds a parse tree for a program (or for different parts of a program).
  • If the compiler cannot build a well-formed parse tree from a given input, it reports a compilation error.
  • The parse tree serves as the basis for semantic interpretation/translation of the program.
extended bnf
Extended BNF
  • Optional parts are placed in brackets [ ]

<proc_call> -> ident [(<expr_list>)]

  • Alternative parts of RHSs are placed inside parentheses and separated via vertical bars

<term> -> <term>(+|-) const

  • Repetitions (0 or more) are placed inside braces { }

<ident> -> letter {letter|digit}

comparison of bnf and ebnf
Comparison of BNF and EBNF
  • sample grammar fragment expressed in BNF

<expr> -> <expr> + <term>

| <expr> - <term>

| <term>

<term> -> <term> * <factor>

| <term> / <factor>

| <factor>

  • same grammar fragment expressed in EBNF

<expr> -> <term> {(+ | -) <term>}

<term> -> <factor> {(* | /) <factor>}

ambiguity in grammars
Ambiguity in grammars
  • A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees
  • Operator precedence and operator associativity are two examples of ways in which a grammar can provide an unambiguous interpretation.
operator precedence ambiguity
Operator precedence ambiguity

The following grammar is ambiguous:

<expr> -> <expr> <op> <expr> | const

<op> -> / | -

The grammar treats the '/' and '-' operators equivalently.

an ambiguous grammar for arithmetic expressions
An ambiguous grammar for arithmetic expressions

<expr> -> <expr> <op> <expr> | const

<op> -> / | -

<expr>

<expr>

<expr>

<op>

<expr>

<expr>

<op>

<op>

<expr>

<expr>

<op>

<expr>

<expr>

<op>

<expr>

const

-

const

/

const

const

-

const

/

const

disambiguating the grammar
Disambiguating the grammar
  • If we use the parse tree to indicate precedence levels of the operators, we can remove the ambiguity.
  • The following rules give / a higher precedence than -

<expr> -> <expr> - <term> | <term>

<term> -> <term> / const | const

<expr>

<expr>

-

<term>

<term>

<term>

/

const

const

const

links to bnf style grammars for actual programming languages
Links to BNF-style grammars for actual programming languages

Below are some links to grammars for real programming languages. Look at how the grammars are expressed.

  • http://www.schemers.org/Documents/Standards/R5RS/
  • http://www.sics.se/isl/sicstuswww/site/documentation.html

In the ones listed below, find the parts of the grammar that deal with operator precedence.

  • http://java.sun.com/docs/books/jls/index.html
  • http://www.lykkenborg.no/java/grammar/JLS3.html
  • http://www.enseignement.polytechnique.fr/profs/informatique/Jean-Jacques.Levy/poly/mainB/node23.html
  • http://www.lrz-muenchen.de/~bernhard/Pascal-EBNF.html
derivation of 2 5 3 using c grammar

<expression>

Derivation of2+5*3using C grammar

<assignment-expression>

<conditional-expression>

<logical-OR-expression>

<logical-AND-expression>

<inclusive-OR-expression>

<exclusive-OR-expression>

<AND-expression>

<equality-expression>

<relational-expression>

<shift-expression>

<additive-expression>

+

<additive-expression>

<multiplicative-expression>

<multiplicative-expression>

<multiplicative-expression>

<cast-expression>

*

<cast-expression>

<unary-expression>

<cast-expression>

<unary-expression>

<postfix-expression>

<unary-expression>

<postfix-expression>

<primary-expression>

<postfix-expression>

<primary-expression>

<constant>

<primary-expression>

<constant>

3

<constant>

2

5

recursion and parentheses
Recursion and parentheses
  • To generate 2+3*4 or 3*4+2, the parse tree is built so that + is higher in the tree than *.
  • To force an addition to be done prior to a multiplication we must use parentheses, as in (2+3)*4.
  • Grammar captures this in the recursive case of an expression, as in the following grammar fragment:

<expr>  <expr> + <term>

<term>  <term> * <factor>

<factor>  <variable> | <constant> | “(” <expr> “)”

associativity of operators
Associativity of operators
  • When multiple operators appear in an expression, we need to know how to interpret the expression.
  • Some operators (e.g. +) are associative, meaning that the meaning of an expression with multiple instances of the operator is the same no matter how it is interpreted: (a+b)+c = a+(b+c)
  • Some operators (e.g. -) are not associative: (a-b)-c ¹ a-(b-c) e.g. try a=10, b=8, c=6 (10-8)-6 = -4 but 10-(8-6)=8
  • - and / are both left-associative, meaning a-b-c is interpreted as (a-b)-c.
  • Exponentiation (**) is right-associative. This means that 2**3**2 is interpreted as 2**(3**2) (i.e. 2**9) rather than (2**3)**2 (i.e. 8**2 or 2**6).
associativity of operators38
Associativity of Operators
  • Operator associativity can be encoded by a grammar. The following grammar fragment does not do this: the left and right operands of '-' are treated symmetrically.

<expr> -> <expr> - <expr> | <term>

<term> -> <var> | <const> | “(” <expr> “)”

<expr>

<expr>

<expr>

<expr>

<expr>

-

<expr>

<expr>

-

<expr>

<expr>

<expr>

<expr>

<expr>

-

-

<term>

<term>

<term>

<term>

<term>

<term>

associativity of operators39
Associativity of Operators
  • However, the following rules ensure that '-' is left-associative, because they prevent direct recursion with '-' in the right-hand operand.

<expr> -> <expr> - <term> | <term>

<term> -> <var> | <const> | “(” <expr> “)”

<expr>

<expr>

<expr>

-

<term>

<expr>

<term>

-

<term>

decision timing design time vs implementation time
Decision timing:Design timevs.Implementation time
  • (to come)
  • Java and precedence/associativity/left-to-right evaluation vs. C++ (?)
theory vs reality

Dealing with fixed-size numeric representations

Theoryvs.Reality
  • (to come)
  • Java/C# vs. C/C++ (size of representation – but this is not the slide to address this on: see next point).
  • Also, effect of fixed size of representations on associativity:
    • mathematically, (x+y)+z = x+(y+z)
    • in practice (+ is not always associative):
      • (large+small)+small = large
      • large+(small+small) > large