chapter 9 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 9 PowerPoint Presentation
Download Presentation
Chapter 9

Loading in 2 Seconds...

play fullscreen
1 / 73

Chapter 9 - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Chapter 9. Syntax Analysis. Contents. Context free grammars Top-down parsing Bottom-up parsing Attribute grammars Dynamic semantics Tools for syntax analysis Chomsky’s hierarchy. The Role of Parser. 9.1: Context Free Grammars.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 9' - kenyon


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chapter 9

Chapter 9

Syntax Analysis

SEG2101 Chapter 9

contents
Contents
  • Context free grammars
  • Top-down parsing
  • Bottom-up parsing
  • Attribute grammars
  • Dynamic semantics
  • Tools for syntax analysis
  • Chomsky’s hierarchy

SEG2101 Chapter 9

the role of parser
The Role of Parser

SEG2101 Chapter 9

9 1 context free grammars
9.1: Context Free Grammars
  • A context free grammar consists of terminals, nonterminals, a start symbol, and productions.
  • Terminals are the basic symbols from which strings are formed.
  • Nonterminals are syntactic variables that denote sets of strings.
  • One nonterminal is distinguished as the start symbol.
  • The productions of a grammar specify the manner in which the terminal and nonterminals can be combined to form strings.
  • A language that can be generated by a grammar is said to be a context-free language.

SEG2101 Chapter 9

example of grammar
Example of Grammar

SEG2101 Chapter 9

notational conventions
Notational Conventions
  • Aho P.166
  • Example P.167

EEAE|(E)|-E|id

A+|-|*|/|

SEG2101 Chapter 9

derivations
Derivations
  • E-E is read “E derives -E”
  • E-E-(E)=-(id)is called aderivation of -(id) from E.
  • If A is a production and  and  are arbitrary strings of grammar symbols, we say A  .
  • If 12... n, we say 1derives n.

SEG2101 Chapter 9

derivations ii
Derivations (II)
  •  means “derives in one step.”
  •  means “derives in zero or more steps.”
    • 
    • if  and  then 
  •  means “derives in one or more steps.”
  • If S, where  may contain nonterminals, then we say that  is a sentential form.

*

*

*

*

+

*

SEG2101 Chapter 9

derivations iii
Derivations (III)
  • G: grammar, S: start symbol, L(G): the language generated by G.
  • Strings in L(G) may contain only terminal symbols of G.
  • A string of terminal w is said to be in L(G) if and only if Sw.
  • The string w is called a sentence of G.
  • A language that can be generated by a grammar is said to be a context-free language.
  • If two grammars generate the same language, the grammars are said to be equivalent.

+

+

SEG2101 Chapter 9

derivations iv
Derivations (IV)

EEAE|(E)|-E|id

A+|-|*|/|

  • The string -(id+id) is a sentence of the above grammar because

E-E-(E+E)-(id+E)-(id+id)

We write E-(id+id)

*

SEG2101 Chapter 9

parse tree
Parse Tree

EE+E|E*E|(E)|-E|id

SEG2101 Chapter 9

parse tree ii
Parse Tree (II)

SEG2101 Chapter 9

two parse trees
Two Parse Trees

SEG2101 Chapter 9

ambiguity
Ambiguity
  • A grammar that produces more than one parse tree for some sentence is said to be ambiguous.

SEG2101 Chapter 9

eliminating ambiguity
Eliminating Ambiguity
  • Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.
    • E.g. “match each else with the closest unmatched then”

SEG2101 Chapter 9

eliminating left recursion
Eliminating Left Recursion
  • A grammar is left recursive if it has a nonterminal A such that there is a derivation AA for some string .
  • AA| can be replaced by

A A’

A’A’|

  • AA1|A2 |… |Am|1|2|…|n|

A1A’|2A’|…|nA’|

A’1A’|2A’|… mA’|

+

SEG2101 Chapter 9

examples
SAa|b

AAc|Sd|

AAc|Aad|bd|

SAa|b

AbdA’|A’

A’cA’|adA’|

Examples

SEG2101 Chapter 9

left factoring
Left Factoring
  • Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive parsing.
  • The basic idea is that when it is not clear which of two alternative productions to use to expand a nonterminal A, we may be able to rewrite the A-productions to defer the decision until we have seen enough of the input to make the right choice.
  • Stmt --> if expr then stmt else stmt
  • | if expr then stmt

SEG2101 Chapter 9

algorithm left factoring
Algorithm: Left Factoring

SEG2101 Chapter 9

left factoring example p178
Left Factoring (example p178)
  • A1|2
  • The following grammar abstracts the dangling-else problem:
    • SiEtS|iEtSeS|a
    • Eb

SEG2101 Chapter 9

9 2 top down parsing
9.2: Top Down Parsing
  • Recursive-descent parsing
  • Predictive parsers
  • Nonrecursive predictive parsing
  • FIRST and FOLLOW
  • Construction of predictive parsing table
  • LL(1) grammars
  • Error recovery in predictive parsing (if time permits)

SEG2101 Chapter 9

recursive descent parsing
Recursive-Descent Parsing
  • Top-down parsing can be viewed as an attempt to find a leftmost derivation for an input string.
  • It can also viewed as an attempt to construct a parse tree for the input string from the root and creating the nodes of the parse tree in preorder.

Grammar:

Input string

w = cad

SEG2101 Chapter 9

predictive parsers
Predictive Parsers
  • By carefully writing a grammar, eliminating left recursion, and left factoring the resulting grammar, we can obtain a grammar that can be parsed by a recursive-descent parser that needs no backtracking, i.e., a predictive parser.

ScAd

AaA’

A’b|

SEG2101 Chapter 9

predictive parser ii
Predictive Parser (II)
  • Recursive-descent parsing is a top-down method of syntax analysis in which we execute a set of recursive procedures to process the input.
  • A procedures is associated with each nonterminal of a grammar.
  • Predictive parsing is what in which the look-ahead symbol unambiguously determines the procedure selected for each nonterminal.
  • The sequence of procedures called in processing the input implicitly defines a parse tree for the input.

SEG2101 Chapter 9

parsing table m
Parsing Table M

Grammar:

Input:

id + id * id

SEG2101 Chapter 9

first and follow
FIRST and FOLLOW
  • If  is any string of grammar symbols, FIRST() is the set of terminals that begin the strings derived from . If  then  is also in FIRST().
  • FOLLOW(A), for nonternimal A, is the set of terminals a that can appear immediately to the right of A in some sentential form, i.e. the set of terminals a such that there exists a derivation of the form SAa for some  and .
  • If A can be the rightmost symbol in some sentential form, the $ is in FOLLOW(A).

*

SEG2101 Chapter 9

compute first x
Compute FIRST(X)

SEG2101 Chapter 9

compute follow a
Compute FOLLOW(A)

SEG2101 Chapter 9

ll 1 grammars
LL(1) Grammars
  • A grammar whose parsing table has no multiply-defined entries is said to be LL(1).
  • First L: scanning from left to right
  • Second L: producing a leftmost derivation
  • 1: using one input symbol of lookahead at each step to make parsing action decision.

SEG2101 Chapter 9

properties of ll 1
Properties of LL(1)
  • No ambiguous or left recursive grammar can be LL(1).
  • Grammar G is LL(1) iff whenever A| are two distinct productions of G and:
    • For no terminal a do both  and  derive strings beginning with a.

FIRST()FIRST()=

    • At most one of  and  can derive the empty string.
    • If , the  does not derive any string beginning with a terminal in FOLLOW(A).

FIRST(FOLLOW(A))FIRST(FOLLOW(A))=

*

SEG2101 Chapter 9

ll 1 grammars example
LL(1) Grammars: Example

SEG2101 Chapter 9

error recovery in predictive parsing
Error recovery in predictive parsing
  • An error is detected during the predictive parsing when the terminal on top of the stack does not match the next input symbol, or when nonterminal A on top of the stack, a is the next input symbol, and parsing table entry M[A,a] is empty.
  • Panic-mode error recovery is based on the idea of skipping symbols on the input until a token in a selected set of synchronizing tokens.

SEG2101 Chapter 9

how to select synchronizing set
How to select synchronizing set?
  • Place all symbols in FOLLOW(A) into the synchronizing set for nonterminal A. If we skip tokens until an element of FOLLOW(A) is seen and pop A from the stack, it likely that parsing can continue.
  • We might add keywords that begins statements to the synchronizing sets for the nonterminals generating expressions.

SEG2101 Chapter 9

how to select synchronizing set ii
How to select synchronizing set? (II)
  • If a nonterminal can generate the empty string, then the production deriving  can be used as a default. This may postpone some error detection, but cannot cause an error to be missed. This approach reduces the number of nonterminals that have to be considered during error recovery.
  • If a terminal on top of stack cannot be matched, a simple idea is to pop the terminal, issue a message saying that the terminal was inserted.

SEG2101 Chapter 9

example error recovery
Example: error recovery

“synch” indicating synchronizing tokens obtained from FOLLOW set of the nonterminal in question.

If the parser looks up entry M[A,a] and finds that it is blank, the input symbol a is skipped.

If the entry is synch, the the nonterminal on top of the stack is popped.

If a token on top of the stack does not match the input symbol, then we pop the token from the stack.

SEG2101 Chapter 9

9 3 bottom up parsing and lr parsers
9.3: Bottom Up Parsing and LR Parsers
  • Shift-reduce parsing attempts to construct a parse tree for an input string beginning at the leaves (bottom) and working up towards the root (top).
  • “Reducing” a string w to the start symbol of a grammar.
  • At each reduction step a particular substring machining the right side of a production is replaced by the symbol on the left of that production, and if the substring is chosen correctly at each step, a rightmost derivation is traced out in reverse.

SEG2101 Chapter 9

example
Example
  • Grammar:

SaABe

AAbc|b

Bd

  • Reduction:

abbcde

aAbcde

aAde

aABe

S

SEG2101 Chapter 9

operator precedence parsing
Operator-Precedence Parsing

Grammar for expression

Can be rewritten as

With the precedence relations inserted, id + id * id can be written as:

SEG2101 Chapter 9

lr k parsers
LR(k) Parsers
  • L: left-to-right scanning of the input
  • R: constructing a rightmost derivation in reverse
  • k: the number of input symbols of lookahead that are used in making parsing decisions.

SEG2101 Chapter 9

lr parsing
LR Parsing

SEG2101 Chapter 9

shift reduce parser
Shift-Reduce Parser

SEG2101 Chapter 9

example lr parsing table
Example LR Parsing Table

SEG2101 Chapter 9

9 4 attributes grammars
9.4: Attributes Grammars
  • An attribute grammar is a device used to describe more of the structure of a programming language than is possible with a context-free grammar.
  • Some of the semantic properties can be evaluated at compile-time, they are called "static semantics", other properties are determined at execution time, they are called "dynamic semantics".
  • The static semantics is often represented by semantic attributes which are associated with the nonterminals.

SEG2101 Chapter 9

attribute grammars
Attribute Grammars
  • Grammars with added attributes, attribute computation functions, and predicate functions.
  • Attributes: similar to variables
  • Attribute computation functions: specify how attribute values are computed
  • Predicate functions: state some of the syntax and static semantic rules of the language

SEG2101 Chapter 9

example ii
Example (II)

SEG2101 Chapter 9

example iii
Example (III)

SEG2101 Chapter 9

example iv
Example (IV)

SEG2101 Chapter 9

9 5 dynamic semantics
9.5: Dynamic Semantics
  • Informal definition: Only informal explanations are given (in natural language) which define the meaning of programs (e.g. language reference manuals, etc.).
  • Operational semantics: The meaning of the constructs of the programming language is defined in terms of the translation into another lower-level language and the semantics of this lower-level language. Usually only the translation is defined formally, the semantics of the lower-level language is defined informally.

SEG2101 Chapter 9

axiomatic semantics
Axiomatic Semantics
  • Axiomatic semantics was defined to prove the correctness of programs.
  • This approach is related to the approach of defining the semantics of a procedure (independently of its code) in terms of pre- and post-conditions that define properties of input and output parameters and values of state variables.
  • Weakest precondition: For a given statement, and a given postcondition that should hold after its execution, the weakest precondition is the weakest condition which ensures, when it holds before the execution of the statement, that the given postcondition holds afterwards.

SEG2101 Chapter 9

denotational semantics
Denotational Semantics
  • Denotational semantics is a method for describing the meaning of programs.
  • It is based on recursive function theory.
  • Grammar:

<bin_num>  0

| 1

| <bin_num> 0

| <bin_num> 1

  • Function Mmin:

SEG2101 Chapter 9

syntax graphs
Syntax Graphs
  • A graph is a collection of nodes, some of which are connected by lines (edges).
  • A directed graph is one in which the lines are directional.
  • A parse tree is a restricted form of directed graph.
  • Syntax graph is a directed graph representing the information in BNF rules.

SEG2101 Chapter 9

9 7 chomsky hierarchy
9.7: Chomsky Hierarchy

SEG2101 Chapter 9

turing machine
Turing Machine

SEG2101 Chapter 9

turing machine ii
Turing Machine (II)
  • Unrestricted grammar
  • Recognized by Turing machine
  • It consists of a read-write head that can be positioned anywhere along an infinite tape.
  • It is not a useful class of language for compiler design.

SEG2101 Chapter 9

linear bounded automata
Linear-Bounded Automata

SEG2101 Chapter 9

linear bounded automata1
Linear-Bounded Automata
  • Context-sensitive
  • Restrictions
    • Left-hand of each production must have at least one nonterminal in it
    • Right-hand side must not have fewer symbols than the left
    • There can be no empty productions (N)

SEG2101 Chapter 9

push down automata
Push-Down Automata

SEG2101 Chapter 9

push down automata ii
Push-Down Automata (II)
  • Context-free
  • Recognized by push-down automata
  • Can only read its input tape but has a stack that can grow to arbitrary depth where it can save information
  • An automation with a read-only tape and two independent stacks is equivalent to a Turing machine.
  • It allows at most a single nonterminal (and no terminal) on the left-hand side of each production.

SEG2101 Chapter 9

finite state automata
Finite-State Automata

SEG2101 Chapter 9

finite state automata ii
Finite State Automata (II)
  • Regular language
  • Anything that must be remembered about the context of a symbol on the input tape must be preserved in the state of the machine.
  • It allows only one symbol (a nonterminal) on the left-hand, and only one or two symbols on the right.

SEG2101 Chapter 9