1 / 33

Parsing - PowerPoint PPT Presentation

  • Uploaded on

Parsing. Programming Language Concepts Lecture 6. Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida. Context-Free Grammars.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Parsing' - kuper

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


Programming Language Concepts

Lecture 6

Prepared by

Manuel E. Bermúdez, Ph.D.

Associate Professor

University of Florida

Context free grammars
Context-Free Grammars

  • Definition: A context-free grammar (CFG) is a quadrupleG = (, , P, S),where all productions are of the formA →, for A   and   (u )*.

  • Re-writing using grammar rules:

    • βAγ => βγif A → (derivation).

String derivations
String Derivations

  • Left-most derivation: At each step, the left-most nonterminal is re-written.

  • Right-most derivation: At each step, the right-most nonterminal is re-written.

Derivation trees
Derivation Trees

Derivation trees:

Describe re-writes, independently of the order (left-most or right-most).

  • Each tree branch matches a production rule in the grammar.

Derivation trees1
Derivation Trees


  • Leaves are terminals.

  • Bottom contour is the sentence.

  • Left recursion causes left branching.

  • Right recursion causes right branching.

Goal of parsing
Goal of Parsing

  • Examine input string, determine whether it's legal.

  • Equivalent to building derivation tree.

  • Added benefit: tree embodies syntactic structure of input.

  • Therefore, tree should be unique.

Ambiguous grammars
Ambiguous Grammars

  • Definition: A CFG is ambiguous if there exist two different right-most (or left-most, but not both) derivations for some sentence z.

  • (Equivalent) Definition: A CFG is ambiguous if there exist two different derivation trees for some sentence z.

Ambiguous grammars1
Ambiguous Grammars

Classic ambiguities:

  • Simultaneous left/right recursion:

    E → E + E

    → i

  • Dangling else problem:

    S → if E then S

    → if E then S else S

Operator precedence and associativity
Operator Precedence and Associativity

  • Let’s build a CFG for expressions consisting of:

    • elementary identifier i.

    • +and - (binary ops) have lowest precedence, and are left associative .

    • * and / (binary ops) have middle precedence, and are right associative.

    • + and - (unary ops) have highest precedence, and are right associative.

Corresponding grammar for expressions
Corresponding Grammar for Expressions

E → E + TE consists of T's,

→ E - Tseparated by –’s and +'s

→ T(lowest precedence).

T → F * TT consists of F's,

→ F / Tseparated by *'s and /'s

→ F(next precedence).

F → - FF consists of a single P,

→ + Fpreceded by +'s and -'s.

→ P(next precedence).

P → '(' E ')'P consists of a parenthesized E,

→ i or a single i(highest precedence).

Operator precedence and associativity1
Operator Precedence and Associativity

  • Operator precedence:

    • The lower in the grammar, the higher the precedence.

  • Operator Associativity:

    • Tie breaker for precedence.

    • Left recursion in the grammar means

      • left associativity of the operator,

      • left branching in the tree.

    • Right recursion in the grammar means

      • right associativity of the operator,

      • right branching in the tree.

Building derivation trees
Building Derivation Trees

Sample Input :

- + i - i * ( i + i ) / i + i

(Human) derivation tree construction:

  • Bottom-up.

  • On each pass, scan entire expression, process operators with highest precedence (parentheses are highest).

  • Lowest precedence operators are last, at the top of tree.

Abstract syntax trees
Abstract Syntax Trees

  • AST is a condensed version of the derivation tree.

  • No noise (intermediate nodes).

  • String-to-tree transduction grammar:

    • rules of the form A → ω => 's'.

  • Build 's' tree node, with one child per tree from each nonterminal in ω.


E → E + T => +

→ E - T => -

→ T

T → F * T => *

→ F / T => /

→ F

F → - F => neg

→ + F => +

→ P

P → '(' E ')'

→ i => i


Sample Input :- + i - i * ( i + i ) / i + i

String to tree transduction
String-to-Tree Transduction

  • We transduce from vocabulary of input symbols, to vocabulary of tree node names.

  • Could eliminate construction of unary + node, anticipating semantics.

    F → - F => neg

    → + F // no more unary +node

    → P

The game of syntactic dominoes
The Game of Syntactic Dominoes

  • The grammar:

    E → E+T T → P*T P → (E)

    → T → P →i

  • The playing pieces: An arbitrary supply of each piece (one per grammar rule).

  • The game board:

    • Start domino at the top.

    • Bottom dominoes are the "input."

The game of syntactic dominoes1
The Game of Syntactic Dominoes

  • Game rules:

    • Add game pieces to the board.

    • Match the flat parts and the symbols.

    • Lines are infinitely elastic.

  • Object of the game:

    • Connect start domino with the input dominoes.

    • Leave no unmatched flat parts.

Parsing strategies
Parsing Strategies

  • Same as for the game of syntactic dominoes.

    • “Top-down” parsing: start at the start symbol, work toward the input string.

    • “Bottom-up” parsing: start at the input string, work towards the goal symbol.

  • In either strategy, can process the input left-to-right  or right-to-left 

Top down parsing
Top-Down Parsing

  • Attempt a left-most derivation, by predicting the re-write that will match the remaining input.

  • Use a string (a stack, really) from which the input can be derived.

Top down parsing1
Top-Down Parsing

Start with S on the stack.

At every step, two alternatives:

  •  (the stack) begins with a terminal t. Match t against the first input symbol.

  •  begins with a nonterminal A. Consult an OPF (Omniscient Parsing Function) to determine which production for A would lead to a match with the first symbol of the input.

    The OPF does the “predicting” in such a predictive parser.

Classical top down parsing algorithm
Classical Top-Down Parsing Algorithm

Push (Stack, S);

while not Empty (Stack) do

if Top(Stack) 

then if Top(Stack) = Head(input)

then input := tail(input)


else error (Stack, input)

else P:= OPF (Stack, input)

Push (Pop(Stack), RHS(P))


Top down parsing2
Top-Down Parsing

  • Most parsing methods impose bounds on the amount of stack lookback and input lookahead. For programming languages, a common choice is (1,1).

  • We must define OPF (A,t), where A is the top element of the stack, and t is the first symbol on the input.

  • Storage requirements: O(n2), where n is the size of the grammar vocabulary

    (a few hundred).

Ll 1 grammars
LL(1) Grammars


A CFG G is LL(1) (Left-to-right, Left-most, one-symbol lookahead)

iff for all A, and for allA→, A→,   ,

Select (A → ) ∩ Select (A → ) = 

  • Previous example: Grammar is not LL(1).

  • More later on why, and what do to about it.


S → A {b,}

A → bAd {b}

→ {d, }


Grammar is LL(1)!

(At most) one production per entry.



Programming Language Concepts

Lecture 6

Prepared by

Manuel E. Bermúdez, Ph.D.

Associate Professor

University of Florida