Parsing. Recognition of strings in a language. Graph of a Grammar. Represents leftmost derivations of a CFG. A path from node S to a node w is a leftmost derivation. Properties of Graph of a Grammar.

Parsing

## Parsing

Recognition of strings in a language

### Graph of a Grammar

• Represents leftmost derivations of a CFG.

• A path from node S to a node w is a leftmost derivation.

### Properties of Graph of a Grammar

• Every node has a finite number of children.

• The number of leaves is infinite if the language is infinite.

• Typical case.

• There can be infinite long paths (derivations).

• Loops in depth-first traversals.

(Illustrates ambiguity in the grammar.)

• Parser

A program that determines if a string

by constructing a derivation. Equivalently,

it searches the graph of G.

• Top-down parsers

• Constructs the derivation tree from root to leaves.

• Leftmost derivation.

• Bottom-up parsers

• Constructs the derivation tree from leaves to root.

• Rightmost derivation in reverse.

Derivation

Trees

Leftmost

derivation

Rightmost

derivation

S

Derivation

Trees

S

S

S

S

Rightmost

Derivation

in Reverse

S

S

a

b

Search the graph of a grammar breadth-first

Uses: Queue

(+) Always terminates with shortest derivation

(-) Inefficient in general.

Search the graph of a grammar depth-first

Uses: Stack

(-) Can get into infinite loops

(e.g., left recursion)

(+) Efficient in general.

### Determining when

• Number of terminals in sentential form

>length of w

• Prefix of sentential form preceding the leftmost non-terminal not a prefix of w.

• No rules applicable to sentential form.

### Parsing Examples

Queue-up left

sentential forms

level by level

Parse

successful

### Depth-first top-down parser

Use stack to

pursue entire

path from left

Parse

fails

### Summary

• In BFTD version, all left derivations investigated in parallel.

• In DFTD version, one specific derivation is pursued to completion.

• Done, if succeeds.

• Otherwise, backtrack and investigate another path.

(Incomplete strategy)

### Practical Parsers

• Language/Grammar designed to enable deterministic (directed and backtrack-free) searches.

• Uses lookahead tokens and/or exploits the context in the sentential form constructed so far.

“Look before you leap.” vs “Procrastination principle.”

• Top-down parsers : LL(k) languages

• Better error diagnosis and recovery.

• Bottom-up parsers : LALR(1), LR(k) languages

• E.g., C/C++, Java, etc.

• Handles left recursion in the grammar.

• Backtracking parsers