- 105 Views
- Uploaded on
- Presentation posted in: General

Bottom-up parsing

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Synthesize tree from fragments
- Automaton performs two actions:
- shift: push next symbol on stack
- reduce: replace symbols on stack

- Automaton synthesizes (reduces) when end of a production is recognized
- States of automaton encode synthesis so far, and expectation of pending non-terminals
- Automaton has potentially large set of states
- Technique more general than LL (k)

(C) Edmond Schonberg, New-York University

- Left-to-right, rightmost derivation with k-token lookahead.
- Most general parsing technique for deterministic grammars.
- In general, not practical: tables too large (10^6 states for C++, Ada).
- Common subsets: SLR, LALR (1).

(C) Edmond Schonberg, New-York University

- An item is a point within a production, indicating that part of the production has been recognized:
- A a . B b ,
- seen the expansion of a, expect to see expansion of B

- A a . B b ,
- A state is a set of items
- Transition within states are determined by terminals and non-terminals
- Parsing tables are built from automaton:
- action: shift / reduce depending on next symbol
- goto: change state depending on synthesized non-terminal

(C) Edmond Schonberg, New-York University

- If a state includes:
A a . B b

- it also includes every state that is the start of B:
B . X Y Z

- Informally: if I expect to see B next, I expect to see anything that B can start with, and so on:
X . G H I

- States are built by closure from individual items.

(C) Edmond Schonberg, New-York University

- E’ E
- E E + T | T; -- left-recursion ok here.
- T T * F | F;
- F id | (E)
- S0 = { E’ .E, E .E + T, E .T,
F .id, F . ( E ) ,

T .T * F, T .F}

(C) Edmond Schonberg, New-York University

- If a state has itemA a .a b,
and the next symbol in the input is a, we shifta on the stack and enter a state that contains item

- A a a.b
(as well as all other items brought in by closure)

- if a state has as item A a. , this indicates the end of a production: reduce action.
- If a state has an item A a .N b, then after a reduction that find an N, go to a state with A a N. b

(C) Edmond Schonberg, New-York University

- S1 = { E’ E., E E. + T }
- S2 = { E T., T T. * F }
- S3 = { T F. }
- S4 = { F (. E), } + S0 (by closure)
- S5 = { F id. }
- S6 = { E E +. T, T .T * F, T .F, F .id, F .(E)}
- S7 = { T T *. F, F .id, F .(E)}
- S8 = { F (E.), E E.+ T}
- S9 = { E E + T., T T.* F}
- S10 = { T T * F.}, S11 = {F (E).}

(C) Edmond Schonberg, New-York University

- An arc between two states labeled with a terminal is a shift action.
- An arc between two states labeled with a non-terminal is a goto action.
- if a state contains an item A a. , (a reduce item)
- the action is to reduce by this production, for all terminals in Follow (A).
- If there are shift-reduce conflicts or reduce-reduce conflicts, more elaborate techniques are needed.

(C) Edmond Schonberg, New-York University

- Canonical LR (1): annotate each item with its own follow set:
- (A -> a a.b , f )
- f is a subset of the follow set of A, because it is derived from a single specific production for A
- A state that includes A -> a a.b is a reduce state only if next symbol is in f: fewer reduce actions, fewer conflicts, technique is more powerful than SLR (1)
- Generalization: use sequences of k symbols in f
- Disadvantage: state explosion: impractical in general, even for LR (1)

(C) Edmond Schonberg, New-York University

- Compute follow set for a small set of items
- Tables no bigger than SLR (1)
- Same power as LR (1), slightly worse error diagnostics
- Incorporated into yacc, bison, etc.

(C) Edmond Schonberg, New-York University