300 likes | 387 Views
Learn about LR(k) Parsing, LR(0) and LR(1) grammars, shift vs. reduce operations, and constructing NFA to DFA for efficient parsing.
E N D
LR(k) Parsing CPSC 388 Ellen Walker Hiram College
Bottom Up Parsing • Start with tokens • Build up rule RHS (right side) • Replace RHS by LHS • Done when stack is only start symbol • (Working from leaves of tree to root)
Operations in Bottom-up Parsing • Shift: • Push the terminal from the beginning of the string to the top of the stack • Reduce • Replace the string xyz at the top of the stack by a nonterminal A (assuming A->xyz) • Accept (when stack is $S’; empty input)
Sample Parse • S’ -> S; S-> aSb | bSa | SS | e • String: abba • Stack = $, input = abba$; shift • Stack = $a input = bba$; reduce S->e • Stack = $aS input = bba$ ; shift • Stack = $aSb input = ba$ ; reduce S->aSb • Stack = $S input = ba ; shift
Sample Parse (cont) • Stack = $S input = ba$ ; shift • Stack = $Sb input = a$ ; reduce S->e • Stack = $SbS input = a$ ; shift • Stack = $SbSa input = $; reduce S->bSa • Stack = $SS input = $; reduce S->SS • Stack = $S input = $; reduce S’-> S • Stack = $S’ input = $; accept
LR(k) Parsing • LR(0) grammars can be parsed with no lookahead (stack only) • LR(1) grammars need 1 character lookahead • LR(k), k>1 use multi-character lookahead • Most “real” grammars are LR(1)
Shift vs. Reduce • First, build NFA of LR(0) items • Transform NFA to DFA • If unambiguous, grammar is LR(0) - use DFA directly to parse (states indicate shift vs. reduce) • Otherwise, use SLR(1) algorithm
LR(0) Items • Rules with . between stack & input • For S->(S) | a, the LR(0) items are: S -> .(S) S-> (.S) S->(S.) S->(S). S-> .a S-> a. • S -> .(S) and S-> .a are initial items • S-> (S). and S->a. are complete items
Building NFA • Each LR(0) item is a state • Shift transitions • Change of goal transitions
More on NFA • Initial state is “ S’ -> .S” • No final state, but acceptance happens in S’->S. state • Complete LR(0) items have no outbound transitions • We’ll worry about getting past them later • No “reduce transitions” • “shift” on non-terminal used during reduce
NFA -> DFA • Compute e-closure (closure items) • All are initial items • Use subset construction (kernel items) • Grammar + kernel items are sufficient (closure items can be inferred) • DFA is computed directly by YACC, etc.
DFA Construction Details • For each symbol (terminal or nonterminal) after the marker, create a shift transition. These are kernel items. S S'-> .S S' -> S.
DFA Construction Details • If there are multiple shift transitions on the same symbol, these are combined into the same state. • (Because the NFA will be in all those states at once).
Adding Closure Items • When the marker is immediately before a non-terminal symbol, the closure items are all of the initial forms for the new symbol, e.g. • S’ -> .S (kernel item) • S -> .(S) (closure item) • S -> .Ab (closure item) • These denote the change of goal transitions (which are all epsilon-transitions)
DFA “Final” States • The DFA doesn’t actually accept the string, so the concept of “final” isn’t the same • In JFLAP, mark any state where a reduction can take place as final
LR(0) Parsing • At each step, push a state onto the stack, and do an action based on the current state • A->a.xb (not a complete item) If x is terminal, shift. • A->aXb. (a complete item) Reduce by A->aXb
When Not LR(0)? • Shift-reduce conflict • State contains both a complete item and a “shift” item (with leading terminal) • Reduce-reduce conflict • State contains 2 or more complete items. • Previous example is not LR(0)! (Why)?
Simple LR(1) • If a shift is possible, do it • Else if there is a complete item for A, and the next terminal is in Follow(A), reduce A. Compute the next state by taking the A link from the last state left on the stack before pushing A • Otherwise, there is a parse error
SLR(1) Table • Rows are states, columns are symbols (terminal and nonterminal) • Table entries (3 types): • sn shift & goto state n (only for terminals) • Rk reduce using rule k (rule #’s start at 0 in JFLAP) • n Goto state n (only for nonterminals, after reduction)
Transitions and Table Entries • Transition from state m to state n on terminal x • Put sn in table [m][x] • Transition from state m to state n on nonterminal X • Put n in table [m][X] • State m has a complete item for rule k, and terminal x is in FINAL of the LHS of rule k • Put rk in table[m][x] • State m is “S’->S” • Put acc (accept) in table[m][$]
SLR(1) Example • Grammar • S-> (S) | Ab A-> aA | e • Firsts • S: (,a,b A: a,e • Follows • S: $,) A: b
SLR(1) Example • Stack input $0 (aab)$ $0(2 aab)$ $0(2a7 ab)$ $0(2a7a7 b)$ $0(2a7a7A8 b)$ A->e $0(2a7A8 b)$ A->e $0(2A5 b)$ A->aA
SLR(1) Example cont. • $0(2A5 b)$ • $0(2A5b6 )$ • $0(2S3 )$ • $0(2S3)4 $ • $0S1 $ • $0S’ $ accept!
Another SLR(1) Grammar to Try • S -> zMNz • M -> aMa • M -> z • N -> bNb • N -> z
Parsing Conflicts in SLR(1) • Shift-reduce conflict • Prefer shift over reduce • Reduce-reduce conflicts • Error in design of grammar (usually) • Possible to designate a grammar-specific choice
Dangling Else • Remember: if C if C else S • Shift-preference puts else with inner if! • To put else with outer if, inner “if C” must be reduced to S first • Good example of how language “evolved” to make it easy for the compiler!
More than SLR(1) • SLR(k) Parsing • Multiple-token lookahead (for shifts) and multiple-token follow information (for reductons) • General LR(1) parsing • Include lookaheads in DFA construction • LALR(1) parsing • Simplified state diagram for GLR(1) • What YACC / Bison uses