530 likes | 723 Views
Chapter 3-3. Chang Chi-Chung 2015.07.03. Bottom-Up Parsing. L R methods ( Left-to-right , Rightmost derivation ) LR(0), SLR, Canonical LR = LR(1), LALR 最右推導 沒有右遞迴 明確性文法 Other special cases: Shift-reduce parsing Operator-precedence parsing. Bottom-Up Parsing. reduction.
E N D
Chapter 3-3 Chang Chi-Chung 2015.07.03
Bottom-Up Parsing • LR methods (Left-to-right, Rightmost derivation) • LR(0), SLR, Canonical LR = LR(1), LALR • 最右推導 • 沒有右遞迴 • 明確性文法 • Other special cases: • Shift-reduce parsing • Operator-precedence parsing
Bottom-Up Parsing reduction E→ T → T * F → T * id → F * id → id * id rightmost derivation E → E + T | T T → T * F | F F → ( E ) | id
LR(0) • An LR parser makes shift-reduce decisions by maintaining states to keep track. • An item of a grammar G is a production of G with a dot at some position of the body. • Example A → X Y Z A → .X Y Z A → X.Y Z A → X Y .Z A → X Y Z . A → X.Y Z items stack next derivations with input strings Note that production A has one item [A •]
LR(0) • Canonical LR(0) Collection • One collection of sets of LR(0) items • Provide the basis for constructing a DFA that is used to make parsing decisions. • LR(0) automation • The canonical LR(0) collection for a grammar • Augmented the grammar • If G is a grammar with start symbol S, then G’ is the augmented grammar for G with new start symbol S’ and new production S’ → S • Closure function • Goto function
Shift-Reduce Parsing • Shift-Reduce Parsing is a form of bottom-up parsing • A stack holds grammar symbols • An input buffer holds the rest of the string to parsed. • Shift-Reduce parser action • shift • reduce • accept • error
Shift-Reduce Parsing • Shift • Shift the next input symbol onto the top of the stack. • Reduce • The right end of the string to be reduced must be at the top of the stack. • Locate the left end of the string within the stack and decide with what nonterminal to replace the string. • Accept • Announce successful completion of parsing • Error • Discover a syntax error and call recovery routine
Conflicts During Shift-Reduce Parsing • Conflicts Type • shift-reduce • reduce-reduce • Shift-reduce and reduce-reduce conflicts are caused by • The limitations of the LR parsing method (even when the grammar is unambiguous) • Ambiguity of the grammar
Function Closure • If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I. • Create closure(I) by the two rules: • add every item in I to closure(I) • IfA→α.Bβ is in closure(I) andB→γis a production, then add the item B→.γtoclosure(I). Applythis rule untill no more new items can be added toclosure(I). • Divide all the sets of items into two classes • Kernel items • initial item S’ → .S, and all items whose dots are not at the left end. • Nonkernel items • All items with their dots at the left end, except for S’ → .S
Example • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E’ → .E} , then closure(I) = { E’ →.E E →.E + T E →.T T →.T * F T →.F F →.( E ) F →.id}
Exercise • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E → E +.T }
Function Goto • Function Goto(I, X) • I is a set of items • X is a grammar symbol • Goto(I, X) is defined to be the closure of the set of all items [A α X‧β] such that [A α‧Xβ] is in I. • Goto function is used to define the transitions in the LR(0) automation for a grammar.
Example • The grammar G • E’ → E • E → E + T | T • T → T * F | F • F → ( E ) | id • I = { E’ → E . E → E .+ T } • Goto (I, +) = { E → E +.T T →.T * F T →.F F →.(E) F →.id }
Constructing the LR(0) Collection • The grammar is augmented with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S]})(this is the start state of the DFA) • For each set of items I C and each grammar symbol X (NT) such thatGOTO(I, X) Cand goto(I, X) , add the set of items GOTO(I, X) to C • Repeat 3 until no more sets can be added to C
Structure of the LR Parsing Table • Parsing Table consists of two parts: • A parsing-action function ACTION • A goto function GOTO • The Action function, Action[i, a], have one of four forms: • Shift j, where j is a state. • The action taken by the parser shifts input a to the stack, but use state j to represent a. • Reduce A→β. • The action of the parser reduces β on the top of the stack to head A. • Accept • The parser accepts the input and finishes parsing. • Error
Structure of the LR Parsing Table(1) • The GOTO Function, GOTO[Ii, A], defined on sets of items. • If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.
Configuration ( = LR parser state): ($s0s1s2 … sm,aiai+1 … an$) stack input ($ X1 X2 … Xm,aiai+1 … an$) LR-Parser Configurations Ifaction[sm, ai] = shift sthen push s (ai), and advance input (s0 s1 s2 … sm, aiai+1 … an $)(s0s1s2 … sms, ai+1 … an$) Ifaction[sm, ai] = reduce A and goto[sm-r, A] = s with r = || then pop r symbols, and push s ( push A )( (s0 s1 s2 … sm, ai ai+1 … an $) (s0s1 s2 … sm-rs, aiai+1 … an$) Ifaction[sm, ai] = accept then stop Ifaction[sm, ai] = error then attempt recovery
Example LR Parse Table action goto Grammar:1. E E+T2. E T3. T T*F4. T F5. F (E)6. F id state shift & goto 5 reduce byproduction #1
Example Grammar0. S E 1. E E+T2. E T3. T T*F4. T F5. F (E)6. F id
SLR Grammars • SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing • SLR eliminates some conflicts by populating the parsing table with reductions A on symbols in FOLLOW(A) Shift on + State I2:E id•+EE id• State I0:S •EE •id+EE •id goto(I0,id) goto(I3,+) S EE id+EE id FOLLOW(E)={$}thus reduce on $
SLR Parsing Table • Reductions do not fill entire rows • Otherwise the same as LR(0) 1. S E2. E id+E3. E id Shift on + FOLLOW(E)={$}thus reduce on $
Constructing SLR Parsing Tables • Augment the grammar with S’ S • Construct the set C={I0, I1, …, In} of LR(0) items • State i is constructed from Ii • If [A•a] Iiand goto(Ii, a)=Ij then set action[i, a]=shiftj • If [A•] Iithen set action[i,a]=reduce A for all a FOLLOW(A) (apply only if AS’) • If [S’S•] is in Ii then set action[i,$]=accept • If goto(Ii, A)=Ij then set goto[i, A]=jset goto table • Repeat 3-4 until no more entries added • The initial state iis the Iiholding item [S’•S]
Example SLR Grammar and LR(0) Items Augmentedgrammar:0. C’ C1. C A B2. A a3. B a I0 = closure({[C’ •C]})I1 = goto(I0,C) = closure({[C’ C•]})… State I1:C’ C• State I4:C A B• final goto(I0,C) goto(I2,B) State I0:C’ •CC •A BA •a State I2:C A•BB •a start goto(I0,A) goto(I2,a) State I5:B a• goto(I0,a) State I3:A a•
Example SLR Parsing Table State I0:C’ •CC •A BA •a State I1:C’ C• State I2:C A•BB •a State I3:A a• State I4:C A B• State I5:B a• 1 Grammar:0. C’ C1. C A B2. A a3. B a C 4 B start A 2 0 a 5 a 3
SLR and Ambiguity • Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1) • Consider for example the unambiguous grammarS L=R | RL *R | idR L FOLLOW(R) = {=, $} I0:S’ •SS •L=RS •RL •*RL •idR •L I1:S’ S• I2:S L•=RR L• I3:S R• I4:L *•RR •L L •*RL •id I5:L id• I6:S L=•RR •L L •*RL •id I7:L *R• I8:R L• I9:S L=R• action[2,=]=s6action[2,=]=r5 Has no SLRparsing table no
LR(1) Grammars • SLR too simple • LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table • LR(1) item = LR(0) item + lookahead LR(0) item[A•] LR(1) item [A•, a]
SLR Versus LR(1) • Split the SLR states by adding LR(1) lookahead • Unambiguous grammarS L = R | RL * R | idR L I2:S L•=RR L• split S L•=R R L• action[2,=]=s6 Should not reduce, because noright-sentential form begins with R=
LR(1) Items • An LR(1) item [A•,a]contains a lookahead terminal a, meaning already on top of the stack, expect to seea • For items of the form[A•,a]the lookahead a is used to reduce A only if the next input is a • For items of the form[A•, a]with the lookahead has no effect
The Closure Operation for LR(1) Items • Start with closure(I) = I • If [A•B, a] closure(I) then for each production B in the grammar and each terminal b FIRST(a) add the item [B•, b] to I if not already in I • Repeat 2 until no new items can be added
The Goto Operation for LR(1) Items • For each item [A•X, a] I, add the set of items closure({[AX•, a]}) to goto(I,X) if not already there • Repeat step 1 until no more items can be added to goto(I,X)
The grammar G • S’ → S • S → C C • C → cC | d Example • Let I= { (S’ → •S, $) } • I0 = closure(I) = { S’ → •S, $ S → • C C, $ C → •cC, c/d C → •d, c/d } • goto(I0, S) = closure( {S’ → S •, $ } ) = {S’ → S •, $ } = I1
Exercise • The grammar G • S’ → S • S → C C • C → cC | d • Let I= { (S → C •C, $) } • I2 = closure(I) = ? • I3 = goto(I2, c) = ?
Construction of the sets of LR(1) Items • Augment the grammar with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S,$]})(this is the start state of the DFA) • For each set of items I C and each grammar symbol X (NT) such that goto(I, X) Cand goto(I, X) , add the set of items goto(I, X) to C • Repeat 3 until no more sets can be added to C
Construction of the Canonical LR(1) Parsing Tables • Augment the grammar with S’S • Construct the set C={I0,I1,…,In} of LR(1) items • State i of the parser is constructed from Ii • If [A•a, b] Iiand goto(Ii,a)=Ij then set action[i,a]=shift j • If [A•, a] Ii then set action[i,a]=reduce A (apply only if AS’) • If [S’S•, $] is in Ii then set action[i,$]=accept • If goto(Ii,A)=Ij then set goto[i,A]=j • Repeat 3 until no more entries added • The initial state i is the Ii holding item [S’•S,$]
Example • The grammar G • S’ → S • S → C C • C → cC | d
Example Grammar and LR(1) Items • Unambiguous LR(1) grammar:S L=R | RL *R | idR L • Augment with S’ S • LR(1) items (next slide)
I0: [S’ •S, $] goto(I0,S)=I1[S •L=R, $] goto(I0,L)=I2[S •R, $] goto(I0,R)=I3[L •*R, =/$] goto(I0,*)=I4[L •id, =/$] goto(I0,id)=I5[R •L, $] goto(I0,L)=I2[S’ S•, $][S L•=R, $] goto(I0,=)=I6[R L•, $][S R•, $][L *•R, =/$] goto(I4,R)=I7[R •L, =/$] goto(I4,L)=I8[L •*R, =/$] goto(I4,*)=I4[L •id, =/$] goto(I4,id)=I5 [L id•, =/$] I6: [S L=•R, $] goto(I6,R)=I9[R •L, $] goto(I6,L)=I10 [L •*R, $] goto(I6,*)=I11[L •id, $] goto(I6,id)=I12[L *R•, =/$] [R L•, =/$] [S L=R•, $] [R L•, $][L *•R, $] goto(I11,R)=I13[R •L, $] goto(I11,L)=I10[L •*R, $] goto(I11,*)=I11[L •id, $] goto(I11,id)=I12[L id•, $] [L *R•, $] I7: I1: I8: I2: I9: I10: I3: I11: I4: I12: I5: I13:
Example LR(1) Parsing Table Grammar:1. S’ S2. S L=R3. S R4. L *R5. L id6. R L
LALR(1) Grammars • LR(1) parsing tables have many states • LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce table size • Less powerful than LR(1) • Will not introduce shift-reduce conflicts, because shifts do not use lookaheads • May introduce reduce-reduce conflicts, but seldom do so for grammars of programming languages • SLR and LALR tables for a grammar always have the same number of states, and less than LR(1) tables. • Like C, SLR and LALR >100, LR(1) > 1000
Constructing LALR Parsing Tables • Two ways • Construction of the LALR parsing table from the sets of LR(1) items. • Union the states • Requires much space and time • Construction of the LALR parsing table from the sets of LR(0) items • Efficient • Use in practice.
Example LALR(1) LR(1)
Constructing LALR(1) Parsing Tables • Construct sets of LR(1) items • Combine LR(1) sets with sets of items that share the same first part I4: [L *•R, =][R •L, =] [L •*R, =] [L •id, =] Shorthandfor two itemsin the same set [L *•R, =/$][R •L, =/$][L •*R, =/$][L •id, =/$] I11: [L *•R, $][R •L, $][L •*R, $][L •id, $]
Example LALR(1) Grammar • Unambiguous LR(1) grammar:S L=R | RL *R | idR L • Augment with S’ S • LALR(1) items (next slide)
I0: [S’ •S, $] goto(I0,S)=I1[S •L=R, $] goto(I0,L)=I2[S •R, $] goto(I0,R)=I3[L •*R, =/$] goto(I0,*)=I4[L •id, =/$] goto(I0,id)=I5[R •L, $] goto(I0,L)=I2[S’ S•, $][S L•=R, $] goto(I0,=)=I6[R L•, $][S R•, $][L *•R, =/$] goto(I4,R)=I7[R •L, =/$] goto(I4,L)=I9[L •*R, =/$] goto(I4,*)=I4[L •id, =/$] goto(I4,id)=I5 [L id•, =/$] I6: [S L=•R, $] goto(I6,R)=I8[R •L, $] goto(I6,L)=I9 [L •*R, $] goto(I6,*)=I4[L •id, $] goto(I6,id)=I5[L *R•, =/$] [S L=R•, $][R L•, =/$] I7: I1: I8: I2: I9: Shorthandfor two items I3: I4: [R L•, =][R L•, $] I5:
Example LALR(1) Parsing Table Grammar:1. S’ S2. S L=R3. S R4. L *R5. L id6. R L
LL, SLR, LR, LALR Summary • LL parse tables computed using FIRST/FOLLOW • Nonterminals terminals productions • Computed using FIRST/FOLLOW • LR parsing tables computed using closure/goto • LR states terminals shift/reduce actions • LR states nonterminals goto state transitions • A grammar is • LL(1) if its LL(1) parse table has no conflicts • SLR if its SLR parse table has no conflicts • LR(1) if its LR(1) parse table has no conflicts • LALR(1) if its LALR(1) parse table has no conflicts
LL, SLR, LR, LALR Grammars LR(1) LL(1) LALR(1) SLR LR(0)