1 / 29

More SLR /LR(1)

More SLR /LR(1). CMSC 431 Shon Vick. Remember Computing FIRST (N). If N e First (N) includes e if N aABC First (N) includes a if N X1X2 First (N) includes First (X1) if N X1X2… and X1 e ,

catrin
Download Presentation

More SLR /LR(1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. More SLR /LR(1) CMSC 431 Shon Vick

  2. RememberComputing FIRST (N) • If N e First (N) includes e • if N aABC First (N) includes a • if N X1X2 First (N) includes First (X1) • if N X1X2… and X1 e, • First (N) includes First (X2) • Obvious generalization to First (a) where a is X1X2...

  3. Computing Follow (N) • Follow (N) is computed from productions in which N appears on the rhs • For the sentence symbol S, Follow (S) includes $ • if A a N b, Follow (N) includes First (b) • because an expansion of N will be followed by an expansion from b • if A a N, Follow (N) includes Follow (A) • because N will be expanded in the context in which A is expanded • if A a N B , B e, Follow (N) includes Follow (A)

  4. Recall our Example • A grammar to generate all palindromes over S = { a, b } 1) S--> P 2) P --> a Pa 3) P --> b P b 4) P --> c • LR parsers work with an augmented grammar in which the start symbol never appears in the right side of a production. Here the original grammar was rules 2-4

  5. Computing the Items • S0: S--> .P , P --> .a P a, P--> .bP b, P-->.c • S1: S--> P. • S2: P --> a.Pa, P-->.aPa,P-->.bPb,P-->.c • S3:P--> b.P b, P-->.aPa,P-->.bPb,P-->.c • S4: P--> c. • S5: P--> aP.a • S6:P--> bP.b • S7: P--> aPa. • S8: P--> bP b.

  6. Finite State Machine • Draw the FSA. The major difference is that transitions can be both terminal and non-terminal symbols. • The Goto and Action Parts of the parsing table come from the FSA as detailed for SLR by Algo 4.8 page 227-228

  7. FSA c Io S-> .P P -> .aPa P -> .bPb P ->.c I1 S-> P. P a I2 P -> a.Pa P -> .aPa P -> .bPb P ->.c a I5 P-> a P.a b b P I4 P-> c. c a I3 P -> b.Pb P -> .aPa P -> .bPb P ->.c a I7 P-> a Pa. c b P 1) P -> aPa 2) P -> bPb 3) P -> c b I6 P-> bP.b I8 P-> bPb.

  8. Parsing Table

  9. Parsing Table Contd • Si means shift the input symbol and goto state I. • Rj means reduce by jth production. Note that we are not storing all the items in the state in our table. • example: abcba$ • if we go thru, parsing algorithm, we get

  10. Example Contd • StateInputAction • $0 abcba$ shift • $0a2 bcba$ shift • $0a2b3 cba$ shift • $0a2b3c4 ba$ reduce • $0a2b3P6 ba$ shift • $0a2b3P6b8 a$ reduce • $0a2P5 a$ shift • $0a2P5a7 $ reduce • $0P 1 $accept

  11. See also • Appel pages 60-62 especially Figure 3.2.1

  12. Shift/Reduce Conflicts • An LR(0) state contains a conflict if its canonical set has two items that recommend conflicting actions. • shift/reduce conflict - when one item prompts a shift action, the other prompts a reduce action. • reduce/reduce conflict - when two items prompt for reduce actions by different production. • A grammar is said be to be LR(0) grammar, if the table does not have any conflicts.

  13. SLR Summary • Uses DFA to recognize viable prefixes of grammar G • Each state in the DFA: • is the set of LR(0 items valid for a viable prefix • “encodes” information about the symbols that have been shifted onto the stack • Valid LR(0) items are computed by applying the closure and goto functions to the initial, valid item [S’ -> .S] (this is called the canonical collection of LR(0) items) • Uses FOLLOW to disambiguate actions

  14. SLR(1) Summary • If A -> aAb is in Ikand goto(Ik, a) = Ij, then set actions[k,a] to sj • If A -> a is in Ikthen set actions[k,b]to rule#, for all be FOLLOW(A) • If S’ -> S. is inIkthen set actions[k,$] to accept Rules 1-3 may define conflicting actions for an entry in the actions table. In this case, the grammar is not SLR(1).

  15. Shift/Reduce Conflict S’ -> .S S -> .A b | d c | b A c A -> .d A very simple language = {db, dc, bdc} Follow(S) = {$}, Follow(A) = {b,c} Form part f the SLR(1) parser: I0 S’ -> .S S -> .A b S -> . d c S -> . b A c A -> .d I1 S -> d .c A -> d. But since c is in Follow(A), we don’t whether to shift or reduce in I1 D1: S’ -> S ->dc D2; S’ ->S ->bAc ->bdc

  16. Reduce/Reduce Conflict S’-> S S -> b A e | b B d | A c A -> d B -> Ec E-> d S’ -> S -> Ac -> dc S’ ->S -> bBd -> bEcd -> bdcd S’ -> S -> bAe -> bde d I2 A -> d. E -> d . I0 S’ -> .S S -> . b A e S -> .b B d S -> .A c A -> .d I1 S -> b .A e S -> b .B d A -> . d B -> .E c E -> .d b Which reduction should be taken? There is not enough context to decide!

  17. LR(1) • Solution: keep more information about context. Namely keep track what next input symbol can be as part of DFA state • Idea: keep an input look-ahead as part of each item - these are called LR(1) items • Always a subset of Follow(A) for any non-terminal A (may not be a proper subset) • Can give rise to larger parsers (i.e. many states) than SLR but recognizes a greater number of constructs

  18. LR(k) Items • The table construction algorithm for an LR(k) parser uses LR(k) items to represent the set of possible states in a parse • An LR(k) item is a pair [a, b], where • ais a production from G with a “.”at some position in the rhs • bis a look-ahead string containing k (where k is typically 1) symbols that are terminals or $ • Example LR(1) item [A -> X . Y Z , a] b a

  19. LR(k) Items • What’s the point of the look-ahead symbols? • Carry them along to allow us to choose correct reduction when there is any choice • Look-ahead symbols are bookkeeping unless item has unless reducing (i.e. has a “.” at the right end) [A -> X . Y Z , a] [A -> X Y Z . , a] Use to Guide Reduction No Use The point: for [A -> a ., a] and [B -> a ., b], we can decide between reducing to A or to B by looking at limited right context

  20. * rm rm Canonical LR(1) Items • The canonical collection of sets of LR(1) items: • Sets of valid items for viable prefixes of the grammar • Sets of items derivable from [S’ -> .S, $] using Goto and Closure • We say that an LR(1) item [A -> a . b, a] is valid for a viable prefix g if there is a derivation S d A w dab w where g = d a and either a is the first symbol of w, or w is e and a is $

  21. LR(1) closure • Given a LR(1) item [A ->a .b, a], its closure contains the item and any other LR(1) items that can generate legal sub-strings to followa . • Thus, if the parser has the suffix a of a viable prefix on its stack, a sub-string of the input should reduce to Bb(or somegfor some other item [B ->. g, b] in the closure).

  22. Computing LR(1) closure • Regard Algo given in Appel on page pg 63 • How is this different than the previous algorithm for computing closure given pg 60?

  23. An Algorithm for Computing Closure function closure(I) repeat new_item <- false for each item [A -> a . B b ,a] e I, each production B -> g e G, and each terminal b e FIRST(ba), if [B -> .g,b] !e I then add [B -> .g,b] to I new_item <- true endif endfor until (new_item = false) return I

  24. LR(1) Goto • Let I be a set of LR(1) items and X be a grammar symbol • Then, Goto(I,X) is the closure of the set of all items [A -> aX. b, a]such that [A -> a .X b, a] • That is, if I is the set of valid items for some viable prefixg , then goto(I,X) is the set of valid items for the viable prefixg X

  25. LR(1) Goto function goto(I, X) J <- set of items [A -> a X.b,a] such that [A -> a .X b,a] e I J’ <- closure(J) return J’

  26. Canonical Collection of Sets of LR(1) Items • We start the construction of the canonical collection of sets of LR(1) items with the item [S’ -> .S, $], where • S’ is the start symbol of the augmented grammar G • S is the start symbol of G • $ is the right end of sentence marker

  27. Computing the Canonical Collection of Sets of LR(1) Items procedure items(G’) C <- closure({[S’ -> .S,$]}) repeat new_item <- false for each set of items I in C and each grammar symbol X such that goto(I,X) != {} and goto(I,X) ! e C add goto(I,X) to C new_item <- true endfor until (new_item = false)

  28. LR(1) Table Construction • Construct the canonical collection of sets of LR(1) items for G • State i of the parser is constructed from Ii • If [A -> a. a b, b] e Ii and Goto(ii ,a) = Ij then set ACTION[i,a] to Shift j – here a must be a terminal • If [A -> a. , a] e Ii then set ACTION[i,a] to Reduce using A -> a • If [S’ -> S. , S] e Ii then set ACTION[i,$] to Accept • If Goto(Ii,A) = Ij then set Goto[i,A] to j • All other entries in ACTION and GOTO are set to Error • The initial state of the parser is the state constructed from the set containing the item [S’ -> .S,$]

  29. References • http://www.cs.nyu.edu/courses/spring02/G22.2130-001/parsing2.ppt • http://www.cs.rpi.edu/~moorthy/Courses/compiler98/Lectures/lecturesinppt/lecture7.ppt • http://www.cs.rutgers.edu/~tdnguyen/classes/cs415/lectures/lecture11.pdf • Modern Compiler Implementation in Java, Andrew Appel, Cambridge University Press

More Related