Compiler Design 11. Table-Driven Bottom-Up Parsing: LALR More Examples for LR0, SLR, LR1, LALR

1. Compiler Design11. Table-Driven Bottom-Up Parsing: LALR More Examples for LR(0), SLR, LR(1), LALR Kanat Bolazar February 23, 2010

2. 2 Bottom-Up Parsers We have been looking at bottom-up parsers. They are also called shift-reduce parsers shift: Put next token in the stack, move on reduce: Tokens combine to RHS of a rule; reduce this to the nonterminal on the left. Scan the input from Left to Right Produce a Rightmost derivation from the grammar Not all context free languages have LR grammars

3. 3 Shift-Reduce Parsing Example: Grammar: E -> 1 | E - 1 (rules 1 and 2) Input: 1 - 1 $ ($ : End of file / tape) Steps: 1 shift E reduce by rule 1 E - shift E - 1 shift E - E reduce by rule 1 E reduce by rule 2 E $ accept

4. 4 LR(0) Table: Reduce on Any Token s1: S ? �E , E ? �E - 1 , E ? �1 action(s1, '1') = shift2 goto(s1, E) = s3 s2: E ? 1� action(s2, on any token) = reduce by rule2 s3: S ? E� , E ? E� - 1 act(s3, EOT)=accept act(s3, '-')=s4 s4: E ? E - �1 action(s4, '1') = shift5 s5: E ? E - 1� action(s5, on any token) = reduce by rule1

5. 5 SLR(1) Table: Reduce Depends on Token s1: S ? �E , E ? �E - 1 , E ? �1 action(s1, '1') = shift2 goto(s1, E) = s3 s2: E ? 1� action(s2, {-, EOT}) = reduce by rule2 s3: S ? E� , E ? E� - 1 act(s3, EOT)=accept act(s3, '-')=s4 s4: E ? E - �1 action(s4, '1') = shift5 s5: E ? E - 1� action(s5, {-, EOT}) = reduce by rule1

6. 6 LR(1) Parsing Although SLR(1) is using 1 lookahead symbol, it is still not using all of the information that could be obtained in a parsing state by keeping track of what path led to that item Not every item in Follow(X) is possible in every rule of X In LR(1) parsing tables, we keep the lookahead in the parsing state and separate those states, so that they can have more detailed successor states: A -> B C � D E F , a/b/c A will eventually be reduced, if the following lookahead token after F is one of {a, b, c} if any other token is seen, some other action may be taken if there is no action, it's an error Leads to larger numbers of states (in thousands, instead of hundreds) for programming language parsers

7. 7 LALR(1) parsing Compromises between the simplicity of SLR and the power of LR(1) by merging similar LR(1) states. Identify a core of configurating sets and merge states that differ only by lookahead This is not just SLR because LALR will have fewer reduce actions, but it may introduce reduce/reduce conflicts that LR(1) did not have Constructing LALR(1) parsing tables is not usually done by brute force to construct LR(1) and then merge sets As configurating sets are generated, a new configurating set is examined to see if it can be merged with an existing one

8. 8 Recap: LR(0), SLR, LR(1), LALR LR(0): Don't look ahead when reducing according to a rule. When we reach the end of RHS, we reduce. SLR = SLR(1): Use Follow set of nonterminal on the left. If the lookahead is in our Follow set, we reduce LR(1): Add the expected lookahead for which we will eventually reduce. Produces very large tables. LALR = LALR(1): Use LR(1), but combine states that differ only in lookahead. Note: LALR is not SLR: S -> V = V | V = V + V After first V, SLR would reduce if next token is '+', LR(1) and LALR wouldn't.

9. 9 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar?

10. 10 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a

11. 11 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3?

12. 12 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do?

13. 13 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse!

14. 14 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse! SLR(1)?

15. 15 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse! SLR(1)? Follow(B) = ?

16. 16 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse! SLR(1)? Follow(B) = {b}

17. 17 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse! SLR(1)? Follow(B) = {b} look ahead; shift on a, reduce on b

18. 18 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse! SLR(1)? Follow(B) = {b} look ahead; shift on a, reduce on b LR(1)? LALR?

19. 19 Example 1.0 Let's start with a simple grammar: 1 S -> B b 2 S -> a a 3 B -> a What strings are allowed in this grammar? a b (from B b) a a Consider seeing a string that starts with a: a ... Should we shift a, or reduce a to B according to rule 3? What would LR(0) parsing do? conflict: Can't parse! SLR(1)? Follow(B) = {b} look ahead; shift on a, reduce on b LR(1)? LALR? Will do the right thing if SLR does!

20. 20 Example 1.1 Original simple grammar: 1 S -> B b 2 S -> a a 3 B -> a SLR: Follow(B) = {b} look ahead; shift on a, reduce on b Can we make the grammar harder for SLR?

21. 21 Example 1.1 Original simple grammar allows only "a a" and "a b": 1 S -> B b 2 S -> a a 3 B -> a SLR: Follow(B) = {b} look ahead; shift on a, reduce on b Can we make the grammar harder for SLR? Of course! Just add 'a' to Follow(B) somehow!

22. 22 Example 1.1 Original simple grammar allows only "a a" and "a b": 1 S -> B b 2 S -> a a 3 B -> a SLR: Follow(B) = {b} look ahead; shift on a, reduce on b Can we make the grammar harder for SLR? Of course! Just add 'a' to Follow(B) somehow: 4 S -> b B a Grammar also allows "b a a" now. This should be irrelevant for "a a", But SLR can't decide: Follow(B) = {a, b}: Conflict for a a!

23. 23 Example 1.1 Modified grammar, reorganized: 1 S -> B b 2 S -> a a 3 S -> b B a 4 B -> a Input: "a ... " SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b LR(1): State 0: S' -> . S , $ S -> . B b , $ S -> . a a , $ S -> . b B a , $ B -> . a , ?

24. 24 Example 1.1 Modified grammar, reorganized: 1 S -> B b 2 S -> a a 3 S -> b B a 4 B -> a Input: "a ... " SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b LR(1): State 0: S' -> . S , $ S -> . B b , $ S -> . a a , $ S -> . b B a , $ B -> . a , b




28. 28 LR(1) vs LALR We went through the previous example with LALR in the class. The states and transitions were mostly the same as those of LR(1).

29. 29 Example 2 Assignment statement with variables: 1 S -> V = V 2 S -> V = V + V 3 V -> id Use LR(0), SLR, LR(1), LALR. Shown in the class. SLR doesn't know that an initial V can't be followed by '+' or $ (EOF). LR(1) knows it; the '=' that must follow is attached to V rule: s0: S -> . V = V , $ S -> . V = V + V , $ V -> . id , = (due to previous two lines) go to s1 on token "id" s1: V -> id . , = reduce rule 3 only if followed by '='

30. 30 Example 3 Another grammar (for regular expression a*ba*b): 1 S -> X X 2 X -> a X 3 X -> b Create the LR(0), SLR and LR(1) tables for table-driven parsing. Draw the states and state transitions for one of these tables. Compare it to the minimal a*ba*b Finite State Machine below. In LR(0), not looking ahead before reducing adds extra states. In LR(1), considering all types of lookahead adds many extra states. In all forms of table-driven parsing, we keep a stack and so we can also share some states that FSM can't (so we might have fewer states).

31. 31 Example 4 A harder grammar: 1 S -> a X c 2 S -> b Y c 3 S -> a Y d 4 S -> b X d 5 X -> e 6 Y -> e Use LR(0), SLR, LR(1), LALR We did not yet do this example in the class. We will, later, to remember how to do table-driven parsing.

Compiler Design 11. Table-Driven Bottom-Up Parsing: LALR More Examples for LR0, SLR, LR1, LALR

Compiler Design 11. Table-Driven Bottom-Up Parsing: LALR More Examples for LR0, SLR, LR1, LALR

Presentation Transcript

Parsing #2

Compiler Design

One pass compiler Compiler Design

Parsing

Bottom up Parsing

More yacc

Discussion #5 LL(1) Grammars &Table-Driven Parsing

CS 31003: Compilers

CPSC 388 – Compiler Design and Construction

CS 381 - Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour

Discussion Section – 11/3/2012

Compiler Construction

Bottom up parsing

Test 2 Post Mortem

COMPILER CONSTRUCTION

BOTTOM UP PARSING

Bottom Up Parsing

CSC 8505 Compiler Construction Parsing

Bottom-Up Parsing

Parsing

COMPILER CONSTRUCTION

YACC Primer

Compiler Design 11. Table-Driven Bottom-Up Parsing: LALR More Examples for LR0, SLR, LR1, LALR