Compiler Design 11. Table-Driven Bottom-Up Parsing: LALR M...

**1. **Compiler Design11. Table-Driven Bottom-Up Parsing: LALR More Examples for LR(0), SLR, LR(1), LALR Kanat Bolazar
February 23, 2010

**2. **2 Bottom-Up Parsers We have been looking at bottom-up parsers.
They are also called shift-reduce parsers
shift: Put next token in the stack, move on
reduce: Tokens combine to RHS of a rule; reduce this to the nonterminal on the left.
Scan the input from Left to Right
Produce a Rightmost derivation from the grammar
Not all context free languages have LR grammars

**3. **3 Shift-Reduce Parsing Example:
Grammar:
E -> 1 | E - 1 (rules 1 and 2)
Input:
1 - 1 $ ($ : End of file / tape)
Steps:
1 shift
E reduce by rule 1
E - shift
E - 1 shift
E - E reduce by rule 1
E reduce by rule 2
E $ accept

**4. **4 LR(0) Table: Reduce on Any Token s1: S ? ?E , E ? ?E - 1 , E ? ?1
action(s1, '1') = shift2 goto(s1, E) = s3
s2: E ? 1? action(s2, on any token) = reduce by rule2
s3: S ? E? , E ? E? - 1 act(s3, EOT)=accept act(s3, '-')=s4
s4: E ? E - ?1 action(s4, '1') = shift5
s5: E ? E - 1? action(s5, on any token) = reduce by rule1

**5. **5 SLR(1) Table: Reduce Depends on Token s1: S ? ?E , E ? ?E - 1 , E ? ?1
action(s1, '1') = shift2 goto(s1, E) = s3
s2: E ? 1? action(s2, {-, EOT}) = reduce by rule2
s3: S ? E? , E ? E? - 1 act(s3, EOT)=accept act(s3, '-')=s4
s4: E ? E - ?1 action(s4, '1') = shift5
s5: E ? E - 1? action(s5, {-, EOT}) = reduce by rule1

**6. **6 LR(1) Parsing Although SLR(1) is using 1 lookahead symbol, it is still not using all of the information that could be obtained in a parsing state by keeping track of what path led to that item
Not every item in Follow(X) is possible in every rule of X
In LR(1) parsing tables, we keep the lookahead in the parsing state and separate those states, so that they can have more detailed successor states:
A -> B C ? D E F , a/b/c
A will eventually be reduced, if the following lookahead token after F is one of {a, b, c}
if any other token is seen, some other action may be taken
if there is no action, it's an error
Leads to larger numbers of states (in thousands, instead of hundreds) for programming language parsers

**7. **7 LALR(1) parsing Compromises between the simplicity of SLR and the power of LR(1) by merging similar LR(1) states.
Identify a core of configurating sets and merge states that differ only by lookahead
This is not just SLR because LALR will have fewer reduce actions, but it may introduce reduce/reduce conflicts that LR(1) did not have
Constructing LALR(1) parsing tables
is not usually done by brute force to construct LR(1) and then merge sets
As configurating sets are generated, a new configurating set is examined to see if it can be merged with an existing one

**8. **8 Recap: LR(0), SLR, LR(1), LALR LR(0): Don't look ahead when reducing according to a rule. When we reach the end of RHS, we reduce.
SLR = SLR(1): Use Follow set of nonterminal on the left. If the lookahead is in our Follow set, we reduce
LR(1): Add the expected lookahead for which we will eventually reduce. Produces very large tables.
LALR = LALR(1): Use LR(1), but combine states that differ only in lookahead.
Note: LALR is not SLR:
S -> V = V | V = V + V
After first V, SLR would reduce if next token is '+', LR(1) and LALR wouldn't.

**9. **9 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?

**10. **10 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a

**11. **11 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?

**12. **12 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do?

**13. **13 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!

**14. **14 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!
SLR(1)?

**15. **15 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!
SLR(1)? Follow(B) = ?

**16. **16 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!
SLR(1)? Follow(B) = {b}

**17. **17 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!
SLR(1)? Follow(B) = {b} look ahead; shift on a, reduce on b

**18. **18 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!
SLR(1)? Follow(B) = {b} look ahead; shift on a, reduce on b
LR(1)? LALR?

**19. **19 Example 1.0 Let's start with a simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
What strings are allowed in this grammar?
a b (from B b)
a a
Consider seeing a string that starts with a:
a ...
Should we shift a, or reduce a to B according to rule 3?
What would LR(0) parsing do? conflict: Can't parse!
SLR(1)? Follow(B) = {b} look ahead; shift on a, reduce on b
LR(1)? LALR? Will do the right thing if SLR does!

**20. **20 Example 1.1 Original simple grammar:
1 S -> B b
2 S -> a a
3 B -> a
SLR: Follow(B) = {b} look ahead; shift on a, reduce on b
Can we make the grammar harder for SLR?

**21. **21 Example 1.1 Original simple grammar allows only "a a" and "a b":
1 S -> B b
2 S -> a a
3 B -> a
SLR: Follow(B) = {b} look ahead; shift on a, reduce on b
Can we make the grammar harder for SLR?
Of course! Just add 'a' to Follow(B) somehow!

**22. **22 Example 1.1 Original simple grammar allows only "a a" and "a b":
1 S -> B b
2 S -> a a
3 B -> a
SLR: Follow(B) = {b} look ahead; shift on a, reduce on b
Can we make the grammar harder for SLR?
Of course! Just add 'a' to Follow(B) somehow:
4 S -> b B a
Grammar also allows "b a a" now.
This should be irrelevant for "a a",
But SLR can't decide: Follow(B) = {a, b}: Conflict for a a!

**23. **23 Example 1.1 Modified grammar, reorganized:
1 S -> B b
2 S -> a a
3 S -> b B a
4 B -> a
Input: "a ... "
SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b
LR(1): State 0:
S' -> . S , $
S -> . B b , $
S -> . a a , $
S -> . b B a , $
B -> . a , ?

**24. **24 Example 1.1 Modified grammar, reorganized:
1 S -> B b
2 S -> a a
3 S -> b B a
4 B -> a
Input: "a ... "
SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b
LR(1): State 0:
S' -> . S , $
S -> . B b , $
S -> . a a , $
S -> . b B a , $
B -> . a , b

**25. **25 Example 1.1 Modified grammar, reorganized:
1 S -> B b
2 S -> a a
3 S -> b B a
4 B -> a
Input: "a ... "
SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b
LR(1): State 0:
S' -> . S , $
S -> . B b , $
S -> . a a , $
S -> . b B a , $
B -> . a , b

**26. **26 Example 1.1 Modified grammar, reorganized:
1 S -> B b
2 S -> a a
3 S -> b B a
4 B -> a
Input: "a ... "
SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b
LR(1): State 0:
S' -> . S , $
S -> . B b , $
S -> . a a , $
S -> . b B a , $
B -> . a , b

**27. **27 Example 1.1 Modified grammar, reorganized:
1 S -> B b
2 S -> a a
3 S -> b B a
4 B -> a
Input: "a ... "
SLR: Follow(B) = {a, b} shift/reduce conflict on a, reduce on b
LR(1): State 0:
S' -> . S , $
S -> . B b , $
S -> . a a , $
S -> . b B a , $
B -> . a , b

**28. **28 LR(1) vs LALR We went through the previous example with LALR in the class.
The states and transitions were mostly the same as those of LR(1).

**29. **29 Example 2 Assignment statement with variables:
1 S -> V = V
2 S -> V = V + V
3 V -> id
Use LR(0), SLR, LR(1), LALR. Shown in the class.
SLR doesn't know that an initial V can't be followed by '+' or $ (EOF).
LR(1) knows it; the '=' that must follow is attached to V rule:
s0: S -> . V = V , $
S -> . V = V + V , $
V -> . id , = (due to previous two lines)
go to s1 on token "id"
s1: V -> id . , =
reduce rule 3 only if followed by '='

**30. **30 Example 3 Another grammar (for regular expression a*ba*b):
1 S -> X X
2 X -> a X
3 X -> b
Create the LR(0), SLR and LR(1) tables for table-driven parsing.
Draw the states and state transitions for one of these tables.
Compare it to the minimal a*ba*b Finite State Machine below.
In LR(0), not looking ahead before reducing adds extra states.
In LR(1), considering all types of lookahead adds many extra states.
In all forms of table-driven parsing, we keep a stack and so we can also share some states that FSM can't (so we might have fewer states).

**31. **31 Example 4 A harder grammar:
1 S -> a X c
2 S -> b Y c
3 S -> a Y d
4 S -> b X d
5 X -> e
6 Y -> e
Use LR(0), SLR, LR(1), LALR
We did not yet do this example in the class. We will, later, to remember how to do table-driven parsing.