1 / 24

CSC 3130: Automata theory and formal languages

Fall 2008. The Chinese University of Hong Kong. CSC 3130: Automata theory and formal languages. LR( k ) grammars. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. LR(0) example from last time. 4. A  aA•b. a. A. b. 2. 5. A  a•Ab A  a•b A  •aAb A  •ab. 1. A  aAb•.

mahdis
Download Presentation

CSC 3130: Automata theory and formal languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fall 2008 The Chinese University of Hong Kong CSC 3130: Automata theory and formal languages LR(k) grammars Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130

  2. LR(0) example from last time 4 A  aA•b a A b 2 5 A  a•Ab A  a•b A  •aAb A  •ab 1 A  aAb• a A  •aAb A •ab b 3 A  ab• A  aAb | ab

  3. LR(0) parsing example revisited S Input A Stack a 1 1 1a2 1a2a2 1a2a2b3 1a2A4 1a2A4b5 1A aabb abb bb b b   1 2 2 3 4 5 2 A  •aAb A •ab S S S R S R a A  a•Ab A  a•b A  •aAb A  •ab 3 b A  ab• A • A 5 4 b a A • • • • A  aAb• A  aA•b b a b • • A  aAb | ab A  aAb  aabb

  4. Meaning of LR(0) items eNFA transitions to: X  •g A undiscovered part shift focus to subtree rooted at X (if X is nonterminal) b a X • focus A  aX•b A  a•Xb move past subtreerooted at X

  5. Outline of LR(0) parsing algorithm • Algorithm can perform two actions: • What if: no complete itemis valid there is one valid item,and it is complete reduce (R) shift (S) some valid itemscomplete, some not more than one validcomplete item R / R conflict S / R conflict

  6. Definition of LR(0) grammar • A grammar is LR(0) if S/R, R/R conflicts never occur • LR means parsing happens left to right and produces a rightmost derivation • LR(0) grammars are unambiguous and have a fastparsing algorithm • Unfortunately, they are not “expressive” enoughto describe programming languages

  7. Hierarchy of context-free grammars context-free grammars parse using CYK algorithm (slow) LR(∞) grammars … java perl python … LR(1) grammars LR(0) grammars parse using LR(0) algorithm

  8. A grammar that is not LR(0) S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a

  9. A grammar that is not LR(0) S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a possibilities: shift (3), reduce (4)reduce (5), shift (6) S valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a S S A A B A A A S/R, R/R conflicts! a a a a a a c • • •

  10. Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a peek inside! S valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a S S A A B A A A a a a a a a c • • •

  11. Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a a peek inside! S valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a A A … a a • action: shift parse tree must look like this

  12. Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a a a peek inside! S valid LR(0) items: A  a•A, A  a• A  •aA, A  •a A A A … a a • action: shift parse tree must look like this

  13. Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a a a S valid LR(0) items: A  a•A, A  a• A  •aA, A  •a A A A a a a • action: reduce parse tree must look like this

  14. LR(0) items vs. LR(1) items A LR(1) A LR(0) A A b b a a • • b b a a A A A  a•Ab [A  a•Ab, b] a a b b A  aAb | ab

  15. LR(1) items • LR(1) items are of the formto represent this state in the parsing [A  a•b, x] or [A  a•b, e] A A x a b a b • •

  16. Outline of LR(1) parsing algorithm • Step 1: Build eNFA that describes valid item updates • Step 2: Convert eNFA to DFA • As in LR(0), DFA will have shift and reduce states • Step 3: Run DFA on input, using stack to remember sequence of states • Use lookahead to eliminate wrong reduce items

  17. Recall eNFA transitions for LR(0) • States of eNFA will be items (plus a start state q0) • For every item S  •a we have a transition • For every item A  •X we have a transition • For every item A  a•Cb and production C  •d e q0 S  •a X A  •X A  X• e A  •C C  •d

  18. eNFA transitions for LR(1) • For every item [S  •a, e]we have a transition • For every item A  •X we have a transition • For every item [A  a•Cb, x] and production C  dfor every y in FIRST(bx) e q0 [S  •a, e] X [A  •X, x] [A  X•, x] e [A  •C, x] [C  •d, y]

  19. FIRST sets • Example FIRST(a) is the set of terminals that occuron the left in some derivation starting from a FIRST(a) = {a} FIRST(A) = {a}FIRST(S) = {a, c} FIRST(bAc) = {b} FIRST(BA) = {a} FIRST(e) = ∅ S  A(1) | cB(2)A  aA(3) | a(4)B  a(5) | ab(6)

  20. Explaining the transitions A A x x b b a X a X • • X [A  •X, x] [A  X•, x] C b A y • d x b a C • e [A  •C, x] [C  •d, y] y ∈ FIRST(bx)

  21. Example [S  A•, e] S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) A [A  •aA, e] e [S  •A, e] [A  •a, e] e e . . . q0 [S  B•c, e] e B e [S  •Bc, e] [B  •a,c] e [B  •ab,c]

  22. Convert NFA to DFA • Each DFA state is a subset of LR(1) items, e.g. • States can contain S/R, R/R conflicts • But lookahead can always resolve such conflicts [A  a•A, ] [A  a•, ] [B  a•, c] [B  a•b, c] [A  •aA, ] [A  •a, ]

  23. Example S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) look ahead! input valid items A stack  a ab B Bc S abc bc c c   [S  •A, ] [S  •Bc, ] [A  •aA, ] [A  •a, ] [B  •a, c] [B  •ab, c] SS R S R [A  a•A, ] [A  a•, ] [B  a•, c] [B  a•b, c][A  •aA, ] [A  •a, ] [B  ab•, c] [S  B•c, ] [S  Bc•, ]

  24. LR(k) grammars • A context-free grammar is LR(1) if all S/R, R/Rconflicts can be resolved with one lookahead • More generally, LR(k) grammars can resolve allconflicts with k lookahead symbols • Items have the form [A  •, x1...xk] • LR(1) grammars describe the semantics of mostprogramming languages

More Related