1 / 48

CSCI 3130: Formal Languages and Automata Theory Tutorial 5

CSCI 3130: Formal Languages and Automata Theory Tutorial 5. Hung Chun Ho Office: SHB 1026. Department of Computer Science & Engineering. 1. Agenda. Cocke -Younger- Kasami (CYK) algorithm Parsing CFG in normal form Pushdown Automata (PDA) Design. 2. CYK Algorithm.

nadine
Download Presentation

CSCI 3130: Formal Languages and Automata Theory Tutorial 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCI 3130: Formal Languages andAutomata TheoryTutorial 5 Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

  2. Agenda • Cocke-Younger-Kasami (CYK) algorithm • Parsing CFG in normal form • Pushdown Automata (PDA) • Design 2

  3. CYK Algorithm Bottom-up Parsing for normal form 3

  4. S  AB A  CC | a | c B  BC | b C  CB | BA | c Example Normal Form • Every production is of type • X  YZ • X  a • S  ε Cocke-Younger-Kasami Algorithm • Used to parse context-free grammar in Chomsky normal form (or simply normal form) 4

  5. CYK Algorithm - Idea • = Algorithm 2 in Lecture Note (10L8.pdf) • Idea: Bottom Up Parsing • Algorithm: Given a string s of length N For k = 1 to N For every substring of length k Determine what variable(s) can derive it 5

  6. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm - Example • CFG • Parse abbc 6

  7. CYK Algorithm – Idea (1) • Idea: We parse the strings in this order: • Length-1 substring abbc abbc abbc abbc 7

  8. CYK Algorithm – Idea (1) • Idea: We parse the strings in this order: • Length-2 substring abbc abbc abbc 8

  9. CYK Algorithm – Idea (1) • Idea: We parse the strings in this order: • Length-3 substring abbc abbc • Length-4 substring abbc • Done! 9

  10. CYK Algorithm – Idea (2) • Idea: Parsing of longer substrings depends on parsing of shorter substrings • Example: abb may be decomposed as • ab + b • a + bb • If we know how to parse ab and b (or, a and bb) then we know how to parse abb 10

  11. CYK Algorithm – Substring • Denote sub(i, j) := substring with start index = i and end index = j • Example: For abbc, sub(2,4) = bbc • This notation is not to complicate things, but just for the sake of convenience in the following discussion… 11

  12. CYK Algorithm – Table • Each cell corresponds to a substring • Store variables deriving the substring Substring of length = 3Starting with index = 2 Length of Substring i.e., sub(2,3) = bbc a b b c 12 Start Index of Substring

  13. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • Base Case : length = 1 • The possible choices of variable(s) can be known by scanning through each production A B B A , C a b b c 13

  14. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • Loop : length = 2 • For each substring of length 2 • Decompose into shorter substrings • Check cells below it ab Let’s parse this substring a b b c 14

  15. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(1,2) = ab, it can be decomposed: • ab = a + b = sub(1,1) + sub(2,2) • Possible choices: AB • Scan rules : S S a b b c 15

  16. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(2,3) = bb, it can be decomposed: • bb = b + b = sub(2,2) + sub(3,3) • Possible choices: BB • Scan rules No suitable rules are found The CFG cannot parse this substring : ∅ ∅ a b b c 16

  17. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(3,4) = bc, it can be decomposed: • bc = b + c = sub(3,3) + sub(4,4) • Possible choices: BA, BC • Scan rules : B, C B, C a b b c 17

  18. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(1,3) = abb: • abb = ab + b = sub(1,2) + sub(3,3) • Possible choices: SB • Scan rules No suitable variables found yetBut, there is another way to decompose the string : ∅ a b b c 18

  19. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(1,3) = abb: • abb = a + bb = sub(1,1) + sub(2,3) • Possible choices: ∅ • Scan rules Cant parse smaller substring Cant parse the string No need to scan rules a b b c 19

  20. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(1,3) = abb: • abb = sub(1,1) + sub(2,3) gives no valid parsing • abb = sub(1,2) + sub(3,3) gives no valid parsing • Cannot parse ∅ a b b c 20

  21. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • For sub(2,4) = bbc: • bbc = sub(2,2) + sub(3,4) • Possible choices: BB, BC • bbc = sub(2,3) + sub(4,4) • Possible choices: ∅  Variable: B B a b b c 21

  22. S  AB A  CC | a | c B  BC | b C  CB | BA | c CYK Algorithm – Simulation • Finally, for sub(1,4) = abbc: • Possible choices: • Variables: This cell represents the original string, and it consists Sabbc is in the language AB , SB, SC S a b b c 22

  23. CYK Algorithm – Parse Tree • abbc is in the language! • How to obtain the parse tree? • Tracing back the derivations: • sub(1,4) is derived using SAB from sub(1,1) and sub(2,4) • sub(1,1) is derived using Aa • sub(2,4) is derived using BBC from sub(2,2) and sub(3,4) • … • So, record also the used derivations! 23

  24. CYK Algorithm – Parse Tree • Obtained from the table a b b c 24

  25. CYK Algorithm – Conclusion • A bottom up parsing algorithm • Dynamic Programming • Solution of a subproblem (parsing of a substring) depends on that of smaller subproblems • Before employing CYK Algorithm, convert the grammar into normal form • Remove ε-productions • Remove unit-productions 25

  26. CYK Algorithm – Detailed D = “On input w = w1w2…wn: If w = ε, and S  ε is rule, Accept For i = 1 to n: For each variable A: Test whether A  b is a rule, where b = wi. If so, place A in table(i, i). For l = 2 to n: For i = 1 to n – l + 1: Let j = i + l – 1, For k = i to j – 1: For each rule A  BC: If table(i,k) contains B and table(k+1, j) contains CPut A in table(i, j) If S is in table (1,n), accept. Otherwise, reject.” 26

  27. Pushdown Automata NFA with infinite memory/states 27

  28. Pushdown Automata • PDA ~= NFA, with a stack of memory • Transition: • NFA – Depends on input • PDA – Depends on input and top of stack • Push a symbol to stack • Pop a symbol to stack • Read a terminal on string • Transitions are non-deterministic (possibly ε) (possibly ε) (possibly ε) 28

  29. Pushdown Automata and NFA • Accept: • NFA – Go to an Accept state • PDA – Go to an Accept state 29

  30. PDA – Example 1 • Given the following language: • Design a PDA for it L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 30

  31. PDA – Example 1 - Idea • Idea: The input has two sections • First half • All ‘0’s • Second half • All ‘1’s • #‘1 depends on #‘0’ • #‘0’ ≤ #‘1’ ≤ #‘0’ × 2 31

  32. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Solution • Solution: L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 32

  33. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • Let’s try some string… w = 00111 • See white board for simulation… L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 33

  34. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • Indicates the start of parsing L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 34

  35. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • This part saves information about #‘0’ • # ‘X’ in stack = #‘0’ L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 35

  36. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • This part accounts for #‘1’ • #‘0’ ≤ #‘1’ ≤ #‘0’ × 2 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 36

  37. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • Consume one ‘X’ and eats one ‘1’ L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 37

  38. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • Consume one ‘X’ and eats two ‘1’ L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 38

  39. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • Consume one ‘X’, and then • eats one ‘1’, or • eat two ‘1’ L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 39

  40. 1,X/e 0,e/X q1 e,e/e e,$/e e,e/$ 1,X/X 1,X/e q0 q2 q3 PDA – Example 1 – Explain • Solution: • Indicates the end of parsing L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 40

  41. PDA – Example 2 • Given the following language: • Design a PDA for it L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 41

  42. PDA – Example 2 – Idea • Idea: • Sequentially read (multiple) ‘a’, ‘b’, ‘c’ and ‘d’ • Maintain: • #‘a’ + #‘c’ • #‘b’ + #‘d’ • If these numbers equal • Accept 42

  43. c,$/$X b,X/e a,e/X d,X/e c,X/XX q4 q1 q2 q3 e,e/e e,e/e e,e/e e, $ /e e,e/$ b,$/$Y c,Y/e • d,$/$Y q5 b,Y/YY d,Y/YY PDA – Example 2 – Solution • Solution: L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 43

  44. c,$/$X b,X/e a,e/X d,X/e c,X/XX q2 q1 q3 q4 e,e/e e,e/e e,e/e e, $ /e e,e/$ b,$/$Y c,Y/e • d,$/$Y q5 b,Y/YY d,Y/YY PDA – Example 2 – Explain • Solution: a b c d start end L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 44

  45. c,$/$X b,X/e a,e/X d,X/e c,X/XX q4 q1 q2 q3 e,e/e e,e/e e,e/e e, $ /e e,e/$ b,$/$Y c,Y/e • d,$/$Y q5 b,Y/YY d,Y/YY PDA – Example 2 – Explain • Solution: • Each X in stack = An extraa or c L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 45

  46. c,$/$X b,X/e a,e/X d,X/e c,X/XX q4 q1 q2 q3 e,e/e e,e/e e,e/e e, $ /e e,e/$ b,$/$Y c,Y/e • d,$/$Y q5 b,Y/YY d,Y/YY PDA – Example 2 – Explain • Solution: • Each Y in stack = An extrab or d L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 46

  47. c,$/$X b,X/e a,e/X d,X/e c,X/XX q4 q1 q2 q3 e,e/e e,e/e e,e/e e, $ /e e,e/$ b,$/$Y c,Y/e • d,$/$Y q5 b,Y/YY d,Y/YY PDA – Example 2 – Explain • Solution: • X and Y ‘cancel’ each other • The stack contains only X’s or only Y’s L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 47

  48. c,$/$X b,X/e a,e/X d,X/e c,X/XX q4 q2 q3 q1 e,e/e e,e/e e,e/e e, $ /e e,e/$ b,$/$Y c,Y/e • d,$/$Y q5 b,Y/YY d,Y/YY PDA – Example 2 – Explain • Solution: • No X’s and no Y’s means • #a + #c = #b + #d Accept L = { aibjckdl: i,j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 48

More Related