1 / 69

Bottom-up Parsing

Bottom-up Parsing. Reading Sections 4.5 and 4.7 from ASU. Predictive Parsing Summary. First and Follow sets are used to construct predictive tables For non-terminal A and input t, use a production A  a where t  First( a )

arissa
Download Presentation

Bottom-up Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU CPSC4600

  2. Predictive Parsing Summary • First and Follow sets are used to construct predictive tables • For non-terminal A and input t, use a production A  a where t  First(a) • For non-terminal A and input t, if e  First(A) and t  Follow(a), then use a production A  a where e First(a) • Recursive-descent without backtracking do not need the parse table explicitly CPSC4600

  3. Bottom-Up Parsing(1) • Bottom-up parsing is more general than top-down parsing • And just as efficient • Builds on ideas in top-down parsing • Bottom-up is the preferred method in practice CPSC4600

  4. Bottom-Up Parsing(2) • Table-driven using an explicit stack (non-recursive) • Stack can be viewed as containing both terminals and nonterminals • Basic operations: • Shift: Move terminals from input stream to the stack until the right-hand side of an appropriate production rule has been identified in the stack • Reduce: Replace the sentential form appearing on the stack (considered from top that matched the right-hand side of an appropriate production rule) with the nonterminal appearing on the left-hand side of the production. CPSC4600

  5. An Introductory Example • Bottom-up parsers don’t need left-factored grammars • Hence we can revert to the “natural” grammar for our example: E  T + E | T T  num * T | num | (E) • Consider the string: num * num + num CPSC4600

  6. num * num + num T  num num * T + num T  num * T T + num T  num T + T E  T T + E E  T + E E The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: Input Productions Used CPSC4600

  7. Right-most Derivation • In a right-most derivation, the rightmost nonterminal of a sentential form is replaced at each derivation step. • Question: find the rightmost derivation of the string num* num + num CPSC4600

  8. num * num + num T  num num * T + num T  num * T T + num T  num T + T E  T T + E E  T + E E Observation • Read the productions found by bottom-up parse in reverse (i.e., from bottom to top) • This is a rightmost derivation! CPSC4600

  9. Important Facts A bottom-up parser traces a rightmost derivation in reverse CPSC4600

  10. num * num + num num * T + num T + num T + T T + E E A Bottom-up Parse E T E T T + num num num * CPSC4600

  11. num * num + num A Bottom-up Parse in Detail (1) + num num num * CPSC4600

  12. num * num + num num * T + num A Bottom-up Parse in Detail (2) T + num num num * CPSC4600

  13. num * num + num num * T + num T + num A Bottom-up Parse in Detail (3) T T + num num num * CPSC4600

  14. num * num + num num * T + num T + num T + T A Bottom-up Parse in Detail (4) T T T + num num num * CPSC4600

  15. num * num + num num * T + num T + num T + T T + E A Bottom-up Parse in Detail (5) T E T T + num num num * CPSC4600

  16. num * num + num num * T + num T + num T + T T + E E A Bottom-up Parse in Detail (6) E T E T T + num num num * CPSC4600

  17. Bottom-up Parsing A trivial bottom-up parsing algorithm Let I = input string repeat pick a non-empty substring  of I where X  is a production if no such , backtrack replace one  by X in I until I = “S” (the start symbol) or all possibilities are exhausted CPSC4600

  18. Observations • The termination of the algorithm (when/if) • Running time of the algorithm • If there are more than one choices for the sub-string to be replaced (reduce) which one to choose? CPSC4600

  19. Where Do Reductions Happen Recall A bottom-up parser traces a rightmost derivation in reverse • Let  be a rightmost sentential form • Assume the next reduction is by X  • Then  is a string of terminals Why? Because X  is a step in a right-most derivation CPSC4600

  20. Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: • Shift • Reduce CPSC4600

  21. Shift • Shift: Move #(marking the part of the input that has been processed) one place to the right • Shifts a terminal to the left string ABC#xyz  ABCx#yz CPSC4600

  22. Reduce • Apply an inverse production at the right end of the left string • If A  xy is a production, then Cbxy#ijk  CbA#ijk CPSC4600

  23. #num * num + num shift num # * num + num shift num * # num + num shift num * num # + num reduce T  num num * T # + num reduce T  num * T T # + num shift T + # num shift T + num # reduce T  num T + T # reduce E  T T + E # reduce E  T + E E # The Example with Shift-Reduce Parsing CPSC4600

  24. #num * num + num A Shift-Reduce Parse in Detail (1) + num num num *  CPSC4600

  25. #num * num + num num # * num + num A Shift-Reduce Parse in Detail (2) + num num num *  CPSC4600

  26. #num * num + num num # * num + num num * # num + num A Shift-Reduce Parse in Detail (3) num + num num *  CPSC4600

  27. #num * num + num num # * num + num num * # num + num num * num # + num A Shift-Reduce Parse in Detail (4) + num num num *  CPSC4600

  28. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num A Shift-Reduce Parse in Detail (5) T + num num * num  CPSC4600

  29. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num T # + num A Shift-Reduce Parse in Detail (6) T T + num num num *  CPSC4600

  30. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num T # + num T + # num A Shift-Reduce Parse in Detail (7) T T num + num num *  CPSC4600

  31. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num T # + num T + # num T + num # A Shift-Reduce Parse in Detail (8) T T num + num num *  CPSC4600

  32. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num T # + num T + # num T + num # T + T # A Shift-Reduce Parse in Detail (9) T T T + num num num *  CPSC4600

  33. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num T # + num T + # num T + num # T + T # T + E # A Shift-Reduce Parse in Detail (10) T E T T num + num num *  CPSC4600

  34. #num * num + num num # * num + num num * # num + num num * num # + num num * T # + num T # + num T + # num T + num # T + T # T + E # E # A Shift-Reduce Parse in Detail (11) E T E T T + num num num *  CPSC4600

  35. The Stack • Left string can be implemented by a stack • Top of the stack is the # • Shift pushes a terminal on the stack • Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a non-terminal on the stack (production lhs) CPSC4600

  36. Key Issue (will be resolved by algorithms) • How do we decide when to shift or reduce? • Consider step: num # * num + num • We could reduce by T  num giving T # * num + num • A fatal mistake: No way to reduce to the start symbol E CPSC4600

  37. Conflicts • Generic shift-reduce strategy: • If there is a handle on top of the stack, reduce • Otherwise, shift • But what if there is a choice? • If it is legal to shift or reduce, there is a shift-reduce conflict • If it is legal to reduce by two different productions, there is a reduce-reduce conflict CPSC4600

  38. E  E + E | E * E | (E) | num Conflict Example Consider the ambiguous grammar: CPSC4600

  39. #num * num + num shift . . . . . . E * E # + num reduce E  E * E E # + num shift E + # num shift E + num# reduce E  num E + E # reduce E  E + E E # One Shift-Reduce Parse Input Action CPSC4600

  40. #num * num + num shift . . . . . . E * E # + num shift E * E + # num shift E * E + num # reduce E  num E * E + E# reduce E  E + E E * E # reduce E  E * E E # Another Shift-Reduce Parse Action Input CPSC4600

  41. Observations • In the second step E * E # + numwe can either shift or reduce by E  E * E • Choice determines associativity of + and * • As noted previously, grammar can be rewritten to enforce precedence • Precedence declarations are an alternative CPSC4600

  42. Overview • LR(k) parsing • L: scan input Left to right • R: produce rightmost derivation • k tokens of lookahead • LR(0) • zero tokens of look-ahead • SLR • Simple LR: like LR(0) but uses FOLLOW sets to build more “precise” parsing tables CPSC4600

  43. Basic Terminologies • Handle • A substring that matches the right side of a production whose reductionwith that production’s left side constitutes one step of the rightmost derivation of the string from the start nonterminal of the grammar CPSC4600

  44. Model of Shift-Reduce Parsing • Stack + input = current right-sentential form. • Locate the handle during parsing: • shift zero or more terminals (tokens) onto the stack until a handle b is on top of the stack. • Replace the handle with a proper non-terminal (Handle Pruning): • reduceb to A where A b CPSC4600

  45. Model of an LR Parser CPSC4600

  46. Problem: when to shift, when to reduce? • Recall grammar: E  T + E | T T  num * T | num | (E) • how to know when to reduce and when to shift? CPSC4600

  47. Model of Shift-Reduce Parsing • Stack + input = current right-sentential form. • Locate the handle during the parsing: • shift zero or terminals onto the stack until a handle is b on top of the stack. • Replace the handle with a proper non-terminal (Handle Pruning) CPSC4600

  48. What we need to know to do LR parsing • LR(0) states • describe states in which the parser can be • Note: LR(0) states are used by both LR(0) and SLR parsers • Parsing tables • transitions between LR(0) states, • actions to take at transition: • shift, reduce, accept, error • How to construct LR(0) states • How to construct parsing tables • How to drive the parser CPSC4600

  49. An LR(0) state = a set of LR(0) items • An LR(0) item [X --> a.b] says that • the parser is looking for an X • it has an aon top of the stack • expects to find in the input a string derived from b. • Notes: • [X --> a.ab] means that if a is on the input, it can be shifted. That is: • a is a correct token to see on the input, and • shifting a would not “over-shift” (still a viable prefix). • [X -->a.] means that we could reduce X CPSC4600

  50. E --> T + . E E --> .T E --> .T + E T --> .(E) T --> .num * T T --> .num LR(0) states E --> T + E. E T + E --> T. E --> T. + E S’ --> E . ( T num E T --> (. E) E --> .T E --> .T + E T --> .(E) T --> .num * T T --> .num T --> num. * T T --> num. ( S’ --> . E E --> . T E --> .T + E T --> .(E) T --> .num * T T --> .num num num num * T --> num * T. E T --> num * .T T --> .(E) T --> .num * T T --> .num T T --> (E.) ) T --> (E). ( CPSC4600 (

More Related