1 / 26

Parsing

Parsing. CSCI 432 Computer Science Theory Much of this material is adapted or stolen from Compilers - Principles, Techniques and Tools aka "The Dragon Book" by Aho , Sethi , and Ullman. Parse Trees. Given the first example of a CFG from the previous lecture: S → bA A → aA A → e

kimc
Download Presentation

Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parsing CSCI 432 Computer Science Theory Much of this material is adapted or stolen from Compilers - Principles, Techniques and Tools aka "The Dragon Book" by Aho, Sethi, and Ullman

  2. Parse Trees Given the first example of a CFG from the previous lecture: S → bA A → aA A → e The Parse Tree for the input "baa" is S b A a A a A e

  3. Parse Trees • Parse Trees allow us to visualize which particular rules were used in which order to match the input. • The leaves match the valid input. • The root (top) of the tree is the main non-terminal. • Interior nodes are non-terminals.

  4. Example Parse Tree stmt var= E ; countnum 0 stmt → var = E ; E → E OP E E → ( E ) E → var | num OP → + | - | * | / Is this a valid statement? count = 0;

  5. Example Parse Tree stmt var= E ; count E OP E E OP E +num var/num1 total 3 stmt → var = E ; E → E OP E E → ( E ) E → var | num OP → + | - | * | / Is this a valid statement? count = total / 3 + 1 ;

  6. stmt var= E ; count E OP E E OP E +num var/num1 total 3 Ambiguous Grammar stmt → var = E ; E → E OP E E → ( E ) E → var | num OP → + | - | * | / This statement count = total / 3 + 1 ; can be interpreted two ways. stmt var= E ; count E OP E var/ E OP E total num + num 3 1

  7. Practice Parse Tree S 1 S 1 S 0 A 0 S 1 A 1 S e e CFG for binary strings with an even number of 0s and 1s. S → 1S S → 0A0S S → e A → 1A A → e Draw the parse tree for the string 110101

  8. Top Down Parsing (recursive descent) Start at the root Preorder Search if node is nonterminal, replace with a production if node is terminal that matches next input, then move on if node is terminal that does not match next input, then back up and try a different production if no productions match, then input is invalid

  9. a Top Down Parsing Example CFG: S → cAd A→ ab | a Input: cad Parse Tree: S c A d a b stolen from page 182 of Compilers.., by Aho, Sethi, and Ullman

  10. Practice Final Parse Tree: S 1 S 1 S 0 A 0 S 1 A 1 S e e S 1 S 1 S 0 A 0 S 1 A 1 S e e CFG for binary strings with an even number of 0s and 1s. S → 1S S → 0A0S S → e A → 1A A → e Draw the parse tree for the string 110101

  11. Top Down Parsing (predictive) Start at the root Preorder Search if node is nonterminal, use the next input symbol to determine which production to use if node is terminal that matches next input, then move on if node is terminal that does not match next input, then back up and try a different production if no productions match, then input is invalid

  12. Predictive Example CFG: stmt → if expr thenstmtelsestmt | while expr dostmt | var= operation ; Input: while E1 do if E2 then cnt = cnt + 1; adapted from page 183 of Compilers.., by Aho, Sethi, and Ullman

  13. Grammars for Predictive CFG: stmt → if expr thenstmtelsestmt | ifexpr then stmt If the next input symbol is "if" then we don't have a good idea of which production to use. This might lead to expensive backtracking. Solution: "left factor" the grammar stmt → if expr thenstmt S' S' → elsestmt | e

  14. Left Recursive Grammars E → E + T | T T → T * F | F F → ( E ) | id So the input "X + Y * Z" creates E E + T T T * F F F id id id Z X Y Try to create that parse tree using our Top Down parsing algorithm.

  15. Removing Left Recursion Algorithm to Remove Left Recursion All Left Recursive grammars have a general form of: A → A B | C We can rewrite that rule into two equivalent rules: A → C A' A' → B A' | e

  16. Removing Left Recursion Example of rewriting a grammar E → E + T | T A B C A → C A' E → T E' A' → B A' | e E' → + T E' | e

  17. Left Recursive Grammars So, this grammar E → E + T | T T → T * F | F F → ( E ) | id Is rewritten to E → T E' E' → + T E' | e T → F T' T' → * F T' | e F → ( E ) | id stolen from page 176 of Compilers.., by Aho, Sethi, and Ullman

  18. Top Down Parsing with Left Recursion removed E → T E' E' → + T E' | e T → F T' T' → * F T' | e F → ( E ) | id Use our Predictive Parsing Algorithm to draw the Parse Tree for "X + Y * Z"

  19. Implementing a Predictive Parser stmt → while expr do stmt expr → ( bool ) stmt: while expr do stmt expr: ( bool ) Terminals : follow transition NonTerminals : push onto stack Accepting States : pop Example: while ( A < 100 ) do A = A + 5;

  20. Implementation This grammar E → T E' E' → + T E' | e T → F T' T' → * F T' | e F → ( E ) | id Can be implemented with this table: stolen from page 188 of Compilers.., by Aho, Sethi, and Ullman

  21. Example A + B adapted from page 188 of Compilers.., by Aho, Sethi, and Ullman

  22. Error Recovery • Panic Mode Recovery • discard input tokens until synchronizing token (eg ";") is found • Phrase-Level Recovery • insert something, eg ";" • Error Productions • add production rules to the grammar for common errors stolen from page 166 of Compilers.., by Aho, Sethi, and Ullman

  23. Bottom Up Parsing Shift-Reduce Parsing Algorithm • Scan input left-to-right for patterns that match the right sides of any rules. • Replace the left-most pattern ("handle") with the left side of the corresponding rule, thus reducing the input string. • Picking the correct handle is the tricky part.

  24. Example Given the CFG: S → aABe A → Abc | b B → d The following sentence is reduced to S. abbcdethere are 3 choices to reduce: b, b, and d; pick left-most aAbcdenow the left-most handle is Abc aAdeonly choice is d aABeonly one possible handle Ssuccess stolen from page 195 of Compilers.., by Aho, Sethi, and Ullman

  25. Picking the correct handle E → E + T | T T → T * F | F F → ( E ) | id A + B * C F + id * id T + id * id E + id * id E + F * id E + T * id E + T * id E * id E + T * F E * FE + T E * T E E * E

  26. LR Parsers L = "left to right scanning" R = "right most derivation in reverse" • efficient, non-backtracking, shift reduce • works with a large set of grammars • works for all programming languages • finds errors close to their source • extremely difficult to create by hand • but YACC can automatically create the decision tables for you • works off of two tables to determine state, stack, and when to reduce

More Related