1 / 12

Review: How do we define a grammar (what are the components in a grammar)?

Review: How do we define a grammar (what are the components in a grammar)? What is a context free grammar? What is the language defined by a grammar? What is an ambiguous grammar? Why we care about left or right derivation?. Example: <PROGRAM> ->’program’ id ‘begin’ <stmt_list> ‘end’

archie
Download Presentation

Review: How do we define a grammar (what are the components in a grammar)?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review: • How do we define a grammar (what are the components in a grammar)? • What is a context free grammar? • What is the language defined by a grammar? • What is an ambiguous grammar? • Why we care about left or right derivation?

  2. Example: <PROGRAM> ->’program’ id ‘begin’ <stmt_list> ‘end’ <STMT_LIST> -> <STMT> ‘;’<STMT_LIST> | <STMT> <STMT> -> id ‘=‘ <EXPR> <EXPR> -><EXPR> <OP> <EXPR> | id <OP> -> ‘+’ | ‘-’ | ‘*’ | ‘/’ program test begin t0 = t1 + t2; t3 = t0 * t4 end program test begin t0 = t1+t2; t3 = t0*t4 end * <PROGRAM> ==>

  3. Parsing: • The process to determine whether the start symbol can derive the program. • If successful, the program is a valid program. • If failed, the program is invalid. • Two approaches in general. • Expanding from the start symbol to the whole program (top down) • Reduction from the whole program to start symbol (bottom up).

  4. Parsing methods: • universal: • There exists algorithms that can parse any context free grammar. These algorithms are too inefficient to be used anywhere. • What is considered efficient? Scan the program (from left to right) once. • Top-down parsing • build the parse tree from root to leave (using leftmost derivation, why?). • Recursive descent, and LL parser • Bottom-up parsing • build the parse tree from leaves to root. • Operator precedence parsing, LR (SLR, canonical LR, LALR).

  5. Recursive descent parsing associates a procedure with each nonterminal in the grammar, it may require backtracking of the input string. • Example: <type>-><simple> | ^ id | array [<sample>] of <type> <simple> ->integer | char | num dotdot num void type() { if (lookahead == INTEGER || lookahead == CHAR || lookahead==NUM) simple(); else if (lookahead == ‘^’) { match (‘^’); match(ID); } else if (lookahead == ARRAY) { match (ARRAY); match(‘[‘); simple(); match (‘]’); match (OF); type(); } else error(); }

  6. Example: <type>-><simple> | ^ id | array [<simple>] of <type> <simple> ->integer | char | num dotdot num void simple() { if (lookahead == INTEGER) match (INTEGER); else if (lookahead == CHAR) match (CHAR); else if (lookahead == NUM) { match(NUM); match(DOTDOT); match(NUM); } else error(); } void match(token t) { if (lookahead == t) {lookahead = nexttoken();} else error(); }

  7. Recursive descent parsing may require backtracking of the input string • try out all productions, backtrack if necessary. • E.g S->cAd, A->ab | a • input string cad • A special case of recursive-descent parser that needs no backtracking is called a predictive parser. • Look at the input string, must predict the right production every time to avoid backtracking. • Needs to know what first symbols can be generated by the right side of a production only lookahead for one token)

  8. First(a) - the set of tokens that can appear as the first symbols of one or more strings generated from a. If a is empty string or can generate empty string, then empty string is also in First(a). • Given productions A ->a | b, predictive (by looking at 1 token ahead) parsing requires First(a) and First(b) to be disjoint. • Predictive parsing won’t work on some type of grammars: • Left recursion: A->Aw (expanding A results in an infinite loop). • Have common left factor: A->aB | aC (First(aB) and First(aC) is not disjoint).

  9. Eliminating Left Recursion • Immediate Left Recursion • Replace A->Aa | b with A->bA’ and A’->aA’ | e • Example: E->E+T | T T->T*F | F F->(E) | id • In general, Can be replaced by

  10. Algorithm 4.1. Eliminating left recursion: Arrange the nonterminals in some order A1, A2, …, An for i = 1 to n do begin for j = 1 to I-1 do begin expand production of the form Ai ->Aj w end for eliminate the immediate left recursion among Ai productions. End for (the algorithm can fail if the grammar has a cycle (A==> A), or A->e)

  11. Example 1: S->Aa | b A->Ac | Sd | e Example 2: X->YZ | a Y->ZX |Xb Z->XY | ZZ | a

  12. Left factoring (to produce a grammar suitable for predictive parsing) • replace productions by Example: S->iEtS | iEtSeS|a E->b

More Related