270 likes | 387 Views
Bottom-Up Syntax Analysis. Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3. Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program
 
                
                E N D
Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3
Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars context free grammar parser tokens Efficient Parsers bison “Ambiguity errors” parse tree
Top-Down (Predictive Parsing) LL Construct parse tree in a top-down matter Find the leftmost derivation For every non-terminal and token predict the next production Bottom-Up LR Construct parse tree in a bottom-up manner Find the rightmost derivation in a reverse order For every potential right hand side and token decide when a production is found Kinds of Parsers
Input A context free grammar A stream of tokens Output A syntax tree or error Method Construct parse tree in a bottom-up manner Find the rightmost derivation in (reversed order) For every potential right hand side and token decide when a production is found Report an error as soon as the input is not a prefix of valid program Bottom-Up Syntax Analysis
Pushdown automata Bottom-up parsing (given a parser table) Constructing the parser table Interesting non LR grammars Plan
Pushdown Automaton input u t w $ V control parser-table $ stack
reduceA   Pop | | symbol from the stack Apply the associated action Push a symbol goto[top, A] on the stack shiftX Push X onto the stack Advance the input accept Parsing is complete error Report an error Bottom-Up Parser Actions
A Parser Table for S a S b|  Manual Construction?
The Challenge • How to construct a parser-table from a given grammar • LR(1) grammars • Left to right scanning • Rightmost derivations (reverse) • 1 token • Different solutions • Operator precedence • SLR(1) • Simple LR(1) • CLR(1) • Canonic LR(1) • LALR(1) • Look Ahead LR(1) • Yacc, Bison, JCUP
Grammar Hierarchy Non-ambiguous CFG CLR(1) LL(1) LALR(1) SLR(1)
Constructing an SLR parsing table • Add a production S’  S$ • Construct a finite automaton accepting “valid stack symbols” • The states of the automaton becomes the states of parsing-table • Determine shift operations • Determine goto operations • Construct reduce entries by analyzing the grammar
A finite Automaton for S’  S$ S a S b|  a a S b 0 1 2 3 S 4
Constructing a Finite Automaton • NFA • For X  X1 X2 … Xn • [X  X1 X2 …XiXi+1 … Xn] • “prefixes of rhs (handles)” • X1 X2 … Xi is at the top of the stack and we expect Xi+1 … Xn • The initial state [S’  .S$] • ([X  X1…XiXi+1 … Xn], Xi+1 = [X  X1 …XiXi+1  … Xn] • For every production Xi+1   ([[X  X1 X2 …XiXi+1 … Xn],  ) = [Xi+1   ] • Convert into DFA
a S   b  S  NFA S’  S$ S a S b|  [S .aSb] [S a.Sb] [S aS.b] [S’ .S$] [S .] [S aSb.] [S’ S.$]
  DFA   [S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$] a S [S .aSb] [S a.Sb] [S aS.b] [S’ .S$] b S [S .] [S aSb.] [S’ S.$]
[S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$]
Filling reduce entries • For an item [A .] we need to know the tokens that can follow A in a derivation from S’ • Follow(A) = {t | S’ * At} • See the textbook for an algorithm for constructing Follow from a given grammar
[S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$] Follow(S) = {b, $} r S  r S  r S  r S  r S a S b
Interesting Non SLR(1) Grammar S’  S$ S  L = R | R L  *R | id R  L Partial DFA [S L=.R] [R .L] [L .*R] [L .id] [S’ .S$] [S .L=R] [S .R] [L .*R] [L .id] [R L] [S L.=R] [R L.] = L Follow(R)= {$, =}
LR(1) Parser • Item [A ., t] •  is at the top of the stack and we are expecting t • LR(1) State • Sets of items • LALR(1) State • Merge items with the same look-ahead
Interesting Non LR(1) Grammars • Ambiguous • Arithmetic expressions • Dangling-else • Common derived prefix • A  B1 a b | B2 a c • B1   • B2   • Optional non-terminals • St  OptLab Ass • OptLab  id : |  • Ass  id := Exp
Summary • LR is a powerful technique • Generates efficient parsers • Generation tools exit • Bison, yacc, CUP • But some grammars need to be tuned • Shift/Reduce conflicts • Reduce/Reduce conflicts • Efficiency of the generated parser