1 / 33

Chap. 5, Top-Down Parsing

Chap. 5, Top-Down Parsing. J. H. Wang Mar. 29, 2011. Outline. Overview LL(k) Grammars Recursive-Descent LL(1) Parsers Table-Driven LL(1) Parsers Obtaining LL(1) Grammars A Non-LL(1) Language Properties of LL(1) Parsers Parse Table Representation Syntactic Error Recovery and Repair.

horace
Download Presentation

Chap. 5, Top-Down Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011

  2. Outline • Overview • LL(k) Grammars • Recursive-Descent LL(1) Parsers • Table-Driven LL(1) Parsers • Obtaining LL(1) Grammars • A Non-LL(1) Language • Properties of LL(1) Parsers • Parse Table Representation • Syntactic Error Recovery and Repair

  3. Overview • Two forms of top-down parsers • Recursive-descent parsers • Table-driven LL parsers: LL(k) – to be explained later • Compiler compilers (or parser generators) • CFG as a language’s definition, parsers can be automatically constructed • Language revision, update, or extension can be easily applied to a new parser • Grammar can be proved unambiguous if parser construction is successful

  4. Top-Down Parsing • Top-down • To grow a parse tree from root to leaves • Predictive • Must predict which production rule to be applied • LL(k) • Scan input left to right, leftmost derivation, k symbol lookahead • Recursive descent • Can be implemented by a set of mutually recursive procedures

  5. LL(k) Grammars • Recall from Chap.2 • A parsing procedure for each nonterminal A • The procedure is responsible for accomplishing one step of derivation for the corresponding production • Choosing production by inspecting the next k tokens. Predict Set for production A is the set of tokens that trigger the production • Predict Set is determined by the right-hand side (RHS) 

  6. We need a strategy for choosing productions • Predictk(p): the set of length-k token strings that predict the application of rule p • Input string: a* • S=>*lmAy1…yn • P={pProductionsFor(A)|aPredict(p)} • P: empty set -> syntax error • P: more than one productions -> nondeterminism • P: exactly one production

  7. How to Compute Predict(p) • To predict production p: AX1…Xm, m>=0 • The set of terminal symbols that are first produced in some derivation from X1…Xm • Those terminal symbols that can follow A • (Fig. 5.1)

  8. For LL(1) grammar, the productions for each nonterminal A must have disjoint predict sets • Not all CFGs are LL(1) • More lookahead may be needed: LL(k), k>1 • A more powerful parsing method may be required (Chap. 6) • The grammar may be ambiguous

  9. S MATCH PEEK ADVANCE ERROR

  10. Recursive-Descent LL(1) Parsers • Input: token stream ts • PEEK(): to examine the next input token without advancing the input • ADVANCE(): to advances the input by one token • To construct a recursive-descent parser • We write a separate procedure for each nonterminal A • For each production pi, we check each symbol in the RHS X1…Xm • Terminal symbol: MATCH(ts, Xi) • Nonterminal symbol: call Xi(ts)

  11. PEEK PEEK PEEK

  12. PEEK MATCH PEEK MATCH PEEK MATCH MATCH PEEK PEEK MATCH PEEK PEEK MATCH PEEK

  13. Table-Driven LL(1) Parsers • Creating recursive-descent parsers can be automated, but • Size of parser code • Inefficiency: overhead of method calls and returns • To create table-driven parsers, we use stack to simulate the actions by MATCH() and calls to nonterminals’ procedures • Terminal symbol: MATCH • Nonterminal symbol: table lookup • (Fig. 5.8)

  14. PARSER PUSH MATCH POP PEEK ERROR APPLY APPLY POP PUSH

  15. How to Build LL(1) Parse Table • The table is indexed by the top-of-stack (TOS) symbol and the next input token • Row: nonterminal symbol • Column: next input token • (Fig. 5.9)

  16. ILL ABLE

  17. Obtaining LL(1) Grammars • It’s easy to violate the requirement of a unique prediction for each combination of nonterminal and lookahead symbols • Common prefixes • Left recursion

  18. Common Prefixes • Two productions for the same nonterminal begin with the same string of grammar symbols • Ex. (Fig. 5.12) Not LL(k) • Factoring transformation • Fig. 5.13 • Ex. (Fig. 5.14)

  19. ACTOR

  20. LIMINATE EFT ECURSION

  21. Left Recursion • A production is left recursive if its LHS symbol is also the first symbol of its RHS • E.g. StmtList  StmtList ; Stmt • AA |  • (Fig. 5.15 & Fig. 5.16)

  22. A Non-LL(1) Language • Almost all common programming language constructs: LL(1) • One exception: if-then-else (dangling else program) • Can be resolved by mandating that each else is matched to its closest unmatched then • (Fig. 5.17)

  23. Ambiguous (Chap. 6) • E.g. if expr then if expr then other else other • If expr then { if expr then other else other } • If expr then { if expr then other } else other • -> at least two distinct parses • Dangling bracket language (DBL) • DBL={[i]j|i≥j≥0} • if expr then Stmt -> [ (opening bracket) • else Stmt -> ] (optional closing bracket)

  24. Fig. 5.18(a) • S  [ S CL | λCL  ] | λ • E.g. [[] • Fig. 5.18(b) • S  [ S | TT  [ T ] | λ

  25. It’s not LL(k) • [Predict( S[S )[Predict( ST )[[Predict2( S[S )[[Predict2( ST )…[kPredictk( S[S )[kPredictk( ST )

  26. Properties of LL(1) Parsers • A correct, leftmost parse is constructed • All grammars in LL(1) are unambiguous • All table-driven LL(1) parsers operate in linear time and space with respect to the length of the parsed input

  27. Thanks for Your Attention!

More Related