1 / 19

Languages and Compilers (SProg og Oversættere)

Languages and Compilers (SProg og Oversættere). Parsing. Parsing. Describe the purpose of the parser Discuss top down vs. bottom up parsing Explain necessary conditions for construction of recursive decent parsers Discuss the construction of an RD parser from a grammar.

tawny
Download Presentation

Languages and Compilers (SProg og Oversættere)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Languages and Compilers(SProg og Oversættere) Parsing

  2. Parsing • Describe the purpose of the parser • Discuss top down vs. bottom up parsing • Explain necessary conditions for construction of recursive decent parsers • Discuss the construction of an RD parser from a grammar

  3. Top-Down vs Bottom-Up parsing LR-Analyse (Bottom-Up) LL-Analyse (Top-Down) Reduction Derivation Look-Ahead Look-Ahead

  4. Development of Recursive Descent Parser (1) Express grammar in EBNF (2) Grammar Transformations: Left factorization and Left recursion elimination (3) Create a parser class with • private variable currentToken • methods to call the scanner: accept and acceptIt (4) Implement private parsing methods: • add private parseNmethod for each non terminal N • public parsemethod that • gets the first token form the scanner • calls parseS (S is the start symbol of the grammar)

  5. Recursive Descent Parsing Sentence ::= Subject Verb Object . Subject ::= I | aNoun | theNoun Object ::= me | aNoun | the Noun Noun ::= cat | mat| rat Verb ::= like| is | see | sees Define a procedure parseN for each non-terminal N private void parseSentence() ; private void parseSubject(); private void parseObject(); private void parseNoun(); private void parseVerb();

  6. Recursive Descent Parsing public class MicroEnglishParser { private TerminalSymbol currentTerminal; //Auxiliary methods will go here ... //Parsing methods will go here ... }

  7. Recursive Descent Parsing: Auxiliary Methods public class MicroEnglishParser { private TerminalSymbol currentTerminal private void accept(TerminalSymbol expected) { if (currentTerminal matchesexpected) currentTerminal = next input terminal; else report a syntax error } ... }

  8. Recursive Descent Parsing: Parsing Methods Sentence ::= Subject Verb Object . private void parseSentence() { parseSubject(); parseVerb(); parseObject(); accept(‘.’); }

  9. Recursive Descent Parsing: Parsing Methods Subject ::= I | aNoun | theNoun private void parseSubject() { if (currentTerminal matches‘I’) accept(‘I’); else if (currentTerminal matches‘a’) { accept(‘a’); parseNoun(); } else if (currentTerminal matches‘the’) { accept(‘the’); parseNoun(); } else report a syntax error }

  10. Recursive Descent Parsing: Parsing Methods Noun ::= cat | mat| rat private void parseNoun() { if (currentTerminal matches‘cat’) accept(‘cat’); else if (currentTerminal matches‘mat’) accept(‘mat’); else if (currentTerminal matches‘rat’) accept(‘rat’); else report a syntax error }

  11. LL 1 Grammars • The presented algorithm to convert EBNF into a parser does not work for all possible grammars. • It only works for so called “LL 1” grammars. • Basically, an LL1 grammar is a grammar which can be parsed with a top-down parser with a lookahead (in the input stream of tokens) of one token. • What grammars are LL1? How can we recognize that a grammar is (or is not) LL1? • We can deduce the necessary conditions from the parser generation algorithm. • We can use a formal definition

  12. LL 1 Grammars parseX* while (currentToken.kind is in starters[X]) { parseX } Condition: starters[X] must be disjoint from the set of tokens that can immediately follow X * parseX|Y switch (currentToken.kind) { cases instarters[X]: parseX break; cases instarters[Y]: parseY break; default: report syntax error } Condition: starters[X] and starters[Y] must be disjoint sets.

  13. Formal definition of LL(1) • A grammar G is LL(1) iff • for each set of productions M ::= X1| X2 | … | Xn : • starters[X1], starters[X2], …, starters[Xn] are all pairwise disjoint • If Xi =>* ε then starters[Xj]∩ follow[X]=Ø, for 1≤j≤ n.i≠j • If G is ε-free then 1 is sufficient

  14. Converting EBNF into RD parsers • The conversion of an EBNF specification into a Java implementation for a recursive descent parser is so “mechanical” that it can easily be automated! • => JavaCC “Java Compiler Compiler”

  15. JavaCC and JJTree

  16. LR parsing • The algorithm makes use of a stack. • The first item on the stack is the initial state of a DFA • A state of the automaton is a set of LR0/LR1 items. • The initial state is constructed from productions of the form S:= •a [, $] (where S is the start symbol of the CFG) • The stack contains (in alternating) order: • A DFA state • A terminal symbol or part (subtree) of the parse tree being constructed • The items on the stack are related by transitions of the DFA • There are two basic actions in the algorithm: • shift: get next input token • reduce: build a new node (remove children from stack)

  17. JavaCUP: A LALR generator for Java Definition of tokens Regular Expressions Grammar BNF-like Specification JFlex JavaCUP Java File: Scanner Class Recognizes Tokens Java File: Parser Class Uses Scanner to get TokensParses Stream of Tokens Syntactic Analyzer

  18. Steps to build a compiler with SableCC • Create a SableCC specification file • Call SableCC • Create one or more working classes, possibly inherited from classes generated by SableCC • Create a Main class activating lexer, parser and working classes • Compile with Javac

  19. Hierarchy

More Related