1 / 34

Syntax Analysis

Syntax Analysis. Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3. A motivating example. Create a desk calculator Challenges Non trivial syntax Recursive expressions (semantics) Operator precedence.

grady-fox
Download Presentation

Syntax Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3

  2. A motivating example • Create a desk calculator • Challenges • Non trivial syntax • Recursive expressions (semantics) • Operator precedence

  3. Solution (lexical analysis) /* desk.l */ %% [0-9]+ { yylval = atoi(yytext); return NUM ;} “+” { return PLUS ;} “-” { return MINUS ;} “/” { return DIV ;} “*” { return MUL ;} “(“ { return LPAR ;} “)” { return RPAR ;} “//”.*[\n] ; /* comment */ [\t\n ] ; /* whitespace */ . { error (“illegal symbol”, yytext[0]); }

  4. Solution (syntax analysis) /* desk.y */ %token NUM %left PLUS, MINUS %left MUL, DIV % start P %% P : E {printf(“%d\n”, $1) ;} E : NUM { $$ = $1 ;} | LPAR e RPAR { $$ = $2; } | e PLUS e { $$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e {$$ = $1 / $3; } ; %% #include “lex.yy.c” flex desk.l bison desk.y cc y.tab.c –ll -ly

  5. Solution (syntax analysis) a.out <input // input 7 + 5 * 3 22

  6. Subjects • The task of syntax analysis • Automatic generation • Error handling • Context free grammars and derivations • Ambiguous Grammars • Predictive Parsing (sketch only) • Bottom-up syntax analysis • Bison A parser generator Next week

  7. Basic Compiler Phases Source program (string) Front-End lexical analysis Tokens syntax analysis Abstract syntax tree semantic analysis Back-End Fin. Assembly

  8. Syntax Analysis (Parsing) • input • Sequence of tokens • output • Abstract Syntax Tree • Report syntax errors • unbalanced parenthesizes • [Create “symbol-table” ] • [Create pretty-printed version of the program] • In some cases the tree need not be generated (one-pass compilers)

  9. Handling Syntax Errors • Report and locate the error • Diagnose the error • Correct the error • Recover from the error in order to discover more errors • without reporting too many “strange” errors

  10. Example a := a * ( b + c * d ;

  11. The Valid Prefix Property • For every prefix tokens • t1, t2, …,ti that the parser identifies as legal: • there exists tokensti+1, ti+2, …, tnsuch thatt1, t2, …, tnis a syntactically valid program • If every token is considered as a single character: • For every prefix word u that the parser identifies as legal: • there exists a word w such that • u.w is a valid program

  12. Error Diagnosis • Line number • may be far from the actual error • The current token • The expected tokens • Parser configuration

  13. Error Recovery • Becomes less important in interactive environments • Example heuristics: • Search for a semi-column and ignore the statement • Try to “replace” tokens for common errors • Refrain from reporting 3 subsequent errors • Globally optimal solutions • For every input w, find a valid program w’ with a “minimal-distance” from w

  14. Designing a parser language design context-free grammar design Bison parser (c program)

  15. What is a grammar Derivations and Parsing Trees Ambiguous grammars Resolving ambiguity Context Free Grammar (Review)

  16. Non-terminals Start non-terminal Terminals (tokens) Context Free Rules<Non-Terminal>  Symbol Symbol … Symbol Context Free Grammars

  17. Example Context Free Grammar 1 S  S ; S 2 S  id := E 3 S  print (L) 4 E  id 5 E  num 6 E  E + E 7 E  (S, E) 8 L  E 9 L  L, E

  18. Show that a sentence is in the grammar (valid program) Start with the start symbol Repeatedly replace one of the non-terminals by a right-hand side of a production Stop when the sentence contains terminals only Rightmost derivation Leftmost derivation Derivations

  19. Example Derivations S S ; S 1 S  S ; S 2 S  id := E 3 S  print (L) 4 E  id 5 E  num 6 E  E + E 7 E  (S, E) 8 L  E 9 L  L, E S ; id := E id := E ; id := E id := num ; id := E id := num ; id := E + E id := num ; id := E + num id := num ; id :=num + num a 56 b 77 16

  20. The trace of a derivation Every internal node is labeled by a non-terminal Each symbol is connected to the deriving non-terminal Parse Trees

  21. Example Parse Tree S s S ; S s s ; S ; id := E id := E ; id := E id := E id := E id := num ; id := E id := E id := E id := num ; id := E + E id := num ; id := E + num num num id := num ; id :=num + num

  22. Two leftmost derivations Two rightmost derivations Two parse trees Ambiguous Grammars

  23. A Grammar for Arithmetic Expressions 1 E  E + E 2 E  E * E 3 E  id 4 E  (E)

  24. Non Ambiguous Grammarfor Arithmetic Expressions Ambiguous grammar • E  E + T • E  T • T  T * F • T  F • 5 F  id • 6 F  (E) 1 E  E + E 2 E  E * E 3 E  id 4 E  (E)

  25. Non Ambiguous Grammarsfor Arithmetic Expressions Ambiguous grammar • E  E + T • E  T • T  T * F • T  F • 5 F  id • 6 F  (E) • E  E * T • E  T • T  F + T • T  F • 5 F  id • 6 F  (E) 1 E  E + E 2 E  E * E 3 E  id 4 E  (E)

  26. Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars context free grammar parser tokens Efficient Parsers bison “Ambiguity errors” parse tree

  27. Top-Down (Predictive Parsing) LL Construct parse tree in a top-down matter Find the leftmost derivation For every non-terminal and token predict the next production Bottom-Up LR Construct parse tree in a bottom-up manner Find the rightmost derivation in a reverse order For every potential right hand side and token decide when a production is found Kinds of Parsers

  28. Example Grammar for Predictive Parsing 1 S  if E then S else S 2 S  begin S L 3 S  print (E) 4 L  end 5 L  ; S L 6 E  num = num

  29. Predictive Parser(utility functions) enum token IF, THEN, ELSE, BEGIN, END, PRINT, SEMI, NUM, EQ, LP, RP; extern enum token getToken(void); void advance(void) { tok= getToken(); } void eat(enum token t) { if (tok==t) advance(); else error(); }

  30. Predictive Parser (S) void S(void) {switch(tok) { case IF: eat(IF); E(); eat(THEN); S(); eat(ELSE); S(); break; case BEGIN: eat(BEGIN); S(); L(); break; case PRINT: eat(PRINT); eat(LP); E(); eat(RP); break; default: error(tok, “Expecting if, begin, or print”'); } } 1 S  if E then S else S 2 S  begin S L 3 S  print (E)

  31. Predictive Parser (L) void L(void) {switch(tok) { case END: eat(END); break; case SEMI: eat(SEMI); S(); L(); break; default: error(tok, “Expecting end or semicolumn''); } } 4 L  end 5 L  ; S L

  32. Predictive Parser (E) void E(void) {switch(tok) { case NUM: eat(NUM); eat(EQ) eat(NUM); break; default: error(tok, “Expecting a number”); } } 6 E  num = num

  33. Predictive Parser for Arithmetic Expressions • Grammar • C-code? • E  E + T • E  T • T  T * F • T  F • 5 F  id • 6 F  (E)

  34. Summary • Context free grammars provide a natural way to define the syntax of programming languages • Ambiguity may be resolved • Predictive parsing is natural • Good error messages • Natural error recovery • But not expressive enough • Next lesson LR bottom-up parsing

More Related