1 / 45

Chapter 2

Chapter 2. Chang Chi-Chung 2008.03 rev.1. A Simple Syntax-Directed Translator. This chapter contains introductory material to Chapters 3 to 8 To create a syntax-directed translator that maps infix arithmetic expressions into postfix expressions . Building a simple compiler involves:

meir
Download Presentation

Chapter 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 2 Chang Chi-Chung 2008.03 rev.1

  2. A Simple Syntax-Directed Translator • This chapter contains introductory material to Chapters 3 to 8 • To create a syntax-directed translator that maps infix arithmetic expressions into postfix expressions. • Building a simple compiler involves: • Defining the syntax of a programming language • Develop a source code parser: for our compiler we will use predictive parsing • Implementing syntax directed translation to generate intermediate code

  3. A Code Fragment To Be Translated To extend syntax-directed translator to map code fragments into three-address code. See appendix A. { int i; int j; float[100] a; float v; float x; while (true) { do i = i + 1; while ( a[i] < v ); do j = j – 1; while ( a[j] > v ); if ( i>= j ) break; x = a[i]; a[i] = a[j]; a[j] = x; } } 1: i = i + 1 2: t1 = a [ i ] 3: if t1 < v goto 1 4: j = j -1 5: t2 = a [ j ] 6: if t2 > v goto 4 7: ifFalse i >= j goto 9 8: goto 14 9: x = a [ i ] 10: t3 = a [ j ] 11: a [ i ] = t3 12: a [ j ] = x 13: goto 1 14:

  4. A Model of a Compiler Front End Lexical analyzer Parser Intermediate Code Generator Source program Tokenstream Syntaxtree Three-address code CharacterStream Symbol Table

  5. Abstract syntax trees Tree-Address instructions do-while body > [ ] v assign i + a i i 1 Two Forms of Intermediate Code 1: i = i + 1 2: t1 = a [ i ] 3: if t1 < v goto 1

  6. Syntax Definition • Using Context-free grammar (CFG) • BNF: Backus-Naur Form • Context-free grammar has four components: • A set of tokens (terminal symbols) • A set of nonterminals • A set of productions • A designated start symbol

  7. Example of CFG • G = <T, N, P, S> • T = { +,-,0,1,2,3,4,5,6,7,8,9} • N = { list, digit} • P = • list  list + digit • list  list – digit • list  digit • digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 • S = list

  8. Derivations • The set of all strings (sequences of tokens) generated by the CFG using derivation • Begin with the start symbol • Repeatedly replace a nonterminal symbol in the current sentential form with one of the right-hand sides of a production for that nonterminal

  9. Example of the Derivations listlist+digit list-digit+digit digit-digit+digit 9 -digit+digit 9 - 5 +digit 9 - 5 + 2 • Production • list  list + digit • list  list – digit • list  digit • digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 • Leftmost derivation • replaces the leftmost nonterminal (underlined) in each step. • Rightmost derivation • replaces the rightmost nonterminal in each step.

  10. A X Y Z Parser Trees • Given a CFG, a parse tree according to the grammar is a tree with following propertes. • The root of the tree is labeled by the start symbol • Each leaf of the tree is labeled by a terminal (=token) or  • Each interior node is labeled by a nonterminal • If A  X1 X2 … Xn is a production, then node A has immediate children X1, X2, …, Xn where Xi is a (non)terminal or  ( denotes the empty string) • Example • A XYZ

  11. list list digit list digit digit The sequence ofleafs is called theyield of the parse tree 9 - 5 + 2 Example of the Parser Tree • Parse tree of the string 9-5+2 using grammar G

  12. Ambiguity • Consider the following context-free grammar • This grammar is ambiguous, because more than one parse tree represents the string 9-5+2 G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string> P = string string+string | string-string | 0 | 1 | … | 9

  13. string string string string string string string string string string 9 - 5 + 2 9 - 5 + 2 Ambiguity (Cont’d)

  14. Associativity of Operators • Left-associative • If an operand with an operator on both sides of it, then it belongs to the operator to its left. • string a+b+c has the same meaning as (a+b)+c • Left-associative operators have left-recursive productions • left  left + term | term • Right-associative • If an operand with an operator on both sides of it, then it belongs to the operator to its right. • string a=b=c has the same meaning as a=(b=c) • Right-associative operators have right-recursive productions • right  term = right | term

  15. right list letter right list digit letter right list digit letter digit a = b = c a + b + c Associativity of Operators (cont’d) left-associative right-associative

  16. Precedence of Operators • String 9+5*2 has the same meaning as 9+(5*2) • * has higher precedence than + • Constructs a grammar for arithmetic expressions with precedence of operators. • left-associative : + - (expr) • left-associative:* / (term) Step 1: factordigit | ( expr ) Step 3: expr  expr + term | expr – term | term Step 2: term  term * factor | term / factor | factor Step 4: expr expr+term | expr – term | termterm  term *factor | term /factor | factorfactor digit | ( expr )

  17. An Example: Syntax of Statements • The grammar is a subset of Java statements. • This approach prevents the build-up of semicolons after statements such as if- and while-, which end with nested substatements. stmt  id = expression ; | if ( expression ) stmt | if ( expression ) stmt else stmt | while( expression ) stmt | do stmt while ( expression ) ; | { stmts } stmts  stmts stmt| 

  18. Syntax-Directed Translation • Syntax-Directed translation is done by attaching rules or program fragments to productions in a grammar. • Translate infix expressions into postfix notation. ( in this chapter ) • Infix: 9 – 5 + 2 • Postfix: 9 5 – 2 + • An Example • expr expr1+term • The pseudo-code of the translation translate expr1 ; translate term ; handle + ;

  19. Syntax-Directed Translation (Cont’d) • Two concepts (approaches) related to Syntax-Directed Translation. • Synthesized Attributes • Syntax-directed definition • Build up a translation by attaching strings (semantic rules) as attributes to the nodes in the parse tree. • Translation Schemes • Syntax-directed translation • Build up a translation by program fragments which are called semantic actions and embedded within production bodies.

  20. Syntax-directed definition • The syntax-directed definition associates • With each grammar symbol (terminals and nonterminals), a set of attributes. • With each production, a set of semantic rules for computing the values of the attributesassociated with the symbols appearing in the production. • An attribute is said to be • Synthesized • if its value at a parse-tree node is determined from attribute values at its children and at the node itself. • Inherited • if its value at a parse-tree node is determined from attribute values at the node itself, its parent, and its siblings in the parse tree.

  21. expr.t = “95-2+” expr.t = “95-” term.t = “2” expr.t = “9” term.t = “5” term.t = “9” 9 - 5 + 2 An Example: Synthesized Attributes • An annotated parse tree • Suppose a node N in a parse tree is labeled by grammar symbol X. • The X.a is denoted the value of attribute a of X at node N.

  22. Semantic Rules

  23. Depth-First Traversals • Tree traversals • Breadth-First • Depth-First • Preorder: N L R • Inorder: L N R • Postorder: L R N • Depth-First Traversals: Postorder、From left to right procedure visit(node N){ for ( each child C of N, from left to right ) {visit(C); } evaluate semantic rules at node N;}

  24. expr.t = 95-2+ expr.t = 95- term.t = 2 expr.t = 9 term.t = 5 term.t = 9 9 - 5 + 2 Example: Depth-First Traversals Note: all attributes are the synthesized type

  25. rest { print(“+”) } + term rest Translation Schemes • A translation scheme is a CFG embedded with semantic actions • Example • rest +term{ print(“+”) }rest Embedded Semantic Action

  26. expr expr + term { print(‘+’) } expr - term { print(‘-’) } 2 { print(‘2’) } term 5 { print(‘5’) } 9 { print(‘9’) } An Example: Translation Scheme expr expr+term { print(‘+’) }expr  expr–term { print(‘-’) }expr  termterm  0 { print(‘0’) } term  1 { print(‘1’) }…term  9 { print(‘9’) }

  27. Parsing • The process of determining if a string of terminals (tokens) can be generated by a grammar. • Time complexity: • For any CFG there is a parser that takes at most O(n3) time to parse a string of n terminals. • Linear algorithms suffice to parse essentially all languages that arise in practice. • Two kinds of methods • Top-down: constructs a parse tree from root to leaves • Bottom-up: constructs a parse tree from leaves to root

  28. Top-Down Parsing • Recursive descentparsing is a top-down method of syntax analysis in which a set of recursive procedures is used to process the input. • One procedure is associated with each nonterminal of a grammar. • If a nonterminal has multiple productions, each production is implemented in a branch of a selection statement based on input lookahead information • Predictive parsing • A special form of recursive descent parsing • The lookahead symbol unambiguously determines the flow of control through the procedure body for each nonterminal.

  29. stmt ( optexpr optexpr stmt for ; optexpr ; ) expr expr other ε An Example: Top-Down Parsing stmt  expr; | if ( expr) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other optexpr  | expr

  30. void stmt() { switch ( lookahead ) { case expr: match(expr); match(‘;’); break; case if: match(if); match(‘(‘); match(expr); match(‘)’); stmt(); break; case for: match(for); match(‘(‘); optexpr(); match(‘;’); optexpr(); match(‘;’); optexpr(); match(‘)’); stmt(); break; case other: match(other); break; default: report(“syntax error”); }} void optexpr() { if ( lookahead == expr ) match(expr);} void match(terminal t) { if ( lookahead == t ) lookahead = nextTerminal; else report(“syntax error”);} Pseudocode For a Predictive Parser stmt expr; | if ( expr) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other Use ε-Productions optexpr  | expr

  31. for optexpr optexpr optexpr stmt ( ; ; ) lookahead Example: Predictive Parsing Parse Tree LL(1) stmt optexpr() match(‘;‘) optexpr() match(‘)‘) stmt() match(for) match(‘(‘) optexpr() match(‘;‘) Input for ( ; expr ; expr ) other

  32. FIRST • FIRST() is the set of terminals that appear as the first symbols of one or more strings generated from  •  is Sentential Form • Example • FIRST(stmt) = { expr, if, for, other } • FIRST(expr ;) = { expr } stmt expr; | if ( expr) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other

  33. Examples: First type simple|^ id|array [ simple ] of typesimple integer|char| num dotdot num FIRST(simple) = { integer, char, num } FIRST(^ id) = { ^ } FIRST(type) = { integer, char, num, ^, array }

  34. Designing a Predictive Parser • A predictive parser is a program consisting of a procedure for every nonterminal. • The procedure for nonterminal A • It decides which A-production to use by examining the lookahead symbol. • Left Factor • Left Recursion • ε Production • Mimics the body of the chosen production. • Applying translation scheme • Construct a predictive parser, ignoring the actions. • Copy the actions from the translation scheme into the parser

  35. Left Factor • Left Factor • One production for nonterminal A starts with the same symbols. • Example: stmt  if ( expr ) stmt | if ( expr ) stmt else stmt • Use Left Factoring to fix it stmt if ( expr )stmt rest rest else stmt |ε

  36. Left Recursion • Left Recursive • A production for nonterminal A starts with a self reference. • A Aα | β • An Example: • expr expr + term | term • Rewrite the left recursive to right recursive by using the following rules. A βR R  αR |ε

  37. A A … R R A … A R A R ε Example: Left and Right Recursive right recursive left recursive

  38. + - 2 expr 5 9 expr term expr term helper term 9 - 5 + 2 Abstract and Concrete Syntax

  39. Conclusion: Parsing and Translation Scheme • Give a CFG grammar G as below: expr expr+term { print(‘+’) } expr  expr–term { print(‘-’) } expr  term term  0 { print(‘0’) } term  1 { print(‘1’) } … term  9 { print(‘9’) } • Semantic actions for translating into postfix notation.

  40. Conclusion: Parsing and Translation Scheme • Step 1 • To elimination left-recursion • Technique A Aα | Aβ| γ into A γR R  αR |βR| ε • Use the rule to transforms G.

  41. Conclusion: Parsing and Translation Scheme • Left-Recursion-elimination expr term rest rest +term { print(‘+’) }rest |–term { print(‘-’) }rest |ε term  0 { print(‘0’) } term  1 { print(‘1’) } … term  9 { print(‘9’) }

  42. expr rest term 9 { print(‘9’) } term { print(‘-’) } rest - term 5 { print(‘5’) } + { print(‘+’) } rest 2 { print(‘2’) } ε An Example: Left-Recursion-elimination expr term rest rest +term { print(‘+’) } rest| –term { print(‘-’) } rest| εterm  0 { print(‘0’) } | 1 { print(‘1’) } |…|9 { print(‘9’) }

  43. Conclusion: Parsing and Translation Scheme void expr() { term(); rest(); } void rest() { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); rest(); } else if ( lookahead == ‘-’ ) { match(‘-’); term(); print(‘-’); rest(); } else { } //do nothing with the input } void term() { if ( lookahead is a digit ) { t = lookahead; match(lookahead); print(t); } else report(“syntax error”);} • Step 2 • Procedures for Nonterminals.

  44. Conclusion: Parsing and Translation Scheme • Step 3 • Simplifying the Translator void rest() { while ( true ) { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); continue; } else if (lookahead == ‘-’) { match(‘-’); term(); print(‘-’); continue; } break; } } void rest() { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); rest(); } else if (lookahead == ‘-’) { match(‘-’); term(); print(‘-’); rest(); } else { }

  45. Conclusion: Parsing and Translation Scheme import java.io.*; class Parser{ static int lookahead; public Parser() throws IOException{ lookahead = System.in.read(); } void expr() { term(); while ( true ) { if ( lookahead == ‘+’ ) { match(‘+’); term(); System.out.write(‘+’); continue; } elseif (lookahead == ‘-’) { match(‘-’); term(); System.out.write(‘-’); continue; } elsereturn; } • Complete void term() throws IOException { if (Character.isDigit((char)lookahead){ System.out.write((char)lookahead); match(lookahead); } else throw new Error(“syntax error”); } void match(int t) throws IOException { if ( lookahead == t ) lookahead = System.in.read();else throw new Error(“syntax error”); } }

More Related