1 / 62

Compiler Structures

Compiler Structures. 241-437 , Semester 1 , 2011-2012. Objective describe semantic analysis with attribute grammars, as applied in yacc and recursive descent parsers. 8. Attribute Grammars. Overview. 1. What is an Attribute Grammar? 2. Parse Tree Evaluation 3. Attributes

nadda
Download Presentation

Compiler Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Structures 241-437, Semester 1, 2011-2012 • Objective • describe semantic analysis with attribute grammars, as applied in yacc and recursive descent parsers 8. Attribute Grammars

  2. Overview 1. What is an Attribute Grammar? 2. Parse Tree Evaluation 3. Attributes 4. Attribute Grammars and yacc 5. A Grid Grammar 6. Recursive Descent and Attributes

  3. Source Program In this lecture Lexical Analyzer Front End Syntax Analyzer Semantic Analyzer Int. Code Generator concentrating on attribute grammars Intermediate Code Code Optimizer Back End Target Code Generator Target Lang. Prog.

  4. 1. What is an Attribute Grammar? • An attribute grammar is a context free grammar with semantic actions attached to some of the productions • semantic = meaning • An action specifies the meaning of a production in terms of its body terminals and nonterminals.

  5. Example Attribute Grammar Production Semantic Action LEE E+TE  TT T*FT  FF (E) F  num printf(Ebody.val)E.val:= Ebody.val + Tbody.valE.val:= Tbody.valT.val:= Tbody.val * Fbody.valT.val:= Fbody.valF.val:= Ebody.valF.val:=value(num)

  6. 2. Parse Tree Evaluation • One way of understanding semantic actions is as extra information (attributes) attached to the nodes of the parse tree for the input. • The semantic action specifies the parent node attribute in terms of the attributes of its children.

  7. Basic Parse Tree Input: 9 * 5 + 2 L LEE E+TE  TT T*FT  FF (E) F  num E E + T T F T 2 F * F 5 9

  8. Adding Meaning to the Tree • What is the meaning of "9 * 5 + 2"? • the answer is to evaluate it, to get 47 • Add attributes to the tree, starting from the leaves and working up to the root • use the semantic actions to get the attribute values

  9. Parse Tree with Actions L printf 47 printf(Ebody.val)E.val:= Ebody.val + Tbody.valE.val:= Tbody.valT.val:= Tbody.val * Fbody.valT.val:= Fbody.valF.val:= Ebody.valF.val:=value(num) E 47 E + 45 T 2 T 45 F 2 T 9 2 F * 5 F 5 9 evaluate bottom-up 9

  10. 3. Attributes • Attribute values can be • numbers, strings, any data structures,code, assembly language instructions • It's not always necessary to build a parse tree in order to evaluate the grammar's action.

  11. Kinds of Attribute • There are two main kinds of attribute evaluation: • synthesized and inherited attributes • The value of a synthesized attribute is calculated by using its body values • as in the previous example

  12. Synthesized Attributes in a Tree • Example: ProductionSemantic Action T T*F T.val:= Tbody.val * Fbody.val T 45 T F 9 * 5 evaluate bottom-up

  13. Inherited Attributes • An inherited attribute for a body symbol (i.e. terminal, non-terminal) gets its value from the other body symbols and the parent value • often used for evaluating more complex programming language features

  14. Inherited Attributes in a Tree • Two examples: A.a X.x := function(A.a, Y.y) X.x Y.y Direction ofevaluation A.a Y.y := function(A.a, X.x) X.x Y.y

  15. 4. Attribute Grammars and yacc • yacc supports (synthesized) attribute grammars • yacc actions are semantic actions • no parse tree is needed, since yacc evaluates the actions using the parser's built-in stack

  16. expr.y Again declarations %token NUMBER %% exprs: expr '\n' { printf("Value = %d\n", $1); } | exprs expr '\n' { printf("Value = %d\n", $2); } ; expr: expr '+' term { $$ = $1 + $3; } | expr '-' term { $$ = $1 - $3; } | term { $$ = $1; } ; attributes actions continued

  17. term: term '*' factor { $$ = $1 * $3; } | term '/' factor { $$ = $1 / $3; } /* integer division */ | factor ; factor: '(' expr ')' { $$ = $2; } | NUMBER ; more actions continued

  18. $$ #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); // the syntax analyzer return 0; } c code

  19. Evaluation in yacc Input: 3 * 5 + 4\n Stack $$3$F$T$T *$T*5$T*F$T$E$ E+$ E+4$ E+F$ E+T$E$ E \n$ Es val_3333 3 53 5151515 15 415 415 41919 19 Input 3*5+4\n$*5+4\n$*5+4\n$*5+4\n$5+4\n$+4\n$+4\n$+4\n$ +4\n$ 4\n$\n$\n$\n$\n$$$ Stack Actionshiftreduce F numreduce TFshiftshiftreduce F num reduce TT*Freduce ETshiftshiftreduce F num reduce TFreduce EE+Tshiftreduce EsE \naccept Semantic Action$$ = $1 (implicit)$$ = $1 (implicit)$$ = $1 (implicit)$$ = $1 * $3$$ = $1 (implicit)$$ = $1(implicit)$$ = $1(implicit)$$ = $1 + $3printf $1

  20. 5. A Grid Grammar • A robot starts at (0,0) on a grid, and is given compass directions: • n = north, s = south, e = east, w = west • Evaluate the sequence of directions to work out the final position of the robot.

  21. Example • The robot receives the directions: • n e e n n w • what is the 'meaning' (semantics) of the directions? • the 'meaning' is the final robot position, (1,3) n e w start final s

  22. 5.1. Grid Grammar Input: n w s s robot robot  pathpath  path step | estep  e | w | s | n path path step path step s path step s path step w e n

  23. Grid Attribute Grammar Production Semantic Actions robot  pathpath  path steppath estep  estep  w step  s step  n printf( pathbody.(x,y) )path.x := pathbody.x + stepbody.dx path.y := pathbody.y + stepbody.dy path.(x,y) = (0,0) step.(dx,dy) := (1,0)step.(dx,dy) := (-1,0)step.(dx,dy) := (0,-1)step.(dx,dy) := (0,1)

  24. Data Types (x,y) • The path rules use (x,y), the position of the robot. • The step rules use (dx,dy), the step taken by the robot. • Implementing these data types requires new features of yacc. dx,dy

  25. Parse Tree with Actions Input: n w s s printf (-1,-1) robot (-1,-1) path (-1,0) path step 0,-1 (-1,1) path step 0,-1 s path step -1,0 (0,1) s path step (0,0) 0,1 w evaluate bottom-up e n

  26. 5.2. Non-integer Yacc Attributes • The default yacc attributes (e.g. $$, $1, etc) are integers. • We want data structures for (x,y) and (dx,dy), coded as two struct types.

  27. Defining New Types • The new types are collected together inside a %union in the yacc definitions section: %union{ type1 name1; type2 name2; . . .} • For the grid: %union{ struct (int x, int y; } pos; struct (int dx, int dy; } offset;}

  28. Using the Types • The non-terminals that return the new types must be listed. • Any tokens that use the types must be listed. • For the grid: % type <offset> step% type <pos> path these non-terminals return values of the specified type

  29. Using Typed Variables • If an attribute (variable) is a record, then dotted-name notation is used to refer to its fields • e.g. $$.dx, $1.y • The default action ($$ = $1) will cause an error if $$ and $1 are not the same type.

  30. 5.3. Grid Compiler grid.l, a flex file flex lex.yy.c gridEval, c executable #include gcc grid.y, a bison file bison grid.tab.c $ flex grid.l $ bison grid.y $ gcc grid.tab.c -o gridEval

  31. Usage $ ./gridEval nwss Robot is at (-1,-1) $ ./gridEval n n n w w w s e Robot is at (-2,2) $ I typed these lines. I typed ctrl-D

  32. grid.l %% [nN] {return NORTH;} [sS] {return SOUTH;} [eE] {return EAST;} [wW] {return WEST;} [ \n\t] ; %% int yywrap(void) { return 1; }

  33. grid.y %union{ struct { int x; int y; } pos; struct { int dx; int dy; } offset; } %token EAST WEST NORTH SOUTH %type <offset> step %type <pos> path %% type definitions types use by the non-terminals continued

  34. robot: path { printf("Robot is at (%d,%d)\n", $1.x, $1.y); } ; path: path step {$$.x = $1.x + $2.dx; $$.y = $1.y + $2.dy;} | {$$.x = 0; $$.y = 0;} ; step: EAST {$$.dx = 1; $$.dy = 0;} | WEST {$$.dx = -1; $$.dy = 0;} | SOUTH {$$.dx = 0; $$.dy = -1;} | NORTH {$$.dx = 0; $$.dy = 1;} ; %% continued

  35. #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); return 0; }

  36. 6. Recursive Descent and Attributes • It is easy to add semantic actions to a recursive descent parser • in many cases, there's no need for the parser to build a parse tree in order to evaluate the attributes • The basic translation strategy: • each production becomes a function continued

  37. The function (e.g. f()) calls other functions representing its body non-terminals • those functions return values (attributes) to f() • f() combines the values, and returns a value (attribute)

  38. 6.1. The Expressions Parser Again • The basic LL(1) grammar: Stats => ( [ Stat ] \n )* Stat => let ID = Expr | Expr Expr => Term ( (+ | - ) Term )* Term => Fact ( (* | / ) Fact ) * Fact => '(' Expr ')' | Int | Id

  39. An Expressions Program (test3.txt) 5 + 6  give answer let x = 2  declare variable 3 + ( (x*y)/2) // comments // y let x = 5 let y = x /0  error // comments

  40. 6.2. Parsing with Actions • exprParse1.c is a recursive descent parser using the expressions language. • It differs from exprParse0.c by having semantic actions attached to its productions • these actions evaluate the expressions, and assign values to expression variables

  41. Grammar with Actions • Productions Actions Stats => ( [ Stat ] \n )* --- Stat => let ID = Expr add id to symbol table; id.val = expr.val; print( id.val ); Stat => Expr print( expr.val ); continued

  42. Expr => Term ( (+ | - ) Term )* return term1.val (+| -) term2.val (+| -) ... termn.val; Term => Fact ( (* | / ) Fact ) * return fact1.val (*| /) fact2.val (*| /) ... factn.val; continued

  43. Fact => '(' Expr ') return expr.val; Fact => Int return int.val; Fact => Id lookup id; if not found then add (id, 0) to table; return id.val;

  44. The Symbol Table • The symbol table is a data structure used to store expression variables and their values. • In exprParse1.c, it's an array of structs, with each struct holding the name of the variable and its current integer value. . . . . id value syms[]

  45. 6.3. Usage $ gcc -Wall -o exprParse1 exprParse1.c $ ./exprParse1 < test3.txt == 11 x being declared x = 2 y being declared == 3 x = 5 Error: Division by zero; using 1 instead y = 5 $

  46. 6.4. exprParse1.c Callgraph same as in exprParse0.c generated from grammar (now with actions) symbol table (new)

  47. 6.5. Symbol Table Data Structures #define MAX_SYMS 15 // max no of variables typedef struct SymInfo { char *id; // name of variable int value; // value (an integer) } SymbolInfo; int symNum = 0; // number of symbols stored SymbolInfo syms[MAX_SYMS]; syms[] . . . . id value 0 1 2 14

  48. Symbol Table Functions SymbolInfo *getIDEntry(void) /* find _OR_ create symbol table entry for current tokString; return a pointer to it */ { SymbolInfo *si = NULL; if ((si = lookupID(tokString)) != NULL) // already declared return si; // add id to table printf("%s being declared\n", tokString); return addID(tokString, 0); //0 is default value } // end of getIDEntry()

  49. SymbolInfo *lookupID(char *nm) /* is nm in the symbol table? return pointer to struct or NULL */ { int i; for(i=0; i<symNum; i++) if (!strcmp(syms[i].id, nm)) return &syms[i]; return NULL; } // end of lookupID()

  50. SymbolInfo *addID(char *nm, int value) /*add nm and value to the symbol table; return pointer to struct */ { if (symNum == MAX_SYMS) { printf("Symbol table full; cannot add %s\n", nm); exit(1); } syms[symNum].id = (char *) malloc(strlen(nm)+1); strcpy(syms[symNum].id, nm); syms[symNum].value = value; SymbolInfo *si = &syms[symNum]; symNum++; return si; } // end of addID()

More Related