1 / 26

Yacc

Yacc. BNF grammar example. y. example.tab.c. YACC. C compiler +linker. Executable. Other modules. Yacc: what is it?.

chico
Download Presentation

Yacc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Yacc BNF grammar example.y example.tab.c YACC C compiler +linker Executable Other modules

  2. Yacc: what is it? Yacc: a tool for automatically generating a parser given a grammar written in a yacc specification (.y file). The grammars accepted are LALR(1) grammars with disambiguating rules. A grammar specifies a set of production rules, which define a language. A production rule specifies a sequence of symbols, sentences, which are legal in the language.

  3. Structure of Yacc • Usually Lex/Yacc work together • yylex(): to get the next token • To call the parser, the function yyparse()is invoked

  4. How the parser works • The parser produced by Yacc consists of a finite state machine with a stack • A move of the parser is done as follows: • Calls to yylex to obtain the next token when needed • Using the current state, and the lookahead token, the parser decides on its next action (shift, reduce, accept or error) and carries it out

  5. Skeleton of a yacc specification (.y file) {declarations} %% {rules} %% {user code} Rules: <production> action Grammar type 2 productions Action: C code that specifies what to do when a production is reduced

  6. Skeleton of a yacc specification (.y file) %{ < C global variables, prototypes, comments > %} [DEFINITION SECTION] %% [PRODUCTION RULES SECTION] %% < C auxiliary subroutines> This part will be embedded into *.c contains token declarations. Tokens are recognized in lexer. define how to “understand” the input language, and what actions to take for each “sentence”. any user code. For example, a main function to call the parser function yyparse()

  7. Structure of yacc file • Definition section • declarations of tokens • type of values used on parser stack • Rules section • list of grammar rules with semantic routines • User code

  8. The declaration section • Terminal and non terminals %token symbol %type symbol • Operator precedence and operator associability %noassoc symbol %left symbolo %right symbol • Axiom %start symbol

  9. The declaration section: terminals • They are returned by the yylex()functionwhich is called be the yyparse() • They become #define in the generated file • They are numbered starting from 257. But a concrete number can be associated with a token • %token T_Key 345 • Terminals that consist of a single character can be directly used (they are implicit). The corresponding tokens have values <257

  10. The declaration section:examples expressions.y %{ #include <stdio.h> %} %token NUMBER, PLUS, MINUS, MUL, DIV, L_PAR, R_PAR %start expr …

  11. The declaration section:examples patterns.l %{ #include "expressions_tab.h" %} digit [0-9] %% [ \t]+ ; {digit}+ {yylval=atoi(yytext); return NUMBER;} "+" return PLUS; "-" return MINUS; "*" return MUL; "/" return DIV; "(" return L_PAR; ")" return R_PAR; . {printf("token erroneous\n");}

  12. The declaration section:examples . . . %token NUMBER, +, -, *, /, (, ) . . . YACC: . . . digit [0-9] %% [ \t]+ ; {digit}+ {yylval=atoi(yytext); return NUMBER;} "+" return ’+’; "-" return ’-’; "*" return ’*’; "/" return ’/’; "(" return ’(’; ")" return ’)’; . . . Lex:

  13. Flex/Yacc communication file.l file.y header lex file.l yacc -d file.y file.tab.h lex.yy.c file.tab.c cc lex.yy.c -c cc file.tab.c -c lex.yy.o file.tab.o gcc lex.yy.o file.tab.o -o calc calc

  14. Lex/Yacc: lex file %{ #include "expressions.tab.h" %} digit [0-9] %option noyywrap %% [ \t]+ ; {digito}+ {yylval=atoi(yytext); /*printf("lex: %s, %d\n ",yytext, yylval);*/ return NUMERO;} "+" return PLUS; "-" return MINUS; . {printf("token erroneous\n");} %% Generated by Yacc no main()

  15. Flex/Yacc communication • expressions.tab.h • #ifndef YYSTYPE • #define YYSTYPE int • #endif • #define NUMBER 258 • #define PLUS 259 • #define MINUS 260 • #define MUL 261 • #define DIV 262 • #define L_PAR 263 • #define R_PAR 264

  16. The Production Rules Section %% production : symbol1 symbol2 … { action } | symbol3 symbol4 … { action } | … production: symbol1 symbol2 { action } %%

  17. statement expression expression expression number expression expression number expression expression number number + 5 4 - + 3 2 Semantic values %% statement : expression { printf (“ = %g\n”, $1); } expression : expression ‘+’ expression { $$ = $1 + $3; } | expression ‘-’ expression { $$ = $1 - $3; } | NUMBER { $$ = $1; } %% According these two productions, 5 + 4 – 3 + 2 is parsed into:

  18. Defining Values expr : expr '+' term { $$ = $1 + $3; } | term { $$ = $1; } ; term : term '*' factor { $$ = $1 * $3; } | factor { $$ = $1; } ; factor : '(' expr ')' { $$ = $2; } | ID | NUM ;

  19. Defining Values expr : expr '+' term { $$ = $1 + $3; } | term { $$ = $1; } ; term : term '*' factor { $$ = $1 * $3; } | factor { $$ = $1; } ; factor : '(' expr ')' { $$ = $2; } | ID | NUM ; $1

  20. Defining Values expr : expr '+' term { $$ = $1 + $3; } | term { $$ = $1; } ; term : term '*' factor { $$ = $1 * $3; } | factor { $$ = $1; } ; factor : '(' expr ')' { $$ = $2; } | ID | NUM ; $2

  21. Defining Values expr : expr '+' term { $$ = $1 + $3; } | term { $$ = $1; } ; term : term '*' factor { $$ = $1 * $3; } | factor { $$ = $1; } ; factor : '(' expr ')' { $$ = $2; } | ID | NUM ; $3 Default: $$ = $1;

  22. The declaration section • Support for arbitrary value types %union{ int intval; char *str; }

  23. The declaration section • Use of union • terminal declaration %token <intval> NATURAL • non terminal declaration %type <type> NO_TERMINAL • in productions expr: NAT ´+´ NAT {$$=$<intval>1+$<intval>3}; • In the lex file [-+]?{digit}+ { yyval.intval=atoi(yytext); return INTEGER;}

  24. Ambiguity • By default yacc does the following: • s/r: chooses reduce over shift • r/r: reduce the production that appears first • Better to solve the conflicts by setting precedence

  25. Error recovery • Yacc detects errors • To inform of errors a function needs to be implemented int yyerror (char *s) {fprintf (stderr, “%s”,s)}; • Panic mode recovery E: IF ´(´ cond ´)´ | IF ´(´ error ´)´ {yyerror(“condition missing”);

  26. Error recovery • After detecting an error, the parser will scan ahead looking for three legal tokens. yyerrork resets the parser to its normal mode • yyclearin allows the token that caused the error to be discarded

More Related