1 / 20

Using Lex

Using Lex. Introduction. When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the patterns matches, the lex program invokes C code that you provide which does something with the matched text. Introduction (Cont’d).

rock
Download Presentation

Using Lex

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Lex

  2. Introduction • When you write a lex specification, you create a set of patterns which lex matches against the input. • Each time one of the patterns matches, the lex program invokes C code that you provide which does something with the matched text.

  3. Introduction (Cont’d) • Lex itself doesn’t produce an executable program; instead it translates the lex specification into a file containing a C routine called yylex(). • Your program calls yylex()to run the lexer.

  4. The format of regular expressions in lex • The notation is slightly different from that used in our text book.

  5. Regular Expressions • Regular expressions used by Lex (See pages 28 and 29) . * [] ^ $ {} \ + ? | “…” / ()

  6. Examples of Regular Expressions • [0-9] • [0-9]+ • [0-9]* • -?[0-9]+ • [0-9]*\.[0-9]+ • ([0-9]+)|([0-9]*\.[0-9]+) • -?(([0-9]+)|([0-9]*\.[0-9]+)) • [eE][-+]?[0-9]+ • -?(([0-9]+)|([0-9]*\.[0-9]+))([eE][-+]?[0-9]+)?)

  7. The Structure of a Lex Program (Definition section) %% (Rules section) %% (User subroutines section)

  8. Example 1-1: Word recognizer ch1-02.l %{ /* * this sample demonstrates (very) simple recognition: * a verb/not a verb. */ %} %% [\t ]+ /* ignore white space */ ; is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go { printf("%s: is a verb\n", yytext); } [a-zA-Z]+ { printf("%s: is not a verb\n", yytext); } .|\n { ECHO; /* normal default anyway */ } %% main() { yylex(); }

  9. The definition section • Lex copies the material between “%{“ and “%}” directly to the generated C file, so you may write any valid C codes here

  10. Rules section • Each rule is made up of two parts • A pattern • An action • E.g. [\t ]+ /* ignore white space */ ;

  11. Rules section (Cont’d) • E.g. is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go { printf("%s: is a verb\n", yytext); }

  12. Rules section (Cont’d) • E.g. [a-zA-Z]+ { printf("%s: is not a verb\n", yytext); } .|\n { ECHO; /* normal default anyway */ } • Lex had a set of simple disambiguating rules: • Lex patterns only match a given input character or string once • Lex executes the action for the longestpossible match for the current input

  13. User subroutines section • It can consists of any legal C code • Lex copies it to the C file after the end of the Lex generated code %% main() { yylex(); }

  14. Example 2-1 %% [\n\t ] ; -?(([0-9]+)|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) { printf("number\n"); } . ECHO; %% main() { yylex(); }

  15. A Word Counting Program • The definition section %{ unsigned charCount = 0, wordCount = 0, lineCount = 0; %} word [^ \t\n]+ eol \n

  16. A Word Counting Program (Cont’d) • The rules section {word} { wordCount++; charCount += yyleng; } {eol} { charCount++; lineCount++; } . charCount++;

  17. A Word Counting Program (Cont’d) main(argc,argv) int argc; char **argv; { if (argc > 1) { FILE *file; file = fopen(argv[1], "r"); if (!file) { fprintf(stderr,"could not open %s\n",argv[1]); exit(1); } yyin = file; } yylex(); printf("%d %d %d\n",charCount, wordCount, lineCount); return 0; } • The user subroutines section

  18. How to implement a scanner()? • We have to stop the yylex()when it recognizes a defined token. • Insert “return” at the end of your program • [a-zA-Z]+ { return 2; } • See scanner_example.l

  19. %{ %} %% [\t ]+ /* ignore white space */ ; is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go { return 1; } [a-zA-Z]+ { return 2; } .|\n { /* normal default anyway */ } %% main() { int i; while ((i=yylex())!=0) { printf("return value is %d, token is %s\n", i,yytext); } printf("End of file\n"); }

  20. How to implement multiple characters lookahead in lex? • Check lex_lookahead.l • DO10I=1,100 • DO10I=1.100

More Related