1 / 13

Lex /Flex

Lex /Flex. COP 3402 – System Software. Lex : a tool for automatically generating a lexer or scanner given a lex specification (.l file ) Flex: Free software alternative to lex

nusa
Download Presentation

Lex /Flex

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lex/Flex COP 3402 – System Software

  2. Lex: a tool for automatically generating a lexer or scanner given a lex specification (.l file) Flex: Free software alternative to lex A lexer or scanner is used to perform lexical analysis, or the breaking up of an input stream into meaningful units, or tokens. For example, consider breaking a text file up into individual words. Lex/Flex: what is it?

  3. *.c is generated after running lex/flex FILENAME.l %{ < C global variables, includes prototypes, comments > %} [DEFINITION SECTION] %% [RULES SECTION] %% < C auxiliary subroutines> This part will be embedded into *.c substitutions, code and start states; will be copied into *.c define how to scan and what action to take for each token any user code. For example, a main function to call the scanning function yylex(). Skeleton of a lex/flex specification (.l file)

  4. The rules section %% [RULES SECTION] <pattern> { <action to take when matched> } <pattern> { <action to take when matched> } … %% Patterns are specified by regular expressions. For example: %% [A-Za-z]* { printf(“this is a word”); } %%

  5. Regular Expression Basics . : matches any single character except \n *: matches 0 or more instances of the preceding regular expression +: matches 1 or more instances of the preceding regular expression ?: matches 0 or 1 of the preceding regular expression |: matches the preceding or following regular expression []: defines a character class (): groups enclosed regular expression into a new regular expression “…” : matches everything within the “ ”, literally

  6. Lex/Flex RegExp (cont) x|yxor y {i} definition of i x/yx, only if followed by y (y not removed from input) x{m,n}mto n occurrences of x xx, but only at beginning of line x$x, but only at end of line "s" exactly what is in the quotes (except for "\" and following character) A regular expression finishes with a space, tab or newline

  7. Meta-characters • meta-characters (do not match themselves, because they are used in the preceding reg exps): • ( ) [ ] { } < > + / , ^ * | . \ " $ ? - % • to match a meta-character, prefix with "\" • to match a backslash, tab or newline, use \\, \t, or \n

  8. Regular Expression Examples • an integer: 12345 • [1-9][0-9]* • a word: cat • [a-zA-Z]+ • a (possibly) signed integer: 12345 or -12345 • [-+]?[1-9][0-9]* • a floating point number: 1.2345 • [0-9]*”.”[0-9]+

  9. Lex/Flex Regular Expressions Lex/Flex uses an extended form of regular expressions: • c any character except meta-characters (see below) • [...] the list of enclosed chars (may be a range) • [...] the list of chars not enclosed • . any ASCII char except newline • xy concatenation of x and y • x* • x+x* but notε • x? an optional x (same as x|ε)

  10. Two Rules • lex/flex will always match the longest (number of characters) token possible. • 2. If two or more possible tokens are of the same length, then the token with the regular expression that is defined first in the lex/flex specification is favored.

  11. Regular Expression Examples • a delimiter for an English sentence • “.” | “?” | ! OR • [“.”“?”!] • C++ comment: // call foo() here • “//”.* • white space • [ \t]+ • English sentence: • ([ \t]+|[a-zA-Z]+)+(“.”|“?”|!)

  12. Special Functions • yytext • where text matched most recently is stored • yyleng • number of characters in text most recently matched • yylval • associated value of current token • yymore () • append next string matched to current contents of yytext • yyless (n) • remove from yytext all but the first n characters • unput (c) • return character c to input stream • yywrap () • may be replaced by user • The yywrap method is called by the lexical analyzer whenever it inputs an EOF as the first character when trying to match a regular expression

  13. Example Flex Code flex.l

More Related