Lexical Analysis - Scanner. 66.648 Compiler Design Lecture 2 (01/14/98). Computer Science Rensselaer Polytechnic. Lecture Outline. Scanners/ Lexical Analyzer Regular Expression NFA/DFA Administration. Introduction .
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
66.648 Compiler Design Lecture 2 (01/14/98)
Example: System.out.println(“Hello Class”);
has tokens System, dot, out, dot, println, left paren, String
Hello Class, right paren and a semicolon.
Qn: How are tokens defined and recognized?
Ans: By using regular expressions to define a token
as a formal regular language.
Formal Languages --
Alphabet - a finite set of symbols, ASCII is a
String - finite sequence of symbols from the alphabet.
Empty string = special string of length 0
Language = set of strings over a given alphabet
(e.g., set of all programs)
A reg. expression E denotes a language L(E)
An alphabet symbol,a, is a regular expression.
An empty symbol is also a regular expression.
Examples: 1997, 19.97
Solution: Note use of regular definitions as intermediate
names that define regular subexpressions.
digit 0 | 1 | 2| 3| … | 9
digit digit digit* (often written as digit+) This is
the Kleene star. Means 1 or more digits.
. digits | epsilon
Note that we have used all the definitions of a regular
One can define similar regular expression(s) for identifiers
comments, Strings, operators and delimiters.
Qn: How to write a regular expression for identifiers?
(identifiers are letters followed by a letter or a digit).
a|A|b|B| … |z|Z
0|1|2| … | 9
letter | digit
letter | letter letter_or_digit*
A General Approach
A transition graph represents a NFA.
From a state(node), there may be more than one edge labeled with the same alphabet and there may be no edge from a node labeled with an input symbol.
A finite automaton is deterministic if
Such a transition graph is called a state graph.