1 / 9

正則表示法 Regular Expression

正則表示法 Regular Expression. Regular Expression. A language is regular, if its syntax can be expressed by a single EBNF expression. The requirement that a single equation suffices also implies that only terminal symbols occur in the expression.

chandler
Download Presentation

正則表示法 Regular Expression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 正則表示法Regular Expression

  2. Regular Expression • A language is regular, if its syntax can be expressed by a single EBNF expression. • The requirement that a single equation suffices also implies that only terminal symbols occur in the expression. • Such an expression is called a regular expression. 2

  3. 3

  4. Tokens • Identifiers: x y11 elsex • Keywords: if else while for break • Integers: 2 1000 -20 • Floating-point: 2.0 -0.0010 .02 1e5 • Symbols: + * { } ++ << < <= [ ] • Strings: “x” “He said, \”I love EECS 483\”” • Comments: /* bla bla bla */

  5. How to Describe Tokens • Use regular expressions to describe programming language tokens! • A regular expression (RE) is defined inductively • a ordinary character stands for itself •  empty string • R|S either R or S (alteration), where R,S = RE • RS R followed by S (concatenation) • R* concatenation of R 0 or mor times (Kleene closure)

  6. Language • A regular expression R describes a set of strings of characters denoted L(R) • L(R) = the language defined by R • L(abc) = { abc } • L(hello|goodbye) = { hello, goodbye } • L(1(0|1)*) = all binary numbers that start with a 1 • Each token can be defined using a regular expression

  7. RE Notational Shorthand • R+ one or more strings of R: R(R*) • R? optional R: (R|) • [abcd] one of listed characters: (a|b|c|d) • [a-z] one character from this range: (a|b|c|d...|z) • [^ab] anything but one of the listed chars • [^a-z] one character not from this range

  8. Regular Expression, R a ab a|b (ab)* (a| )b digit = [0-9] posint = digit+ int = -? posint real = int ( | (. posint)) = -?[0-9]+ (|(.[0-9]+)) Strings in L(R) “a” “ab” “a”, “b” “”, “ab”, “abab”, ... “ab”, “b” “0”, “1”, “2”, ... “8”, “412”, ... “-23”, “34”, ... “-1.56”, “12”, “1.056”, ... Note, “.45” is not allowedin this definition of real Example Regular Expressions

  9. Class Problem • A. Whats the difference? • [abc] abc • Extend the description of real on the previous slide to include numbers in scientific notation • -2.3E+17, -2.3e-17, -2.3E17

More Related