Algebraic properties of regular expressions
Download
1 / 9

Algebraic Properties of Regular Expressions - PowerPoint PPT Presentation


  • 76 Views
  • Uploaded on

Algebraic Properties of Regular Expressions. AXIOM. DESCRIPTION. r | s = s | r. | is commutative. r | (s | t) = (r | s) | t. | is associative. (r s) t = r (s t). concatenation is associative. r ( s | t ) = r s | r t ( s | t ) r = s r | t r. concatenation distributes over |.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Algebraic Properties of Regular Expressions' - claus


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Algebraic properties of regular expressions
Algebraic Properties of Regular Expressions

AXIOM

DESCRIPTION

r | s = s | r

| is commutative

r | (s | t) = (r | s) | t

| is associative

(r s) t = r (s t)

concatenation is associative

r ( s | t ) = r s | r t

( s | t ) r = s r | t r

concatenation distributes over |

r = r

r = r

 Is the identity element for concatenation

r* = ( r |  )*

relation between * and 

r** = r*

* is idempotent


Regular expression examples
Regular Expression Examples

  • All Strings that start with “tab” or end with “bat”:tab{A,…,Z,a,...,z}*|{A,…,Z,a,....,z}*bat

  • All Strings in Which Digits 1,2,3 exist in ascending numerical order:{A,…,Z}*1 {A,…,Z}*2 {A,…,Z}*3 {A,…,Z}*


Towards token definition
Towards Token Definition

Regular Definitions: Associate names with Regular Expressions

For Example : PASCAL IDs

letter  A | B | C | … | Z | a | b | … | z

digit  0 | 1 | 2 | … | 9

id  letter ( letter | digit )*

Shorthand Notation:

“+” : one or more r* = r+ |  & r+ = r r*

“?” : zero or one r?=r | 

[range] : set range of characters (replaces “|” )

[A-Z] = A | B | C | … | Z

Example Using Shorthand : PASCAL IDs

id  [A-Za-z][A-Za-z0-9]*


Token recognition
Token Recognition

How can we use concepts developed so far to assist in recognizing tokens of a source language ?

Assume Following Tokens:

if, then, else, relop, id, num

What language construct are they used for ?

Given Tokens, What are Patterns ?

Grammar:stmt  |if expr then stmt |if expr then stmt else stmt |expr  term relop term | termterm  id | num

if  if

then  then

else  else

relop  < | <= | > | >= | = | <>

id  letter ( letter | digit )*

num digit+ (. digit+ ) ? ( E(+ | -) ? digit+ ) ?

What does this represent ? What is  ?


What else does lexical analyzer do
What Else Does Lexical Analyzer Do?

Scan away b, nl, tabs

Can we Define Tokens For These?

blank  b

tab  ^T

newline  ^M

delim  blank | tab | newline

ws  delim+


Overall
Overall

Regular Expression

Token

Attribute-Value

ws

if

then

else

id

num

<

<=

=

< >

>

>=

-

if

then

else

id

num

relop

relop

relop

relop

relop

relop

-

-

-

-

pointer to table entry

pointer to table entry

LT

LE

EQ

NE

GT

GE

Note: Each token has a unique token identifier to define category of lexemes


Constructing transition diagrams for tokens
Constructing Transition Diagrams for Tokens

  • Transition Diagrams (TD) are used to represent the tokens

  • As characters are read, the relevant TDs are used to attempt to match lexeme to a pattern

  • Each TD has:

    • States : Represented by Circles

    • Actions : Represented by Arrows between states

    • Start State : Beginning of a pattern (Arrowhead)

    • Final State(s) : End of pattern (Concentric Circles)

  • Each TD is Deterministic - No need to choose between 2 different actions !


Example tds
Example TDs

> = :

start

>

=

RTN(GE)

0

6

7

other

*

RTN(G)

8

We’ve accepted “>” and have read other char that must be unread.


Example all relops
Example : All RELOPs

start

<

=

0

6

1

2

7

5

4

return(relop, LE)

>

3

return(relop, NE)

other

*

=

return(relop, LT)

return(relop, EQ)

>

=

return(relop, GE)

other

*

8

return(relop, GT)


ad