1 / 141

# Deterministic Finite Automata Machine - PowerPoint PPT Presentation

Deterministic Finite Automata Machine. More Examples of DFAs Simple examples 0*1* vs 0 n 1 n Mod Counting Examples Contains Substring (NFA vs DFA) Any Finite Language Integer mod 7 Adding to Integers Calculator Syntax Processor. Read-Once and Bounded Memory Loop Invariants

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Deterministic Finite Automata Machine' - lel

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Deterministic Finite AutomataMachine

More Examples of DFAs

Simple examples

0*1* vs 0n1n

Mod Counting Examples

Contains Substring (NFA vs DFA)

Any Finite Language

Integer mod 7

Calculator

Syntax Processor

Loop Invariants

Constant Memory

Code with Simple Loop

Deterministic Finite Automata

Path Through Graph

DFA "knowing" and distinguishable strings

Nondeterministic Finite Automata

DFA vs TM

(Extended) Regular Expressions

Parsing with Regular Expressions

Jeff Edmonds

York University

ECS 2001

Lecture 2

• The input arrives as a stream.One character at a time.It cannot be reread.

• Eg:

• simple iterative algorithms

• simple mechanical or electronic devices like elevators and calculators

• simple processes like the job queue of an operating system

• simple patterns within strings of characters.

• simple languages of strings

{01,100,01110,… }

l=40,

x=3,

y=2

• The input arrives as a stream.One character at a time.It cannot be reread.

• What do you have to rememberso that in the end you can answer the question?

• The memory is bounded!

Loop Invariant:

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

Given a string, find the longest block of 1’s

00101111001100011111000011001

Alg reads the digits one at a time and remembers enough about what has been read so that it does not need to reread anything.

Given a string, find the longest block of 1’s

00101111001100011

1

1

1

When it has read this much,what does it remember?

Read the next character & re-determine the largest block so far.

Read the next character & re-determine the largest block so far& current largest.

Given a string, find the longest block of 1’s

00101111001100011

1

1

1

When it has read this much,what does it remember?

• Largest block so far.

• Size of current block.

A loop invariant is an assertion that must be true about the state of the data structure every time the algorithm is at the top of the loop.

Extra

The input consists of an array of objects

We have read in the first i objects.

We will pretend that this prefix is the entire input.

We have a solution for this prefix

Extra

The input consists of an array of objects

• Do not worry about the entire computation.

next

Solution

Extra

Extra

The input consists of an array of objects

We read in the i+1st object.

We will pretend that this larger prefix is the entire input.

We extend the solution we have to one for this larger prefix.

• By Induction the computation will always keep the loop invariant true!

Exit

The input consists of an array of objects

In the end, we have read in the entire input.

The LI gives us that we have a solution for this entire input.

Exit

Exit

0 km

Exit

79 km

75 km

79 km

to school

Exit

Loop Invariant

23,31,52,88

Insertion Sort

The input consists of an array of integers

Solution

We have read in the first i objects.

We will pretend that this prefix is the entire input.

We have a solution for this prefix

52,23,88,31,25,30,98,62,14,79

23,31,52,88

Insertion Sort

The input consists of an array of integers

We read in the i+1st object.

We will pretend that this larger prefix is the entire input.

We extend the solution we have to one for this larger prefix.

Do you think about an algorithm as a sequence of actions?

Do you explain it by saying:“Do this. Do that. Do this”?

Do you get lost about where the computation is and where it is going?

What if there are many paths through many ifs and loops?

How do you know it worksfor every pathon every input?

A Sequence of Actions

Max( a,b,c )

m = a

if( b>m ) m = bendif

if( c>m ) m = cendif

return(m)

“postCond: return max in {a,b,c}”

A Sequence of Actions

Max( a,b,c )

m = a

At least tell me what the algorithm is supposed to do.

• Preconditions:Any assumptions that must be true about the input instance.

• Postconditions:The statement of what must be true when the algorithm/program returns.

if( b>m ) m = bendif

if( c>m ) m = cendif

return(m)

“postCond: return max in {a,b,c}”

A Sequence of Actions

Max( a,b,c )

m = a

How can you possibly understand this algorithm without knowing what is true when the computation is here?

if( b>m ) m = bendif

if( c>m ) m = cendif

return(m)

“postCond: return max in {a,b,c}”

A Sequence of Actions

Max( a,b,c )

m = a

“assert: m is max in {a}”

How can you possibly understand this algorithm without knowing what is true when the computation is here?

if( b>m ) m = bendif

“assert: m is max in {a,b}”

if( c>m ) m = cendif

Tell me!

“assert: m is max in {a,b,c}”

return(m)

vs say?A Sequence of Assertions

Max( a,b,c )

m = a

if( b>m ) m = bendif

if( c>m ) m = cendif

return(m)

A Sequence of Actions

“preCond: Input has 3 numbers.”

“assert: m is max in {a}”

“assert: m is max in {a,b}”

It is helpful to have different ways of looking at it.

“assert: m is max in {a,b,c}”

“postCond: return max in {a,b,c}”

Fixed/Constant say?vs Arbitrary/Finite

l=40,

x=3,

y=2

• The input arrives as a stream.One character at a time.It cannot be reread.

• What do you have to rememberso that in the end you can answer the question?

• The memory is bounded!

Loop Invariant:

Fixed/Constant say?vs Arbitrary/Finite

Given a string, find the longest block of 1’s

l=40,

x=3,

y=2

00101111001100011

1

When it has read this much,what does it remember?

• Largest block so far.

• Size of current block.

• How much memory is need on an input of length n?

• A count up to n, takes log n bits of memory.

• Too much. Memory can’t grow with input!

Fixed/Constant say?vs Arbitrary/Finite

Max( a,b,c )

m = a

if( b>m ) m = bendif

if( c>m ) m = cendif

return(m)

“preCond: Input has 3 numbers.”

“assert: m is max in {a}”

“assert: m is max in {a,b}”

• How much memory is need on an input of length n?

• A index up to n, takes log n bits of memory.

• Too much. Memory can’t grow with input!

“assert: m is max in {a,b,c}”

“postCond: return max in {a,b,c}”

Fixed/Constant say?vs Arbitrary/Finite

• Given the needs of the problem at hand,a programmer can give her constant memory program

• as many variables and

• as a big of a (finite) range for each variable

• as she wants.

• But these numbers are fixed/constant.

• If K(J,I) is these numbers program J has on input I,

• then this function can depend on J,

• but can’t depend on I

• (or on |I|)

• "regular language L,

•  constant memory program J,  an integer k,

• "inputs I,

• K(J,I) k

Fixed/Constant say?vs Arbitrary/Finite

I come up with a regular language L

• "regular language L,

•  constant memory program J,  an integer k,

• "inputs I,

• K(J,I) k

Fixed/Constant say?vs Arbitrary/Finite

I write a constant memory program J and give it k=1,000,000,000,000,000 variables, each with range 0..1,000,000,000,000,000!

• "regular language L,

•  constant memory program J,  an integer k,

• "inputs I,

• K(J,I) k

Fixed/Constant say?vs Arbitrary/Finite

I write a constant memory program J and give it k=1,000,000,000,000,000 variables, each with range 0..1,000,000,000,000,000!

Wow. That’s not fair.With more memory, it can memorize more of the input and then can have any power to determine the answer.

• "regular language L,

•  constant memory program J,  an integer k,

• "inputs I,

• K(J,I) k

Fixed/Constant say?vs Arbitrary/Finite

I write a constant memory program J and give it k=1,000,000,000,000,000 variables, each with range 0..1,000,000,000,000,000!

I will let him use any amount of memory,but this fixed J must work for all inputs.

• "regular language L,

•  constant memory program J,  an integer k,

• "inputs I,

• K(J,I) k

Fixed/Constant say?vs Arbitrary/Finite

I write a constant memory program J and give it k=1,000,000,000,000,000 variables, each with range 0..1,000,000,000,000,000!

If he uses more memory,I will give him a bigger input I.

Hee Hee Hee

J must still solve the problem.

• "regular language L,

•  constant memory program J,  an integer k,

• "inputs I,

• K(J,I) k

Read-Once & Bounded Memory say?Loop Invariant

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

a = 001011

Length = |a|= 6 > 3

Therefore not in language L.

a = 011

# of 1’s = 2 is even

Therefore not in language L.

a = 001

Length = |a|= 3  3

# of 1’s = 1 is odd

Therefore in language L.

Read-Once & Bounded Memory say?Loop Invariant

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Read-Once & Bounded Memory say?Loop Invariant

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

a = 001

When it has read this much,what does it remember?

Read-Once & Bounded Memory say?Loop Invariant

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

l =0, r = even

a = 

With the empty string read, establish the empty string.

Read the next character & say?re-determine the largest block so far& current largest.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

l =0, r = even

a = 

lt= lt-1+1 = 1, rt = rt-1 = even

a = 0

Read the next character & say?re-determine the largest block so far& current largest.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

l =0, r = even

a = 

lt= lt-1+1 = 1, rt = rt-1 = even

a = 0

lt= lt-1+1 = 2, rt = rt-1 +1 = odd

a = 01

Read the next character & say?re-determine the largest block so far& current largest.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

l =0, r = even

a = 

lt= lt-1+1 = 1, rt = rt-1 = even

a = 0

lt= lt-1+1 = 2, rt = rt-1 +1 = odd

a = 01

lt= lt-1+1 = 3, rt = rt-1 = odd

a = 010

Read the next character & say?re-determine the largest block so far& current largest.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

l =0, r = even

a = 

lt= lt-1+1 = 1, rt = rt-1 = even

a = 0

lt= lt-1+1 = 2, rt = rt-1 +1 = odd

a = 01

lt= lt-1+1 = 3, rt = rt-1 = odd

a = 010

lt= lt-1+1 = more, rt = rt-1 +1 = even

a = 0101

Exit say?

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

l =length = |a|  {0,1,2,3,more}

# of 1’s is r  {even,odd}

l =0, r = even

a = 

lt= lt-1+1 = 1, rt = rt-1 = even

a = 0

lt= lt-1+1 = 2, rt = rt-1 +1 = odd

a = 01

lt= lt-1+1 = 3, rt = rt-1 = odd

a = 010

lt= lt-1+1 = more, rt = rt-1 +1 = even

a = 0101

Once the string is all read in,determine if in L or not.

a  L

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

Loop Invariant that must always be true when at top of loop.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Gives a picture of what will be remembered about the prefix read so far.

Exit say?

Read the next character & maintain the Loop Invariant.

Code with Simple Loop

Establish the Loop Invariant for the empty string.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Once the input string is all read in,determine if it is in L or not.

Loop Invariant that must always be true when at top of loop.

Gives a picture of what will be remembered about the prefix read so far.

Temporary variables need not be mentioned.

Exit say?

Read the next character & maintain the Loop Invariant.

Code with Simple Loop

Establish the Loop Invariant for the empty string.

Once the input string is all read in,determine if it is in L or not.

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Temp variables need not be considered.

Build a state node for each “state” that the loop invariant says the computation might be in.ie for every setting of the variables.

Give the states meaningful names.

For each state node & say?for each character, have an edge to the state node to transition to.

Deterministic Finite Automaton (DFA)

Establish the Loop Invariant for the empty stringby specifying the start state.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

δ(qcurrent , cnext read) = qnext

Exit say?

Deterministic Finite Automaton (DFA)

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Accept states are those at whichthe DFA accepts the input if in this state when the string ends.

Exit say?

Deterministic Finite Automaton (DFA)

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

Path Through DFA say?

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

a = 010

start

0

1

0

a  L

Path Through DFA say?

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

a = 01010

0

start

0

a  L

1

1

0

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

When in this state, we say that the DFA “knows” that for the prefix read, leng=2 and r=even.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

If all prefixes ending up at this state have some common property and the DFA is in this state then we say that the DFA “knows” that this property.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

The language L does not distinguish between prefixes a = 00 and β = 11,because for all future ζ, αζand βζhave the same answer.

Hence the DFA need not distinguish which of a and βwas read.

Hence these a and βcan go to the same state.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

The language L does need to distinguish between prefixes a = 00 and β = 01,because future ζ=0, αζand βζhave the different answers.

Hence the DFA must distinguish which of a and βwas read.

Hence these a and βmust go to the different states.

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Partition all strings into sets based on whether they are distinguished by the language L.

These sets correspond directly to the needed states of the DFA.

all other strings

00,11

0

100,010,001,111

01,10

1

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

NonDeterminism say?

``Nondeterministic'‘ means that what

the machine does next is not set in stone, i.e. not determined, but there is a choice.

NFA are a natural abstraction of life

NonDeterminism say?

δ(qcurrent , cnext read) = { qnext }

The transition function specifies is a set of states, one of which will be the next state.

NonDeterminism say?

δ(qcurrent , cnext read) = { qnext }

The transition function specifies is a set of states, one of which will be the next state.

0

0

Jeff talks of a Fairy God Mother helping to know which way to go

NonDeterminism say?

We say God is giving you a “fair” lifeiff there exists a reasonable path that you could follow from your start state to a final state that you would accept.

Path Through NFA say?

0

0

0

a = 0001010

start

1

0

1

0

We say a string a is accepted iff there is an path through the NFAlabeled by the stringending at an accept state.

Path Through NFA say?

0

0

1

0

0

0

1

0

a = 0001010

start

We say a string a is accepted iff there is an path through the NFAlabeled by the stringending at an accept state.

But there is also many rejecting paths labeled a.

That’s ok.

Path Through NFA say?

a = 0001010

The language L of strings accepted by this NFA are:

It is my job to tell you just before you start reading the 0101.

TM vs DFA say?

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

DFA can be built into hardware using and/or gates.

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

• The current configuration specified by

• The contents of tape.

• Unlike the TM, the “tape” in a DFA can never be changed and only read once.

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

• The current configuration specified by

• The contents of tape.

• The current state q

• One of some fixed number

• One of which is the start state

• And some of which are accept state.

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

• The current configuration specified by

• The contents of tape.

• The current state q

• The current location of the head.

• In a DFA the head moves right each step.

• But does not “know” its current location.

• It only knows the value of this current cell.

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

• The legal operations are:

• Given the current state q

• And value c of the cell with the head

• The DFA looks up in a preset table

• The next state q’

• ie Transition function δ(q,c) = <q’,c’,direction>

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

• The finite state control can be thought of being a JAVA object with:

• a finite length of code (& a current line)

• a finite number of variables

• each taking on a current value from a finite range

• Its periscope can look at one cell of the tape

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

• The finite state control can be thought of being a JAVA object with:

• Each time step it decides

• How to change its values of its internal variables and its current line of code

TM say?vs DFA

Model of Computation: Turing Machine

DFA

q

q

q’

• The finite state control can be thought of as having

• A fixed sized black board on which to writesome bounded amount of information q.

• Eg, If it remembers a total of r bits, then then the number of different states q is

2r

TM say?vs DFA

l=39,

x=3,

y=2,

p=even

Model of Computation: Turing Machine

DFA

q

q

q’

• The finite state control can be thought of as having

• A fixed sized black board on which to writesome bounded amount of information q.

• Eg, If it remembers

• x ϵ{1,2,3,4}, y ϵ {1,2,3,4,5}, & p ϵ {even,odd}

• The number of different states q is

• Give the states meaningful names like q<line=39,x=3,y=2,p=even>.

4×5×2 = 40

We will always assume we are at the top of the loop.

line=1

TM say?vs DFA

l=39,

x=3,

y=2,

p=even

Model of Computation: Turing Machine

DFA

q

q

q’

• The finite state control can be thought of as having

• A fixed sized black board on which to write q.

• And learning value c of the cell under new head position.

• Can instantly do any computation on what it knows.

• (even solving incomputable problems)

δ(qknows,c) = qcomputed

qknowsis the state indicating what the TM currently knows.

qcomputedis the state indicating what the TM computes.

TM say?vs DFA

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

δ(ql=1,r=even,1) = ql=2,r=odd

1

• A deterministic finite automaton (DFA)M is defined by a 5-tuple M=(Q,,,q0,F)

• Q: finite set of states

• : finite alphabet

• : transition function :QQ

• q0Q: start state

• FQ: set of accepting states

When in state q and reading the character c,transition to state q’ =(q,c)

• A nondeterministic finite automaton (NFA) M is defined by a 5-tuple M=(Q,,,q0,F)

• Q: finite set of states

• : finite alphabet

• : transition function :QP (Q)

• q0Q: start state

• FQ: set of accepting states

When in state q and reading the character c,transition to one of the states q’ in the set (q,c).

I can help choose 

• A nondeterministic finite automaton (NFA) M is defined by a 5-tuple M=(Q,,,q0,F)

• Q: finite set of states

• : finite alphabet

• : transition function :QP(Q)

• q0Q: start state

• FQ: set of accepting states

1

 = {}Edges labeled  can be followed anytime.

I can help choose 

Linear

DFA=NFA

These languages are calledRegular Languages.

Problems which have

TM/Java programs that solve themin linear time

a*b*

a*b*

0n1n

0n1n

=Extended Regular

First is easy.For second, the algorithm must count.

=Regular

Many different equivalent models:

• Loop Invariant

• What is remembered at top of loop

• Code with simple loop

• Deterministic Finite Automata (DFA)

• Focuses on transitions between states

• Path through DFA

• Each path represents a string

• Nondeterministic Finite Automata (NFA)

• Get help from a fairy god mother.

• Regular Expressions

• Representing string patterns

• Extended Regular Expressions

• Also allow set intersection and set complement

Compilably Equivalent

(a*b  a*bb) 0*1*

(a*b  a*bb) 0*1*

Regular Expression are a quick notation for defining a language of strings.

• Any finite set of finite strings is a regular expression

• Eg R = {0,01,11}, then L(R) = {0,01,11}.

• R1R2 , R1∙R2 , and R*

representing L1L2 , L1∙L2 , and L*

Extended Regular Expressions

• Also R1R2 and R

Union: L1L2 = { α | αϵL1 or αϵL2 }

Intersection: L1L2 = { α | αϵL1and αϵL2 }

Complement: L = { α| αL }

Concatenation:

• L1∙L2 = { αβ | αϵL1 and βϵL2}

• |L1∙L2| is likely |L1| |L2|

• {ab,cd}∙{wx,yz} = {abwx, abyz, cdwx, cdyz}

• {0,00}∙{0,00} = {00,000,0000}

Kleene Star

• L*= { α1α2α3…αr| r ≥0 and each αi}

• {ab,cd}*= { , ab, cd, abab, abcd, cdab, cdcd,ababab, ….

• Unix ‘grep’ command: Global Regular Expression and Print

• Lexical Analyzer Generators (part of compilers)

• Both use regular expression to DFA conversion

• {0,1}

• {0,1}*

• {0,1}3

• 0*

• 10*10*

= character 0 or character 1

= a string consisting of zero or more characters

each is character 0 or character 1

= a string consisting of at most 3 characters each is character 0 or character 1

= a string consisting of zero or more 0’s

= a string consisting of a 1 followed byzero or more 0’s,followed by a 1, followed by zero or more 0’s

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

= a 0/1 string with two onesstarting with a one.

• 0*10* (10*10*)*

• {0,1}3  0*10* (10*10*)*

= a 0/1 string with an odd number of 1’s

= L

{0,1}3  0*10* (10*10*)*

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

How would you express this language without?

L = {1,10,01,100,010,001,111}

• {0,1}3  0*10* (10*10*)*

= L

L = {a Î {0,1}* | a has length at mostthree and the number of 1'sis odd }.

Ok, but in general how would you build a regular expression from the intersection of two others?

No idea. But an earlier page says that there is a way to compile from one to the other.

The advantage of having many modelsfor the same thing is some things are easier in one and then the theory automatically coverts between them.

0*10* (10*10*)*

L = {a Î {0,1}* | the number of 1'sis odd }.

00010011001000101010001010000

Yes, this regular expressions represents every string with the property and no string without it,but more…

Regular Expressions can also be used to “parse” a string in order understand the string’s structure better.

0*10* (10*10*)*

*=3

*=2

L = {a Î {0,1}* | the number of 1'sis odd }.

00010011001000101010001010000

Partition the string just before the 2nd, 4th, 6th, … one.

The first block has the form 0*10* with the first * putting out 3 zeroes and the second putting out 2.

0*10* (10*10*)*

*=4

*=0

*=2

L = {a Î {0,1}* | the number of 1'sis odd }.

00010011001000101010001010000

The remaining4 blocks are produced by having ()*be 4.

Each has the form 10*10*

The first such block has the form 10*10* with the first * putting out 0 zeroes and the second putting out 2.

={0,1}*

0* (10*)*

00010011001000101010001010000

What language does this regular expression represent and how does is break up a string?

All strings are represented, hence these two regular expressions produce the same strings, but they parse them differently.

Build a Parse Tree

0*10* (10*10*)*  {a,b}*

Choose one object from the set for union.

*

*

00010011001000101010001010000

*

b

*

*

a

*

*

*

*

*

*

*

*

0

1

*

*

*

*

0

1

1

0

0

000

1

00

1 1 00 1 000 1 0 1 0 1 000 1 0 1 0000

Build a Parse Tree

0*10* (10*10*)*  {a,b}*

Choose one object from each side for concatenation.

*

*

00010011001000101010001010000

*

b

*

*

a

*

*

*

*

*

*

*

*

0

1

*

*

*

*

0

1

1

0

0

000

1

00

1 1 00 1 000 1 0 1 0 1 000 1 0 1 0000

Build a Parse Tree

0*10* (10*10*)*  {a,b}*

Choose the number of repeats r ≥0 for star.

*

*

00010011001000101010001010000

*

b

*

*

a

*

*

*

*

*

*

*

*

0

1

*

*

*

*

0

1

1

0

0

000

1

00

1 1 00 1 000 1 0 1 0 1 000 1 0 1 0000

Fixed/Constant say?vs Arbitrary/Finite

I make a DFA M with k=1,000,000,000,000,000 states.

Wow. That’s not fair.With more states, it can count higher.

I will let him use number of states,but this fixed M must work for all inputs.

If he uses more states,I will give him a bigger input I.

M must still solve the problem.

•  DFA M,  an integer k,

• "inputs I,

• K(J,I) k

• DFA can’t count arbitrarily high. Hence a typical thing to do is to count mod some integer.

• Three ways of thinking about mod.

• mod(85,3) is a function that divides 85 by 3 giving 28 with a remainder of 1 and hence outputs 1. Note the answer is in {0,1,2}.

• We count 0,1,2,0,1,2,0,1,2,0,1,2, …

• In the mod 3 world, -5=-2=1=4=7

• Hence we could equivalently say

• (810)+5 = 85 = 1 (mod 3)

• (810)+5 = (21)+2 = 4 = 1 (mod 3)

• (810)+5 = (-11)+2 = 1 (mod 3)

• The final answer is typically in the range {0,1,2}, but it does not have to be.

L = {ω ∈ {0,1}∗ | 2 × (# of 0’s in ω) - (# of 1’s in ω) mod 5 = 0}

10011010011001001

Eg, ω = 2 × (# of 0’s in ω) - (# of 1’s in ω)

= 2 × 9 − 8 = 10 mod 5 = 0 . Hence ω ∈ L.

L = {ω ∈ {0,1}∗ | 2 × (# of 0’s in ω) - (# of 1’s in ω) mod 5 = 0}

10011010011001001

When it has read this much,what does it remember?

• r = 2 × (# of 0’s in ω) - (# of 1’s in ω) mod 5

If the next character is a 0,

increase r by 2.

If the next character is a 1,

decrease r by 1.

L = {ω ∈ {0,1}∗ | 2 × (# of 0’s in ω) - (# of 1’s in ω) mod 5 = 0}

L = {ω ∈ {0,1}∗ | 2 × (# of 0’s in ω) - (# of 1’s in ω) mod 5 = 0}

ω = 100111

ω L

1

1

Regular expression?

No idea?

But I know it is possible.

0

1

0

1

0

0

0

a = 0001010

start

1

0

1

0

a  L

We say a string a is accepted iff there is an path through the NFAlabeled by the stringending at an accept state.

But we want a Deterministic FA.

a = 001001010

0

1

0

start

0,1

0

1

0

0

1

1

0

a  L

Hint: Pooh can compute anything from what’s on his black board.

Output: f(I) if |I| ≤ 6

Input: I = b100101b

α=100

• Leap into the middle of the algorithm.

• If you have read the prefix α = 100, what do you want to remember on the black board?

• The entire prefix read so far is on the board.

• What state should you be in?

• q100.

Output: f(I) if |I| ≤ 6

Input: I = b100101b

• δ(q100, 1) = ?

• You are in state q100and the next character is a 1.

• Hence the input prefix you have read is 100.

• Hence after reading the 1, the prefix read will be 1001.

• Hence the next state should be q1001.

• δ(q100, 1) = q1001

• α∑<6, c∑δ(qα, c) =

qαc

For each string α from alphabet ∑ of length at most 6

there is a state qα.

For each character c in this alphabet, when reading this characterwhat should the next state (and actions) should be?

Output: f(I) if |I| ≤ 6

Input: I = b100101b

0

q000

α, cδ(qα, c) = qαc

q00

0

q001

1

qtoo long

α |α|=6, cδ(qα, c) =

q0

0

q010

1

0

q01

q011

1

q

0

q100

q10

1

q100101

0

q101

1

q1

0,1

0

q110

qtoo long

1

q11

q111

1

Output: f(I) if |I| ≤ 6

Input: I = b100101b

0

q000

• α |α|≤6,δ(qα, b) = ?

• Unlike with a TM, the input doesn’t end in a blank.

• Instead a DFA makes a state an accept state if when halting there it should accept.

• Again this is done with Table lookup

• α |α|≤6, f(α)L  qαaccept

q00

0

q001

1

q0

0

q010

1

0

q01

q011

1

q

0

q100

q10

1

q100101

0

q101

1

q1

0,1

0

q110

qtoo long

1

q11

q111

1

20569 mod 7 = 3

205694 mod 7

= 2056910+4 mod 7

= (20569mod 7) 10+4 mod 7

= (3) 10+4 mod 7

= 34 mod 7 = 6

**

**

**

**

**

**

**

**

**

**

**

+

**

**

**

**

**

**

**

**

**

* **

**

*

+

**

**

**

**

**

**

**

**

* **

* **

*

**

*

+

**

**

**

**

**

**

**

* **

* **

*

* **

*

**

*

+

**

**

**

**

**

**

* **

* **

*

* **

*

* **

*

**

*

+

*

*

* **

*

* **

*

* **

*

* **

*

* **

*

***

*

* **

*

* **

*

* **

*

* **

*

**

*

+

a:b means if you read an a then output a b and follow this edge to the next state

When in state “not in a comment”

• If you read a, b, or *, then copy it to the output.

• If you read /*, then give no output (e = empty string)and go state “in a comment”.

• If you read /a, /b, /*, or /“ then copy it (delayed) to the output.

When in state “in a comment”

• If you read a, b, *, “, or / then ignore it.

• If you read */, then give no output (e = empty string)and go state “not in a comment”.

• If you read *a, *b, /“, then ignore it.

When in state “not in a comment”

• If you read “, then copy it to the output and go state “in quote”.

• If you then read another ”, then leave this state.

When in state “in quote”

• Copy everything to the output.

Impossible

• If you do not print it as you go along, then remember what to print when the line ends.

• If you do print it as you go along, then you can’t unprint it when you read the */.

The End say?