Discrete maths
Download
1 / 59

Discrete Maths - PowerPoint PPT Presentation


  • 135 Views
  • Uploaded on

Discrete Maths. 242-213 , Semester 2, 2013-2014. Recogni z ing input using: automata : a graph-based technique regular expressions : an algebraic technique equivalent to automata . 13 . Automata and Regular Expressions. Overview. Introduction to Automata Representing Automata

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Discrete Maths' - nicki


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Discrete maths
Discrete Maths

242-213, Semester 2,2013-2014

  • Recognizing input using:

    • automata: a graph-based technique

    • regular expressions: an algebraic technique

      • equivalent to automata

13. Automata andRegular Expressions


Overview
Overview

  • Introduction to Automata

  • Representing Automata

  • The ‘aeiou’ Automaton

  • Generating Output

  • Deterministic and Nondeterministic Automata

  • Regular Expressions

  • UNIX Regular Expressions

  • From REs to Automata

  • More Information


1 introduction to automata
1. Introduction to Automata

  • A finite state automaton represents a problem as a series of states and transitions between the states

    • the automaton starts in an initial state

    • input causes a transition from the current state to another;

    • a state may be accepting

      • the automaton can terminate successfully when it enters an accepting state (if it wants to)


1 1 an example
1.1. An Example

The ‘even-odd’ Automaton

b

  • The states are the ovals.

  • The transitions are the arrows

    • labelled with the input that ‘trigger’ them

  • The ‘oddA’ state is accepting.

b

start

a

evenA

oddA

a

continued


Execution sequence
Execution Sequence

b a b a a

evenA

initial

state

  • InputMove to State

b a b a a

evenA

the automaton

could choose to

terminate here

b a b a a

oddA

b a b a a

oddA

b a b a a

evenA

stops since

no more input

b a b a a

oddA


1 2 why are automata useful
1.2. Why are Automata Useful?

  • Automata are a very good way of modeling finite-state systems which change state due to input. Examples:

    • text editors, compilers, UNIX tools like grep

    • communications protocols

    • digital hardware components

      • e.g. adders, RAM

very different

applications


2 representing automata
2. Representing Automata

  • Automata have a mathematical basis which allows them to be analysed, e.g.:

    • prove that they accept correct input

    • prove that they do not accept incorrect input

  • Automata can be manipulated to simplify them, and they can be automatically converted into code.


2 1 a mathematical coding
2.1. A Mathematical Coding

  • We can represent an automaton in terms of sets and mathematical functions.

  • The ‘even-odd’ automaton is:

    startSet = { evenA }

    acceptSet = { oddA }

    nextState(evenA, b) => evenAnextState(evenA, a) => oddAnextState(oddA, b) => oddAnextState(oddA, a) => evenA

continued



2 2 automaton in code
2.2. ‘even-odd’ automaton only accepts strings which:Automaton in Code

  • It is easy to (automatically) translate an automaton into code, but ...

    • an automaton graph does not contain all the details needed for a program

  • The main extra coding issues:

    • what to do when we enter an accepting state?

    • what to do when the input cannot be processed?

      • e.g. abzz is entered


Encoding the even odd automaton
Encoding the ‘even-odd’ Automaton ‘even-odd’ automaton only accepts strings which:

enum state {evenA, oddA}; // possible statesenum state currState = evenA; // start stateint isAccepting = 0; // falseint ch;while ((ch = getchar()) != EOF)) { currState = nextState(currState, ch); isAccepting = acceptable(currState);}if (isAccepting) printf(“accepted\n);else printf(“not accepted\n”);

accepting state

only used at

end of input

continued


enum state nextState(enum state s, int ch) ‘even-odd’ automaton only accepts strings which:{ if ((s == evenA) && (ch == ‘b’)) return evenA; if ((s == evenA) && (ch == ‘a’)) return oddA; if ((s == oddA) && (ch == ‘b’)) return oddA; if ((s == oddA) && (ch == ‘a’)) return evenA; printf(“Illegal Input”); exit(1);}

simple handling

of incorrect input

continued


int acceptable(enum state s) ‘even-odd’ automaton only accepts strings which:{ if (s == oddA) return 1; // oddA is an accepting state return 0;}


3 the aeiou automaton
3. ‘even-odd’ automaton only accepts strings which:The ‘aeiou’ Automaton

  • What English words contain the five vowels (a, e, i, o, u) in order?

  • Some words that match:

    • abstemious

    • facetious

    • sacrilegious


3 1 automaton graph
3.1. ‘even-odd’ automaton only accepts strings which:Automaton Graph

L = all letters

L - a

L - e

L - i

L - o

L - u

a

e

i

o

u

start

0

1

2

3

4

5


3 2 execution sequence 1
3.2. ‘even-odd’ automaton only accepts strings which:Execution Sequence (1)

  • InputMove to State

f a c e t i o u s

0

f a c e t i o u s

0

1

f a c e t i o u s

f a c e t i o u s

1

continued


f a c e t i o u s ‘even-odd’ automaton only accepts strings which:

2

  • InputMove to State

f a c e t i o u s

2

f a c e t i o u s

3

f a c e t i o u s

4

the automaton can

terminate here;

no need to process

more input

f a c e t i o u s

5


Execution sequence 2
Execution Sequence (2) ‘even-odd’ automaton only accepts strings which:

  • InputMove to State

a n d r e w

0

a n d r e w

1

a n d r e w

1

1

a n d r e w

continued


  • Input ‘even-odd’ automaton only accepts strings which:Move to State

a n d r e w

1

a n d r e w

2

a n d r e w

2, and end of inputmeans failure


3 3 translation to code
3.3. ‘even-odd’ automaton only accepts strings which:Translation to Code

enum state {0, 1, 2, 3, 4, 5}; // poss. states enum state currState = 0; // start stateint isAccepting = 0; // falseint ch;while ((ch = getchar()) != EOF) && !isAccepting) { currState = nextState(currState, ch); isAccepting = acceptable(currState);}if (isAccepting) printf(“accepted\n);else printf(“not accepted\n”);

stop processing

when the accepting

state is entered

continued


enum state nextState(enum state s, int ch) ‘even-odd’ automaton only accepts strings which:{ if (s == 0) { if (ch == ‘a’) return 1; else return 0; // input is L-a } if (s == 1) { if (ch == ‘e’) return 2; else return 1; // input is L-e } if (s == 2) { if (ch == ‘i’) return 3; else return 2; // input is L-i } :

continued


: ‘even-odd’ automaton only accepts strings which: if (s == 3) { if (ch == ‘o’) return 4; else return 3; // input is L-o } if (s == 4) { if (ch == ‘u’) return 5; else return 4; // input is L-u } printf(“Illegal Input”); exit(1);} // end of nextState()

simple handling

of incorrect input


int acceptable(enum state s) ‘even-odd’ automaton only accepts strings which:{ if (s == 5) return 1; // 5 is an accepting state return 0;}


4 generating output
4. ‘even-odd’ automaton only accepts strings which:Generating Output

  • One possible extension to the basic automaton idea is to allow output:

    • when a transition is ‘triggered’ there can be optional output as well

  • Automata which generate output are sometimes called Finite State Machines (FSMs).


4 1 even odd with output
4.1. ‘even-odd’ automaton only accepts strings which:‘even-odd’ with Output

b

  • When the ‘a’ transition is triggered out of the evenA state, then a ‘1’ is output.

b

a/1

start

evenA

oddA

a


4 2 mathematical coding
4.2. ‘even-odd’ automaton only accepts strings which:Mathematical Coding

  • Add an ‘output’ mathematical function to the automaton representation:

    output( evenA, a ) => 1


4 3 extending the c coding
4.3. ‘even-odd’ automaton only accepts strings which:Extending the C Coding

  • The while loop for ‘even-odd’ will become:

    :while ((ch = getchar()) != EOF)) {output(currState, ch); currState = nextState(currState, ch); isAccepting = acceptable(currState);} :

continued


  • The ‘even-odd’ automaton only accepts strings which:output() C function:

    void output(enum state s, int ch){ if ((s == evenA) && (ch == ‘a’)) putchar(‘1’);}


5 deterministic and nondeterministic automata
5. ‘even-odd’ automaton only accepts strings which:Deterministic and Nondeterministic Automata

a

  • We have been writing deterministic automata so far:

    • for an input read by a state there is at most one transition that can be fired

      • state ‘s’ can process input ‘a’ and ‘w’, and fails for anything else

S

w


Nondeterministic automata
Nondeterministic Automata ‘even-odd’ automaton only accepts strings which:

V

a

  • A nondeterministic (ND) automaton can have 2 or more transitions with the same label leaving a state.

  • Problem: if state S sees input ‘x’, then which transition should it use?

x

T

S

x

U


5 1 the man automaton
5.1. ‘even-odd’ automaton only accepts strings which:The ‘man’ Automaton

  • Accept all strings that contain “man”

    • this is hard to write as a deterministic automaton. The following has bugs:

L - m

WRONG

start

m

a

n

0

1

2

3

L - a

L - n

continued


  • The input string ‘even-odd’ automaton only accepts strings which:commandwill get stuck at state 0:

0

0

0

0

0

0

1

0

n

m

a

d

c

o

m

the problem

starts here


5 2 a nd automaton solution
5.2. ‘even-odd’ automaton only accepts strings which:A ND Automaton Solution

L

  • It is nondeterministic because an ‘m’ input in state 0 can be dealt with by two transitions:

    • a transition back to state 0, or

    • a transition to state 1

start

m

a

n

0

1

2

3

continued


  • Processing ‘even-odd’ automaton only accepts strings which:command input:

0

0

0

0

0

0

0

0

n

a

d

c

o

m

m

2

1

3

acceptingstate

n

a

fail: reject

the input

1

m


5 3 executing a nd automata
5.3. ‘even-odd’ automaton only accepts strings which:Executing a ND Automata

  • It is difficult to code ND automata in conventional languages, such as C.

  • Two different coding approaches:

    • 1. When an input arrives, execute all transitions in parallel. See which succeeds.

    • 2. When an input arrives,try one transition. If it leads to failure then backtrack and try another transition.


5 4 why use nd automata
5.4. ‘even-odd’ automaton only accepts strings which:Why use ND Automata?

  • With nondeterminism, some problems are easier to solve/model.

  • Nondeterminism is common in some application areas, such as AI, graph search, and compilers.

continued



6 regular expressions res
6. complex) deterministic one.Regular Expressions (REs)

  • REs are an algebraic way of specifying how to recognise input

    • ‘algebraic’ means that the recognition pattern is defined using RE operands and operators

  • REs are equivalent to automata

    • REs and automata can be used on all the same problems


6 1 res in grep
6.1. complex) deterministic one.REs in grep

  • grep searches input lines, a line at a time.

  • If the line contains a string that matches grep's RE (pattern), then the line is output.

output matching lines

(e.g. to a file)

input lines

(e.g. from a file)

grep "RE"

hello andy

my name is andy

my bye byhe

continued


Examples
Examples complex) deterministic one.

grep "and"

hello andy

my name is andy

my bye byhe

hello andy

my name is andy

grep –E "an|my"

hello andy

my name is andy

my bye byhe

hello andy

my name is andy

my bye byhe

"|" means "or"

continued


grep "hel*" complex) deterministic one.

hello andy

my name is andy

my bye byhe

hello andy

my bye byhe

"*" means "0 or more"


6 2 why use res
6.2. complex) deterministic one.Why use REs?

  • They are very useful for expressing patterns that recognise textual input.

  • For example, REs are used in:

    • editors

    • compilers

    • web-based search engines

    • communication protocols


6 3 the re language
6.3. complex) deterministic one.The RE Language

  • A RE defines a pattern which recognises (matches) a set of strings

    • e.g. a RE can be defined that recognises the strings { aa, aba, abba, abbba, abbbba, …}

  • These recognisable strings are sometimes called the RE’s language.


Re operands
RE Operands complex) deterministic one.

  • There are 4 basic kinds of operands:

    • characters (e.g. ‘a’, ‘1’, ‘(‘)

    • the symbol e (means an empty string ‘’)

    • the symbol {} (means the empty set)

    • variables, which can be assigned a RE

      • variable = RE


Re operators
RE Operators complex) deterministic one.

  • There are three basic operators:

    • union ‘|’

    • concatenation

    • closure *


Concatenation
Concatenation complex) deterministic one.

  • S T

    • this RE will use the S RE followed by the T RE to match against strings

  • What a string is matched by a RE"abc"

  • it is equivalent to:

    'a' followed by 'b' followed by 'c'


6 4 res for c identifiers
6.4. complex) deterministic one.REs for C Identifiers

  • We define two RE variables, letter and digit:

    letter = A | B | C | D ... Z | a | b | c | d .... z

    digit = 0 | 1 | 2 | ... 9

  • ident is defined using letter and digit:

    ident = letter ( letter | digit )*

continued


  • Strings matched by complex) deterministic one.ident include:

    ab345 w h5g

  • Strings not matched:

    2 $abc ****


7 unix regular expressions
7. complex) deterministic one.UNIX Regular Expressions

  • Different UNIX tools use slightly different extensions of the basic RE notation

    • vi, awk, sed, grep, egrep, etc.

  • Extra features include:

    • character classes

    • line start ‘^’ and end ‘$’ symbols

    • the wild card symbol ‘.’

    • additional operators, R? and R+


7 1 character classes
7.1. complex) deterministic one.Character Classes

  • The character class [a1 a2 ... an] stands for a1 | a2 | ... | an

  • a1- an stands for the set of characters between a1 and an

    • e.g. [A-Z] [a-z0-9]


7 2 line start and end
7.2. complex) deterministic one.Line Start and End

  • The ‘^’ matches the beginning of the line, ‘$’ matches the end

    • e.g. grep ‘^andr’ /usr/share/dict/wordsgrep '^[washingto]*$' /usr/share/dict/words


Example as a diagram
Example as a Diagram complex) deterministic one.

grep "^andr"

AA's

AOL

AOL's

:

:

androgen

androgen's

androgynous

android

android's

androids

/usr/share/dict/words


7 3 wild card symbol
7.3. complex) deterministic one.Wild Card Symbol

  • The ‘.’ stands for any character except the newline

    • e.g. grep ‘^a..b.$’ chapter1.txt grep ‘t.*t.*t’ manual


grep "^a..b.$" complex) deterministic one.

AA's

AOL

AOL's

:

:

adobe

alibi

ameba

/usr/share/dict/words


7 4 r and r
7.4. complex) deterministic one.R? and R+

  • R? stands for e | R (0 or 1 R)

  • R+ stands for R | RR | RRR | ...which can also be written as R R*

    • one or more occurrences of R


8 from res to automata
8. complex) deterministic one.From REs to Automata

  • The translation uses a special kind of ND automata which uses e-transitions. Automata of this type are sometimes callede-NFAs.

  • The translation steps are:

    • RE e-NFA

    • e-NFA  ND automaton

    • ND automaton  deterministic automaton

    • deterministic automaton code


E nfas
e complex) deterministic one.-NFAs

  • A e-NFA allows a transition to use a e label.

  • A transition using an e label can be triggered without having to match any input.


E nfa example
e complex) deterministic one.-NFA Example

  • a*b | b*a is accepted by the following e-NFA:

b

a

2

3

e

e

start

nondeterminism

occurs here

6

1

e

e

4

5

b

Example input:"bbba"

a


9 more information
9. complex) deterministic one.More Information

  • Johnsonbaugh, R. 1997. Discrete Mathematics, Prentice Hall, chapter 10.

  • Discrete Mathematics and its ApplicationsKenneth H. RosenMcGraw Hill, 2007, 7th edition

    • chapter 13, sections 13.2 – 13.3


ad