1 / 12

Regular Expressions

Regular Expressions. Hopcroft, Motawi, Ullman, Chap 3. Machines versus expressions. Finite automata are machine-like descriptions of languages Alternative: declarative description Specifying a language using expressions and operations

arleen
Download Presentation

Regular Expressions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regular Expressions Hopcroft, Motawi, Ullman, Chap 3

  2. Machines versus expressions • Finite automata are machine-like descriptions of languages • Alternative: declarative description • Specifying a language using expressions and operations • Example: 01* + 10* defines the language containing strings such as 01111, 100, 0, 1000000; * and + are operators in this “algebra”

  3. Regular expressions defined • The simplest regular expressions are • The empty string  • A symbol a from an alphabet • Given that R1 and R2 are regular expressions, regular expressions are built from the following operations • Union: R1+R2 • Concatenation: R1R2 • Closure: R1* • Parentheses (to enforce precedence): (R1) • Nothing else is a regular expression unless it is built from the above rules

  4. Examples • 1 +  • (ab)* + (ba)* • (0+1+2+3+4+5+6+7+8+9)*(0+5) • (x+y)*x(x+y)* • (01)* • 0(1*) • 01* equivalent to 0(1*)

  5. Equivalence between regular expressions and finite automata Strategy: • Convert regular expression to an -NFA • Convert a DFA to a regular expression

  6. Regular Expression to -NFA • Recursive construction • Base cases follow base case definitions of regular expressions • : -NFA that accepts the empty string –a single state that is the start and end state • a: -NFA that accepts {a} – two-state machine (start and final state) with an a-transition • Note: technically, we also need an -NFA for the empty language {} – easy

  7. Regular Expression to -NFA • Recursive step: build -NFA from smaller -NFAs that correspond to the operand regular expressions • To simplify construction, we may ensure the following characteristics for the automata we build • Only one final state, with no outgoing transitions • No transitions into the start state • Note: the base cases satisfy these characteristics

  8. Regular Expression to -NFA • Suppose -NFA1 and -NFA2 are the automata for R1 and R2 • Three operations to worry about: unionR1 + R2, concatenation R1R2), closure R1* • With -transitions, construction is straightforward • Union: create a new start state, with -transitions into the start states of -NFA1 and -NFA2; create a new final state, with -transitions from the two final states of -NFA1 and -NFA2 • Concatenation: -transition from final state of -NFA1 to the start state of -NFA2 • Closure: closure can be supported by an -transition from final to start state; need a few more -transitions (why?)

  9. DFA to Regular Expression • More difficult construction • Build the regular expression “bottom up” starting with simpler strings that are acceptable using a subset of states in the DFA • Define Rki,j as the expression for strings that have an admissible state sequence from state i to state j with no intermediate states greater than k • Assume no states are numbered 0, but k can be 0

  10. R0i,j • Observe that R0i,j describes strings of length 1 or 0, particularly: • {a1, a2, a3, … }, where, for each ax, (i,ax) = j • Add  to the set if i = j • The 0 in R0i,j means no intermediate states are allowed, so either no transition is made (just stay in state i to accept  if i = j) or make a single transition from state i to state j • These are the base cases in our construction

  11. Rki,j • Recursive step: for each k, we can build Rki,j as follows: Rki,j = Rk-1i,j + Rk-1i,k (Rk-1k,k)* Rk-1k,j • Intuition: since the accepting sequence contains one or more visits to state k, break the path into pieces that • first goes from i to its first k-visit (Rk-1i,k) • followed by zero or more revisits to k (Rk-1k,k) • followed by a path from k to j (Rk-1k,j )

  12. And finally… • We get the regular expression(s) that represent all strings with admissible sequences that start with the initial state (state 1) and end with a final state • Resulting regular expression built from the DFA: the union of all Rn1,f where f is a final state • Note: n is the number of states in the DFA meaning there are no more restrictions for intermediate states in the accepting sequence

More Related