1 / 24

Theory of Computing

This lecture discusses the minimum number of states required for a regular language and explores methods for finding a minimal DFA. It also examines the uniqueness of minimal DFA and the size of NFA compared to DFA.

moorhousem
Download Presentation

Theory of Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Theory of Computing Lecture 24 MAS 714 Hartmut Klauck

  2. Size of Automata? • We know L is regular, if there is a DFA with a finite number of states for L • Minimum number of states is the Myhill-Nerode Index • Questions: • How can we find a minimal DFA for L? • Is it unique (up to renaming states)? • Are NFA smaller than DFA? • Are two-way DFA smaller than DFA?

  3. Question 4 • Consider the language L={xi : x2{0,1}n and i2{0,1} log n and xi=1} • n is fix, this is a finite language • It is easy to see that there are 2n rows in the communication matrix • Rows labeled x contain the string on columns i=1…n • Hence any DFA for L has 2n states. • Can give a 2-way DFA with O(n2) states: • go to the right end of the input, read iinto the state, move n-i steps left, accept if there is a 1

  4. Question 3 • Consider L={xy: x,y2 {0,1}n, x  y} • This is a FINITE language, n is fixed • Exercise: there is an NFA with O(n2) states for L • DFA-size: the matrix has more than 2n distinct rows • DFA size is exponential • Hence NFA can be exponentially smaller than DFAfor some languages • Example where they are not: complement of L

  5. 2) Uniqueness • Uniqueness of the minimal DFA (up to vertex names)follows from the Myhill-Nerode characterization • The optimal DFA has exactly one state for each set of equal rows of the comm. matrix • Edges are also determined uniquely

  6. 1) Optimal DFA • Theorem: For every regular L there is a unique DFA A that has minimum size. Given a DFA M for L we can find A in polynomial time. • Corollary: Given DFA A,B we can decide in polynomial time whether they compute the same language • Minimize DFA, compare graphs (note that this is not an instance of the (hard) graph isomorphism problem, since we know the starting state and edge labels) • Corollary: Given DFA A, we can decide in polynomial time whether LA is empty

  7. Note • The corresponding questions for Turing machines are undecidable • These properties (easy to find unique representation, easy to find small representation, easy to check for emptiness and easy to compare) make DFA a good data-structure for languages/Boolean functions

  8. DFA Minimization • All algorithms try to recover equivalence classes in the Myhill-Nerode characterization • I.e., try to find the smallest partition of the rows of the communication matrix into equal rows • Input: DFA with n states and alphabet size k • Hopcroft: O(nk log n) • Moore: O(n2k) • Idea in both: • Start from partition of states into accepting/rejecting • Refine partition until no longer possible

  9. Proof • We are given a DFA M with n states and alphabet size k • Want to find an optimal DFA A • First: remove unreachable states • unreachable means in the graph, cannot be reached from q0on any input • Identify them with DFS from the starting state • Equivalent states: q,q’ are equivalent, if all strings y, starting from q or q’ lead to the same result (acc/rej) • can assume |y|< n, otherwise y contains a loop in M • equivalent states have the same row in the communication matrix

  10. A simple algorithm • Algorithm: • Remove unreachable states • Build a n£n matrix D that containing blank entries • For all rows q, and columns q’, where q2 F and q’ is not, set D[q,q’]=D[q’,q]=² • Iterate [until nothing to do, at most n times] • Loop over all letters a and all pairs q,q’ with D[q,q’] blank • If D[±(q,a),±(q’,a)] is not blank, set D[q,q’]=a • Join states q,q’ that have D[q,q’] still blank

  11. A simple algorithm • D allows us to find a witness that q,q’ are not equivalent: D[q,q’]=a, then find recursively the witness y for ±(q,a) and ±(q’,a), witness for q,q’ is ay • Follow ay from q and from q’, one leads to accept, the other to reject • Witnesses need not be longer than n-1 • If no witness exists, then q,q’ are equivalent • by definition of equivalence

  12. Time • Each iteration takes time O(n2k) • If q and q’ are not equivalent, a witness of length no more than n-1 can be found, i.e., number of iterations is at most n-1 • typically much less: longest simple path in DFA

  13. Correctness • All q,q’ with D[q,q’] not blank are not equivalent • Claim: algo will find a witness for non-equivalent q,q’ Induction over length of witness • length 0: in first iteration • Assume length k witnesses are found in k iterations • If q,q’ are not equivalent, and there is a string ay of length k such that ±(q,ay) accepts, ±(q’,ay) rejects, then some witness y’ for ±(q,a) and ±(q’,a) of length k-1 is found earlier • ay’ is a witness and found in iteration k • Claim: all equivalent states will never be declared not equivalent • Clear from the witness computed

  14. Minimizing NFA? • NFA are frequently smaller than DFA • Can we minimize their size? • Theorem: • NFA-minimization is PSPACE-hard • This remains true if the input is a DFA but we search the smallest NFA • Even approximation is hard

  15. Conclusion • DFA can be transformed into unique minimal DFA in polynomial time • Can then compare, decide emptiness etc. • NFA and 2-way DFA/NFA can be exponentially smaller, but cannot be minimized efficiently • Limits of DFA: • Communication method

  16. Regular Languages (Definition 2) • Simple recursive description of languages • Definition: Regular expression over alphabet § • a2§ is a r.e. • The empty word ² is a r.e. • ; is a r.e. [empty set] • R1[R2 is regular [union] Notation: r1 | r2 • R1R2 is regular [concatenation] • R1* is regular [Kleene]

  17. Examples • 0*10*: strings with one 1 • §* 1§*: strings with at least one 1 • (§§)*: even length • 1*(011*)*: every 0 followed by a 1 • (0*10*10*)*: even number of 1’s • 0*|1*: contains only 0’s or only 1’s

  18. Regular vs. DFA • Definition: a language describable by a regular expression is called regular • Theorem: The set of regular languages is the same set as the set of languages decidable by DFA • Proof: • We show • NFA can simulate regular expression • Regular expression can be constructed from DFA

  19. Regular to NFA • Consider an NFA with ² transitions • Edges can be labeled with ² (empty word) • Exercise: NFA with ² transitions can be transformed into NFA, same number of states • R a regular expression • Inductively find an NFA for R • Basis cases: R=², R=a2§, R=; • NFA is trivial (at most 3 states) • Induction: Find NFA for Union, Concatenation, Kleene Hull

  20. Regular to NFA • Union R1[R2: • NFA M1 and M2, starting state ²-move to start in M1 and to start in M2 • Concatenation: • Kleene:

  21. DFA to regular • Generalized DFA: edges can carry regular expressions instead of letters • Idea 1: start with a normal DFA, shrink to a generalized DFA with 1 edge • Idea 2: dynamic programming type argument • Ri,j,k regular expression for the set of strings that lead from qi to qj using q0,…, qk only

  22. DFA to regular • R0,0,0=(a|b|c)* if there are edges labeled a,b,cfrom q0 to q0 • Ri,i,0 = ²otherwise • Ri,j,0 = a|b etc., if there are edges labeled a and b from qi to qj • Ri,j,k= Ri,j,k-1 | Ri,k,k-1 (Rk,k,k-1)* Rk,j,k-1 • Finally take union over all R0,i,n, where qi is accepting • Result: a reg. expr. for the DFA • Works even for NFA

  23. Conclusion • Theoretical Computer Science • Algorithms • Computability • Complexity • Machine Models • Formal Languages • More: Cryptography, Information/Communication Theory …

  24. Topics: • Algorithms: • Basic Algorithms [Sorting/Searching] • Graph Algorithms • Paradigms [Greedy, Dynamic Programming] • Linear Programming • Algorithms for hard problems [Approximation] • Other theory: • Hardness and reductions [eg. NP-completeness] • Computability [eg. Halting problem, Universality] • Time-Hierarchy [etc.] • Finite Automata

More Related