1 / 37

Properties of Regular Languages

CS 3240 – Chapter 4. Properties of Regular Languages. Topics. Closure Properties Algorithms for Elementary Questions: Is a given word, w , in L ? Is L empty, finite or infinite? Are L 1 and L 2 the same set? Detecting non-regular languages. Closure Properties.

Download Presentation

Properties of Regular Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 3240 – Chapter 4 Properties of Regular Languages

  2. Topics • Closure Properties • Algorithms for Elementary Questions: • Is a given word, w, in L? • Is L empty, finite or infinite? • Are L1 and L2 the same set? • Detecting non-regular languages CS 3240 - Properties of Regular Languages

  3. Closure Properties • Closure of operations • If x and y are in the same set, is x op y also? • Example: The integers are closed under addition • They are not closed under division • Regular languages are closed under everything! • Typical set operations CS 3240 - Properties of Regular Languages

  4. Regular Operations • Regular languages are closed under: • Kleene Star (*) • Union (+) • Concatenation (xy) • (By definition!) • They are also closed under: • Complement (reverse state acceptability✓) • Intersection • Set difference • Reversal (already proved in homework #12, 2.3✓) CS 3240 - Properties of Regular Languages

  5. Closure Under Intersection • Proof from set theory: • L1 ∩ L2 = (L1’ ∪ L2’)’ • Since complement and union are closed, intersection must be also! QED CS 3240 - Properties of Regular Languages

  6. Union of ComplementsAnd Complement the Result • Note how the intersection is never shaded • L1’ ∪ L2’ shades everything but where they overlap • Therefore, (L1’ ∪ L2’)’is the overlap (intersection) CS 3240 - Properties of Regular Languages

  7. Set DifferenceA Simple Proof • A – B: • Everything that is in A but not in B • A – B = A ∩ B’ • We have already shown that regular languages are closed under intersection and complement. QED CS 3240 - Properties of Regular Languages

  8. Computing the Union by MachineBy “combining” the machines • Start with a composite start state: • Consisting of the two start states • Follow all out-edges simultaneously • As we did for NFA-to-DFA conversion • States containing any original final state is a final state in the result for union • Because one of the machines accepts there • States containing an original final state from each original machine is a final state in the result for intersection • Because both of the machines accept there • ¿How would you construct the difference machine? CS 3240 - Properties of Regular Languages

  9. a,b b a a -x1 x2 +x3 b b b a a a a b b Double-a EVEN-EVEN

  10. For union: assign accepting states where any original xi or yi accept. For intersection: assign accepting states only where both original xi or yi accept simultaneously. No need to compute (L1’ ∪ L2’)’! For difference, assign accepting states where one accepts and the other does not.

  11. The resulting machine… a a a a b a a b b b b b b b a a b b a a b a a

  12. The Membership ProblemSection 4.2 • Given a word w, and a regular language, L, can we answer the question: • Is w ∊ L? • You tell me… CS 3240 - Properties of Regular Languages

  13. Is L Empty? • A graph theory problem: • Find a path from the start to a final state in the associated FA • Algorithm: “mark” the start state repeat: mark any state with an incoming edge from a previously marked state untilan accepting state is marked or no new states were marked at all CS 3240 - Properties of Regular Languages

  14. Another SolutionTo See if L is Non-empty • Attempt to convert the associated FA to a regular expression • By the state bypass and elimination algorithm • If you get a regular expression, then a string is accepted CS 3240 - Properties of Regular Languages

  15. Yet Another ApproachTo See if L is non-empty (by computer) • Suppose a minimal machine, M, for the language L has p states • If M accepts any non-empty words at all, it must accept one of length <=p • Why? • So… • Systematically try all possible strings in Σ* of length 1 through p. If none are accepted, then no non-empty strings at all are in L. CS 3240 - Properties of Regular Languages

  16. Is L Finite or Infinite? • Convert its machine to a regular expression • It is infinite iff it has a star •  • Another way: • A language is infinite if there is a cycle in an accepting path • A (tedious) graph theory problem •  CS 3240 - Properties of Regular Languages

  17. An ObservationAbout Infinite Regular Languages • Suppose L’s minimal machine, M, has p states • Any path of length p has (or is) a cycle • And any cycle must have or be a cycle of length p or less • Because a state is revisited after at mostp characters • So, infinite languages have a machine with at least one cycle of length p or less in an accepting path* • And all non-empty languages have a string of length p or less (already showed that)… CS 3240 - Properties of Regular Languages

  18. Finishing the ReasoningAbout Detecting Infinite Languages – A third way • Let m denote the length of a cycle in an accepting path • We know m ≤ p • Let k be the length of a string in L such that k ≤ p • There has to be one if the language is infinite! • Then strings of length k + im are accepted, i ≥ 0 • By traversing the cycle i times • But k + im ≤ p + ip = (i+1)p • So, there must be some i such that p ≤ k+im ≤ 2p • Procedure: Test all strings of length p through 2p-1 CS 3240 - Properties of Regular Languages

  19. Is L1 = L2? • That is, are they the same set of strings? • Set-theoretic argument: • Two sets are equal if their symmetric difference is empty (denoted by A ∆ B or A ⊖ B) • A ∆ B = A ∪ B – A ∩ B = A – B ∪ B – A • But A – B = A ∩ B’, and B – A = B ∩ A’ • So L1 = L2 iff (L1 ∩ L2’) ∪ (L1’ ∩ L2) = ∅ CS 3240 - Properties of Regular Languages

  20. Is L1 = L2? CS 3240 - Properties of Regular Languages

  21. Is L1 = L2? CS 3240 - Properties of Regular Languages

  22. Non-Regular LanguagesSection 4.3 • Not all languages are regular • We need to recognize whether languages are regular or not • We don’t want to waste time using regular language processing techniques where they don’t apply CS 3240 - Properties of Regular Languages

  23. ab CS 3240 - Properties of Regular Languages

  24. ab + aabb CS 3240 - Properties of Regular Languages

  25. ab + aabb + aaabbb CS 3240 - Properties of Regular Languages

  26. Recognizing Non-Regular Languages • Consider anbn • ab is regular • ab + aabb = anbn, 0 ≤ n ≤ 2, is regular • Any finite language is regular (why?) • But anbn, n ≥ 0 is not regular (why not?) • How do we prove it’s not regular!?! CS 3240 - Properties of Regular Languages

  27. An Observation • Finite Automata don’t have unlimited counting capability • They only have a fixed number of states • Intuitively, we see that an automaton can’t keep track of counts for anbn where n is arbitrarily large • But intuition is often faulty. We need a proof! CS 3240 - Properties of Regular Languages

  28. About Infinite Regular LanguagesRedux • Any accepted string of length p (the number of states) or greater forces a cycle in an accepting path. • In other words, at least one state is visited a second time • And that “revisit” must happen within the first p characters of the string • Because that’s when the (p+1)th state is entered • This could be any state (start, final, other) CS 3240 - Properties of Regular Languages

  29. anbn is Not RegularProof by Contradiction • Consider akbk, where k is greater than the number of states in a supposed DFA accepting all anbn, n ≥ 0 • Before the first b is encountered, a state has been visited at least twice (because there are more a’s than states) • Suppose the length of the associated cycle is m • Then the string ak+imbk is also accepted! • This contradicts the existence of a DFA that accepts anbn CS 3240 - Properties of Regular Languages

  30. “Revisiting” a State The first “revisit” CS 3240 - Properties of Regular Languages

  31. The Pumping LemmaFor Regular Languages • For every infinite regular language, L, there is a number, p, such that for all strings, s, in L, where |s| ≥ p, you can partition s into three concatenated substrings, xyz, such that: • |y| > 0 • |xy| ≤ p • xy*z ∈ L CS 3240 - Properties of Regular Languages

  32. Using the Pumping LemmaRegular => Pumpable≣ ¬Pumpable => ¬Regular • You can only use the pumping lemma to show that a language is not regular • By showing it fails the “pumping” conditions of infinite regular languages • Note: Some non-regular languages pump! • The trick is to find a convenient string • Usually the condition |xy| ≤ p is also key • Sometimes pumping down (i = 0) is easiest CS 3240 - Properties of Regular Languages

  33. Using the Pumping Lemma on anbn • Consider the string apbp • It is in this language • It is long enough (≥ p in length) • Now let apbp = xyz • Remember |xy| ≤ p • What can you conclude about y? CS 3240 - Properties of Regular Languages

  34. Playing Games • You can treat proving a language non-regular as a “game”: • You pick a string, s, in L, where |s| ≥ p • You may pick any such string; choose wisely! • Opponent picks x, y, and z • But must obey |xy| ≤ p and |y| > 0 • You show it can’t be “pumped” • Because a pumped string falls “outside” the language • Must anticipate all possible partitions xyz CS 3240 - Properties of Regular Languages

  35. Some Non-regular languagesAll require arbitrary counting capability • aibj, i > j • PALINDROME • w = wR (same backwards and forwards) • ww • Equal halves • PRIME (am where m is prime) • SQUARE (am where m is a perfect square) CS 3240 - Properties of Regular Languages

  36. Using Closure Properties • Strings with equal number of a’s and b’s • NOTPRIME CS 3240 - Properties of Regular Languages

  37. A Pumpable Non-regular Language • NOTPRIME is pumpable! • Let y = the whole string (akm) • The number of a’s will always be a multiple of km, hence not prime • Note: zero is not a prime number • This does not violate the pumping lemma • The pumping lemma draws no conclusion about non-regular languages CS 3240 - Properties of Regular Languages

More Related