Create Presentation
Download Presentation

Download Presentation
## Properties of Regular Languages

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**CS 3240 – Chapter 4**Properties of Regular Languages**Topics**• Closure Properties • Algorithms for Elementary Questions: • Is a given word, w, in L? • Is L empty, finite or infinite? • Are L1 and L2 the same set? • Detecting non-regular languages CS 3240 - Properties of Regular Languages**Closure Properties**• Closure of operations • If x and y are in the same set, is x op y also? • Example: The integers are closed under addition • They are not closed under division • Regular languages are closed under everything! • Typical set operations CS 3240 - Properties of Regular Languages**Regular Operations**• Regular languages are closed under: • Kleene Star (*) • Union (+) • Concatenation (xy) • (By definition!) • They are also closed under: • Complement (reverse state acceptability✓) • Intersection • Set difference • Reversal (already proved in homework #12, 2.3✓) CS 3240 - Properties of Regular Languages**Closure Under Intersection**• Proof from set theory: • L1 ∩ L2 = (L1’ ∪ L2’)’ • Since complement and union are closed, intersection must be also! QED CS 3240 - Properties of Regular Languages**Union of ComplementsAnd Complement the Result**• Note how the intersection is never shaded • L1’ ∪ L2’ shades everything but where they overlap • Therefore, (L1’ ∪ L2’)’is the overlap (intersection) CS 3240 - Properties of Regular Languages**Set DifferenceA Simple Proof**• A – B: • Everything that is in A but not in B • A – B = A ∩ B’ • We have already shown that regular languages are closed under intersection and complement. QED CS 3240 - Properties of Regular Languages**Computing the Union by MachineBy “combining” the**machines • Start with a composite start state: • Consisting of the two start states • Follow all out-edges simultaneously • As we did for NFA-to-DFA conversion • States containing any original final state is a final state in the result for union • Because one of the machines accepts there • States containing an original final state from each original machine is a final state in the result for intersection • Because both of the machines accept there • ¿How would you construct the difference machine? CS 3240 - Properties of Regular Languages**a,b**b a a -x1 x2 +x3 b b b a a a a b b Double-a EVEN-EVEN**For union: assign accepting states where any original xi or**yi accept. For intersection: assign accepting states only where both original xi or yi accept simultaneously. No need to compute (L1’ ∪ L2’)’! For difference, assign accepting states where one accepts and the other does not.**The resulting machine…**a a a a b a a b b b b b b b a a b b a a b a a**The Membership ProblemSection 4.2**• Given a word w, and a regular language, L, can we answer the question: • Is w ∊ L? • You tell me… CS 3240 - Properties of Regular Languages**Is L Empty?**• A graph theory problem: • Find a path from the start to a final state in the associated FA • Algorithm: “mark” the start state repeat: mark any state with an incoming edge from a previously marked state untilan accepting state is marked or no new states were marked at all CS 3240 - Properties of Regular Languages**Another SolutionTo See if L is Non-empty**• Attempt to convert the associated FA to a regular expression • By the state bypass and elimination algorithm • If you get a regular expression, then a string is accepted CS 3240 - Properties of Regular Languages**Yet Another ApproachTo See if L is non-empty (by computer)**• Suppose a minimal machine, M, for the language L has p states • If M accepts any non-empty words at all, it must accept one of length <=p • Why? • So… • Systematically try all possible strings in Σ* of length 1 through p. If none are accepted, then no non-empty strings at all are in L. CS 3240 - Properties of Regular Languages**Is L Finite or Infinite?**• Convert its machine to a regular expression • It is infinite iff it has a star • • Another way: • A language is infinite if there is a cycle in an accepting path • A (tedious) graph theory problem • CS 3240 - Properties of Regular Languages**An ObservationAbout Infinite Regular Languages**• Suppose L’s minimal machine, M, has p states • Any path of length p has (or is) a cycle • And any cycle must have or be a cycle of length p or less • Because a state is revisited after at mostp characters • So, infinite languages have a machine with at least one cycle of length p or less in an accepting path* • And all non-empty languages have a string of length p or less (already showed that)… CS 3240 - Properties of Regular Languages**Finishing the ReasoningAbout Detecting Infinite Languages**– A third way • Let m denote the length of a cycle in an accepting path • We know m ≤ p • Let k be the length of a string in L such that k ≤ p • There has to be one if the language is infinite! • Then strings of length k + im are accepted, i ≥ 0 • By traversing the cycle i times • But k + im ≤ p + ip = (i+1)p • So, there must be some i such that p ≤ k+im ≤ 2p • Procedure: Test all strings of length p through 2p-1 CS 3240 - Properties of Regular Languages**Is L1 = L2?**• That is, are they the same set of strings? • Set-theoretic argument: • Two sets are equal if their symmetric difference is empty (denoted by A ∆ B or A ⊖ B) • A ∆ B = A ∪ B – A ∩ B = A – B ∪ B – A • But A – B = A ∩ B’, and B – A = B ∩ A’ • So L1 = L2 iff (L1 ∩ L2’) ∪ (L1’ ∩ L2) = ∅ CS 3240 - Properties of Regular Languages**Is L1 = L2?**CS 3240 - Properties of Regular Languages**Is L1 = L2?**CS 3240 - Properties of Regular Languages**Non-Regular LanguagesSection 4.3**• Not all languages are regular • We need to recognize whether languages are regular or not • We don’t want to waste time using regular language processing techniques where they don’t apply CS 3240 - Properties of Regular Languages**ab**CS 3240 - Properties of Regular Languages**ab + aabb**CS 3240 - Properties of Regular Languages**ab + aabb + aaabbb**CS 3240 - Properties of Regular Languages**Recognizing Non-Regular Languages**• Consider anbn • ab is regular • ab + aabb = anbn, 0 ≤ n ≤ 2, is regular • Any finite language is regular (why?) • But anbn, n ≥ 0 is not regular (why not?) • How do we prove it’s not regular!?! CS 3240 - Properties of Regular Languages**An Observation**• Finite Automata don’t have unlimited counting capability • They only have a fixed number of states • Intuitively, we see that an automaton can’t keep track of counts for anbn where n is arbitrarily large • But intuition is often faulty. We need a proof! CS 3240 - Properties of Regular Languages**About Infinite Regular LanguagesRedux**• Any accepted string of length p (the number of states) or greater forces a cycle in an accepting path. • In other words, at least one state is visited a second time • And that “revisit” must happen within the first p characters of the string • Because that’s when the (p+1)th state is entered • This could be any state (start, final, other) CS 3240 - Properties of Regular Languages**anbn is Not RegularProof by Contradiction**• Consider akbk, where k is greater than the number of states in a supposed DFA accepting all anbn, n ≥ 0 • Before the first b is encountered, a state has been visited at least twice (because there are more a’s than states) • Suppose the length of the associated cycle is m • Then the string ak+imbk is also accepted! • This contradicts the existence of a DFA that accepts anbn CS 3240 - Properties of Regular Languages**“Revisiting” a State**The first “revisit” CS 3240 - Properties of Regular Languages**The Pumping LemmaFor Regular Languages**• For every infinite regular language, L, there is a number, p, such that for all strings, s, in L, where |s| ≥ p, you can partition s into three concatenated substrings, xyz, such that: • |y| > 0 • |xy| ≤ p • xy*z ∈ L CS 3240 - Properties of Regular Languages**Using the Pumping LemmaRegular => Pumpable≣ ¬Pumpable =>**¬Regular • You can only use the pumping lemma to show that a language is not regular • By showing it fails the “pumping” conditions of infinite regular languages • Note: Some non-regular languages pump! • The trick is to find a convenient string • Usually the condition |xy| ≤ p is also key • Sometimes pumping down (i = 0) is easiest CS 3240 - Properties of Regular Languages**Using the Pumping Lemma on anbn**• Consider the string apbp • It is in this language • It is long enough (≥ p in length) • Now let apbp = xyz • Remember |xy| ≤ p • What can you conclude about y? CS 3240 - Properties of Regular Languages**Playing Games**• You can treat proving a language non-regular as a “game”: • You pick a string, s, in L, where |s| ≥ p • You may pick any such string; choose wisely! • Opponent picks x, y, and z • But must obey |xy| ≤ p and |y| > 0 • You show it can’t be “pumped” • Because a pumped string falls “outside” the language • Must anticipate all possible partitions xyz CS 3240 - Properties of Regular Languages**Some Non-regular languagesAll require arbitrary counting**capability • aibj, i > j • PALINDROME • w = wR (same backwards and forwards) • ww • Equal halves • PRIME (am where m is prime) • SQUARE (am where m is a perfect square) CS 3240 - Properties of Regular Languages**Using Closure Properties**• Strings with equal number of a’s and b’s • NOTPRIME CS 3240 - Properties of Regular Languages**A Pumpable Non-regular Language**• NOTPRIME is pumpable! • Let y = the whole string (akm) • The number of a’s will always be a multiple of km, hence not prime • Note: zero is not a prime number • This does not violate the pumping lemma • The pumping lemma draws no conclusion about non-regular languages CS 3240 - Properties of Regular Languages