1 / 47

Regular Languages and Grammars

CS 3240 – Chapter 3. Regular Languages and Grammars. Directory Operations. How would you delete all C++ files from a directory from the command line ? How about all PowerPoint files that start with the letter a ? PowerPoint file names that contain the string 3240 ?. Patterns for Strings.

dallon
Download Presentation

Regular Languages and Grammars

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 3240 – Chapter 3 Regular Languages and Grammars

  2. Directory Operations • How would you delete all C++ files from a directory from the command line? • How about all PowerPoint files that start with the letter a? • PowerPoint file names that contain the string 3240? CS 3240 - Regular Languages and Grammars

  3. Patterns for Strings • *.cpp • a*.ppt • *3240*.ppt • These are wildcard expressions • Not bona fide regular expressions CS 3240 - Regular Languages and Grammars

  4. Where Are We? CS 3240 - Introduction

  5. Regular Expressions • Text patterns that represent regular languages • We’ll show shortly that for every regular expression there is a finite automaton that accepts that language • And vice-versa • The operators are: • ( ) (Grouping) • * (Kleene Star) • + (Union) • xy (Concatenation) CS 3240 - Regular Languages and Grammars

  6. Recursive Definitions of Sets • 1) Specify base case(s) • 2) Show how to generate other elements • Rules that use what’s in the set already • Example: Non-negative multiples of 5, F • 1) 0, 5 is in F • 2) For x, y in F, then x + y is in F • Alternate definition: • 1) 0 is in F • 2) For x in F, so is x + 5 CS 3240 - Regular Languages and Grammars

  7. Regular ExpressionsRecursive Definition • Base cases: • The empty set: ∅ or ( ) • The empty string: λ • Any letter in Σ • Recursive rules: Given regular expressions r, r1, r2: • (r) (Grouping) • r* (Kleene Star) • r1 + r2 (Union) • r1r2 (Concatenation) CS 3240 - Regular Languages and Grammars

  8. Regular ExpressionsExamples • All strings beginning with a: • a(a + b)* • All strings containing aba: • (a + b)*aba(a + b)* • All strings of even length: • ((a + b)(a + b))* = (aa + ba + ab + bb)* = ((a + b)2)* • All strings of odd length: • (a+b)((a + b)2)* • Valid decimal integers in C: • (1+2+3+4+5+6+7+8+9)(0+1+2+3+4+5+6+7+8+9)* CS 3240 - Regular Languages and Grammars

  9. Taking Liberties with Transition Graphs • Put anything you want on an edge • Use an “else” branch as well • [0-9] (if-branch) • ~[0-9] or [^(0-9)] or else (Decimal integers) CS 3240 - Regular Languages and Grammars

  10. What Language? • (b*ab*ab*ab* + b) * • = b* (ab*ab*ab*) * • = b* + (b*ab*ab*ab*) * • (a(a+bb) *) * • ((a + b)a) * CS 3240 - Regular Languages and Grammars

  11. Language Associated with a Regular Expression (Stating the Obvious) • L(∅) =∅ • L(λ) = λ • L(c) = c, for c∊Σ • L((r)) = L(r) • L(r*) = L(r)* • L(r1 + r2) = L(r1) ∪ L(r2) • L(r1r2) = L(r1)L(r2) CS 3240 - Regular Languages and Grammars

  12. “Algebra” of Regular Expressions • r+s = s+r • (r+s)+t = r+(s+t) • r+r = r • r + ∅ = r • (rs)t = r(st) • rλ = λr = r • r ∅ = ∅r = ∅ • r(s+t) = rs+rt • (r+s)t = rt+st CS 3240 - Regular Languages and Grammars

  13. Regular Expressions and Finite Automata (Section 3.2) • For every regular expression there is an associated NFA that accepts the same language • And therefore a DFA, by conversion • For every FA (either NFA or DFA) there is a regular expression that represents the same language CS 3240 - Regular Languages and Grammars

  14. Regular Expression => NFA • We will show how to convert each element of the definition of regular expressions to an NFA • This is sufficient! • And shows the convenience of recursive definitions (review slide 7 now) • because if we can give a machine for every case in the definition of REs, we are done! CS 3240 - Regular Languages and Grammars

  15. Mapping Primitives REs • Empty Language • Empty String • Single Character CS 3240 - Regular Languages and Grammars

  16. Mapping Union of REs CS 3240 - Regular Languages and Grammars

  17. Mapping Union of REsA Simplification • Just draw the lambdas from a new start state to the start states of each machine • Remove the start notation from the original start states • (No need to have a new final state) CS 3240 - Regular Languages and Grammars

  18. Mapping Concatenation of REs CS 3240 - Regular Languages and Grammars

  19. Mapping Concatenation of REsA Simplification • 1) Just draw a lambda from each final state of the first machine to the start state of the second machine • 2) remove the acceptability of those final states of the first machine CS 3240 - Regular Languages and Grammars

  20. Mapping Kleene Star of a RE CS 3240 - Regular Languages and Grammars

  21. Mapping Kleene Star of a REA Simplification • We need to do two things: • 1) Add the empty string, if needed • 2) Loop from each final state back to the start state • Procedure: • 1) If the empty string is not accepted, create a new start state which accepts, and connect to the original start state with λ • 2) Add a λ-edge from each final state to the original (or the new) start state CS 3240 - Regular Languages and Grammars

  22. Practice • Draw NFAs for the REs on slides 8 and 9 CS 3240 - Regular Languages and Grammars

  23. FA => Regular Expression • First remove all jails • Then, if needed, convert the DFA to an equivalent NFA with • A start state with no incoming edges • A single final state with no outgoing edges • Will need lambda transitions for this • Then “eliminate” all but the start and final states • Without changing the language accepted • Using GTGs… CS 3240 - Regular Languages and Grammars

  24. Generalized Transition GraphsGTGs • Allow regular expressions on the edges Accepts a* + a*(a+b)c* [Note: (c*)* = c*] CS 3240 - Regular Languages and Grammars

  25. FA => REStep 1 • If the start state has an incoming edge (even if it’s a loop), create a new start state with a lambda transition to the old start state: CS 3240 - Regular Languages and Grammars

  26. FA => REStep 2 • If there is more than one final state, or if the single final state has an outgoing edge (even if it’s a loop), create a new final state and link to it with a lambda transition from each final state: CS 3240 - Regular Languages and Grammars

  27. FA => REStep 3 • “Remove” each intermediate state, one at a time: • Combine each incoming path with each outgoing path (only “through” paths; not loops) • Determine the regular expression equivalent to the combined path through the current state • Add an edge with that RE between the incoming state and the outgoing state • Repeat until all intermediate states vanish CS 3240 - Regular Languages and Grammars

  28. FA => REExample CS 3240 - Regular Languages and Grammars

  29. FA => REExample: Steps 1 and 2 • To eliminate 2: • 1-2-1: af*b • 1-2-3: af*c • 3-2-1: df*b • 3-2-3: df*c CS 3240 - Regular Languages and Grammars

  30. FA => REExample: Step 3a (State 2 removed) • To eliminate 1: • 0-1-3: (e+af*b)*(h+af*c) • 3-1-3: (i+df*b)(e+af*b)*(h+af*c) CS 3240 - Regular Languages and Grammars

  31. FA => REExample: Step 3b (State 1 removed) Eliminate 3 (Final Result): (e+af*b) *(h+af*c)(g+df*c+(i+df*b)(e+af*b) *(h+af*c))* CS 3240 - Regular Languages and Grammars

  32. FA => REEVEN - EVEN CS 3240 - Regular Languages and Grammars

  33. Exercise • Find a regular expression for the language containing all strings that do not contain the substring aa CS 3240 - Regular Languages and Grammars

  34. FA => REOnline Document • See bypass.doc • Shows different possibilities by eliminating states in different orders • But the REs obtained are equivalent • Meaning they represent the same language CS 3240 - Regular Languages and Grammars

  35. Where Are We? CS 3240 - Introduction

  36. Regular GrammarsSection 3.3 • There is a natural correspondence between FAs and grammars • Right-linear Grammars • “Linear” means there is at most one variableon the right-hand side of the rule • “Right-linear” means the variable occurs as the last entry in the rule: • A → abC CS 3240 - Regular Languages and Grammars

  37. Equivalence of FAs and Grammars • The variables represent states • The right-hand side contains the character(s) on the edge, optionally followed by the target state • The accepting states have a lambda rule A → aB | bC | λ B → aA | bD C → aD | bA D → aC | bB CS 3240 - Regular Languages and Grammars

  38. Rules Without a Variable • Go to an accepting state with no out-edges A → b CS 3240 - Regular Languages and Grammars

  39. Another Grammar for EVEN-EVEN • S → aaS | bbS | abA | baA | λ • A → aaA | bbA | abS | baS a GTG CS 3240 - Regular Languages and Grammars

  40. Exercise • Construct a regular grammar for the language denoted by aab*a • First build a GTG • Then map to a right-linear grammar CS 3240 - Regular Languages and Grammars

  41. A Left-Linear Grammaraab*a • S → Xa • X → Xb | aa • How did I come up with this? CS 3240 - Regular Languages and Grammars

  42. Left-linear = Right-linear • If you have the single variable only at the left ends, you have a left-linear grammar • This is also a regular grammar • We will show how to convert between right-linear and left-linear grammars • We will use two facts to establish the process: • If L is regular, so is LR (Section 2.3, exercise 12) • L(GR) = L(G)R(obvious, but on next slide…) CS 3240 - Regular Languages and Grammars

  43. L(GR) = L(G)R • GR means you reverse the right-hand sides of each rule in a grammar, G • The language generated is L(G)R (the reverse of L(G)) S → abS | X X → bX | λ(ab)*b* S → Sba | X X → Xb | λ b*(ba)* CS 3240 - Regular Languages and Grammars

  44. Convert Right-linear to Left-linearUsing 2 Reversals • Convert the right-linear grammar to a GTG • “Reverse” the GTG (a la Section 2.3, #12) • Ensure a single final state (use λ if needed) • Interchange the role of the start and final states • Reverse all arrows • Convert the reversed GTG to a right-linear grammar • Reverse the right-hand sides of each rule to obtain the left-linear grammar CS 3240 - Regular Languages and Grammars

  45. ExampleConverting Right-linear to Left-linear: (aab)*ab A → aB B → abA | b (rev) C → bB B → aA A → baB | λ ba(baa)* C → Bb B → Aa A → Bab | λ (aab)*ab (rev) CS 3240 - Regular Languages and Grammars

  46. Convert Left-linear to Right-linearReverse the Steps on Previous Slide • Reverse the grammar, G, obtaining right-linear grammar, GR, for L(G)R • Convert to GTG • Reverse the GTG • Convert to Right-linear CS 3240 - Regular Languages and Grammars

  47. Summary CS 3240 - Regular Languages and Grammars

More Related