CS322 Week 13 Wednesday Review: Coconuts, Formal Languages, and Regular Expressions
In this session, we recap previous topics, including exam preparation, graphing functions, and asymptotic bounds. We present a logical problem involving ten marooned castaways and their coconuts, leading into a discussion of formal languages in computer science. We outline what constitutes a formal language, including rules for strings and alphabets. Additionally, we explore regular expressions, their components, and provide practical examples. Participants are encouraged to engage with questions and solve problems collaboratively.
CS322 Week 13 Wednesday Review: Coconuts, Formal Languages, and Regular Expressions
E N D
Presentation Transcript
Week 13 - Wednesday CS322
Last time • What did we talk about last time? • Exam 3 • Before review: • Graphing functions • Rules for manipulating asymptotic bounds • Computing bounds for running time functions
Logical warmup • Ten people are marooned on a deserted island • During their first day they gather many coconuts and put them all in a community pile • They are so tired that they decide to divide them into ten equal piles the next morning • That night one castaway wakes up hungry and decides to take his share early • After dividing up the coconuts he finds he is one coconut short of ten equal piles • He notices a monkey holding one coconut • He tries to take the monkey's coconut so that the total is evenly divisible by 10 • However, when he tries to take it, the monkey hits him on the head with it, killing him • Later, another castaway wakes up hungry and decides to take his share early • On the way to the coconuts he finds the body of the first castaway and realizes that he is now be entitled to 1/9 of the total pile • After dividing them up into nine piles he is again one coconut short of an even division and tries to take the monkey's (slightly) bloody coconut • Again, the monkey hits the second man on the head and kills him • Each of the remaining castaways goes through the same process, until the 10th person to wake up realizes that the entire pile for himself • What is the smallest number of coconuts in the original pile (ignoring the monkey's)?
Formal languages • Computer science grew out a lot of different pieces • Mathematics • Engineering • Linguistics • Describing an algorithm precisely requires that it be framed in terms of some formal language with exact rules
Rules • We say that a language is a set of strings • A string is an ordered n-tuple of elements of an alphabet Σ or the empty string ε (which has no characters) • An alphabet Σ is a finite set of characters
Examples • Let alphabet Σ = {a, b} • Define a language L1 over Σ to be the set of all strings that begin with the character a and have length at most three characters • Write out L1 • A palindrome is a string which stays the same if the order of its characters is reversed • Define a language L2 over Σ to be the set of all palindromes made up of characters from Σ • Write 10 strings in L2
Notation • Let Σ be some alphabet • For any nonnegative integer n, let • Σn be the set of all strings over Σ that have length n • Σ+ be the set of all strings over Σ that have length at least 1 • Σ* be the set of all strings over Σ • Σ* is called the Kleene closure of Σ and the * operator is often called the Kleene star
Examples • Let alphabet Σ = {x, y, z} • Find Σ0, Σ1,and Σ2 • What is A = Σ0Σ1? What is B = Σ1Σ2? How would you describe these sets and set AB in words? • Describe a systematic way of writing out Σ+ • How would you have to change your system to write out Σ*?
More notation • Let Σ be a finite alphabet • Given strings x and y over Σ, the concatenation of x and y is the string made by writing x with y appended afterwards • With languages L and L' over Σ, we can define the following new languages: • Concatenation of L and L', written LL' • LL' = { xy | x L and y L' } • Union of L and L', written L L' • L L' = { x | x L or x L' } • Kleene closure of L, written L* • L* = { x | xis a concatenation of any finite number of strings in L }
Examples • Let alphabet Σ = {a, b} • Let L1 be the set of all strings consisting of an even number of a's (including the empty string) • Let L2 = {b, bb, bbb} • Find • L1L2 • L1L2 • (L1L2)*
Regular expressions • It's getting annoying trying to describe infinite languages using ellipses • Notation called a regular expression can allow us to express languages precisely and compactly • Given a finite alphabet Σ, we can define regular expressions recursively: • Base: The empty set, the empty string ε, and any individual character in Σ is a regular expression • Recursion: If r and s are regular expressions over Σ, then the following are too: • Concatenation: (rs) • Alternation: (r | s) • Kleene star: (r*) • Restriction: Nothing else is a regular expression
Languages defined by a regular expression • For a finite alphabet Σ, the language L(r) defined by a regular expression r is as follows • Base: L() = , L(ε) = {ε}, L(a) = {a} for every aΣ • Recursion: If L(r) and L(r') are the languages defined by the regular expressions r and r' over Σ, then • L(r r') = L(r)L(r') • L(r | r') = L(r) L(r') • L(r*) = (L(r))*
Examples • Let Σ = {a, b, c} • Let language L = a | (b | c)* | (ab)* • Write 5 strings in L • Let language M = ab * (c |ε) • Write 5 strings in M
Order of precedence • For the sake of consistency, regular expressions obey a particular order of precedence • * is the highest precedence • Concatenation is the next highest • Alternation is the lowest • Parentheses can be omitted if there is no ambiguity • Write (a((bc)*)) with as few parentheses as possible • Write a | b* c, using parentheses to mark the precedence of each operation
Equivalences • As before, let Σ = {a,b} • Can you describe (a | b)* in another way? • What about ( ε | a* | b* )*? • Given that L = a*b(a | b)*, write 5 strings that belong to L • Let M = a* | (ab)* • Which of the following belong to M? • a • b • aaaa • abba • ababab
Examples • Let Σ = {0, 1} • Find regular expressions for the following languages: • The language of all strings of 0's and 1's that have even length and in which the 0's and 1's alternate • The language consisting of all strings of 0's and 1's with an even number of 1's • The language consisting of all strings of 0's and 1's that do not contain two consecutive 1's • The language that gives all binary numbers written in normal form (that is, without leading zeroes, and the empty string is not allowed)
Practical notation • Regular expressions are used in some programming languages (notably Perl) and in grep and other find and replace tools • The notation is generally extended to make it a little easier, as in the following: • [ A – C] means any character in that range, • [A – C] means ( A | B | C ) • [0 – 9] means ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ) • [ABC] means (A | B | C ) • ABC means the concatenation of A, B, and C • A dot stands for any letter: A.C could match AxC, A&C, ABC • ^ means NOT, thus [^D – Z] means not the characters D through Z • Repetitions: • R? means 0 or 1 repetitions of R • R* means 0 or more repetitions of R • R+ means 1or more repetitions of R • Notations vary and have considerable complexity • Use this notation to describe the regular expression for legal C++ identifiers
Finite-state automaton • A finite-state automaton is an idealized machine composed of five objects: • A finite set I, called the input alphabet, of input symbols • A set S of states the automaton can be in • A designated state s0 called the initial state • A designed set of states called the set of accepting states • A next-state functionN: S x I S that maps a current state with current input to the next state
Transition diagram • FSA's are often described with a state transition diagram • The starting state has an arrow • The accepting states are marked with circles • Each rule is represented by a labeled transition arrow • The following FSA represents a vending machine quarter 25¢ 75¢ half-dollar half-dollar quarter $1.25 0¢ quarter half-dollar quarter quarter quarter 50¢ $1 half-dollar half-dollar half-dollar
FSA example • Consider this FSA: • What are its states? • What are its input symbols? • What is the initial state of A? • What are the accepting states of A? • What is N(s1, 1)? • What's a verbal description for the strings accepted? 1 1 s0 s1 s2 0 0 0 1
Annotated next-state tables • Consider the same FSA: • We can also describe an FSA using an annotated next-state table • A next-state table gives shows what the transition is for each state for all possible input • An annotated next-state table also marks the initial state and accepting states • Find the annotated next-state table for this FSA 1 1 s0 s1 s2 0 0 0 1
Table to transition diagram • Consider the following annotated next-state table • marks initial state • marks accepting states): • Draw the corresponding transition state diagram
FSA example • Consider this FSA again: • Which state will be reached on the following inputs: • 01 • 0011 • 0101100 • 10101 • What's a verbal description for the strings accepted? 1 1 s0 s1 s2 0 0 0 1
Eventual-state function • Let A be a FSA with a set of states S, set of input symbols I, and next-state function N: XxI S • Let I* be the set of all strings over I • The eventual-state functionN*: S x I* S is the following • N*(s,w) = the state that A goes to if the symbols of w are input to A in sequence, starting with A in state s • All of this is just a notational convenience so that we have a way of talking about the state that a string will transition an FSA to • We say that wis accepted by AiffN*(s0, w) is an accepting state of A • The language of A, L(A) = { w I* | w is accepted by A }
Designing automata • Design a finite-state automaton that accepts the set of all strings of 0's and 1's such that the number of 1's in the string is divisible by 3 • Make a regular expression for this language • Design a finite-state automaton that accepts the set of all strings of 0's and 1's that contain exactly one 1 • Make a regular expression for this language
Next time… • More on finite state automata • Simplifying FSA's
Reminders • Read Chapter 12