Download Presentation
## Strings

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Strings**• A string over a set A is a finite sequence of elements from A. • The set of elements from which the strings are built is called • an alphabet. Definition. An alphabet is a nonempty, finite set of indivisible symbols. We are going to denote it by . • Any program is a string of keywords, variable names, and • permissible symbols. • A programming language should satisfy general rules (grammar) • to be understood by computer (compiler). These rules are studied • by the formal theory of programming languages.**Definition. A string w over an alphabet is a sequence of**symbols, w = a1a2… an, where each ai , 1 i n. • The number of symbols is called thelengthof the string, |w|=n. There is one special string that has zero length (contains no symbols). It is called the empty string and has special notation , . |w|=0 w = . is not an element of any alphabet, . Example. Let = {a, bb, c}. Find all possible strings with length less or equal 3 built from . Length 0: Length 1: a bb c Length 2: aaabb ac bba bbbb bbc ca cbb cc Length 3: aaa aabb aac abba abbbb abbc aca acbb …**Two strings u and v can be concatenated to form a single**• string uv, that consists of the symbols of string u, followed by • symbols of of string v. The length | uv | = | u|+| v |. • u = u = u • Concatenation is associative: (uv)w = u(vw), • but not commutative: uv vu (the order is important!).**?**Q : L1L2 = L2L1 Definition. Any set of strings over some alphabet is called a language. Examples: Set of all executable computer programs is a language. Alphabet itself is a language as well (the language of all one-symbol words). Since languages are sets, we can apply all set operations to languages: union, intersection and set difference. There is one operation specific for languages: concatenation of two languages : L1L2 ={uv | u L1 and v L2 } A: L1L2L2L1**Example. Take the alphabet = {a, b, c}. Consider two**languages over alphabet : L1 ={a, ab} and L2 ={b, bc, c}. Find L1L2 and L2L1. We need to take every string from L1 and concatenate with every string from L2 . In this way we get |L1||L2 | strings: ab, abc, ac, abb, abbc, abc. Note, that not all strings are distinct, like abc. L1L2 = {ab, abc, ac, abb, abbc} . In the same way: L2L1 = {ba, bab, bca, bcab, ca, cab}. The cardinality | L1L2 | is the number of distinct strings, resulting from concatenation . In general, | L1L2 | | L1| |L2 | and | L1L2 | | L2L1 | In the example | L1L2 | = 5< | L1| |L2 |=6.**In particular, we can consider the concatenation of an**alphabet with itself: is the language of all two-symbol words. Notation: = 2 Example: ={a, b}, = 2 = {aa, ab, ba, bb} Similarly, 3 = 2, the language that consists of all 3-symbol words: 3 ={aaa, aba, baa, bba, aab, abb, bab, bbb}. So, we can define recursively for any n>1: n = n-1 To make this recursive definition agree with the basis case n =1, = 0, zero power 0 is defined as 0 = {}, (no matter what is ). Then {} ={ x | x } = { x | x }= What is 2? What is 2 3 … n?**‘Kleene star’ notation: * = 0 1 2** … So, * is the (infinite) set of all possible words over alphabet , including empty string . Example. = {0, 1}. * is an infinite set of all possible bit strings. (or all binary numbers including numbers with leading 0’s and empty string). Any language L over alphabet is a subset of *, L *. Note that {} , because ={} {} | {}|=1, ||=|{}|=0. A language L may contain , or may not.**Example. Consider two languages over alphabet = {a}:**L1={aa}, L2={, aa, aaaa}. What is L1*? By definition of Kleene star L1* = L10 L11 L12 … ={}{aa} {aaaa} {aaaaaa} … = {, aa, aaaa, aaaaaa, …} infinite set of strings of even length build from symbol a. What is L2*? L2* = L20 L21 L22 … ={}{, aa, aaaa} {, aa, aaaa, aaaaaa, aaaaaaaa}… ={, aa, aaaa, aaaaaa, …} = L1***Definition. A string u is called a substring of v if there**exist two strings x and y, such that v = xuy, and x, y * Definition. A string u is called a prefix of v if there exists a string x *, such that v = ux. Similarly, a string u is called a suffix of v if there exists a string y *, such that v = yu.**Theorem 1. Let A, B and C be sets of strings. Then**(AB)C = ACBC Proof. a) We need to prove the equality of two sets of strings. We can do it by double-inclusion, i. e. to show that i) (AB)C ACBC and ii) ACBC (AB)C**i) To prove (AB)C ACBC, it’s suffices**to show that for any string w, w(AB)C wAC BC Take any w (AB)C x, y, such that w = xy and x(AB) and yC (dfn of concat) … (xA or xB) and y C (dfn of ) … (xA and yC)or (xB and yC) (distributive property) w AC or wBC (dfn of concat) w AC BC (dfn of )**ii) To prove that ACBC (AB)C, we need to**show that for any string w, w ACBC w (AB)C Take any w AC BC w AC or w BC (dfnof ) x, y, such that w = xy and (x A and y C) or (x B and y C) (dfnof concat) So we can have two cases. In the first case, (x A and y C) implies that (x AB and y C) because A (AB). In the second case, (x B and y C) implies that (x AB and y C) because B (AB). So, in either case we have w (AB)C(dfn of concat) So, we proved ACBC (AB)C and (AB)CACBC, that means (AB)C = ACBC**Theorem 2. Let A, B and C be sets of strings. Then**(AB)C ACBC Proof. To prove subset relation we need to show that for any string w, w(AB)C wACBC. Why not to prove ACBC (AB)C as well? Let’s try. Take arbitrary wACBC wAC and wBC . (x, y, w=xy, xA and yC)and (u,v, w=uv, uB and vC) Can we imply xy=uv x = u ? No, because the same string abc may come from abc and abc Example.A ={a}, B ={ab}, C ={c, bc}. Then AB={}, (AB)C={}. AC={ac, abc} BC={abc, abbc} abc ACBC, but we can not imply that abc (AB)C={}