Strings and Languages Operations

Strings and Languages Operations Concatenation Exponentiation Kleene Star Regular Expressions

Strings and Language Operations • Concatenation • Exponentiation • Kleene star • Pages 27-30 of the text • Regular expressions • Pages 71-75 of the text

String Concatenation • If x and y are strings over alphabet S, the concatenation of x and y is the string xy formed by writing the symbols of x and the symbols of y consecutively. • Suppose x = abb and y = ba • xy = abbba • yx = baabb

Properties of String Concatenation • Suppose x, y, and z are strings. • Concatenation is not commutative. • xy is not guaranteed to be equal to yx • Concatenation is associative • (xy)z = x(yz) = xyz • The empty string is the identity for concatenation • x/\ = /\x = x

Language Concatenation • Suppose L1 and L2 are languages (sets of strings). • The concatenation of L1 and L2, denoted L1L2,is defined as • L1L2 = { xy | x  L1 and y  L2 } • Example, • Let L1 = { ab, bba } and L2 = { aa, b, ba } • What is L1L2? • Solution • Let x1= ab, x2= bba, y1= aa, y2= b, y3= ba • L1L2 = { x1y1, x1y2, x1y3, x2y1, x2y2, x2y3 } = { abaa, abb, abba, bbaaa, bbab, bbaba}

Language Concatenation is not commutative • Let L1 = { aa, bb, ba } and L2 = { /\, aba } • Let x1= aa, x2= bb, x3=ba, y1= /\, y2= aba • L1L2 = { x1y1, x1y2, x2y1, x2y2, x3y1, x3y2 } = { aa, aaaba, bb, bbaba, ba, baaba } • L2L1 = { y1x1, y1x2, y1x3, y2x1, y2x2, y2x3 } = { aa, bb, ba, abaaa, ababb, ababa } • L2L2 = { y1y1, y1y2, y2y1, y2y2 } = { /\, aba, aba, abaaba } = { /\, aba, abaaba } (dropped extra aba)

Associativity of Language Concatenation • (L1L2)L3 = L1(L2L3) = L1L2L3 • Example • Let L1={a,b}, L2={c,d}, and L3={e,f} • L1L2L3=({a,b}{c,d}){e,f} ={ac, ad, bc, bd}{e,f} ={ ace,acf,ade,aef,bce,bcf,bde,bdf } • L1L2L3={a,b}({c,d}{e,f}) ={a,b}{ce, df, ce, df} ={ ace,acf,ade,aef,bce,bcf,bde,bdf }

Special Cases • What language is the identity for language concatenation? • The set containing only the empty string /\: {/\} • Example • {aab,ba,abc}{/\} = {/\}{aab,ba,abc} = {aab,ba,abc} • What about {}? • For any language L, L {} = {} L = {} • Thus {} for concatenation is like 0 for multiplication • Example • {aab,ba,abc}{} = {}{aab,ba,abc} = {} • The intuitive reason is that we must choose a string from both sets that are being concatenated, but there is nothing to choose from {}.

Exponentiation • We use exponentiation to indicate the number of items being concatenated • Symbols • Strings • Set of symbols (S for example) • Set of strings (languages) • a3 = aaa • x3 = xxx • S3 = SSS = { x  S* | |x|=3 } • L3 = LLL

Examples of Exponentiation • Let x=abb, S={a,b}, L={ab,b} • a4 = aaaa • x3 = (abb)(abb)(abb) = abbabbabb • S3= SSS = {a,b}{a,b}{a,b} ={aaa,aab,aba,abb,baa,bab,bba,bbb} • L3 = LLL = {ab,b}{ab,b}{ab,b} = {ababab,ababb,abbab,abbb, babab,babb,bbab,bbb}

Results of Exponentiation • Exponentiation of a symbol or a string results in a string. • Exponentiation of a set of symbols or a set of strings results in a set of strings • a symbol  a string • a string  a string • a set of symbols  a set of strings • a set of strings  a set of strings

Special Cases of Exponentiation • a0 = /\ • x0 = /\ • S0 = { /\ } • L0 = { /\ } for any language L • {aa,bb}0 = { /\ } • { a, aa, aaa, aaaa, …}0 = { /\ } • { /\ }0 = { /\ } • 0 = { }0 = { /\ }

Kleene Star • Kleene * is a unary operation on languages. • Kleene * is not an operation on strings • However, see the pages on regular expressions. • L* represents any finite number of concatenations of L. L* = Uk>0 Lk = L0U L1U L2U … • For any L, /\ is always an element of L* • because L0 = { /\ } • Thus, for any L, L* != 

Example of Kleene Star • Let L={aa} • L0={ /\ } • L1=L={aa } • L2={ aaaa } • L3= … • L* = L0  L1  L2  L3 … • = { /\, aa, aaaa, aaaaaa, … } • = set of all strings that can be obtained by concatenating 0 or more copies of aa

Example of Kleene Star • Let L={aa, b} • L0={ /\ } • L1=L={aa,b} • L2= LL={ aaaa, aab, baa, bb} • L3= … • L* = L0  L1  L2  L3 … • = set of all strings that can be obtained by concatenating 0 or more copies of aa and b

Regular Languages • Regular languages are languages that can be obtained from the very simple languages over S, using only • Union • Concatenation • Kleene Star • See lecture 14 and pages 71-75 of the text

Examples of Regular Languages • {aab} (i.e. {a}{a}{b} ) • {aa,b} (i.e. {a}{a}  {b} ) • {a,b}* language of strings that can be obtained by concatenating any number of a’s and b’s • {bb}{a,b}* language of strings that begin with bb (followed by any number of a’s and b’s) • {a}*{bb,/\} language of strings that begin with any number of a’s and end with an optional bb. • {a}*{b}* language of strings that consist of only a’s or only b’s and /\.

Regular Expressions • We can simplify the formula for regular languages slightly by • leaving out the set brackets { } and • replacing  with + • The results are called regular expressions.

Examples of Regular Expressions

String or Language? • Consider the regular expression a*(bb+/\) • a*(bb+/\) is a string over alphabet {a, b, *, +, /\, (, ),  } • a*(bb+/\) represents a language over alphabet {a, b} • It represents the language of strings over {a,b} that begin with any number of a’s and end with an optional bb. • Some regular expressions look just like strings over alphabet {a,b} • Regular expression aaba represents the language {aaba} • Regular expression /\ represents the language {/\} • It should be clear from the context whether a sequence of symbols is a regular expression or just a string.

Module 1: Course Overview • Course: CSE 460 • Instructor: Dr. Eric Torng • TA: To be determined

What is this course? • Philosophy of computing course • We take a step back to think about computing in broader terms • Science of computing course • We study fundamental ideas/results that shape the field of computer science • “Applied” computing course • We learn study a broad range of material with relevance to computing today

Phil. of life What is the purpose of life? What are we capable of accomplishing in life? Are there limits to what we can do in life? Why do we drive on parkways and park on driveways? Phil. of computing What is the purpose of programming? What can we achieve through programming? Are there limits to what we can do with programs? Why don’t debuggers actually debug programs? Philosophy

Physics Study of fundamental physical laws and phenomenon like gravity and electricity Engineering Governed by physical laws Our material Study of fundamental computational laws and phenomenon like undecidability and universal computers Programming Governed by computational laws Science

Applied computing • Applications are not immediately obvious • In some cases, seeing the applicability of this material requires advanced abstraction skills • Every year, there are people who leave this course unable to see the applicability of the material • Others require more material in order to completely understand their application • for example, to understand how regular expressions and context-free grammars are applied to the design of compilers, you need to take a compilers course

Some applications • Important programming languages • regular expressions (perl) • finite state automata (used in hardware design) • context-free grammars • Proofs of program correctness • Subroutines • Using them to prove problems are unsolvable • String searching/Pattern matching • Algorithm design concepts such as recursion

Fundamental Theme * • What are the capabilities and limitations of computers and computer programs? • What can we do with computers/programs? • Are there things we cannot do with computers/programs?

Module 2: Fundamental Concepts • Problems • Programs • Programming languages

Problems We view solving problems as the main application for computer programs

Inputs Outputs Definition • A problem is a mapping or function between a set of inputs and a set of outputs • Example Problem: Sorting (4,2,3,1) (1,2,3,4) (3,1,2,4) (1,5,7) (7,5,1) (1,2,3) (1,2,3)

How to specify a problem • Input • Describe what an input instance looks like • Output • Describe what task should be performed on the input • In particular, describe what output should be produced

Example Problem Specifications* • Sorting problem • Input • Integers n1, n2, ..., nk • Output • n1, n2, ..., nk in nondecreasing order • Find element problem • Input • Integers n1, n2, …, nk • Search key S • Output • yes if S is in n1, n2, …, nk, no otherwise

Programs Programs solve problems

Purpose • Why do we write programs? • One answer • To solve problems • What does it mean to solve a problem? • Informal answer: For every legal input, a correct output is produced. • Formal answer: To be given later

Programming Language • Definition • A programming language defines what constitutes a legal program • Example: a pseudocode program may not be a legal C++ program which may not be a legal C program • A programming language is typically referred to as a “computational model” in a course like this.

C++ • Our programming language will be C++ with minor modifications • Main procedure will use input parameters in a fashion similar to other procedures • no argc/argv • Output will be returned • type specified by main function type

Maximum Element Problem • Input • integer n >= 1 • List of n integers • Output • The largest of the n integers

C++ Program which solves the Maximum Element Problem* int main(int A[], int n) { int i, max; if (n < 1) return (“Illegal Input”); max = A[0]; for (i = 1; i < n; i++) if (A[i] > max) max = A[i]; return (max); }

Fundamental Theme Exploring capabilities and limitations of C++ programs

Restating the Fundamental Theme * • We will study the capabilities and limits of C++ programs • Specifically, we will try and identify • What problems can be solved by C++ programs • What problems cannot be solved by C++ programs

Question • Is C++ general enough? • Or is it possible that there exists some problem P such that • P can be solved by some program P in some other reasonable programming language • but P cannot be solved by any C++ program?

Church’s Thesis (modified) • We have no proof of an answer, but it is commonly accepted that the answer is no. • Church’s Thesis (three identical statements) • C++ is a general model of computation • Any algorithm can be expressed as a C++ program • If some algorithm cannot be expressed by a C++ program, it cannot be expressed in any reasonable programming language

Summary * • Problems • When we talk about what programs can or cannot “DO”, we mean what PROBLEMS can or cannot be solved

Module 3: Classifying Problems • One of the main themes of this course will be to classify problems in various ways • By solvability • Solvable, “half-solvable”, unsolvable • We will focus our study on decision problems • function (one correct answer for every input) • finite range (yes or no is the correct output)

Set of Problems Subset 1 Subset 2 Subset 3 Classification Process • Take some set of problems and partition it into two or more subsets of problems where membership in a subset is based on some shared problem characteristic

Set of All Problems Solvable Problems Unsolvable Problems Classify by Solvability • Criteria used is whether or not the problem is solvable • that is, does there exist a C++ program which solves the problem?

Set of All Problems Non-Function Problems Function Problems Function Problems • We will focus on problems where the mapping from input to output is a function

Inputs Outputs General (Relation) Problem • the mapping is a relation • that is, more than one output is possible for a given input

Inputs Outputs Criteria for Function Problems • mapping is a function • unique output for each input

1 3 9 Inputs Outputs Example Non-Function Problem • Divisor Problem • Input: Positive integer n • Output: A positive integral divisor of n 9

Strings and Languages Operations