1 / 27

Lecture Two: Formal Languages

Lecture Two: Formal Languages. Amjad Ali. Formal Language. It is an abstraction of the general characteristics of programming languages It consists of a set of symbols and some rules of formation of sentences Sentences are formed by grouping the symbols. Formal Language.

huslu
Download Presentation

Lecture Two: Formal Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture Two: Formal Languages Amjad Ali

  2. Formal Language • It is an abstraction of the general characteristics of programming languages • It consists of a set of symbols and some rules of formation of sentences • Sentences are formed by grouping the symbols

  3. Formal Language • A formal language is the set of all strings permitted by the rules of formation

  4. What is a language? • A system for the expression of certain ideas, facts, or concepts, including a set of symbols and rules for their manipulation

  5. Mathematical definition of a language • This shall require us to understand the following concepts first • Alphabets • Strings • Concatenation of strings etc.

  6. Alphabet • An ALPHABET is a nonempty set of symbols • It is denoted by S • Example: S = {a,b} where a and b are symbols

  7. Alphabets • An alphabet is any finite set of symbols • {0,1}: binary alphabet • {0,1,2,3,4,5,6,7,8,9}: decimal alphabet • ASCII, Unicode: machine-text alphabets • Or just {a,b}: enough for many examples • {}: a legal but not usually interesting alphabet • We will usually use  as the name of the alphabet we’re considering, as in  = {a,b}

  8. Strings • Strings are constructed from the individual symbols • Strings are finite sequences of symbols from the alphabet • Example : aabba, ababaaa, abbbaaa, etc are the strings formed by t he symbols of the alphabet

  9. Symbols And Variables • Sometimes we will use variables that stand for strings: x = abbb • In programming languages, syntax helps distinguish symbols from variables • String x = "abbb"; • In formal language, we rely on context and naming conventions to tell them apart • We'll use the first letters, like a, b, and c, as symbols • The last few, like x, y, and z, will be string variables

  10. Assumptions • Lower case letters a,b,c,… are used for elements of the alphabet • Lower case letters u,v,w,… for string names eg w=aabbaba • This indicates that w is a string having specific value aabbaba

  11. Empty String • The empty string is written as  • Like "" in some programming languages • || = 0 • Don't confuse empty set and empty string: • {}   • {}  {}

  12. Concatenation • The concatenation of two strings x and y is the string containing all the symbols of x in order, followed by all the symbols of y in order • We show concatenation just by writing the strings next to each other • If x = abc and y = def, then xy = abcdef • For any x, x = x = x

  13. Concatenation of the strings • Two strings are concatenated by appending the symbols of one string to the end of the other string • Example u=aaabbb v=abbabba Concatenated string uv=aaabbbabbabba

  14. Length of the string • The length of the string is the number of symbols in the string • |w| = 5 if w = aabaa • Empty String has no symbols and is denoted by l • |l| = 0

  15. Kleene Star • The Kleene closure of an alphabet , written as *, is the language of all strings over  • {a}* is the set of all strings of zero or more as: {, a, aa, aaa, …} • {a,b}* is the set of all strings of zero or more symbols, each of which is either a or b= {, a, b, aa, bb, ab, ba, aaa, …} • x  * means x is a string over  • Unless  = {}, * is infinite

  16. Kleene Star • Iterating a language L • L ={ε} • L =L • L =L·L • L =L ·L • Kleene star: L*=Un≥ 0 L • Example: {a,b}* = {ε,a,b,aa,bb,ab,ba, aab, …} • all finite sequences over {a,b}.

  17. S+andS* • S is an alphabet • S* is the set of allstrings obtained by concatenating zero or more symbols fromS • S* always contains l then S+ = S* - {l}

  18. Finiteness • S is always finite • S* and S+ are always infinite

  19. Numbers • We use N to denote the set of natural numbers: N = {0, 1, …}

  20. Exponents • We use N to denote the set of natural numbers: N = {0, 1, …} • Exponent n concatenates a string with itself n times • If x = ab, then • x0 =  • x1 = x = ab • x2 = xx = abab, etc. • We use parentheses for grouping exponentiations (assuming that  does not contain the parentheses) • (ab)7 = ababababababab

  21. Languages • A language is a set of strings over some fixed alphabet • Not restricted to finite sets: in fact, finite sets are not usually interesting languages • All our alphabets are finite, and all our strings are finite, but most of the languages we're interested in are infinite

  22. Language • A language L is defined very generally as a subset of S • Astring in a language L will be called a sentence of L

  23. Set Formers • A set written with extra constraints or conditions limiting the elements of the set • Not the rigorous definitions we're looking for, but a useful notation anyway: • {x {a, b}* | |x| ≤ 2} = {,a, b, aa, bb, ab, ba} • {xy | x {a, aa} and y {b, bb}} = {ab, abb, aab, aabb} • {x {a, b}* | x contains one a and two bs} = {abb, bab, bba} • {anbn | n ≥ 1} = {ab, aabb, aaabbb, aaaabbbb, ...}

  24. Free Variables in Set Formers • Unless otherwise constrained, exponents in a set former are assumed to range over all N • Examples • {(ab)n} = {, ab, abab, ababab, abababab, ...} • {anbn} = {, ab, aabb, aaabbb, aaaabbbb, ...}

  25. The Quest • Set formers are relatively informal • They can be vague, ambiguous, or self-contradictory • A big part of our quest in the study of formal language is to develop better tools for defining languages

  26. Problem • S = {a,b} • S* = {l,a,b,aa,ab,ba,bb,aaa,aab,aba, abb, baa,bab,bba,bbb,aaaa,… ….} • L = {a,aa,aab} is a language on S as L is a subset of S* and is finite • L = {anbn:n>0} is also a subset of S* but it is infinite

  27. Concatenation of two Languages • L1L2 = {xy :x ε L1 and y ε L2 }

More Related