. . . .
1 / 11

aaba - PowerPoint PPT Presentation

. . . . aaba. acba. aaba. M. G. Models of Language Recognition . Models of Language Generation. What is it about?. Language: The words, their pronunciation, and the methods of combining them used and understood by a community.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Grammars and languages

. . . .






Models of Language


Models of Language


What is it about?

Grammars and languages

  • Language:

  • The words, their pronunciation, and the methods of combining them

  • used and understood by a community.

  • (2) A system of signs and symbols and rules for using them that is used

  • to carry information.

  • - from a Webster dictionary -

Formal languages and grammars definition
Formal Languages and Grammars:definition



(1) A phrase structured (also called type 0) grammar is a 4-tuple

G = < VT , VN , P , S > , where

  • VT : terminal alphabet (called morphemes by linguists),

  • VN : nonterminals (also called variables, or syntactic categories),

  • V = VT VN: total alphabet,

  • S VN: the start symbol, and

  • P : a finite set of production (also called rewriting) rules of the form 

    which means  generates (or produces) , where  V*VNV* and V*.

Formal languages and grammars definition cont ed
Formal Languages and Grammars:definition (cont’ed)

  • Notice that V*VNV*is the set of strings of total alphabet which has at least one

    nonterminal symbol.

  • For two strings w1and w2, we write w1 w2 to denote w2 can be derived from

    w1 by applying a production rule of a grammar G.

  • We write w1 w2 to denote w2 can be derived by applying some finite

    number of production rules including zero.

  • The language of a grammar G, denoted by L(G), is the set of strings over VT that

    can be generated by G staring with the start symbol S, i.e.,

    L(G) = { x | x  VT* and S  x }.

  • Following the convention we will use uppercase letters for nonterminal symbols

    and lowercase letters for terminal symbols.



Foramal languages and grammars definition cont ed
Foramal Languages and Grammars:definition (cont’ed)

(2) Context-sensitive (type 1) grammars are type 0 grammars with the the following restriction:

||   | (i.e., noncontracting) except for S .

(3) Context-free (type 2) grammars are type 0 grammars with the restriction

| | = 1, i.e., the left side of every production rule has only one symbol,

which is nonterminal.

(4) Regular (type 3) grammars are type 2 grammars with the restriction  = xB

or  = x, for some x * and B  VN.


type 0 : G = < {a}, {S,A,B,C,D,E}, P, S >, where

P = { SACaB | aAD  AC

Ca aaCaE  Ea

CB  DB | EAE 

aD  Da }

L(G) = { | n  0 }


type 1 : G = < {a,b,c}, {S,B,C}, P, S >

P = { S  aSBC | aBC

CB  BCbB  bb

aB  abbC  bc

cC  cc }

L(G) = {aibici | i  1 }


type 2 : G = < {0,1}, {S,A,B}, P, S >

P = { S  ASB |  A 0B 1 }

L(G) = {0i1i | i  0 }

type 3 : G = < {0,1}, {S,A}, P, S >

P = { S 0S | AA 1A |  }

L(G) = { 0i1j | i, j  0 }

Remarks on grammars
Remarks on Grammars

The following remarks summarize subtle conceptual aspects concerning formal grammars and their languages that we have defined in the class. Let G = < VT , VN , P , S > be a grammar.

  • The set of rules P does not have any order explicitly defined that must be observed when a string is derived. Recall that the language L(G) is the set of terminal strings that can be generated by applying a finite sequence of production rules.

    However, it is not true that every sequence of production rules produces a terminal string. We may end up with a string which has a nonterminal symbol that can never derive a terminal (or null) string.

    For example, consider the grammar below, which is type 1. ( For convenience, we will only show the set of production rules written according to the convention, because we can identify VT , VN and the start symbol, which is S.)

    (1) S  ABC (2) AB  ab (3) BC  bc (4) bC  bc

    Clearly, only rules (1) (2) (4) applied in this order will derive terminal string abc, which is the only member of the language of the grammar. If you apply (1) followed by (3), you will be stuck with Abc, which cannot be a member of the language because the string has a nonterminal symbol A.

Remarks on grammars cont ed
Remarks on Grammars (cont’ed)

  • Rule (3) of the grammar above is useless in the sense that it does not contribute to the generation of the language. We can delete the rule from the grammar without affecting the language of the grammar. In general, the decision problem of whether an arbitrary grammar has a useless rule or not is unsolvable. However, if we restrict the problem to the class of context-free grammars (type 2), we can effectively clean up such useless rules, if any. We will learn how to do this.

  • The grammars that we have defined in the class are sequential in the sense that only one rule is allowed to apply at a time. Notice that in the above grammar, if we apply rules AB ab and BC  bc simultaneously on string ABC, which is derived from S, we will get terminal string abbc, which is not a member of the language according to our definition. There is a class of grammar where more than one rule can be applied simultaneously. We call such rules parallel rewriting rules. In general it is very difficult to study parallel rewriting grammars. However, the language of a context-free grammar does not depend on how you apply the rules. We get the same language independent of the mode of rule application, sequential or parallel. Why? The answer is left for the reader.

Remarks on grammars cont ed1
Remarks on Grammars (cont’ed)

  • Context-free grammars are defined as type 0 ( not type 1) grammar with the restriction of || = 1. It follows that a context-free grammar can have a contracting rule, like A  , while type 1 grammars cannot have contracting rules except for S  Later we will see that all context-free grammars which have -production rules can be converted

    to a grammar which has production S  if the grammar produces the null string.

  • By definition, a regular grammar cannot have rules of either one of the following from, where A, B, C are arbitrary nonterminals, and a, b are terminals.

    A  bBCA  abBaA  Ba

  • We can define the same class of regular languages using production rules restricted to the forms A  Bx or A  x. Notice that the nonterminal symbols on the right side of a production rule, if any, must be at the left end of the string. We call these rules left linear and the rules defined in the class right linear. However, the definition does not allow a type 3 grammar to have both left linear and right linear forms (e.g., S  aB, B  Sb | b ).

  • Login