Grammars and languages
Download
1 / 50

Grammars and languages - PowerPoint PPT Presentation


  • 59 Views
  • Uploaded on

Grammars and languages. Hybrid and uncertain systems. Language. A language L is a set of strings over the alphabet T alphabet = finite set of symbols (letters). Example : T = { a,b,c } L = { abc, abbc, ab }. Gram mars. Grammar is a quaternition:. where :

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Grammars and languages' - jaimie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Grammars and languages

Grammars and languages

Hybrid and uncertain systems

České vysoké učení technické v Praze

Fakulta dopravní


Language
Language

A language L is a set of strings over the alphabet T

  • alphabet = finite set of symbols (letters)

Example:

T = {a,b,c}

L = {abc, abbc, ab}

České vysoké učení technické v Praze

Fakulta dopravní


Gram mars
Grammars

Grammar is a quaternition:

where:

N- a set of non-terminal symbols

T- a set of terminal symbols

P - a set of rules

S - start symbol of the grammar SN

České vysoké učení technické v Praze

Fakulta dopravní


Rules
Rules

  • set of rules:

the left sideof the rule

the right side

of the rule

is a arbitrary string consisting terminal and non-terminal symbols

České vysoké učení technické v Praze

Fakulta dopravní


Rules1
Rules

  • the rule (,)Pis written in the form of

      

    • the sense: „ is transcribed to “

the left side contains always at least one non-terminal symbol (it is possible to rewrite non-terminal symbol)

České vysoké učení technické v Praze

Fakulta dopravní


An example of simple grammar
An example of simple grammar

  • the grammar generating symmetric strings of zeros and ones 0000…01…11111

    G = (N,T,P,S)

    N = { S, A }

    T = { 0, 1 }

    P = { S 0A1, A 0A1, A  }

    (symbol  is an empty symbol)

České vysoké učení technické v Praze

Fakulta dopravní


An example of simple grammar1
An example of simple grammar

  • generated string (sentence):

    S 0A1  00A11  000A111  000111

    Terminology:

  •  = γ1αγ2 generates  =γ1βγ2 directly, if the rule α  β exists

    • it is denoted 

    • example: 00A11  000A111

České vysoké učení technické v Praze

Fakulta dopravní


Terminolog y
Terminology

  •  generates , if the sequence α1, α2,…, αn exists such = α1,  = αn a αi  αi+1,

    i = 1 …n

    • it is denoted  * 

    • the sequence of string is a derivation

    • example: 0A1 * 000111

  • derivation description

    • the sequence of rules – the previous slide

    • derivation tree

České vysoké učení technické v Praze

Fakulta dopravní


Derivation tree
Derivation tree

S

0

A

1

0

A

1

0

A

1

České vysoké učení technické v Praze

Fakulta dopravní


Languages and grammars
Languages and grammars

  • the languageLG is by the grammarG

  • the grammar G and the language LG generated by the grammar are equivalent

    Note:

  • sentences of the languages are composed only by terminal symbols

České vysoké učení technické v Praze

Fakulta dopravní


Grammar classification by chomsk i
Grammar Classification by Chomski

  • grammars are classified by the shape of rules

    • general (unlimited)

    • context

    • context-free

    • regular

České vysoké učení technické v Praze

Fakulta dopravní


Grammar classification
Grammar classification

  • unlimited - L(0)

    • rules are general

  • context – L(1)

    • γ1Aγ2  γ1βγ2, AN, γ1,γ2 is a context,

      γ1,γ2 (N  T)*,β (N  T)+

  • context-free

    • A  β, AN, β  (N  T)+

  • regular

České vysoké učení technické v Praze

Fakulta dopravní


Grammar classification1
Grammar classification

  • unlimited grammars generate unlimited languages - L(0)

  • context grammars generate context languages - L(1)

  • context-free grammars generate context-free languages

  • regular grammars generate regular languages

České vysoké učení technické v Praze

Fakulta dopravní


Example of unlimited grammar
Example of unlimited grammar

  • G = { N, T, P, S }

  • N = { S, B }

  • T = { a, b, c }

  • P = { S abc, S aSBc, cB Bc, bB bb }

  • the grammar generates language:

České vysoké učení technické v Praze

Fakulta dopravní


Example of context grammar
Example of context grammar

  • the third rule, cB Bc, of the previous example is not the rule of context grammar, others are valid

  • we transform the previous grammar to the context one

  • the rule AB  BAis replaced with the set of rules of context grammar:

    • the context is denote by the blue letter

  • AB  XB

  • XB  XA

  • XA  BA

České vysoké učení technické v Praze

Fakulta dopravní


Example of context grammar1
Example of context grammar

  • but the swapping of symbols can not be applied to the third rule

    Why?

    Because the terminal symbol can not be replaced.

  • we add a new terminal symbol C, the rule cC cc and we modify other rules

České vysoké učení technické v Praze

Fakulta dopravní


Example of context language
Example of context language

G = { N, T, P, S }

  • N = { S, B, C, X}

  • T = { a, b, c }

  • P = { S abC, S aSBC, CB XB,

    XB  XC, XC  BC, bB bb, bC  bc,

    cC  c }

  • the grammar generates the same language

České vysoké učení technické v Praze

Fakulta dopravní


Using grammars in programming
Using grammars in programming

  • lexical elements of programming languages (keyword, constants) are defined by the regular grammars

  • programming languages are defined by context-free grammars

České vysoké učení technické v Praze

Fakulta dopravní


Regular grammars
Regular grammars

  • the shape for rules:

    A aB orA a,

    where A, B  N, a T

    Note:

    • rules of shape A aBare members of the right regular grammar

    • rules of shape A Baare members of the left regular grammar

České vysoké učení technické v Praze

Fakulta dopravní


Example
Example

  • grammar that generates positive integer constants in C programming language

    • decimal constants start with 1-9

    • octal constants start with 0

    • hexadecimal constants start with 0x

      G = (N, T, P, S)

      N = { S, X, D, H, O }

      T = { 0,...,9,x,A,..., F }

České vysoké učení technické v Praze

Fakulta dopravní


Example1
Example

České vysoké učení technické v Praze

Fakulta dopravní


Finite state machines
Finite State Machines

  • regular language generated by the regular grammar can be accepted by the finite state machine

    • FSM is a model of lexical analyzer that recognizes if the input string belongs to the language

    • FSM is equivalent to the regular grammar

České vysoké učení technické v Praze

Fakulta dopravní


FSMs

  • FSM is a five-tuple

    where

    T is a finite set of input symbols

    Q is a finite set of internal states

     is a transition:

    • function: Q  T  Q for deterministic FSM

    • relation  Q  T  Q for nondeterministic FSM

České vysoké učení technické v Praze

Fakulta dopravní


FSMs

  • Kis a set of final states

  • q0is the initial state

    Note:

    • FSM has no output function

    • if FSM accepts a string from the language the present state is s K

    • FSM can be nondeterministic

      • it is transformable to the deterministic one

České vysoké učení technické v Praze

Fakulta dopravní


FSMs

České vysoké učení technické v Praze

Fakulta dopravní


Algorithm of constru cting a fsm from the regular grammar
Algorithm of constructing a FSM from the regular grammar

  • the set of input symbol is given

    X = T

  • the set of internal states is given

    Q = N {U},U N

  • each rule A aB implicates the transition (A,a)=B, each rule A a implicates the transition (A,a)=U

  • the set of final statesK= {U}, orK={U,S}, if the ruleS   exists

České vysoké učení technické v Praze

Fakulta dopravní


Equivalent fsm to the regular grammar
Equivalent FSM to the regular grammar

  • the FSM is nondeterministic

S

U

final state

initial state

České vysoké učení technické v Praze

Fakulta dopravní


Equivalent fsm to the regular grammar1
Equivalent FSM to the regular grammar

  • corresponding deterministic FSM

S

initial state

final states

České vysoké učení technické v Praze

Fakulta dopravní


Regular expressions
Regular expressions

  • a finite alphabet T is given

  • regular expressions generate regular language, they defined recursively, using operations „*“ (iteration), „·“ (concatenation) a „+“ (union)

    Definition:

    1) Each letter x Tis a regular expression

    2) If E1, E2 are regular expressions, then E1 · E2,

    E1+ E2, E1 *, (E1) are regular expressions too.

České vysoké učení technické v Praze

Fakulta dopravní


Regular expression generating constants in c language
Regular expression generating constants in C language

(1+...+9)(0+...+9)*+0(0+...+7)*+

+0x(0+...+9+A+...+F)(0+...+9+A+...+F)*

České vysoké učení technické v Praze

Fakulta dopravní


Equivalence
Equivalence

  • regular grammars, regular expressions and FSMs are equivalent and convertible

Regular grammars

FSMs

Regular expressions

České vysoké učení technické v Praze

Fakulta dopravní


Example of context free grammar
Example of context-free grammar

  • the grammar generating a simple programming language

    G = {N,T,P,S}

    N = {S, Seq, Block, Comm, Cond}

    T = {main,{,},;,read_x,write_x,++,--, if,(,),else,==,!=,0,x,>,<}

České vysoké učení technické v Praze

Fakulta dopravní


Rules2
Rules

  • S main{Seq},

  • Seq Comm, Seq CommSeq

  • Block Comm,Block {Seq}

  • Comm  read_x;,Comm  write_x;

  • Comm  x++;,Comm  x—-;

  • Comm  if(Cond) Block

  • Comm  if(Cond) Block else Block

  • Cond  x==0, Cond  x>0, Cond  x<0,

  • Cond  x!=0

České vysoké učení technické v Praze

Fakulta dopravní


Generated sequence
Generated sequence

S main{Seq}  main{Comm}

main{if(Cond) Block } 

main{if(x!=0) Block } 

main{if(x!=0) Comm } 

main{if(x!=0)if(Cond)Block else Block } 

main{if(x!=0)if(x<0)Comm else Comm} 

main{if(x!=0)if(x<0)x++; else Comm} 

main{if(x!=0)if(x<0)x++; else x--;}

České vysoké učení technické v Praze

Fakulta dopravní


Other sequence
Other sequence

S main{Seq}  main{Comm}

main{if(Cond) Block else Block } 

main{if(Cond) Comm else Block } 

main{if(x!=0) Comm else Comm } 

main{if(x!=0)if(Cond)Block else Comm } 

main{if(x!=0)if(x<0)Comm else Comm} 

main{if(x!=0)if(x<0)x++; else Comm} 

main{if(x!=0)if(x<0)x++; else x--;}

  • the left nonterminal symbol is always replaced (left derivation)

České vysoké učení technické v Praze

Fakulta dopravní


Ambiguity
Ambiguity

Remark:

  • two syntactically identical sentences are generated by the two different derivations (the same syntax, but different semantics)

    • such languages are ambiguous

  • solutions:

    • to define additional rules in programming languages, for example else is assigned to the nearest if

České vysoké učení technické v Praze

Fakulta dopravní


The analysis of context free languages
The analysis of context free languages

  • context-freelanguage is analyzed by FSM with stack (LIFO) – push down automaton

  • Note:

    • analysis of context languages and unlimited languages is NP problem

České vysoké učení technické v Praze

Fakulta dopravní


Translation regular grammars
Translation regular grammars

  • translation regular grammar

    G = (N,T,D,P,S)

    where:

    N- a set of non-terminal symbols

    T- a set of terminal (input) symbols

    D- a set of output symbols

    P – a set of rules

    S - start symbol SN

České vysoké učení technické v Praze

Fakulta dopravní


Rules3
Rules

  • rules are of the form:

    A aB orA a,

    where A, B  N, a T,   D*

    (D* is a set of all strings over alphabetD)

České vysoké učení technické v Praze

Fakulta dopravní


Example of the transl grammar
Example of the transl. grammar

G= {N,T,D,P,S}

N = {S,A,K,X}

T = {a,+,*}

D = {,,}

P = { S aA, A +K, A *X, K a,

X a }

  • example:

    S aA a+K  a+a

    • the grammar translates expression a+a in infix form to output expression  in postfix form

České vysoké učení technické v Praze

Fakulta dopravní


Translation fsm
Translation FSM

  • translation FSM is a six-tuple

    where

    Ta set of input symbols

    Da set of output symbols

    Qa set of internal states

    Ka set of terminal states

    q0is a initial state

České vysoké učení technické v Praze

Fakulta dopravní


Translation fsm1
Translation FSM

is amapping:

  • : Q  T  { Mi: Mi  Q  D* }

  • if a grammar contains rules A ayB, resp.A ay, where y D (the rule contains only one output symbol) and there are no two rules such that A ayB and A ayC then the translation FSM is deterministic and it hold properties of the sequential mapping

  • České vysoké učení technické v Praze

    Fakulta dopravní


    Translation fsm2
    Translation FSM

    • mapping can be dividedinto:

      • translation function : Q  T  Q

      • output function: Q  T  D

    • then FSM is Mealy one

      Poznámka:

      • FSMs in hardwaredomain has usually no set of terminal states K

    České vysoké učení technické v Praze

    Fakulta dopravní


    Equivalency
    Equivalency

    • there is an equivalency

    Regular translation grammars

    TranslationFSMs

    České vysoké učení technické v Praze

    Fakulta dopravní


    Examples
    Examples

    • Construct a regular grammar which generates decimal numbers with sign +/-

    • Construct a context-free grammar which generates boolean expressionsin disjunctive form using and (*), or (+), negation (-) ansinput variablesa,b,c, output variable is y. The expression is terminated by semicolon ";"

    České vysoké učení technické v Praze

    Fakulta dopravní


    Notes
    Notes

    • grammars are used not only with languages

    • other generative systems can be defined by grammars

      • grammars of the "nature"

      • L – systems (Lindenmayer systems)

        • a group of fractals defined by grammars

    České vysoké učení technické v Praze

    Fakulta dopravní


    Sierpinsk i tr iangle
    Sierpinski triangle

    G = (V,P,S)

    V={S,G,F,+,-}

    • a finite set of symbols

      P = {S FGF + +FF + +FF, F FF,

      G  + + FGF − −FGF − −FGF + +}

  • interpretation using "turtle graphics"

    • "F" – moving turtle forward (drawing a line)

    • "G" – ignore

    • "+" – rotate to the left around given angle

    • "–" – rotate to the right around given angle

  • České vysoké učení technické v Praze

    Fakulta dopravní


    České vysoké učení technické v Praze

    Fakulta dopravní


    Helge von koch curve
    Helge von Koch curve

    G = (V,P,S)

    V={S,F,+,-}

    • a finite set of symbols

      P = {S F +F − − F + F, F F +F − − F + F}

  • "turtle graphics"

    • "F" – moving turtle forward (drawing a line)

    • "+" – rotate to the left around given angle

    • "–" – rotate to the right around given angle

  • České vysoké učení technické v Praze

    Fakulta dopravní


    České vysoké učení technické v Praze

    Fakulta dopravní


    ad