This presentation is the property of its rightful owner.
1 / 57

# CS 208: Computing Theory PowerPoint PPT Presentation

CS 208: Computing Theory. Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics. Context Free Languages. Context-Free Languages. So far …. Methods for describing regular languages Finite Automata Deterministic Non-deterministic Regular Expressions

CS 208: Computing Theory

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## CS 208: Computing Theory

Assoc. Prof. Dr. Brahim Hnich

Faculty of Computer Sciences

Izmir University of Economics

## Context Free Languages

Context-Free Languages

### So far …

• Methods for describing regular languages

• Finite Automata

• Deterministic

• Non-deterministic

• Regular Expressions

• They are all equivalent, and limited

• Cannot some simple languages like {0n1n | n is positive}

• Now, we introduce a more powerful method for describing languages

• Context-free Grammars (CFG)

### Are CFGs any useful?

• Extremely useful!

• Artificial Intelligence

• Natural language Processing

• Programming Languages

• specification

• compilation

### Example

• This is a CFG which we call G1

• A0A1

• AB

• B#

### Example: production rules

• This is a CFG which we call G1

• A0A1

• AB

• B#

Each line is a substitution rules or production rules

### Example: variables

• This is a CFG which we call G1

• A0A1

• AB

• B#

A and B are called variables or non-terminals

### Example: variables

• This is a CFG which we call G1

• A0A1

• AB

• B#

0,1, and # are called terminals

### Example: variables

• This is a CFG which we call G1

• A0A1

• AB

• B#

A is the start variable

### Rules

• We use a CFG to describe a language by generating each string of that language

• Write down the start variable

• Pick a variable written down and a production rule that starts with that variable

• Replace that variable with right-hand side of the production rule

• Repeat until no variable remain

### Derivations

• This is a CFG which we call G1

• A0A1

• AB

• B#

• Derivations with G1

• A0A10B10#1

• A0A100A1100B1100#11

• A0A100A11000A111000B111000#111

### Parse tree

• Parse tree for 0#1 in G1

• A0A10B10#1

A

A

B

1

0

#

### Parse tree

Parse tree for 00#11 in G1 A0A100A1100B1100#11

A

A

A

B

1

1

0

0

#

### Context-free languages

• All strings generated by a CFG constitute the language of the grammar

• Example: L(G1)={0n#1n | n is positive}

• Any language generated by a context-free grammar is a context-free language

### A useful abbreviation

• Production rules

• A  0A1

• A  B

• B  #

• Can be written as

• A  0A1 | B

• B  #

### Another example

• CFG G2 describing a fragment of English

<SENTENCE>  <NOUN-PHRASE><VERB-PHRASE>

<NOUN-PHRASE> <CMPLX-NOUN>|<PREP-PHRASE>

<VERB-PHRASE><CMPLX-VERB>|<CMPX-VERB><PREP-PHRASE>

<PREP-PHRASE><PREP><CMPLX-NOUN>

<CMPLX-NOUN><ARTICLE><NOUN>

<CMPLX-VERB><VERB>|<VERB><NOUN-PHRASE>

<ARTICLE> a | the

<NOUN>  boy | girl | flower

<VERB>  touches | likes | sees

<PREP>  with

### Another example

• Examples of strings belonging to L(G2)

a boy sees

the boy sees a flower

a girl with a flower likes the boy with a flower

### Another example

• Derivation of a boy sees

<SENTENCE>

 <NOUN-PHRASE><VERB-PHRASE>

 <CMPLX-NOUN><VERB-PHRASE>

 <ARTICLE><NOUN> <VERB-PHRASE>

 a <NOUN><VERB-PHRASE>

 a boy <VERB-PHRASE>

 a boy <CMPLX-VERB>

 a boy <VERB>

 a boy sees

### Formal definitions

• A context-free grammar is a 4-tuple <V, ∑, R, S> where

• V is a finite set of variables

• ∑is a finite set of terminals

• R is a finite set of rules: each rule is a variable and a finite string of variable and terminals

• S is the start symbol

### Formal definitions

• If

• u and v are strings of variable and terminals, and

• A  w is a rule of the grammar,

• Then uAv yields uwv, written uAv  uwv

• We write u * v if

• u = v or

• u u1  …. uk  v

### Formal definitions

• The language of grammar G is

• L(G) = {w | S * w}

### Example

• Consider G4 =<{S},{(,)},R,S> where R is

• S  (S) | SS | ε

• What is the language of G4?

• Examples: (), (()((())), …

### Example

• Consider G4 =<{S},{(,)},R,S> where R is

• S  (S) | SS | ε

• What is the language of G4?

• L(G4) is the set of strings of properly nested parenthesis

### Example

• Consider G4 =<{E,T,F},{a,+, x, (, )},R,E> where R is

• E  E + T | T

• T  T X F | F

• F  (E) | a

• What is the language of G4?

• Examples: a+a+a, (a+a) x a

### Example

• Consider G4 =<{E,T,F},{a,+, x, (, )},R,E> where R is

• E  E + T | T

• T  T x F | F

• F  (E) | a

• What is the language of G4?

• E stands for expression, T for Term, and F for Factor: so this grammar describes some arithmetic expressions

### Ambiguity

• Sometimes a grammar can generate the same string in several different ways!

• This string will have several parse trees

• This is a very serious problem

• Think if a C program can have multiple interpretations?

• If a language has this problem, we say that it is ambiguous

### Example

• Consider G5:

<EXPR><EXPR>+<EXPR>|<EXPR>x<EXPR>

|(<EXPR>) | a

G5 is ambiguous because a+axa has two parse tress!

### Example

• Consider G5:

<EXPR><EXPR>+<EXPR>|<EXPR>x<EXPR>

|(<EXPR>) | a

G5 is ambiguous because a+axa has two parse tress!

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

a+ a xa

### Example

• Consider G5:

<EXPR><EXPR>+<EXPR>|<EXPR>x<EXPR>

|(<EXPR>) | a

G5 is ambiguous because a+axa has two parse tress!

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

<EXPR>

a+ a xa

a+ a xa

### Formal definition: ambiguity

• A string w is generated ambiguously in CFG G if it has two or more different leftmost derivations!

• A derivation is leftmost if at every step the variable being replaced is the leftmost one

• Grammar G is ambiguous if it generates some string ambiguously

### Chomsky Normal Form (CNF)

• Every rule has the form

• A  BC

• A  a

• S  ε

• Where S is the start symbol, A, B, and C are any variables – except that B and C may not be the start symbol

### Theorem

• Theorem: Any context-free language is generated by a context-free grammar in Chomsky normal form

• How?

• Add new start symbol S0

• Eliminate all rules of the form A  ε

• Eliminate all “unit” rules of the form A  B

• Patch up rules so that grammar denotes the same language

• Convert remaining rules to proper form

### Steps to convert any grammar into CNF

• Step1

• Add a new start symbol S0

### Steps to convert any grammar into CNF

• Step2: Repeat

• Remove some rule of the form A  ε where A is not the start symbol

• Then, for each occurrence of A on the right-hand side of a rule, we add a new rule with that occurrence deleted

• E.g., if R uAvAu where u and v are strings of variables and terminals

• We add rules: R uvAu, RuAvu, and Ruvu

• For RA add Rε, except if Rε has already been removed

• Until all ε-rules not involving the start symbol have been removed

### Steps to convert any grammar into CNF

• Step3: eliminate unit rules

• Repeat

• Remove some rule of the form A  B

• For each Bu, add Au, except if Au has already been removed

• Until all unit rules have been removed

### Steps to convert any grammar into CNF

• Step4: convert remaining rules

• Replace each rule A u1u2…uk, where k >2 and each ui is a terminal or a variable with the rules

• Au1A1

• A1u2A2

• A2u3A3

• ….

• Ak-2uk-1uk

• If k=2, we replace any terminal ui in the preceding rules with the new variable Ui and add the rule Uiui

### Example

• S  ASA | aB

• A  B | S

• B  b | ε

### Example

• Step 1: add new start symbol and new rule

• S0  S

• S  ASA | aB

• A  B | S

• B  b | ε

### Example

• Step 2: remove ε-rule B ε

• S0  S

• S  ASA | aB | a

• A  B | S | ε

• B  b

### Example

• Step 2: remove ε-rule A ε

• S0  S

• S  ASA | aB | a | SA | AS | S

• A  B | S

• B  b

### Example

• Step 3: remove unit rule S S

• S0  S

• S  ASA | aB | a | SA | AS | S

• A  B | S

• B  b

### Example

• Step 3: remove unit rule S0 S

• S0  S | ASA | aB | a | SA | AS

• S ASA | aB | a | SA | AS

• A  B | S

• B  b

### Example

• Step 3: remove unit rule A B

• S0  ASA | aB | a | SA | AS

• S ASA | aB | a | SA | AS

• A  B | S | b

• B  b

### Example

• Step 3: remove unit rule A S

• S0  ASA | aB | a | SA | AS

• S ASA | aB | a | SA | AS

• A  S | b | ASA | aB | a | SA | AS

• B  b

### Example

• Step 3: remove unit rule A S

• S0  ASA | aB | a | SA | AS

• S ASA | aB | a | SA | AS

• A  b | ASA | aB | a | SA | AS

• B  b

### Example

• Step 4: convert remaining rules

• S0  AA1|UB| a| SA | AS

• S AA1|UB | a | SA | AS

• A  b | AA1 | UB | a | SA | AS

• B  b

• Ua

• A1SA

## Pushdown Automata

### Pushdown automata

• Pushdown automat (PDA) are like nondeterministic finite automat but have an extra component called a stack

• Can push symbols onto the stack

• Can pop them (read them back) later

• Stack is potentially unbounded

input

State

control

a

a

b

a

x

y

z

stack

### Formal Definition

• A pushdown automaton is a 6-tuple (Q,∑,S, ξ,q0,F), where

• Q is a finite set of states

• ∑ is a finite set of symbols called the alphabet

• S is the stack alphabet

• ξ : Q x ∑ε x Sε P(Q x Sε) is the transition function

• q0 Є Q is the start state

• F ⊆ Q is the set of accept states or final states

### Conventions

• Question: when is the stack empty?

• Start by pushing a \$ onto the stack

• If you see it again, stack is empty

• Question: when is input string empty

• Doesn’t matter

• Accepting states accept only if inputs exhausted

### Notation

• Transition a,bc means

• Read a from the input

• Pop b from stack

• Push c onto stack

• Meaning of ε transition

• If a = ε , don’t read input

• If b= ε , don’t pop any symbol

• If c= ε , don’t push any symbols

### Example

• Recall 0n1n which is not regular

• Consider the following PDA

• For each 0, push it on the stack

• As soon as a 1 is seen, pop a 0 for each 1 read

• Accept if stack is empty when last symbol read

• Reject if stack non-empty, or if input symbol exist, or if 0 read after a 1, etc…

### Example

{0n1n| n is positive}

0, ε0

ε,ε\$

1,0 ε

1,0 ε

ε,\$ ε

### Example

{aibjck| i=j or i=k}

c,ε ε

b, a  ε

ε,\$ ε

ε,ε ε

ε,ε\$

ε,ε ε

ε, \$ ε

ε,ε ε

a, ε a

b, ε ε

c, a ε

### Theorem

• Theorem: A language is context-free if and only some pushdown automaton accepts it

• Proof: we will skip it! (Those interested may read the book)

• Corollary: Every regular language is a context-free language

Context-free

languages

Regular

languages

### Conclusions

Context-free grammars

definition

ambiguity

Chomsky normal form

Pushdown automata

definition

Next: Part C;

Computability Theory