- 282 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Grammars' - KeelyKia

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Grammars

CPSC 5135

Formal Definitions

- A symbol is a character. It represents an abstract entity that has no inherent meaning
- Examples:

a, A, 3, *, - ,=

Formal Definitions

- A string (or word) is a finite sequence of symbols from a given alphabet.
- Examples:

S = { 0, 1 } is a alphabet

0, 1, 11010, 101, 111 are strings from S

A = { a, b, c ,d } is an alphabet

bad, cab, dab, d, aaaaa are strings from A

Formal Definitions

- A language is a set of strings from an alphabet.
- The set can be finite or infinite.
- Examples:

A = { 0, 1}

L1 = { 00, 01, 10, 11 }

L2 = { 010, 0110, 01110,011110, …}

Formal Definitions

- A grammar is a quadruple G = (V, Σ, R, S) where

1) V is a finite set of variables (non-terminals),

2) Σ is a finite set of terminals, disjoint from V,

3) R is a finite set of rules. The left side of each rule is a string of one or more elements from V U Σ and whose right side is a string of 0 or more elements from V U Σ

4) S is an element of V and is called the start symbol

Formal Definitions

- Example grammar:
- G = (V, Σ, R, S)

V = { S, A }

Σ = { a, b }

R = { S → aA

A → bA

A → a }

Derivations

R = S → aA

A → bA

A → a

- A derivation is a sequence of replacements , beginning with the start symbol, and replacing a substring matching the left side of a rule with the string on the right side of a rule

S → aA

→abA

→ abbA

→ abba

Formal Definitions

- The language generated by a grammar is the set of all strings of terminal symbols which are derivable from S in 0 or more steps.
- What is the language generated by this grammar?
- S → a

S → aB

B → aB

B → a

Kleene Closure

- Let Σ be a set of strings. Σ* is called the Kleene closure of Σ and represents the set of all concatenations of 0 or more strings in Σ.
- Examples

Σ* = { 1 }* = { ø, 1, 11, 111, 1111, …}

Σ* = { 01 }* = { ø, 01, 0101, 010101, …}

Σ* = { 0 + 1 }* = set of all possible strings of 0’s and 1’s. (+ means union)

Formal Definitions

- A grammar G = (V,Σ, R, S) is right-linear if all rules are of the form:

A → xB

A → x

where A, B ε V and x εΣ*

Right-linear Grammar

- G = { V, Σ, R, S }

V = { S, B }

Σ = { a, b }

R = { S → aS ,

S → B ,

B → bB ,

B → ε }

What language is generated?

Formal Definitions

- A grammar G = (V,Σ, R, S) is left-linear if all rules are of the form:

A → Bx

A → x

where A, B ε V and x εΣ*

Formal Definitions

- A regular grammar is one that is either right or left linear.
- Let Q be a finite set and let Σ be a finite set of symbols. Also let δ be a function from Q x Σ to Q, let q0 be a state in Q and let A be a subset of Q. We call each element of Q a state, δ the transition function, q0 the initial state and A the set of accepting states. Then a deterministic finite automaton (DFA) is a 5-tuple < Q , Σ , q0 , δ , A >
- Every regular grammar is equivalent to a DFA

Language Definition

- Recognition – a machine is constructed that reads a string and pronounces whether the string is in the language or not. (Compiler)
- Generation – a device is created to generate strings that belong to the language. (Grammar)

Chomsky Hierarchy

- Noam Chomsky (1950’s) described 4 classes of grammars

1) Type 0 – unrestricted grammars

2) Type 1 – Context sensitive grammars

3) Type 2 – Context free grammars

4) Type 3 – Regular grammars

Grammars

- Context-free and regular grammars have application in computing
- Context-free grammar – each rule or production has a left side consisting of a single non-terminal

Backus-Naur form (BNF)

- BNF was used to describe programming language syntax and is similar to Chomsky’s context free grammars
- A meta-language is a language used to describe another language
- BNF is a meta-language for computer languages

BNF

- Consists of nonterminal symbols, terminal symbols (lexemes and tokens), and rules or productions
- <if-stmt> → if <logical-expr> then <stmt>
- <if-stmt> → if <logical-expr> then <stmt> else <stmt>
- <if-stmt> → if <logical-expr> then <stmt>

| if <logical-expr> then <stmt>

else <stmt>

A Small Grammar

<program> begin <stmt_list> end

<stmt_list> <stmt>

| <stmt> ; <stmt_list>

<stmt> <var> = <expression>

<var> A | B | C

<expression> <var> + <var>

| <var> - <var>

| <var>

A Derivation

<program> begin <stmt_list> end

begin <stmt> end

begin <var> = <expression> end

begin A = <expression> end

begin A = <var> + <var> end

begin A = B + <var> end

begin A = B + C end

Terms

- Each of the strings in a derivation is called a sentential form.
- If the leftmost non-terminal is always the one selected for replacement, the derivation is a leftmost derivation.
- Derivations can be leftmost, rightmost, or neither
- Derivation order has no effect on the language generated by the grammar

Derivations Yield Parse Trees

<program> begin <stmt_list> end

begin <stmt> end

begin <var> = <expression> end

begin A = <expression> end

begin A = <var> + <var> end

begin A = B + <var> end

begin A = B + C end

<Program>

begin <stmt_list> end

<stmt>

<var> = <expression>

A <var> + <var>

B C

Parse Trees

- Parse trees describe the hierarchical structure of the sentences of the language they define.
- A grammar that generates a sentence for which there are two or more distinct parse trees is ambiguous.

An Ambiguous Grammar

<assign> <id> = <expr>

<id> A | B | C

<expr> <expr> + <expr>

| <expr> * <expr>

| ( <expr> )

| <id>

Two Parse Trees – Same Sentence

<assign>

<id> = <expr>

A <expr> + <expr>

<id> <expr> * <expr>

B <id> <id>

C A

<assign>

<id> = <expr>

A <expr> * <expr>

<expr> + <expr> <id>

<id> <id> A

B C

Derivation 1

<assign> <id> = <expr>

A = <expr>

A = <expr> + <expr>

A = <id> + <expr>

A = B + <expr>

A = B + <expr> * <expr>

A = B + <id> * <expr>

A = B + C * <expr>

A = B + C * <id>

A = B + C * A

Derivation 2

<assign> <id> = <expr>

A = <expr>

A = <expr> * <expr>

A = <expr> + <expr> * <expr>

A = <id> + <expr> * <expr>

A = B + <expr> * <expr>

A = B + <id> * <expr>

A = B + C * <expr>

A = B + C * <id>

A = B + C * A

Ambiguity

- Parse trees are used to determine the semantics of a sentence
- Ambiguous grammars lead to semantic ambiguity - this is intolerable in a computer language
- Often, ambiguity in a grammar can be removed

Unambiguous Grammar

<assign> <id> = <expr>

<id> A | B | C

<expr> <expr> + <term> | <term>

<term> <term> * <factor> | <factor>

<factor> ( <expr> ) | <id>

- This grammar makes multiplication take precedence over addition

Associativity of Operators

<assign>

<id> = <expr>

A <expr> + <term>

<expr> + <term> <factor>

<term> <factor> <id>

<factor> <id> A

<id> C

B

<assign> <id> = <expr>

<id> A | B | C

<expr> <expr> + <term> | <term>

<term> <term> * <factor> | <factor>

<factor> ( <expr> ) | <id>

Addition operators associate from left to right

BNF

- A BNF rule that has its left hand side appearing at the beginning of its right hand side is left recursive .
- Left recursion specifies left associativity
- Right recursion is usually used for associating exponetiation operators

<factor> <exp> ** <factor> | <exp>

<exp> ( <expr> ) | <id>

Ambiguous If Grammar

<stmt> <if_stmt>

<if_stmt> if <logic_expr> then <stmt> |

if <logic_expr> then <stmt> else <stmt>

- Consider the sentential form:

if <logic_expr> then if <logic_expr> then <stmt> else <stmt>

Parse Trees for an If Statement

<if_stmt>

If <logic_expr> then <stmt> else <stmt>

<if_stmt>

if <logic_expr> then <stmt>

<if_stmt>

If <logic_expr> then <stmt>

<if_stmt>

if <logic_expr> then <stmt> else <stmt>

Unambiguous Grammar for If Statements

<stmt> <matched> | <unmatched>

<matched> if <logic_expr> then <matched> else <matched>

| any non-if statement

<unmatched> if <logic_expr> then <stmt>

| if <logic_expr> then <matched> else <unmatched>

Extended BNF (EBNF)

- Optional part denoted by […]

<selection> if ( <expr> ) <stmt> [ else <stmt> ]

- Braces used to indicate the enclosed part can be repeated indefinitely or left out

<ident_list> <identifier> { , <identifier> }

- Multiple choice options are put in parentheses and separated by the or operator |

<for_stmt> for <var> := <expr> (to | downto) <expr> do <stmt>

BNF vs EBNF for Expressions

BNF:

<expr> <expr> + <term>

| <expr> - <term>

| <term>

<term> <term> * <factor>

| <term> / <factor>

| <factor>

EBNF:

<expr> <term> { (+ | - ) <term> }

<term> <factor> { ( * | / ) <factor>

Download Presentation

Connecting to Server..