- 90 Views
- Uploaded on
- Presentation posted in: General

LING/C SC/PSYC 438/538 Computational Linguistics

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

LING/C SC/PSYC 438/538Computational Linguistics

Sandiway Fong

Lecture 7: 9/11

- Reminder:
- this Thursday
- change of venue
- instead of classhere, please attend the Master’s Seminar at 5pm
- presented by the Cognitive Science Program
- Speech, Language and Hearing Sciences Building, Rm. 205
- I’ll be giving a research talk (accessible to non-specialists) on treebanks and statistical natural language parsers

- Homework 1 review
- DCG rules and left vs. right recursion recap
- adding parse tree output to DCG rules

Question 1:

Using Perl andwsj2000.txt

What is the maximum number of consonants occurring in a row within a word?

How many words are there with that maximum number?

List those words

Give your Perl program

Question 2:

Using your Perl program for Question 1

modify your Perl program to report the sentence number as well as the word encountered in Question 1

submit your modified program

Example:

676 Pennsylvania

means on line number 676 the word Pennsylvania occurs

Question 3:

(optional 438/mandatory 538)

using Perl andwsj2000.txt

find the words with the longest palindrome sequence of letters as a substring

give your Perl code

Example:

common has a

palindrome sequence

of length 2:

ommo

right recursive regular grammar:

s --> [b], ays.

ays --> [a], a.

a --> [a], exclam.

a --> [a], a.

exclam --> [!].

- language: Sheeptalk
- baa!
- baaa!
- ba...a!

- left recursive regular grammar:
s --> a, [!].

a --> firsta, [a].

a --> a, [a].

firsta --> b, [a].

b --> [b].

?- s([b,a,a,!],[]).Yes

?- s([b,a,!],[]).Infinite loop

?- s([b,a,a,!],[]).Yes

?- s([b,a,!],[]).No

left recursive regular grammar:

s --> a, [!].

a --> firsta, [a].

a --> a, [a].

firsta --> b, [a].

b --> [b].

given Prolog left-to-right, depth-first evaluation strategy:

?- s([b,a,!],[]).

enters an infinite loop

derivation tree:

s

s

s

a

a

a

a

a

firsta

a

a

firsta

a

b

firsta

a

b

b

a

b

b

mismatch with input

b

left recursive regular grammar:

s --> a, [!].

a --> firsta, [a].

a --> a, [a].

firsta --> b, [a].

b --> [b].

no re-ordering of the grammar rules will prevent Prolog from looping

re-ordered:

s --> a, [!].

a --> a, [a].

a --> firsta, [a].

firsta --> b, [a].

b --> [b].

makes things worse!

this DCG will make Prolog loop even on grammatical input

Recovering a parse tree

when want Prolog to return more than just Yes/Noanswers

in case of Yes, we can compute a syntax tree representation of the parse

by adding an extra argument to nonterminals

applies to all grammar rules (not just regular grammars)

Example

sheeptalk again

DCG:

s --> [b], [a], a, [!].

a --> [a]. (base case)

a --> [a], a. (recursive case)

s

b

a

a

!

a

a

a

Tree:

Prolog data structure:

term

hierarchical

allows sequencing of arguments

functor(arg1,..,argn)

each argi could be another term or simple atom

s

b

a

a

!

a

a

a

s(b,a,a(a,a(a)),!)

s

b

a

a

!

a

a

a

- DCG
- s --> [b],[a], a, [!].
- a --> [a].(base case)
- a --> [a], a.(right recursive case)

- base case
- a --> [a].
- a(subtree) --> [a].
- a(a(a)) --> [a].

s(b,a,a(a,a(a)),!)

- recursive case
- a --> [a], a.
- a(subtree) --> [a], a(subtree).
- a(a(a,A)) --> [a], a(A).

Idea: for each nonterminal, add an argument to store its subtree

s

b

a

a

!

a

a

a

- Prolog grammar
- s --> [b], [a], a, [!].
- a --> [a].(base case)
- a --> [a], a.(right recursive case)

- base and recursive cases
- a(a(a)) --> [a].
- a(a(a,A)) --> [a], a(A).

s(b,a,a(a,a(a)),!)

- start symbol case
- s --> [b], [a], a, [!].
- s(tree) --> [b], a(subtree), [!].
- s(s(b,a,A,!)) --> [b], [a], a(A), [!].

- Prolog grammar
- s --> [b], [a], a, [!].
- a --> [a].(base case)
- a --> [a], a.(right recursive case)

- Prolog grammar computing a parse
- s(s(b,a,A,!)) --> [b], [a], a(A), [!].
- a(a(a)) --> [a].
- a(a(a,A)) --> [a], a(A).

Augment the right

recursive regular

grammar for Sheeptalk

to return a parse tree

?- s(Tree,[b,a,a,!],[]).

DCG:

s --> [b], ays.

ays --> [a], a.

a --> [a], exclam.

a --> [a], a.

exclam --> [!].