english syntax l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
English Syntax PowerPoint Presentation
Download Presentation
English Syntax

Loading in 2 Seconds...

play fullscreen
1 / 27

English Syntax - PowerPoint PPT Presentation


  • 275 Views
  • Uploaded on

English Syntax. Read J & M Chapter 9. Two Kinds of Issues. Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic – what are effective computational procedures for dealing with those facts? Building parsers. What is Syntax?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'English Syntax' - dorjan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
english syntax

English Syntax

Read J & M Chapter 9.

two kinds of issues
Two Kinds of Issues
  • Linguistic – what are the facts about language?
    • The rules of syntax (grammar)
  • Algorithmic – what are effective computational procedures for dealing with those facts?
    • Building parsers
what is syntax
What is Syntax?

Try 1: the rules for stringing words together to form sentences.

The boys hit the ball. vs. Ball boys hit the the.

I gave Sue a ride to the store vs. I gave Sue ride to store.

I saw the book that Mary had written. vs.

I saw the book what Mary had written.

But if that’s all it were, we wouldn’t have to do much for understanding assuming legal input.

what is syntax4
What is Syntax?

Try 2: The rules for forming constituents that correspond to meaningful entities.

Example: The cat with the furry tail purred.

why do we care about syntax
Why Do We Care about Syntax?

Morphology

POS Tagging

Syntax

Semantics

Discourse Integration

Generation goes backwards. For this reason, we generally want declarative representations of the facts.

sometimes we need it even if we don t go all the way
Sometimes We Need it Even if We Don’t Go All the Way
  • Question answering:
    • Lawyers whose clients committed fraud
  • vs
    • Lawyers who committed fraud
  • vs
    • Clients whose lawyers committed fraud
finding constituents in sentences
Finding Constituents in Sentences
  • A constituent is a word or group of words that functions as a unit.
  • How can we discern constituents?
  • Semantically:
  • The cat with the furry tail purred.
  • What can be chopped out and replaced by a single word?
  • Agnes purred.
  • * Agnes tail purred.
finding constituents in sentences con t
Finding Constituents in Sentences, con’t
  • Preposed and postposed constructions:
  • Early next year I’d like to go to Paris.
  • I’d like to go to Paris early next year.
  • I’d like early next year to go to Paris.
  • * Early I’d like to go to Paris next year.
  • * I’d like early to go to Paris next year.
  • * The early next year old man would like to go to Paris.
how many kinds of constituents are there
How Many Kinds of Constituents are There?

Although there may be an infinite number of possible constituent tokens, there’s quite a small number of constituent types, e.g., NP, PP, VP.

On what basis can we group tokens into types? Occurrence in similar contexts.

how many kinds of constituents are there con t
How Many Kinds of Constituents are There, con’t
    • The cat with the furry tail purred.
    • Every dog wore a collar.
    • Most of the children in the room brought a dog with a furry tail and a collar.
    • The furry tail brought a room.
    • Every room purred.
    • A dog with a furry tail and a collar purred.
    • Mary saw most of the children in the room.
  • NPs occur as subjects, objects of verbs, and objects of prepositions.
single word constituents
Single Word Constituents

Single word constituents are exactly the parts of speech that we have already considered.

How many of these single word constituent types are there? Look at sizes of tagsets.

Lots of design decisions:

Sue bought the big white house.

* Sue bought the white big house.

Are big and white the same POS?

simple constituent types don t capture everything
Simple Constituent Types Don’t Capture Everything

* The cat with a furry tail purred a collar.

Mary imagined a cat with a furry tail.

Mary decided to go.

* Mary decided a cat with a furry tail.

Mary decided a cat with a furry tail would be her next pet.

Mary gave Lucy the food.

* Mary decided Lucy the food.

subcategorization
Subcategorization

Frame Verb Example

Ø eat, sleep, … I want to eat

NP prefer, find, leave, ... Find [NP the flight from Pittsburgh to Boston]

NP NP show, give, … Show [NP me] [NP airlines with flights from Pittsburgh]

PPfrom PPto fly, travel, … I would like to fly [pp from Boston] [pp to Philadelphia]

NP PPwith help, load, … Can you help [NP me] [pp with a flight]

VPto prefer, want, need, … I would prefer [VPto to go by United airlines]

VPbrst can, would, might, … I can [VPbrst go from Boston]

S mean Does this mean [S AA has a hub in Boston]?

the role of the lexicon in parsing
The Role of the Lexicon in Parsing
  • Serves as the starting point for POS tagging.
  • Provides additional information such as subcategorization:
    • For verbs
    • For adjectives:
    • I’m angry with Mary. I’m angry at Mary.
    • I’m mad at Mary. * I’m mad with Mary.
    • For nouns:
      • Jane has a passion for old movies.
      • Jane has an interest in old movies.
one other barrier to a small number of kinds of constituents agreement
One Other Barrier to a Small Number of Kinds of Constituents - Agreement

Number agreement:

The boys want to go to the game(s).

* The boy want to to to the game(s).

Case agreement:

I want to give it to him.

* Me want to give it to he.

In English it’s just pronouns, but not so in many other languages.

the solution augmenting the constituent types
The Solution – Augmenting the Constituent Types

To solve these and other problems, one strategy is to augment constituent types with other sorts of information:

V +pl +[NP NP]  VP/NP/NP +pl Show

VP/NP +pl Show me

VP +pl Show me the book.

specifying a language
Specifying a Language
  • The set of sentences in English is large (maybe even infinite).
  • We want a concise (i. e., much shorter than a list of sentences) definition of it.
  • We have a finite (in fact quite small) set of constituent types (NP, VP, etc.) from which to build our description.
  • So we appeal to recursion and write grammar rules such as:
    • S  NP VP
    • VP  V NP
    • NP  NP PP
    • NP  NP S (The boy who went to the store won the game.)
    • PP  prep NP
a context free grammar for english
A Context-Free Grammar for English
  • If we ignore:
    • subcategorization
    • agreement
    • gapping
  • Then we can build a context-free grammar for English that does a pretty good job of:
    • generating all and only the acceptable sentences, and of
    • building reasonable parse trees for those sentences.
  • We’ll look at whether English is formally context free later.
context free grammars
Context-Free Grammars
  • A context-free grammar (CFG) is a 4-tuple:
    • A set of non-terminal symbols N
    • A set of terminals  (disjoint from N)
    • A set of productions P, each of the form A  , where A is a non-terminal and  is a string of symbols from the infinite set of strings (N)*
    • A designated start symbol S
  • In our grammar of English:
  •  is the set of POS, and
  • N is the set of remaining constituent types, e.g., NP, VP, PP
derivations using cfgs
Derivations Using CFGs
  • The standard formal definition:
  • LG generated by grammar G is the set of strings composed of terminal symbols which can be derived from the designated start symbol S.
      • LG= {w | w is in* and S  w}
  • But we won’t generally want our grammar to have to all the way to words. We want to let the lexicon do that. That’s why we let  be the set of POS. So the grammar may generate strings such as:
  • N V Det N
derivations using cfgs21
Derivations Using CFGs
  • So we will use the following definition:
      • LG= {s | w is in* and S  w and s can be derived from w by substituting words for POS as licensed by the lexicon}
  • Note that this doesn’t change the formal picture. We could instead augment our grammar with tens of thousands of rules of the form: N  phlogiston
  • This is a system design decision.
context free grammars and parse trees
Context-Free Grammars and Parse Trees

S  NP VP

NP  Name

NP  Det N

VP  V NP

S

NP

VP

Name

V

NP

Det

N

John

ate

the

pizza

(S (NP (NAME John))

(VP (V ate)

(NP (ART the)

(N pizza))))

long distance dependencies
Long Distance Dependencies

Who did she say she saw ____ coming down the hill?

She did say she saw who coming down the hill.

The boy she saw coming down the road was crying.

The boy she saw _____ coming down the road was crying.

long distance dependencies a linguistic solution
Long Distance Dependencies – A Linguistic Solution
  • Transformational Grammar (Chomsky, 1965):
  • A context free grammar generates base forms
  • A transformational component moves constituents around and may delete them from the surface form.
  • But how can we run these rules backwards?
  • This approach went out of fashion at least 20 years ago.
long distance dependencies computational solutions
Long Distance Dependencies – Computational Solutions
  • Augmented Transition Networks: All arbitrary actions on the arcs. These permit insertions and movements of constituents.
  • But any procedural solution won’t be reversible for generation.
  • Unification systems: Declarative patterns for assigning constituents to fill subcategorization slots.
spoken language syntax
Spoken Language Syntax

Speech is collected in utterances rather than in text.

Spoken language is looser than written with more pauses, ‘nonverbal events’, disfluencies such as er, uh, um.

Sample spoken language utterances from users interacting with ATIS

spoken language syntax27
Spoken Language Syntax

The repair often has the same structure as the constituent immediately before the interruption point.