lin6932 topics in computational linguistics
Download
Skip this Video
Download Presentation
LIN6932 Topics in Computational Linguistics

Loading in 2 Seconds...

play fullscreen
1 / 44

LIN6932 Topics in Computational Linguistics - PowerPoint PPT Presentation


  • 161 Views
  • Uploaded on

LIN6932 Topics in Computational Linguistics. Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip. Outline for Grammar/Parsing. Context-Free Grammars (CFG) and Constituency Some common CFG phenomena for English Sentence-level constructions NP, PP, VP Coordination

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'LIN6932 Topics in Computational Linguistics' - richelle


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lin6932 topics in computational linguistics

LIN6932 Topics in Computational Linguistics

Lecture 6: Grammar and Parsing (I)

February 15, 2007

Hana Filip

outline for grammar parsing
Outline for Grammar/Parsing
  • Context-Free Grammars (CFG) and Constituency
  • Some common CFG phenomena for English
    • Sentence-level constructions
    • NP, PP, VP
    • Coordination
    • Subcategorization
  • Levels of complexity in grammars and automata:

The Chomsky hierarchy

  • Top-down and Bottom-up Parsing
  • Earley Parsing
  • Quick sketch of probabilistic parsing
review
Review
  • Parts of Speech
    • Basic syntactic/morphological categories that words belong to
  • Part of Speech tagging
    • Assigning parts of speech to all the words in a sentence
syntax
Syntax
  • Syntax: from Greek syntaxis, “setting out together, arrangement’
  • Refers to the way words are arranged together, and their relationships
  • Distinction:
    • Prescriptive grammar: how people ought to talk
    • Descriptive grammar: how they do talk
  • Goal of syntax is to model the knowledge that people unconsciously have about the grammar of their native language
syntax5
Syntax
  • Why should you care?
    • Grammar checkers
    • Question answering
    • Information extraction
    • Machine translation
4 key ideas of syntax
4 key ideas of syntax
  • Constituency
  • Grammatical relations
  • Subcategorization
  • Lexical dependencies

Plus one part we won’t have time for:

  • Movement/long-distance dependency
context free grammars
Context-Free Grammars
  • Capture constituency and ordering
    • Ordering:

What are the rules that govern the ordering of words and bigger units in the language?

    • Constituency:

How words group into units and how the various kinds of units behave

constituency
Constituency
  • Noun phrases (NPs)
      • Three parties from Brooklyn
      • A high-class spot such as Mindy’s
      • The Broadway coppers
      • They
      • Harry the Horse
      • The reason he comes into the Hot Box
  • How do we know these form a constituent?
    • They can all appear before a verb:
      • Three parties from Brooklyn arrive…
      • A high-class spot such as Mindy’s attracts…
      • The Broadway coppers love…
      • They sit…
constituency ii
Constituency (II)
  • They can all appear before a verb:
    • Three parties from Brooklyn arrive…
    • A high-class spot such as Mindy’s attracts…
    • The Broadway coppers love…
    • They sit
  • But individual words can’t always appear before verbs:
    • *from arrive…
    • *as attracts…
    • *the is
    • *spot is…
  • Must be able to state generalizations like:
    • Noun phrases occur before verbs
constituency iii
Constituency (III)
  • Preposing and postposing:
    • On September 17th, I’d like to fly from Atlanta to Denver
    • I’d like to fly on September 17th from Atlanta to Denver
    • I’d like to fly from Atlanta to Denver on September 17th.
  • But not:
    • *On September, I’d like to fly 17th from Atlanta to Denver
    • *On I’d like to fly September 17th from Atlanta to Denver
cfg examples
CFG Examples
  • S -> NP VP
  • NP -> Det NOMINAL
  • NOMINAL -> Noun
  • VP -> Verb
  • Det -> a
  • Noun -> flight
  • Verb -> left
slide12
CFGs
  • S -> NP VP
    • This says that there are units called S, NP, and VP in this language
    • That an S consists of an NP followed immediately by a VP
    • Doesn’t say that that’s the only kind of S
    • Nor does it say that this is the only place that NPs and VPs occur
generativity
Generativity
  • As with FSAs you can view these rules as either analysis or synthesis machines
    • Generate strings in the language
    • Reject strings not in the language
    • Impose structures (trees) on strings in the language
derivations
Derivations
  • A derivation is a sequence of rules applied to a string that accounts for that string
    • Covers all the elements in the string
    • Covers only the elements in the string
cfgs more formally
CFGs more formally
  • A context-free grammar has 4 parameters (“is a 4-tuple”)
    • A set of non-terminal symbols (“variables”) N
    • A set of terminal symbols  (disjoint from N)
    • A set of productions P, each of the form
      • A -> 
      • Where A is a non-terminal and  is a string of symbols from the infinite set of strings (  N)*
    • A designated start symbol S
defining a cf language via derivation
Defining a CF language via derivation
  • A string A derives a string B if
    • A can be rewritten as B via some series of rule applications
  • More formally:
    • If A -> is a production of P
    •  and  are any strings in the set (  N)*
    • Then we say that
      • A directly derives 
      • Or A
    • Derivation is a generalization of direct derivation
    • Let 1, 2, … m be strings in (  N)*, m>= 1, s.t.
      • 12, 23… m-1m
      • We say that 1derives m or 1*m
    • We then formally define language LG generated by grammar G
      • As set of strings composed of terminal symbols derived from S
      • LG = {w | w is in * and S * w}
parsing
Parsing
  • Parsing is the process of taking a string and a grammar and assigning correct trees to input strings
  • Correct tree: a tree that covers all and only the elements of the input and has an S node at the top
context
Context?
  • The notion of context in CFGs has nothing to do with the ordinary meaning of the word context in language.
  • All it means is that the non-terminal on the left-hand side of a rule is out there all by itself (free of context)

A -> B C

Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found

key constituents english
Key Constituents (English)
  • Sentences
  • Noun phrases
  • Verb phrases
  • Prepositional phrases
sentence types
Sentence-Types
  • Declaratives: A plane left

S -> NP VP

  • Imperatives: Leave!

S -> VP

  • Yes-No Questions: Did the plane leave?

S -> Aux NP VP

  • WH Questions: When did the plane leave?

S -> WH Aux NP VP

slide22
NPs
  • NP -> Pronoun
    • I came, you saw it, they conquered
  • NP -> Proper-Noun
    • Los Angeles is west of Texas
  • NP -> Det Noun
    • The president
  • NP -> Nominal
  • Nominal -> Noun Noun
    • A morning flight to Denver
slide23
PPs
  • PP -> Preposition NP
    • From LA
    • To Boston
    • On Tuesday
    • With lunch
recursion
Recursion
  • We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly).

NP -> NP PP [[The flight] [to Boston]]

VP -> VP PP [[departed Miami] [at noon]]

recursion25
Recursion
  • Of course, this is what makes syntax interesting

flights from Denver

Flights from Denver to Miami

Flights from Denver to Miami in February

Flights from Denver to Miami in February on a Friday

Flights from Denver to Miami in February on a Friday under $300

Flights from Denver to Miami in February on a Friday under $300 with lunch

Etc.

recursion26
Recursion
  • Of course, this is what makes syntax interesting

[[flights] [from Denver]]

[[[Flights] [from Denver]] [to Miami]]

[[[[Flights] [from Denver]] [to Miami]] [in February]]

[[[[[Flights] [from Denver]] [to Miami]] [in February]] [on a Friday]]

Etc.

implications of recursion and context freeness
Implications of recursion and context-freeness
  • If you have a rule like
    • VP -> V NP
    • It only cares that the thing after the verb is an NP. It doesn’t have to know about the internal affairs of that NP
the point
The Point
  • VP -> V NP
  • I hate

flights from Denver

Flights from Denver to Miami

Flights from Denver to Miami in February

Flights from Denver to Miami in February on a Friday

Flights from Denver to Miami in February on a Friday under $300

Flights from Denver to Miami in February on a Friday under $300 with lunch

bracketed notation
Bracketed Notation
  • [S [NP [PRO I] [VP [V prefer [NP [NP [Det a] [Nom [N morning] [N flight]]]]
coordination constructions
Coordination Constructions
  • S -> S and S
    • John went to NY and Mary followed him
  • NP -> NP and NP
  • VP -> VP and VP
  • In fact the right rule for English is

X -> X and X

other syntactic stuff
Other Syntactic stuff
  • Grammatical Relations
    • Subject
      • I booked a flight to New York
      • The flight was booked by my agent.
    • Object
      • I booked a flight to New York
    • Complement
      • I said that I wanted to leave
problems
Problems
  • Agreement
  • Subcategorization
  • Movement (for want of a better term)
agreement
This dog

Those dogs

This dog eats

Those dogs eat

*This dogs

*Those dog

*This dog eat

*Those dogs eats

Agreement
possible cfg solution
S -> NP VP

NP -> Det Nominal

VP -> V NP

SgS -> SgNP SgVP

PlS -> PlNp PlVP

SgNP -> SgDet SgNom

PlNP -> PlDet PlNom

PlVP -> PlV NP

SgVP ->SgV Np

Possible CFG Solution
cfg solution for agreement
CFG Solution for Agreement
  • It works and stays within the power of CFGs
  • But its ugly (loss of generalization)
  • And it doesn’t scale all that well
subcategorization
Subcategorization

Verbs have preferences for the number and syntactic kinds of constituents they co-occur with

  • Sneeze: John sneezed intransitive verb
  • Find: Please find [a flight to NY]NP transitive verb
  • Give: Give [me]NP[a cheaper fare]NP ditransitive verb
  • Help: Can you help [me]NP[with a flight]PP DO IO/Oblique
  • Prefer: I prefer [to leave earlier]TO-VP VP complement
  • Said: You said [United has a flight]S sentential complement
subcategorization37
Subcategorization

But not:

  • *John sneezed the book
  • *I prefer United has a flight
  • *Give with a flight
  • Subcategorization expresses the constraints that a lexical predicate places on the number and syntactic kinds of arguments it wants to take (occur with)
  • So far we have only considerate one type of lexical predicate: verb
  • Nouns, adjectives and prepositions also take arguments
slide38
So?
  • So the various CFG rules for VPs overgenerate.
    • They permit the presence of strings containing verbs and arguments that don’t go together
    • For example

VP -> V NP

predicts that

Sneezed the book is a VP since “sneeze” is a verb and

“the book” is a valid NP

forward pointer
Forward Pointer
  • It turns out that verb subcategorization facts will provide a key element for semantic analysis (determining who did what to who in an event).
possible cfg solution40
Possible CFG Solution
  • VP -> V
  • VP -> V NP
  • VP -> V NP PP
  • VP -> IntransV
  • VP -> TransV NP
  • VP -> TransVwPP NP PP
movement
Movement
  • Core example
    • My travel agent booked the flight
movement42
Movement
  • Core example
    • [[My travel agent]NP [booked [the flight]NP]VP]S
  • I.e. “book” is a straightforward transitive verb. It expects a single NP arg as the subject and a single NP arg within the VP
movement long distance dependencies
Movement - Long Distance Dependencies
  • What about?
    • Which flight do you want me to have the travel agent book __?
  • The direct object argument to “book” isn’t appearing in the right place. It is a long way from where it originally appeared
  • And note that it is separated from its verb by 2 other verbs.
cfgs a summary
CFGs: a summary
  • CFGs appear to be just about what we need to account for a lot of basic syntactic structure in English
  • But there are problems
    • That can be dealt with adequately, although not elegantly, by staying within the CFG framework.
  • There are simpler, more elegant, solutions that take us out of the CFG framework (beyond its formal power)
  • Syntactic theories: HPSG, LFG, CCG, Minimalism, etc
ad