The practical value of statistics for sentence generation the perspective of the nitrogen system
Download
1 / 28

The Practical Value of Statistics for Sentence Generation: - PowerPoint PPT Presentation


  • 369 Views
  • Uploaded on

The Practical Value of Statistics for Sentence Generation: The Perspective of the Nitrogen System. Irene Langkilde-Geary. How well do statistical n-grams make linguistic decisions?. Subject-Verb Agreement Article-Noun Agreement

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Practical Value of Statistics for Sentence Generation:' - victoria


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
The practical value of statistics for sentence generation the perspective of the nitrogen system l.jpg

The Practical Value of Statistics for Sentence Generation:The Perspective of the Nitrogen System

Irene Langkilde-Geary


How well do statistical n grams make linguistic decisions l.jpg
How well do statistical n-grams make linguistic decisions?

Subject-Verb AgreementArticle-Noun Agreement

I am 2797 a trust 394 an trust 0 the trust 1355

I are 47 a trusts 2 an trusts 0 the trusts 115

I is 14

Singular vs PluralWord Choice

their trust 28 reliance 567 trust 6100

their trusts 8 reliances 0 trusts 1083


More examples l.jpg
More Examples

Relative pronounPreposition

visitor who 9 visitors who 20 in Japan 5413 to Japan 1196

visitor which 0 visitors which 0

visitor that 9 visitors that 14 came to 2443 arrived in 544

came in 1498 arrived to 35

Singular vs Plural came into 244 arrived into 0

visitor 575 visitors 1083

came to Japan 7 arrived to Japan 0

Verb Tense came into Jap 1 arrived into Japan 0

admire 212 admired 211 came in Japan 0 arrived in Japan 4

admires 107



Nitrogen takes a two step approach l.jpg
Nitrogen takes a two-step approach

  • Enumerate all possible expressions

  • Rank them in order of probabilistic likelihood

    Why two steps? They are independent.


Assigning probabilities l.jpg
Assigning probabilities

  • Ngram model

    Formula for bigrams:

    P(S) = P(w1|START) * P(w2|w1) * … * P(w n|w n-2)

  • Probabilistic syntax (current work)

    • A variant of probabilistic parsing models


Sample results of bigram model l.jpg
Sample Results of Bigram model

Random path: (out of a set of 11,664,000 semantically-related sentences)

Visitant which came into the place where it will be Japanese has admired that there was Mount Fuji.

Top three:

Visitors who came in Japan admire Mount Fuji .

Visitors who came in Japan admires Mount Fuji .

Visitors who arrived in Japan admire Mount Fuji .

Strengths

  • Reflects reality that 55% (Stolke et al. 1997) of dependencies are binary, and between adjacent words

  • Embeds linear ordering constraints


Limitations of bigram model l.jpg
Limitations of Bigram model

ExampleReason

Visitors come inJapan. A three-way dependency

He planned increase in sales. Part-of-speech ambiguity

A tourist who admire Mt. Fuji... Long-distance dependency

A dog eat/eats bone. Previously unseen ngrams

I cannot sell their trust. Nonsensical head-arg relationship

The methods must be modified to Improper subcat structure

the circumstances.


Representation of enumerated possibilities easily on the order of 10 15 to 10 32 or more l.jpg
Representation of enumerated possibilities(Easily on the order of 1015 to 1032 or more)

  • List

  • Lattice

  • Forest

  • Issues

  • space/time constraints

  • redundancy

  • localization of dependencies

  • non-uniform weights of dependencies



Number of phrases versus time in seconds for 15 sample inputs l.jpg
Number of phrases versus time (in seconds) inputsfor 15 sample inputs


Generating from templates and meaning based inputs l.jpg
Generating from Templates and Meaning-based Inputs inputs

INPUT  ( <label> <feature> VALUE )

VALUE  INPUT -OR- <label>

Labels are defined in:

  • input

  • user-defined lexicon

  • WordNet-based lexicon

    (~ 100,000 concepts)

Example Input:

(a1 :template (a2 / “eat”

:agent YOU

:patient a3)

:filler (a3 / |poulet| ))


Mapping rules l.jpg
Mapping Rules inputs

  • Recast one input to another

    • (implicitly providing varying levels of abstraction)

  • Assign linear order to constituents

  • Add missing info to under-specified inputs

    Matching Algorithm

  • Rule order determines priority. Generally:

    • Recasting < linear ordering < under-specification

    • High (more semantic) level of abstraction < low (more syntactic)

    • Distant position (adjuncts) from head < near (complements)

    • Basic properties < specialized


Recasting l.jpg

(a1 :venue < inputsvenue>

:cusine <cuisine> )

(a2 / |serve|

:agent <venue>

:patient <cuisine> )

(a2 / |have the quality of being|

:domain (a3 / “food type”

:possessed-by <venue>)

:range (b1 / |cuisine|))

Recasting


Recasting19 l.jpg

(a1 :venue < inputsvenue>

:region <region> )

(a2 / |serve|

:agent <venue>

:patient <cuisine>

(a3 / |serve|

:voice active

:subject <venue>

:object <cuisine> )

(a3 / |serve|

:voice passive

:subject <cuisine>

:adjunct (b1 / <venue>

:anchor |BY| ))

Recasting


Linear ordering l.jpg
Linear ordering inputs

(a3 / |serve|

:voice active

:subject <venue>

:object <cuisine> )

<venue>

(a4 / |serve|

:voice active

:object <cuisine> )


Under specification l.jpg
Under-specification inputs

(a4 / |serve|)

(a6 / |serve|

:cat noun)

(a5 / |serve|

:cat verb)


Under specification22 l.jpg
Under-specification inputs

(a4 / |serve|)

(a5 / |serve|

:cat verb)

(a5 / |serve|

:cat verb

:tense past)

(a5 / |serve|

:cat verb

:tense present)


Core features currently recognized by nitrogen l.jpg
Core features currently recognized by Nitrogen inputs

Syntactic relations

:subject :object :dative :compl :pred :adjunct :anchor :pronoun :op :modal :taxis :aspect :voice :article

Functional relations

:logical-sbj :logical-obj :logical-dat:obliq1 :obliq2 :obliq3 :obliq2-of :obliq3-of :obliq1-of :attr :generalized-possesion :generalized-possesion-inverse

Semantic/Systemic Relations

:agent :patient :domain :domain-of :condition :consequence :reason :compared-to :quant :purpose :exemplifier :spatial-locating :temporal-locating :temporal-locating-of :during :destination :means :manner :role :role-of-agent :source :role-of-patient :inclusive :accompanier :sans :time :name :ord

Dependency relations

:arg1 :arg2 :arg3 :arg1-of :arg2-of :arg3-of


Properties used by nitrogen l.jpg
Properties used by Nitrogen inputs

:cat [nn, vv, jj, rb, etc.]

:polarity [+, -]

:number [sing, plural]

:tense [past, present]

:person [1s 2s 3s 1p 2p 3p s p all]

:mood [indicative, pres-part, past-part, infinitive, to-inf, imper]


How many grammar rules needed for english l.jpg
How many grammar rules needed for English? inputs

Sentence  Constituent+

Constituent  Constituent+ OR Leaf

Leaf  Punc* FunctionWord* ContentWord FunctionWord* Punc*

FunctionWord  ``and'' OR ``or'' OR ``to'' OR ``on'' OR ``is'' OR ``been'' OR ``the'' OR ….

ContentWord  Inflection(RootWord,Morph)

RootWord  ``dog'' OR ``eat'' OR ``red'' OR ....

Morph  none OR plural OR third-person-singular ...


Computational complexity l.jpg
Computational Complexity inputs

(x2/A2) + (y2/B2) = 1

???

Y

X


Advantages of a statistical approach for symbolic generation module l.jpg
Advantages of a statistical approach inputsfor symbolic generation module

  • Shifts focus from “grammatical” to “possible”

  • Significantly simplifies knowledge bases

  • Broadens coverage

  • Potentially improves quality of output

  • Dramatically reduces information demands on client

  • Greatly increases robustness


ad