Grammatical processing with lfg and xle
Download
1 / 124

Grammatical processing with LFG and XLE - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

A dvanced QU estion A nswering for INT elligence. Grammatical processing with LFG and XLE. Ron Kaplan ARDA Symposium, August 2004. Match. M. M. Layered Architecture for Question Answering. XLE/LFG Parsing. Target KRR. Text. KR Mapping. F-structure.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Grammatical processing with LFG and XLE' - tauret


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Grammatical processing with lfg and xle

Advanced QUestion

Answering for INTelligence

Grammatical processing with LFG and XLE

Ron Kaplan

ARDA Symposium, August 2004


Layered architecture for question answering

Match

M

M

Layered Architecture for Question Answering

XLE/LFG Parsing

Target KRR

Text

KR Mapping

F-structure

Conceptual semantics

KR

Sources

Assertions

Question

Query

Answers

Explanations

Subqueries

Composed

F-StructureTemplates

Text to user

XLE/LFG

Generation

Text


Layered architecture for question answering1
Layered Architecture for Question Answering

XLE/LFG Parsing

Target KRR

Text

KR Mapping

F-structure

Conceptual semantics

KR

Sources

Assertions

Question

Query

Answers

Explanations

Subqueries

Composed

F-StructureTemplates

Text to user

XLE/LFG

Generation

Text


Layered architecture for question answering2

Infrastructure

XLE

MaxEnt models

Linear deduction

Term rewriting

Theories

Lexical Functional Grammar

Ambiguity management

Glue Semantics

Resources

English Grammar

Glue lexicon

KR mapping

Layered Architecture for Question Answering

XLE/LFG Parsing

Target KRR

Text

KR Mapping

F-structure

Conceptual semantics

KR

Sources

Assertions

Question

Query


Deep analysis matters if you care about the answer

shallow but wrong

delegation furthest away but

Subject of flew

deep and right

“grammatical function”

Deep analysis matters…if you care about the answer

Example:

A delegation led by Vice President Philips, head of the chemical division, flew to Chicago a week after the incident.

Question: Who flew to Chicago?

Candidate answers:

division closest noun

head next closest

V.P. Philips next


F structure localizes arguments

PRED easy(SUBJ, COMP)

SUBJ John

PRED please(SUBJ, OBJ)

SUBJ someone

OBJ John

COMP

PRED eager(SUBJ, COMP)

SUBJ John

PRED please(SUBJ, OBJ)

SUBJ John

OBJ someone

COMP

F-structure: localizes arguments

Was John pleased?

“John was easy to please” Yes

“John was eager to please” Unknown

“lexical dependency”


Topics
Topics

  • Basic LFG architecture

  • Ambiguity management in XLE

  • Pargram project: Large scale grammars

  • Robustness

  • Stochastic disambiguation

  • [Shallow markup]

  • [Semantic interpretation]

Focus on the language end, not knowledge


The language mapping lfg xle

“Tony decided to go.”

Knowledge

The Language Mapping: LFG & XLE

StochasticModel

NamedEntities

LFGGrammar

English, German, etc.

Parse

Functional

structures

TokensMorphology

Sentence

Generate

XLE

XLE: Efficient ambiguity management


Why deep analysis is difficult
Why deep analysis is difficult

  • Languages are hard to describe

    • Meaning depends on complex properties of words and sequences

    • Different languages rely on different properties

    • Errors and disfluencies

  • Languages are hard to compute

    • Expensive to recognize complex patterns

    • Sentences are ambiguous

    • Ambiguities multiply: explosion in time and space


S

NP

V’

NP

Det

Adj

N

V

Det

Aux

N

the

small

children

are

chasing

the

dog

S

NP

NP

V

N

P

Adj

N

P

inudog

oObj

tiisaismall

kodomotatichildren

gaSbj

oikaketeiruare chasing

Different patterns code same meaning

The small children are chasing the dog.

English

Group, order

Japanese

Group, mark


S

NP

V’

NP

Det

Adj

N

V

Det

Aux

N

the

small

children

are

chasing

the

dog

Pred

‘chase<Subj, Obj>’

S

Tense

Present

NP

NP

V

PredMod

childrensmall

Subj

N

P

Adj

N

P

inudog

oObj

tiisaismall

kodomotatichildren

gaSbj

oikaketeiruare chasing

Pred

dog

Obj

S

NP

Aux

NP

V

NP

Warlpiri

Mark only

N

A

N

kurdujarrarluchildren-Sbj

kapalaPresent

malikidog-Obj

wajilipinyichase

witajarrarlusmall-Sbj

Different patterns code same meaning

The small children are chasing the dog.

LFG theory: minor adjustments on universal theme

English

Group, order

Japanese

Group, mark

chase(small(children), dog)


Lfg architecture

PRED ‘John’

NUM SG

SUBJ

PRED

‘like<SUBJ,OBJ>’

TENSE

PRESENT

PRED ‘Mary’

NUM SG

OBJ

Modularity

Nearly-decomposable

LFG architecture

related by a piecewise correspondence 

C(onstituent)-structures and F(unctional)-structures

S

NP

VP

John

V

NP

likes

Mary

Formal encoding of order and grouping

Formal encoding of grammatical relations


Lfg grammar
LFG grammar

Rules

Lexical entries

  • Context-free rules define valid c-structures (trees).

  • Annotations on rules give constraints that corresponding f-structures must satisfy.

  • Satisfiability of constraints determines grammaticality.

  • F-structure is solution for constraints (if satisfied).

S  NP VP ( SUBJ)= =

N  John ( PRED)=‘John’ ( NUM)=SG

V  likes

( PRED)=‘like<SUBJ, OBJ>’ ( SUBJ NUM)=SG

(↑ SUBJ PERS)=3

VP  V (NP) =  ( OBJ)=

NP  (Det) N =  =


Rules as well formedness conditions

S

NP( SUBJ)=

VP=

Rules as well-formedness conditions

S

SUBJ [ ]

NP

VP

If * denotes a particular daughter:

: f-structure of mother (M(*))

: f-structure of daughter(*)

A tree containing S over NP - VP is OK if

F-unit corresponding to NP node is SUBJ of f-unit corresponding to S node

The same f-unit corresponds to both S and VP nodes.


Inconsistent equations ungrammatical

f

NP( SUBJ)=

VP=

f

S 

s

s

v

f

v

they( NUM)=PL

walks( SUBJ NUM)=SG

s

v

Let f be the (unknown) f-structure of the S

Then (substituting equals for equals):

s be the f-structure of the NP

(fSUBJ) = s and (s NUM)=PL => (f SUBJ NUM)=PL

v be the f-structure of the VP

(f SUBJ NUM)=PL and (f SUBJ NUM)=SG

=> SG=PL =>FALSE

Inconsistent equations = Ungrammatical

S

What’s wrong with “They walks” ?

NP

VP

they

walks

f= v and (v SUBJ NUM)=SG => (f SUBJ NUM)=SG

If a valid inference chain yields FALSE,

the premises are unsatisfiable,

no f-structure.


English and japanese

English: One NP before verb, one after: Subject and Object

S 

NP( SUBJ)=

V=

NP( OBJ)=

Japanese: Any number of NP’s before Verb Particle on each defines its grammatical function

ga: ( GF)=SUBJ

NP*( ( GF))=

V=

o: ( GF)=OBJ

S 

English and Japanese


S  … NP*… ( ( GF))=

rlu: ( GF)=SUBJ

ki: ( GF)=OBJ

Unlike Japanese, head Noun is optional in NP

A*( MOD)

NP 

N=

PRED

‘chase<Subj, Obj>’

S

TENSE

Present

PREDMOD

childrensmall

SUBJ

NP

Aux

NP

V

NP

PRED

dog

OBJ

N

A

N

kurdujarrarluchildren-Sbj

kapalaPresent

malikidog-Obj

wajilipinyichase

witajarrarlusmall-Sbj

Warlpiri: Discontinuous constituents

Like Japanese: Any number of NP’s Particle on each defines its grammatical function


English discontinuity in questions

S’

NP

S

Q Who

TENSE past

PRED think<SUBJ, COMP>

COMP

Who

Aux

NP

V

S

did

Bill

think

NP

V

PRED see<SUBJ,OBJ>

TENSE past

SUBJ Mary

OBJ

Mary

saw

English: Discontinuity in questions

Who did Mary see?

Who did Bill think Mary saw?

Who did Bill think saw Mary?

OBJ

COMP OBJ

COMP SUBJ

Who is understood as subject/object of distant verb.Uncertainty: which function of which verb?

S’ → NP S

(↑ Q)=↓ ↑=↓

(↑ COMP* SUBJ|OBJ)=↓


Summary lexical functional grammar
Summary: Lexical Functional Grammar

Kaplan and Bresnan, 1982

  • Modular: c-structure/f-structure in correspondence

  • Mathematically simple, computationally transparent

    • Combination of Context-free grammar, Quantifier-free equality theory

    • Closed under composition with regular relations: finite-state morphology

  • Grammatical functions are universal primitives

    • Subject and Object expressed differently in different languages

      English: Subject is first NP

      Japanese: Subject has ga

    • But: Subject and Object behave similarly in all languages

      Active to Passive: Object becomes Subject

      English: move words Japanese: move ga

  • Adopted by world-wide community of linguists

    • Large literature: papers, (text)books, conferences; reference theory

    • (Relatively) easy to describe all languages

    • Linguists contribute to practical computation

  • Stable: Only minor changes in 25 years


Efficient computation with lfg grammars ambiguity management in xle

Efficient computation with LFG grammars: Ambiguity Management in XLE


Computation challenge pervasive ambiguity

Tokenization

Morphology

Syntax

Semantics

Knowledge

  • The sheet broke the beam.

    Atoms or photons?

  • Every proposer wants an award.

    The same award or each their own?

  • The duck is ready to eat.Cooked or hungry?

  • walks Noun or Verb?

  • untieable knot (untie)able or un(tieable)?

  • bankriver or financial?

  • I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation)

Computation challenge: Pervasive ambiguity


Coverage vs ambiguity
Coverage vs. Ambiguity

I fell in the park.

+

I know the girl in the park.

I see the girl in the park.


Ambiguity can be explosive
Ambiguity can be explosive

If alternatives multiply within or across components…

Tokenize

Morphology

Syntax

Semantics

Knowledge


Computational consequences of ambiguity
Computational consequences of ambiguity

  • Serious problem for computational systems

    • Broad coverage, hand written grammars frequently produce thousands of analyses, sometimes millions

    • Machine learned grammars easily produce hundreds of thousands of analyses if allowed to parse to completion

  • Three approaches to ambiguity management:

    • Prune: block unlikely analysis paths early

    • Procrastinate: do not expand alternative analysis paths until something else requires them

      • Also known as underspecification

    • Manage: compact representation and computation of all possible analyses


Pruning premature disambiguation

Statistics

X

Pruning ⇒ Premature Disambiguation

  • Conventional approach: Use heuristics to kill as soon as possible

X

X

X

Tokenize

Morphology

Syntax

Semantics

Knowledge

X

Fast computation, wrong result


Procrastination passing the buck
Procrastination: Passing the Buck only) option

  • Chunk parsing as an example:

    • Collect noun groups, verb groups, PP groups

    • Leave it to later processing to put these together

    • Some combinations are nonsense

  • Later processing must either:

    • Call (another) parser to check constraints

    • Have its own model of constraints (= grammar)

    • Solve constraints that chunker includes with output


Computational complexity of lfg
Computational Complexity of LFG only) option

  • LFG is simple combination of two simple theories

    • Context-free grammars for trees

    • Quantifier free theory of equality for f-structures

  • Both theories are easy to compute

    • Cubic CFG Parsing

    • Linear equation solving

  • Combination is difficult: Parsing problem is NP Complete

    • Exponential/intractible in the worst case(but computable, unlike some other linguistic theories

    • Can we avoid the worst case?


Some syntactic dependencies
Some syntactic dependencies only) option

  • Local dependencies: These dogs *This dogs(agreement)

  • Nested dependencies: The dogs[in the park] bark (agreement)

  • Cross-serial dependencies:Jan PietMarie zag helpenzwemmen(predicate/argument map)

See(Jan, help(Piet, swim(Marie)))

  • Long distance dependencies:

    The girl who John says that Bob believes … likes Henry left.

Left(girl) Says(John, believes(Bob, (…likes(girl, Henry))))


Expressiveness vs complexity

Intractable! only) option

Expressiveness vs. complexity

The Chomsky Hierarchy

n is length of sentence

Linear

Cubic

Exponential

But languages have mostly local and nested dependencies... so (mostly) cubic performance should be possible.


Np complete problems
NP Complete Problems only) option

  • Problems that can be solved by a Nondeterministic Turing Machine in Polynomial time

  • General characterization: Generate and test

    • Lots of candidate solutions that need to be verified for correctness

    • Every candidate is easy to confirm or disconfirm

n elements

Nondeterministic TM has an oracle

that provides only the right candidates

to test, doesn’t search.

Deterministic TM doesn’t have oracle,

must test all (exponentially many) candidates.

2n candidates


Polynomial search problems
Polynomial search problems only) option

  • Subparts of a candidate are independent of other parts: outcome is not influenced by other parts (context free)

  • The same independent subparts appear in many candidates

  • We can (easily) determine that this is the case

  • Consequence: test subparts independent of context, share results


Why is lfg parsing np complete
Why is LFG parsing NP Complete? only) option

Classic generate-and-test search problem

  • Exponentially many tree-candidates

    • CFG chart parser quickly produces packed representation of all trees

    • CFG can be exponentiallyambiguous

    • Each tree must be tested for f-structure satisfiability

  • Boolean combinations of per-tree constraints

    English base verbs: Not 3rd singular

    ( SUBJ NUM)SG  ( SUBJ PERS)3

    Disjunction!

Exponentially many exponential problems


Xle ambiguity management the intuition

Options multiplied out only) option

The sheep-sg saw the fish-sg.

The sheep-pl saw the fish-sg.

The sheep-sg saw the fish-pl.

The sheep-pl saw the fish-pl.

In principle, a verb might require agreement of Subject and Object: Have to check it out.

Options packed

But English doesn’t do that: Subparts are independent

sgpl

sgpl

The sheep saw the fish

XLE Ambiguity Management: The intuition

How many sheep?

How many fish?

The sheep saw the fish.

Packed representation is a “free choice” system

  • Encodes all dependencies without loss of information

  • Common items represented, computed once

  • Key to practical efficiency


Dependent choices

nom only) optionacc

nomacc

Das Mädchen sah die Katze

The girl saw the cat

bad

The girl saw the cat

The cat saw the girl

bad

Das Mädchen-nom sah die Katze-nom

Das Mädchen-nom sah die Katze-acc

Das Mädchen-acc sah die Katze-nom

Das Mädchen-acc sah die Katze-acc

Dependent choices

… but it’s wrong

Doesn’t encode all dependencies, choices are not free.

Again, packing avoids duplication

Who do you want to succeed?

I want to succeed John want intrans, succeed trans I want John to succeed want trans, succeed intrans


Solution label dependent choices

bad only) option

The girl saw the cat

The cat saw the girl

bad

(pq)

(pq)

 =

Das Mädchen-nom sah die Katze-nom

Das Mädchen-nom sah die Katze-acc

Das Mädchen-acc sah die Katze-nom

Das Mädchen-acc sah die Katze-acc

p:nomp:acc

q:nomq:acc

Das Mädchen sah die Katze

Solution: Label dependent choices

  • Label each choice with distinct Boolean variables p, q, etc.

  • Record acceptable combinations as a Boolean expression 

  • Each analysis corresponds to a satisfying truth-value assignment

  • (free choice from the true lines of ’s truth table)


Boolean satisfiability

(a only) option x  c) (a  x  d) (b  x  c) (b  x  d)

(a  b)  x  (c  d)

Boolean Satisfiability

Can solve Boolean formulas by multiplying out: Disjunctive Normal Form

  • Produces simple conjunctions of literal propositions (“facts”--equations)

  • Easy checks for satisfiability

    If ad FALSE, replace any conjunction with a and d by FALSE.

  • Blow-up of disjunctive structure before fact processing

  • Individual facts are replicated (and re-processed): Exponential


Alternative contexted normal form
Alternative: “Contexted” normal form only) option

(a  b)  x  (c  d)

pa pb  x  qcqd

Produce a flat conjunction of contexted facts

context

fact


Alternative contexted normal form1

only) option

p

p

q

q

x

p  a

p  b

x

q  c

q  d

a

b

c

d

Alternative: “Contexted” normal form

(a  b)  x  (c  d)

pa pb  x  qcqd

  • Each fact is labeled with its position in the disjunctive structure

  • Boolean hierarchy discarded

Produce a flat conjunction of contexted facts

No blow-up, no duplicates

  • Each fact appears and can be processed once

  • Claims:

    • Checks for satisfiability still easy

    • Facts can be processed first, disjunctions deferred


A sound and complete method

Ambiguity-enabled inference (by trivial logic): only) option If    is a rule of inference, then so is [C1  ]  [C2  ]  [(C1C2)  ]

Valid for any theory

E.g. Substition of equals for equals:

x=y    x/yis a rule of inferenceTherefore: (C1  x=y)  (C2  )  (C1  C2  x/y)

A sound and complete method

Maxwell & Kaplan, 1987, 1991

Conversion to logically equivalent contexted form

Lemma:   iff p  p  (p a new Boolean variable)

Proof: (If) If  is true, let p be true, in which case p  p   is true.

(Only if) If p is true, then  is true, in which case  is true.


Test for satisfiability
Test for satisfiability only) option

Suppose R  FALSE is deduced from a contexted formula . Then  is satisfiable only if R.

E.g. R  SG=PL⇒R → FALSE.

R is called a “nogood” context.

  • Perform all fact-inferences, conjoining contexts

  • If infer FALSE, add context to nogoods

  • Solve conjunction of nogoods

    • Boolean satisfiability: exponential in nogood context-Booleans

    • Independent facts: no FALSE, no nogoods

  • Implicitly notices independence/context-freeness


Example 1
Example 1 only) option

“They walk”

  • No disjunction, all facts are in the default “True” context

  • No change to inference

    T(f SUBJ NUM)=SG  T(f SUBJ NUM)=SG  T SG=SG

    reduces to: (f SUBJ NUM)=SG  (f SUBJ NUM)=SG  SG=SG

“They walks”

  • No disjunction, all facts still in the default “True” context

  • No change to inference:

    T(f SUBJ NUM)=PL  T(f SUBJ NUM)=SG TPL=SG  T→FALSE

Satisfiable iff ¬T, so unsatisfiable


Examples 2
Examples 2 only) option

“The sheep walks”

  • Disjunction of NUM feature from sheep

    (f SUBJ NUM)=SG  (f SUBJ NUM)=PL

  • Contexted facts:

    p(f SUBJ NUM)=SG 

    p(f SUBJ NUM)=PL 

    (f SUBJ NUM)=SG (from walks)

  • Inferences:

    p(f SUBJ NUM)=SG  (f SUBJ NUM)=SG  p SG=SG

    p(f SUBJ NUM)=PL  (f SUBJ NUM)=SG  p PL=SG  p FALSE

p FALSE is true iff p is false iff p is True.

Conclusion: Sentence is grammatical in context p: Only 1 sheep


Contexts and packing index by facts

p only) option SGp  PL

q  SGq  PL

p  SGp  PL

q  SGq  PL

SUBJ

OBJ

NUM

NUM

NUM

NUM

SUBJ

OBJ

Contexts and packing: Index by facts

The sheep saw the fish.

Contexted unification  concatenation, when choices don’t interact.


Compare dnf unification

SUBJ only) option[NUMSG]OBJ [ NUM PL]

SUBJ[NUMPL]OBJ [ NUM SG]

SUBJ[NUMPL]OBJ [ NUM PL]

Compare: DNF unification

The sheep saw the fish.

SUBJ[NUMSG]OBJ [ NUM SG]

[ SUBJ [NUM SG]][SUBJ [NUM PL ]]

[ OBJ [NUM SG]][ OBJ [NUM PL ]]

DNF cross-product of alternatives: Exponential


The xle wager for real sentences of real languages
The XLE wager only) option(for real sentences of real languages)

  • Alternatives from distant choice-sets can be freely chosen without affecting satisfiability

    • FALSE is unlikely to appear

  • Contexted method optimizes for independence

    • No FALSE  no nogoods nothing to solve.

Bet: Worst case2n reduces to k2m where m<< n


Ambiguity enabled inference choice logic common to all modules
Ambiguity-enabled inference: only) option Choice-logic common to all modules

If    is a rule of inference,then so is C1    C2    (C1C2)  

1. Substitution of equals for equals (e.g. for LFG syntax)

x=y   x/y Therefore: C1x=y  C2    (C1C2) x/y

2. Reasoning

Cause(x,y)  Prevent(y,z)  Prevent(x,z)

Therefore: C1Cause(x,y)  C2Prevent(y,z)  (C1C2)Prevent(x,z)

3. Log-linear disambiguation

Prop1(x)  Prop2(x) Count(Featuren)Therefore: C1 Prop1(x)  C2 Prop2(x) (C1C2) Count(Featuren)

Ambiguity-enabled components propagate choices, can defer choosing, enumerating


Summary contexted constraint satisfaction
Summary: Contexted constraint satisfaction only) option

  • Packed

    • facts not duplicated

    • facts not hidden in Boolean structure

  • Efficient

    • deductions not duplicated

    • fast fact processing (e.g. equality) can prune slow disjunctive processing

    • optimized for independence

  • General and simple

    • applies to any deductive system, uniform across modules

    • not limited to special-case disjunctions

    • mathematically trivial

  • Compositional free-choice system

    • enumeration of (exponentially many?) valid solutions deferred across module boundaries

    • enables backtrack-free, linear-time, on-demand enumeration

    • enables packed refinement by cross-module constraints: new nogoods


The remaining exponential
The remaining exponential only) option

  • Contexted constraint satisfaction (typically) avoids the Boolean explosion in solving f-structure constraints for single trees

  • How can we suppress tree enumeration?

    (and still determine satisfiability)


Ordering strategy easy things first
Ordering strategy: Easy things first only) option

  • Do all c-structure before any f-structure processing

    • Chart is a free choice representation, guarantees valid trees

  • Only produce/solve f-structure constraints for constituents in complete, well-formed trees

    [NB: Interleaved, bottom-up pruning is a bad idea]

    Bets on inconsistency, not independence


Asking the right question
Asking the right question only) option

  • How can we make it faster?

    • More efficient unifier: undoable operations, better indexing, clever data structures, compiling.

    • Reordering for more effective pruning.

  • Why not cubic?

    • Intuitively, the problem isn’t that hard.

    • GPSG: Natural language is nearly context free.

    • Surely for context-free equivalent grammars!


No f structure filtering no nogoods but still explosive

S only) option

S( L)=

S( R)=

L [A +]

S 

a( A)=+

L [A +]

S

R

L

R [A +]

S

S

S

S

S

R [A +]

S

S

S

S

S

S

S

S

S

F-structuresenumerate trees

Chart:Packed trees

a

a

a

a

S

S

a

a

a

a

No f-structure filtering, no nogoods... but still explosive

LFG grammar for a context-free language:


Disjunctive lazy copy
Disjunctive lazy copy only) option

  • Pack functional information from alternative local subtrees.

  • Unpack/copy to higher consumers only on demand.

p f1q f2r f3

S

L

S

S

1

4

( L)= on Sdoesn’t accessinternal  features

S

S

2

5

p f6q f5r f4

S

S

6

3

R

Automatically takes advantage of context freeness, without grammar analysis or compilation


The xle wager
The XLE wager only) option

  • Most feature dependencies are restricted to local subtrees

    • mother/daughter/sister interactions

    • maybe a grandmother now and then

    • very rarely span an unbounded distance

  • Optimize for local case

    • bounded computation per subtree gives cubic curve

    • graceful degradation with non-local interactions … but still correct


Packing equalities in f structure
Packing Equalities in F-structure only) option

S

A1

A2

NP(SUBJ)=

NP(SUBJ)=

VP

V

Adj

NP(NUM)=sg

NP=

Adj

visiting

relatives (NUM)=pl

boring

is

(SUBJ NUM)=sg

V

NP

visiting

relatives

T & A1  sg=sg

A1  (SUBJ NUM)=sg

T  (SUBJ NUM)=sg

nogood(A2)

A2  (SUBJ NUM)=pl

T & A2  sg=pl


a:1 only) option

a:2


Xle performance homecentre corpus
XLE Performance: HomeCentre Corpus only) option

About 1100 English sentences


Time is linear in subtrees nearly cubic
Time is ~linear in subtrees: Nearly cubic only) option

R2=.79

2.1 ms/subtree


French homecentre
French HomeCentre only) option

R2=.80

3.3 ms/subtree


German homecentre
German HomeCentre only) option

R2=.44

3.8 ms/subtree


Generation with lfg xle
Generation with LFG/XLE only) option

  • Parse: string → c-structure → f-structure

  • Generate: f-structure → c-structure → string

  • Same grammar: shared development, maintenance

  • Formal criterion: s ∈ Gen(Parse(s))

  • Practical criterion: don’t generate everything

    • Parsing robustness → undesired strings, needless ambiguity

    • Use optimality marks to restrict generation grammar

    • Restricted (un)tokenizing transducer: don’t allow arbitrary white space, etc.


Mathematics and computation
Mathematics and Computation only) option

Formal properties

  • Gen(f) is a (possibly infinite) set

    • Equality is idempotent: x=y ∧ x=y ⇔ x=y

    • Longer strings with redundant equations map to same f-structure

  • What kind of set?

    Context-free language (Kaplan & Wedekind, 2000)


Computation
Computation only) option

XLE/LFG generation:

  • Convert LFG grammar to CFG only for strings that map to f

    • NP complete, ambiguity managed (as usual)

    • All strings in CFL are grammatical w.r.t. LFG grammar

    • Composition with regular relations is crucial

  • CFG is a packed, free-choice representation of all strings

    • Can use ordinary CF generation algorithms to enumerate strings

    • Can defer enumeration, give CFG for client to enumerate

    • Can apply other context-free technology

      • Choose shortest string

      • Reduce to finite set of unpumped strings (Context free Pumping Lemma)

      • Choose most probable (for fluency, not grammaticality)


Generating from incomplete f structures
Generating from incomplete f-structures only) option

  • Grammatical features can’t be read from

    • Back-end question-answering logic

    • F-structure translated from other language

  • Generating from a bounded underspecification of a complete f-structure is still context-free

    • Example: a skeleton of predicates

    • Proof: CFL’s are closed under union, bounded extensions produce finite alternatives

  • Generation from arbitrary underspecification is undecidable

    • Reduces to undecidable emptiness problem (= Hilbert’s 10th)(Dymetman, van Noord, Wedekind, Roach)


Ask only) option

Parse

Generate

Search

A (light-weight?) approach to QA

Analyze the question, anticipate and search for possible answer phrases

  • Question: What is the graph partitioning problem?

    • Generated Queries: “The graph partitioning problem is *”

    • Answer (Google): The graph partitioning problem is defined as dividing a graph into disjoint subsets of nodes …

  • Question: When were the Rolling Stones formed?

    • Generated Queries: “The Rolling Stones were formed *”“*formed the Rolling Stones *”

    • Answer (Google): Mick Jagger, Keith Richards, Brian Jones, Bill Wyman, and Charlie Watts formed the Rolling Stones in 1962.

Question F-structure Queries


Pipeline for answer anticipation
Pipeline for Answer Anticipation only) option

Question f-structures

Answer f-structures

Convert

Parser

Generator

Question

AnswerPhrases

Search

(Google...)

Englishgrammar

Englishgrammar


Grammar engineering the parallel grammar project

Grammar engineering: only) optionThe Parallel Grammar Project


Pargram project
Pargram project only) option

  • Large-scale LFG grammars for several languages

    • English, German, Japanese, French, Norwegian

    • Coming along: Korean, Urdu, Chinese, Arabic, Welsh, Malagasy, Danish

    • Intuition + Corpus: Cover real uses of language--newspapers, documents, etc.

  • Parallelism: test LFG universality claims

    • Common c- to f-structure mapping conventions

      (unless typologically motivated variation)

    • Similar underlying f-structures

      Permits shared disambiguation properties, Glue interpretation premises

    • Practical: all grammars run on XLE software

  • International consortium of world-class linguists

    • PARC, Stuttgart, Fuji Xerox, Konstanz, Bergen, Copenhagen, Oxford, Dublin City University, PIEAS…

    • Full week meetings, twice a year

    • Contributions to linguistics and comp-ling: books and papers

    • Each group is self-funded, self-managed


Pargram goals
Pargram goals only) option

  • Practical

    • Create grammatical resources for NL applications

      • translation, question answering, information retrieval, ...

    • Develop discipline of grammar engineering

      • what tools, techniques, conventions make it easy to develop and maintain broad-coverage grammars?

      • how long does it take?

      • how much does it cost?

  • Theoretical

    • Refine and guide LFG theory through broad coverage of multiple languages

    • Refine and guide XLE algorithms and implementation




Pargram grammars
Pargram grammars only) option

German

English*

French

Japanese (Korean)

#Rules

251

388

180

56

#States

3,239

13,655

3,422

368

#Disjuncts

13,294

55,725

16,938

2,012

* English allows for shallow markup: labeled bracketing, named-entities


Why norwegian and japanese
Why Norwegian and Japanese? only) option

Engineering assessment: given mature system, parallel grammar specs.

How hard is it?

  • Norwegian: best case

    • Well-trained LFG linguists

    • Users of previous Parc software

    • Closely related to existing Pargram languages

  • Japanese: worst case

    • One computer scientist, one traditional Japanese linguist--no LFG experience

    • Typologically different language

    • Character sets, typographical conventions

      Conclusion: not that hard

      For both languages: good coverage, accuracy in ~2 person years


Engineering results
Engineering results only) option

  • Grammars and Lexicons

  • Grammar writer’s cookbook (Butt et al., 1999)

  • New practical formal devices

    • Complex categories for efficiency NP[nom] vs. NP: ( CASE) = NOM

    • Optimality marks for robustness

      enlarge grammar without being overrun by peculiar analyses

    • Lexical priority: merging different lexicons

  • Integration of off-the-shelf morphology

    From Inxight, based on earlier PARC research, and Kyoto


Accuracy and coverage
Accuracy and coverage only) option

Riezler et al., 2002

  • WSJ F scores for English Pargram grammar

    • Produces dependencies, not labeled trees

    • Stochastic model trained on sections 2-22

    • Tested on dependencies for 700 sentences in section 23

    • Robustness: some output for every input

(Named Entities seem to bump these by ~3%)


Meridian will pay a premium of 30 5 million to assume 2 billion in deposits
“Meridian will pay a premium of $30.5 million to only) option assume $2 billion in deposits.”

subj(assume~7, pro~8),

number($~9, billion~17),

adjunct($~9, in~11),

num($~9, pl),

pers($~9, 3),

adjunct_type(in~11, nominal),

obj(in~11, deposit~12),

num(deposit~12, pl),

pers(deposit~12, 3),

adjunct(billion~17, 2~19),

number_type(billion~17, cardinal),

number_type(2~19, cardinal),

obj(of~23, $~24),

number($~24, million~4),

num($~24, pl),

pers($~24, 3),

number_type(30.5~28, cardinal))

mood(pay~0, indicative),

tense(pay~0, fut),

adjunct(pay~0, assume~7),

obj(pay~0, premium~3),

stmt_type(pay~0, declarative),

subj(pay~0, Meridian~5),

det_type(premium~3, indef),

adjunct(premium~3, of~23),

num(premium~3, sg),

pers(premium~3, 3),

adjunct(million~4, 30.5~28),

number_type(million~4, cardinal),

num(Meridian~5, sg),

pers(Meridian~5, 3),

obj(assume~7, $~9),

stmt_type(assume~7, purpose),


Accuracy and coverage1
Accuracy and coverage only) option

  • Japanese Pargram grammar

    • ~97% coverage on large corpora

      • 10,000 newspaper sentences (EDR)

      • 460 copier manual sentences

      • 9,637 customer-relations sentences

    • F-scores against 200 hand-annotated sentences from newspaper corpus:

      • Best: 87%

      • Average: 80%

        Recall: Grammar constructed with ~2 person-years of effort

        (compare: Effort to create an annotated training corpus)


Robustness some output for every input

Robustness: only) option Some output for every input


Sources of brittleness
Sources of Brittleness only) option

  • Vocabulary problems

    • Gaps in coverage, neologisms, terminology

    • Incorrect entries, missing frames…

  • Missing constructions

    • No theoretical guidance (or interest)

      (e.g. dates, company names)

    • Core constructions overlooked

      • Intuition and corpus both limited

  • Ungrammatical input

    • Real world text is not perfect

    • Sometimes it’s horrendous

  • Strict performance limits (XLE parameters)


Real world input
Real world input only) option

  • Other weak blue-chip issues included Chevron, which went down 2 to 64 7/8 in Big Board composite trading of 1.3 million shares; Goodyear Tire & Rubber, off 1 1/2 to 46 3/4, and American Express, down 3/4 to 37 1/4.

    (WSJ, section 13)

  • ``The croaker's done gone from the hook”

    (WSJ, section 13)

  • (SOLUTION 27000 20) Without tag P-248 the W7F3 fuse is located in the rear of the machine by the charge power supply (PL3 C14 item 15.

    (Copier repair tip)


Lfg entries from finite state morphologies
LFG entries from Finite-State Morphologies only) option

  • Broad-coverage inflectional transducers

    falls → fall +Noun +Pl

    fall +Verb +Pres +3sg

    Mary → Mary +Prop +Giv +Fem +Sg

    vienne → venir +SubjP +SG {+P1|+P3} +Verb

  • For listed words, transducer provides

    • canonical stem form

    • inflectional information


On the fly lfg entries
On-the-fly LFG entries only) option

  • “-unknown” head-word matches unrecognized stems

  • Grammar writer defines -unknown and affixes

    -unknown N (↑ PRED)=‘%stem’ (↑ NTYPE)=common;

    V (↑ PRED)=‘%stem<SUBJ,OBJ>’.

    +Noun N-AFX (↑ PERS)=3.

    +Pl N-AFX (↑ NUM)=pl.

    +Pres V-AFX (↑ TENSE)=present

    +3sg V-AFX (↑ SUBJ PERS)=3 (↑ SUBJ NUM)=sg

  • Pieces assembled by sublexical rules:

    NOUN → N N-AFX*.

    VERB → V V-AFX*.

(transitive)


Guessing for unlisted words
Guessing for unlisted words only) option

  • Use FST guesser for general patterns

    • Capitalized words can be proper nouns

      • Saakashvili → Saakashvili +Noun +Proper +Guessed

    • ed words can be past tense verbs or adjectives

      • fumped → fump +Verb +Past +Guessed

        fumped +Adj +Deverbal +Guessed

  • Languages with richer morphology allow better guessers


Subcategorization and argument mapping
Subcategorization and Argument Mapping? only) option

  • Transitive, intransitive, inchoative…

    • Not related to inflection

    • Can’t be inferred from shallow data

  • Fill in gaps from external sources

    • Machine readable dictionaries

    • Other resources: VerbNet, WordNet, FrameNet, Cyc

    • Not always easy, not always reliable

      • Current research


Grammatical failures
Grammatical failures only) option

Fall-back approach

  • First try to get a complete analysis

    • Prefer standard rules, but

    • Allow for anticipated errors

      E.g. subject/verb disagree, but interpretation is obvious

    • Optimality-theory marks to prefer standard analyses

  • If fail, enlarge grammar, try again

    • Build up fragments that get complete sub-parses (c-structure and f-structure)

    • Allow tokens that can’t be chunked

    • Link chunks and tokens in a single f-structure


Fall back grammar for fragments
Fall-back grammar for fragments only) option

  • Grammar writer specifies REPARSECAT

    • Alternative c-structure root if no complete parse

    • Allows for fragments and linking

  • Grammar writer specifies possible chunks

    • Categories (e.g. S, NP, VP but not N, V)

    • Looser expansions

  • Optimality theory

    Grammar writer specifies marks to

    • Prefer standard rules over anticipated errors

    • Prefer parse with fewest chunks

    • Disprefer using tokens over chunks


Example
Example only) option

“The the dog appears.”

Analyzed as

  • “token” the

  • sentence “the dog appears”


C structure
C-structure only) option


F structure
F-structure only) option

  • Many chunks have useful analyses

  • XLE/LFG degrades to shallow parsing in worst case


Robustness summary
Robustness summary only) option

  • External resources for incomplete lexical entries

    • Morphologies, guessers, taggers

    • Current work: Verbnet, Wordnet, Framenet, Cyc

    • Order by reliability

  • Fall back techniques for missing constructions

    • Disprefered rules

    • Fragment grammar

  • Current WSJ evaluation:

    • 100% coverage, ~85% full parses

    • F-score (esp. recall) declines for fragment parses


Brief demo
Brief demo only) option


Stochastic disambiguation when you have to choose

Stochastic disambiguation: only) option When you have to choose


Finding the most probable parse
Finding the most probable parse only) option

  • XLE produces many candidates

    • All valid (with respect to grammar and OT marks)

    • Not all equally likely

    • Some applications are ambiguity enabled (defer selection)

    • … But some require a single best guess

  • Grammar writers have only coarse preference intuitions

    • Many implicit properties of words and structures with unclear significance

  • Appeal to probability model to choose best parse

    • Assume: previous experience is a good guide for future decisions

    • Collect corpus of training sentences

    • Build probability model that optimizes for previous good results

    • Apply model to choose best analysis of new sentences


Issues
Issues only) option

  • What kind of probability model?

  • What kind of training data?

  • Efficiency of training, disambiguation?

  • Benefit vs. random choice of parse?

    • Random is awful for treebank grammars

    • Hard LFG constraints restrict to plausible candidates


Probability model
Probability model only) option

  • Conventional models: stochastic branching process

    • Hidden Markov models

    • Probabilistic Context-Free grammars

  • Sequence of decisions, each independent of previous decisions, each choice having a certain probability

    • HMM: Choose from outgoing arcs at a given state

    • PCFG: Choose from alternative expansions of a given category

  • Probability of an analysis = product of choice probabilities

  • Efficient algorithms

    • Training: forward/backward, inside/outside

    • Disambiguation: Viterbi

  • Abney 1997 and others: Not appropriate for LFG, HPSG…

    • Choices are not independent: Information from different CFG branches interacts through f-structure

    • Relative-frequency estimator is inconsistent


Exponential models are appropriate aka log linear models
Exponential models are appropriate only) option (aka Log-linear models)

  • Assign probabilities to representations, not to choices in a derivation

  • No independence assumption

  • Arithmetic combined with human insight

    • Human:

      • Define properties of representations that may be relevant

      • Based on any computable configuration of f-structure features, trees

    • Arithmetic:

      • Train to figure out the weight of each property


Stochastic disambiguation in xle all parses most probable
Stochastic Disambiguation in XLE only) optionAll parses  Most probable

Discriminative ranking

  • Conditional log-linear model on c/f-structure pairs

    Probability of parse x for string s, where

    f is a vector of feature values for x

     is a vector of feature weights

    Z is normalizer for all parses of s

  • Discriminative estimation of  from partially labeled data(Riezler et al. ACL’02)

  • Combined l1-regularization and feature-selection

    • Avoid over-fitting, choose best features(Riezler & Vasserman, EMNLP’04)


Coarse training data for xle

(S (S-ADV (NP-SBJ (-NONE- *-1)) (VP (VBG Considering) (NP (NP (DT the) (NNS naggings)) (PP (IN of)

(NP (DT a) (NN culture)

(NN imperative))))))

(, ,)

(NP-SBJ-1 (PRP I))

(VP (ADVP-MNR (RB promptly))

(VBD signed)

(PRT (RB up)))

(. .))

Coarse training data for XLE

“Correct” parses are consistent with weak annotation

Considering/VBG (NP the naggings of a culture imperative), (NP-SBJ I) promptly signed/VBD up.

  • Sufficient for disambiguation, not for grammar induction


Classes of properties
Classes of properties only) option

  • C-structure nodes and subtrees

    • indicating certain attachment preferences

  • Recursively embedded phrases

    • indicating high vs. low attachment

  • F-structure attributes

    • presence of grammatical functions

  • Atomic attribute-value pairs in f-structure

    • particular feature values

  • Left/right/ branching behavior of c-structures

  • (Non)parallelism of coordinations in c- and f-structures

  • Lexical elements

    • tuples of head words, argument words, grammatical relations

~60,000 candidate properties, ~1000 selected


Some properties and weights
Some properties and weights only) option

0.937481 cs_embedded VPv[pass] 1

-0.126697 cs_embedded VPv[perf] 3

-0.0204844 cs_embedded VPv[perf] 2

-0.0265543 cs_right_branch

-0.986274 cs_conj_nonpar 5

-0.536944 cs_conj_nonpar 4

-0.0561876 cs_conj_nonpar 3

0.373382 cs_label ADVPint

-1.20711 cs_label ADVPvp

-0.57614 cs_label AP[attr]

-0.139274 cs_adjacent_label DATEP PP

-1.25583 cs_adjacent_label MEASUREP PPnp

-0.35766 cs_adjacent_label NPadj PP

-0.00651106 fs_attrs 1 OBL-COMPAR

0.454177 fs_attrs 1 OBL-PART

-0.180969 fs_attrs 1 ADJUNCT

0.285577 fs_attr_val DET-FORM the

0.508962 fs_attr_val DET-FORM this

0.285577 fs_attr_val DET-TYPE def

0.217335 fs_attr_val DET-TYPE demon

0.278342 lex_subcat achieve OBJ,SUBJ,VTYPE SUBJ,OBL-AG,PASSIVE=+

0.00735123 lex_subcat acknowledge COMP-EX,SUBJ,VTYPE


Efficiency
Efficiency only) option

  • Properties counts

    • Associated with AND/OR tree of XLE contexts (a1, b2)

      • Detectors may add new nodes to tree: conjoined contexts

    • Shared among many parses

  • Training

    • Dynamic programming algorithm applied to AND/OR tree

      • Avoids unpacking of individual parses (Miyao and Tsujii HLT’02)

      • Similar to inside-outside algorithm of PCFG

    • Fast algorithm for choosing best properties

    • Can train only on sentences with relatively low-ambiguity

      • Shorter, perhaps easier to annotate

    • 5 hours to train over WSJ (given file of parses)

  • Disambiguation

    • Viterbi algorithm applied to Boolean tree

    • 5% of parse time to disambiguate

    • 30% gain in F-score from random-parse baseline


Integrating shallow mark up part of speech tags named entities syntactic brackets

Integrating Shallow Mark up: only) option Part of speech tags Named entities Syntactic brackets


Shallow mark up of input strings
Shallow mark-up of input strings only) option

  • Part-of-speech tags (tagger?)

    I/PRP saw/VBD her/PRP duck/VB.

    I/PRP saw/VBD her/PRP$ duck/NN.

  • Named entities (named-entity recognizer)

    <person>General Mills</person> bought it.

    <company>General Mills</company> bought it

  • Syntactic brackets (chunk parser?)

    [NP-S I] saw [NP-O the girl with the telescope].

    [NP-S I] saw [NP-O the girl] with the telescope.


Hypothesis
Hypothesis only) option

  • Shallow mark-up

    • Reduces ambiguity

    • Increases speed

    • Without decreasing accuracy

    • (Helps development)

  • Issues

    • Markup errors may eliminate correct analyses

    • Markup process may be slow

    • Markup may interfere with existing robustness mechanisms (optimality, fragments, guessers)

    • Backoff may restore robustness but decrease speed in 2-pass system


Implementation in xle

Input string only) option

Input string

Marked up string

Tokenizer (FST)

(plus POS,NE converter)

Tokenizer (FST)

Morphology (FST)

(plus POS filter)

Morphology (FST)

LFG grammar(plus bracket metarule,

NE sublexical rule)

LFG grammar

c-str

f-str

c-str

f-str

Implementation in XLE

Integration with minimal changes to existing system/grammar



Comparison shallow vs deep parsing
Comparison: Shallow vs. Deep parsing only) option

HLT, 2004

  • Popular myth

    • Shallow statistical parsers are fast, robust… and useful

    • Deep grammar-based parsers are slow and brittle

  • Is this true?Comparison on predicate-argument relations, not phrase-trees

    • Needed for meaning-sensitive applications (= usefulness)

      (translation, question answering…but maybe not IR)

    • Collins (1999) parser: state-of-the-art, marks arguments

      (for fair test, wrote special code to make relations explicit--not so easy)

    • LFG/XLE with morphology, named-entities, disambiguation

    • Measured time, accuracy against PARC 700 Gold Standard

    • Results:

      • Collins is a bit times faster than LFG/XLE

      • LFG/XLE makes somewhat fewer errors, provides more useful detail


Xle system
XLE System only) option

  • Parser/generator for LFG grammars: multilingual

  • Composition with finite-state transductions

  • Careful ambiguity-management implementation

    • Preserves context-free locality in equational disjunctions

    • Exports ambiguity-enabling interfaces

      Efficient implementation of clause-conjunction (C1C2)\

  • Log-linear disambiguation

    • Appropriate for LFG representations

    • Ambiguity-enabled theory and implementation

  • Robustness: shallow in the worst-case

  • Scales to broad-coverage grammars, long sentences

  • Semantic interface: Glue


Lfg xle current issues
LFG/XLE: Current issues only) option

  • Induction of LFG grammars from treebanks

    • Basic work in ParGram: Dublin City University

    • Principles of generalization, for human extension, combination with manual grammar

      DCU + PARC

  • Large grammars for more language typologies

    • E.g. verb initial: Welsh, Malagasy, Arabic

  • Reduce performance variance; why not linear?

    • Competence vs. performance: limit center embedding?

    • Investigate speed/accuracy trade-off

  • Embedding in applications: XLE as a black box

    • Question answering(!), Translation, Sentence condensation …

    • Develop, combine with other ambiguity-enabled modules

      Reasoning, transfer-rewriting…


Matching for question answering
Matching for Question Answering only) option

Overlap detector

Semantics

Semantics

F-structure

F-structure

Answer

Sources

Question

Parser

Parser

Englishgrammar

Englishgrammar


Glue semantics

Glue Semantics only) option


Logical collocational semantics
Logical & Collocational Semantics only) option

  • Logical Semantics

    • Map sentences to logical representations of meaning

    • Enables inference & reasoning

  • Collocational semantics

    • Represent word meanings as feature vectors

    • Typically obtained by statistical corpus analysis

    • Good for indexing, classification, language modeling, word sense disambiguation

    • Currently does not enable inference

  • Complementary, not conflicting, approaches


Example semantic representation

Syntax (f-structure) only) option

Semantics (logical form)

PRED

break<SUBJ>

w. wire(w) & w=part25 &

t. interval(t) & t<now &

e. break_event(e) & occurs_during(e,t) &

object_of_change(e,w) &

c. cause_of_change(e,c)

SUBJ

PRED wire

SPEC def

TENSE

past

NUM sg

Example Semantic Representation

The wire broke

  • F-structure gives basic predicate-argument structure,

    but lacks:

    • Standard logical machinery (variables, connectives, etc)

    • Implicit arguments (events, causes)

    • Contextual dependencies (the wire = part25)

  • Mapping from f-structure to logical form is systematic,

    but non-trivial


Glue semantics dalrymple lamping saraswat 1993 and subsequently
Glue Semantics only) optionDalrymple, Lamping & Saraswat 1993 and subsequently

  • Syntax-semantics mapping as linear logic inference

  • Two logics in semantics:

    • Meaning Logic (target semantic representation)

      any suitable semantic representation

    • Glue Logic (deductively assembles target meaning)

      fragment of linear logic

  • Syntactic analysis produces lexical glue premises

  • Semantic interpretation uses deduction to assemble final meaning from these premises


Linear logic
Linear Logic only) option

  • Influential development in theoretical computer science (Girard 87)

  • Premises are resources consumed in inference (Traditional logic: premises are non-resourced)

Traditional Linear

A, AB |= B A, A -o B |= B

A, AB |= A&B A, A -o B |= AB

A re-usedA consumed

A, B|= B A, B|= B

A discarded Cannot discard A

/

/

  • Linguistic processing typically resource sensitive

    • Words/meanings used exactly once


Glue interpretation outline
Glue Interpretation (Outline) only) option

  • Parsing sentence instantiates lexical entries to produce lexical glue premises

  • Example lexical premise (verb “saw” in “John saw Fred”):

see : g -o (h -o f)

Meaning Term Glue Formula

2-place predicate g, h, f: constituents in parse

“consume meanings of g and h

to produce meaning of f”

  • Glue derivation |= M : f

    • Consume all lexical premises ,

    • to produce meaning, M, for entire sentence, f


Glue interpretation getting the premises

Syntactic Analysis: only) option

S

see

PRED

NP

VP

g:

PRED John

f:

SUBJ

NP

V

h:

OBJ

PRED Fred

John

saw

Fred

Glue Interpretation Getting the premises

Lexicon:

John NP john:

Fred NP fred: 

saw V see: SUBJ-o (OBJ-o)

Instantiated premises:

john: g

fred: h

see: g -o(h -of)


Glue interpretation deduction with premises

Linear Logic Derivation only) option

g -o(h -of) g

h -of h

f

Using linear modus ponens

Derivation with Meaning Terms

see: g -o(h -of)john: g

see(john) : h -of fred: h

see(john)(fred) : f

Linear modus ponens = function application

Glue InterpretationDeduction with premises

Premises

john: g

fred: h

see: g -o(h -of)


Modus ponens function application the curry howard isomorphism

Fun only) option: Arg:

Fun(Arg):

g -o f g

f

Modus Ponens = Function ApplicationThe Curry-Howard Isomorphism

Curry Howard Isomorphism:

Pairs LL inference rules with operations on meaning terms

Propositional linear logic inference constructs meanings

LL inference completely independent of meaning language

(Modularity of meaning representation)


Semantic ambiguity multiple derivations from single set of premises

Alleged criminal from London only) option

PRED criminal

f:

alleged

MODS

from London

Semantic AmbiguityMultiple derivations from single set of premises

Premises

criminal: f

alleged: f -o f

from-London:f -o f

Two distinct derivations:

1. from-London(alleged(criminal))

2. alleged(from-London(criminal))


Semantic ambiguity modifiers
Semantic Ambiguity & Modifiers only) option

  • Multiple derivations from single premise set

    • Arises through different ways of permuting modifiers around an  skeleton

  • Modifiers given formal representation in glue as -ological identities

    • E.g. an adjective is a noun -onounmodifier

  • Modifiers prevalent in natural language, and lead to combinatorial explosion

    • Given N f-o f modifiers, N! ways of permuting them around f skeleton


Ambiguity management in semantics
Ambiguity management in semantics only) option

  • Efficient theorem provers that manage combinatorial explosion of modifiers

    • Packing of N! analyses

      • Represent all N! analyses in polynomial space

      • Compute representation in polynomial time

      • Free choice: Read off any given analysis in linear time

    • Packing through structure re-use

      • N! analyses through combinations of N sub-analyses

      • Compute each sub-analysis once, and re-use


Parc linguistic environment

Applications only) option

Translation

Question Answering

Dialog

Ambiguity

Management

Condensation

Knowledge tracking

Email Routing

Email Response

Scale

Modularity

Robustness

Parc Linguistic Environment

Multidimensional

Architecture

Glue Semantics

LFG Syntax

FS Morphology

Parse

Generate

Select

Transfer

Interpret

Mathematics

Algorithms

Programs

Data structures

Models, parameters

English

French

Theory

Software

Tableware

Japanese

German

Urdu

Norwegian


ad