grammatical processing with lfg and xle
Download
Skip this Video
Download Presentation
Grammatical processing with LFG and XLE

Loading in 2 Seconds...

play fullscreen
1 / 124

Grammatical processing with LFG and XLE - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

A dvanced QU estion A nswering for INT elligence. Grammatical processing with LFG and XLE. Ron Kaplan ARDA Symposium, August 2004. Match. M. M. Layered Architecture for Question Answering. XLE/LFG Parsing. Target KRR. Text. KR Mapping. F-structure.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Grammatical processing with LFG and XLE' - tauret


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
grammatical processing with lfg and xle

Advanced QUestion

Answering for INTelligence

Grammatical processing with LFG and XLE

Ron Kaplan

ARDA Symposium, August 2004

layered architecture for question answering

Match

M

M

Layered Architecture for Question Answering

XLE/LFG Parsing

Target KRR

Text

KR Mapping

F-structure

Conceptual semantics

KR

Sources

Assertions

Question

Query

Answers

Explanations

Subqueries

Composed

F-StructureTemplates

Text to user

XLE/LFG

Generation

Text

layered architecture for question answering1
Layered Architecture for Question Answering

XLE/LFG Parsing

Target KRR

Text

KR Mapping

F-structure

Conceptual semantics

KR

Sources

Assertions

Question

Query

Answers

Explanations

Subqueries

Composed

F-StructureTemplates

Text to user

XLE/LFG

Generation

Text

layered architecture for question answering2

Infrastructure

XLE

MaxEnt models

Linear deduction

Term rewriting

Theories

Lexical Functional Grammar

Ambiguity management

Glue Semantics

Resources

English Grammar

Glue lexicon

KR mapping

Layered Architecture for Question Answering

XLE/LFG Parsing

Target KRR

Text

KR Mapping

F-structure

Conceptual semantics

KR

Sources

Assertions

Question

Query

deep analysis matters if you care about the answer

shallow but wrong

delegation furthest away but

Subject of flew

deep and right

“grammatical function”

Deep analysis matters…if you care about the answer

Example:

A delegation led by Vice President Philips, head of the chemical division, flew to Chicago a week after the incident.

Question: Who flew to Chicago?

Candidate answers:

division closest noun

head next closest

V.P. Philips next

f structure localizes arguments

PRED easy(SUBJ, COMP)

SUBJ John

PRED please(SUBJ, OBJ)

SUBJ someone

OBJ John

COMP

PRED eager(SUBJ, COMP)

SUBJ John

PRED please(SUBJ, OBJ)

SUBJ John

OBJ someone

COMP

F-structure: localizes arguments

Was John pleased?

“John was easy to please” Yes

“John was eager to please” Unknown

“lexical dependency”

topics
Topics
  • Basic LFG architecture
  • Ambiguity management in XLE
  • Pargram project: Large scale grammars
  • Robustness
  • Stochastic disambiguation
  • [Shallow markup]
  • [Semantic interpretation]

Focus on the language end, not knowledge

the language mapping lfg xle

“Tony decided to go.”

Knowledge

The Language Mapping: LFG & XLE

StochasticModel

NamedEntities

LFGGrammar

English, German, etc.

Parse

Functional

structures

TokensMorphology

Sentence

Generate

XLE

XLE: Efficient ambiguity management

why deep analysis is difficult
Why deep analysis is difficult
  • Languages are hard to describe
    • Meaning depends on complex properties of words and sequences
    • Different languages rely on different properties
    • Errors and disfluencies
  • Languages are hard to compute
    • Expensive to recognize complex patterns
    • Sentences are ambiguous
    • Ambiguities multiply: explosion in time and space
slide10

S

NP

V’

NP

Det

Adj

N

V

Det

Aux

N

the

small

children

are

chasing

the

dog

S

NP

NP

V

N

P

Adj

N

P

inudog

oObj

tiisaismall

kodomotatichildren

gaSbj

oikaketeiruare chasing

Different patterns code same meaning

The small children are chasing the dog.

English

Group, order

Japanese

Group, mark

slide11

S

NP

V’

NP

Det

Adj

N

V

Det

Aux

N

the

small

children

are

chasing

the

dog

Pred

‘chase<Subj, Obj>’

S

Tense

Present

NP

NP

V

PredMod

childrensmall

Subj

N

P

Adj

N

P

inudog

oObj

tiisaismall

kodomotatichildren

gaSbj

oikaketeiruare chasing

Pred

dog

Obj

S

NP

Aux

NP

V

NP

Warlpiri

Mark only

N

A

N

kurdujarrarluchildren-Sbj

kapalaPresent

malikidog-Obj

wajilipinyichase

witajarrarlusmall-Sbj

Different patterns code same meaning

The small children are chasing the dog.

LFG theory: minor adjustments on universal theme

English

Group, order

Japanese

Group, mark

chase(small(children), dog)

lfg architecture

PRED ‘John’

NUM SG

SUBJ

PRED

‘like<SUBJ,OBJ>’

TENSE

PRESENT

PRED ‘Mary’

NUM SG

OBJ

Modularity

Nearly-decomposable

LFG architecture

related by a piecewise correspondence 

C(onstituent)-structures and F(unctional)-structures

S

NP

VP

John

V

NP

likes

Mary

Formal encoding of order and grouping

Formal encoding of grammatical relations

lfg grammar
LFG grammar

Rules

Lexical entries

  • Context-free rules define valid c-structures (trees).
  • Annotations on rules give constraints that corresponding f-structures must satisfy.
  • Satisfiability of constraints determines grammaticality.
  • F-structure is solution for constraints (if satisfied).

S  NP VP ( SUBJ)= =

N  John ( PRED)=‘John’ ( NUM)=SG

V  likes

( PRED)=‘like<SUBJ, OBJ>’ ( SUBJ NUM)=SG

(↑ SUBJ PERS)=3

VP  V (NP) =  ( OBJ)=

NP  (Det) N =  =

rules as well formedness conditions

S 

NP( SUBJ)=

VP=

Rules as well-formedness conditions

S

SUBJ [ ]

NP

VP

If * denotes a particular daughter:

: f-structure of mother (M(*))

: f-structure of daughter(*)

A tree containing S over NP - VP is OK if

F-unit corresponding to NP node is SUBJ of f-unit corresponding to S node

The same f-unit corresponds to both S and VP nodes.

inconsistent equations ungrammatical

f

NP( SUBJ)=

VP=

f

S 

s

s

v

f

v

they( NUM)=PL

walks( SUBJ NUM)=SG

s

v

Let f be the (unknown) f-structure of the S

Then (substituting equals for equals):

s be the f-structure of the NP

(fSUBJ) = s and (s NUM)=PL => (f SUBJ NUM)=PL

v be the f-structure of the VP

(f SUBJ NUM)=PL and (f SUBJ NUM)=SG

=> SG=PL =>FALSE

Inconsistent equations = Ungrammatical

S

What’s wrong with “They walks” ?

NP

VP

they

walks

f= v and (v SUBJ NUM)=SG => (f SUBJ NUM)=SG

If a valid inference chain yields FALSE,

the premises are unsatisfiable,

no f-structure.

english and japanese

English: One NP before verb, one after: Subject and Object

S 

NP( SUBJ)=

V=

NP( OBJ)=

Japanese: Any number of NP’s before Verb Particle on each defines its grammatical function

ga: ( GF)=SUBJ

NP*( ( GF))=

V=

o: ( GF)=OBJ

S 

English and Japanese
slide17

S  … NP*… ( ( GF))=

rlu: ( GF)=SUBJ

ki: ( GF)=OBJ

Unlike Japanese, head Noun is optional in NP

A*( MOD)

NP 

N=

PRED

‘chase<Subj, Obj>’

S

TENSE

Present

PREDMOD

childrensmall

SUBJ

NP

Aux

NP

V

NP

PRED

dog

OBJ

N

A

N

kurdujarrarluchildren-Sbj

kapalaPresent

malikidog-Obj

wajilipinyichase

witajarrarlusmall-Sbj

Warlpiri: Discontinuous constituents

Like Japanese: Any number of NP’s Particle on each defines its grammatical function

english discontinuity in questions

S’

NP

S

Q Who

TENSE past

PRED think<SUBJ, COMP>

COMP

Who

Aux

NP

V

S

did

Bill

think

NP

V

PRED see<SUBJ,OBJ>

TENSE past

SUBJ Mary

OBJ

Mary

saw

English: Discontinuity in questions

Who did Mary see?

Who did Bill think Mary saw?

Who did Bill think saw Mary?

OBJ

COMP OBJ

COMP SUBJ

Who is understood as subject/object of distant verb.Uncertainty: which function of which verb?

S’ → NP S

(↑ Q)=↓ ↑=↓

(↑ COMP* SUBJ|OBJ)=↓

summary lexical functional grammar
Summary: Lexical Functional Grammar

Kaplan and Bresnan, 1982

  • Modular: c-structure/f-structure in correspondence
  • Mathematically simple, computationally transparent
    • Combination of Context-free grammar, Quantifier-free equality theory
    • Closed under composition with regular relations: finite-state morphology
  • Grammatical functions are universal primitives
    • Subject and Object expressed differently in different languages

English: Subject is first NP

Japanese: Subject has ga

    • But: Subject and Object behave similarly in all languages

Active to Passive: Object becomes Subject

English: move words Japanese: move ga

  • Adopted by world-wide community of linguists
    • Large literature: papers, (text)books, conferences; reference theory
    • (Relatively) easy to describe all languages
    • Linguists contribute to practical computation
  • Stable: Only minor changes in 25 years
computation challenge pervasive ambiguity

Tokenization

Morphology

Syntax

Semantics

Knowledge

  • The sheet broke the beam.

Atoms or photons?

  • Every proposer wants an award.

The same award or each their own?

  • The duck is ready to eat.Cooked or hungry?
  • walks Noun or Verb?
  • untieable knot (untie)able or un(tieable)?
  • bankriver or financial?
  • I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation)
Computation challenge: Pervasive ambiguity
coverage vs ambiguity
Coverage vs. Ambiguity

I fell in the park.

+

I know the girl in the park.

I see the girl in the park.

ambiguity can be explosive
Ambiguity can be explosive

If alternatives multiply within or across components…

Tokenize

Morphology

Syntax

Semantics

Knowledge

computational consequences of ambiguity
Computational consequences of ambiguity
  • Serious problem for computational systems
    • Broad coverage, hand written grammars frequently produce thousands of analyses, sometimes millions
    • Machine learned grammars easily produce hundreds of thousands of analyses if allowed to parse to completion
  • Three approaches to ambiguity management:
    • Prune: block unlikely analysis paths early
    • Procrastinate: do not expand alternative analysis paths until something else requires them
      • Also known as underspecification
    • Manage: compact representation and computation of all possible analyses
pruning premature disambiguation

Oops: Strong constraints may reject the so-far-best (= only) option

Statistics

X

Pruning ⇒ Premature Disambiguation
  • Conventional approach: Use heuristics to kill as soon as possible

X

X

X

Tokenize

Morphology

Syntax

Semantics

Knowledge

X

Fast computation, wrong result

procrastination passing the buck
Procrastination: Passing the Buck
  • Chunk parsing as an example:
    • Collect noun groups, verb groups, PP groups
    • Leave it to later processing to put these together
    • Some combinations are nonsense
  • Later processing must either:
    • Call (another) parser to check constraints
    • Have its own model of constraints (= grammar)
    • Solve constraints that chunker includes with output
computational complexity of lfg
Computational Complexity of LFG
  • LFG is simple combination of two simple theories
    • Context-free grammars for trees
    • Quantifier free theory of equality for f-structures
  • Both theories are easy to compute
    • Cubic CFG Parsing
    • Linear equation solving
  • Combination is difficult: Parsing problem is NP Complete
    • Exponential/intractible in the worst case(but computable, unlike some other linguistic theories
    • Can we avoid the worst case?
some syntactic dependencies
Some syntactic dependencies
  • Local dependencies: These dogs *This dogs(agreement)
  • Nested dependencies: The dogs[in the park] bark (agreement)
  • Cross-serial dependencies:Jan PietMarie zag helpenzwemmen(predicate/argument map)

See(Jan, help(Piet, swim(Marie)))

  • Long distance dependencies:

The girl who John says that Bob believes … likes Henry left.

Left(girl) Says(John, believes(Bob, (…likes(girl, Henry))))

expressiveness vs complexity

Intractable!

Expressiveness vs. complexity

The Chomsky Hierarchy

n is length of sentence

Linear

Cubic

Exponential

But languages have mostly local and nested dependencies... so (mostly) cubic performance should be possible.

np complete problems
NP Complete Problems
  • Problems that can be solved by a Nondeterministic Turing Machine in Polynomial time
  • General characterization: Generate and test
    • Lots of candidate solutions that need to be verified for correctness
    • Every candidate is easy to confirm or disconfirm

n elements

Nondeterministic TM has an oracle

that provides only the right candidates

to test, doesn’t search.

Deterministic TM doesn’t have oracle,

must test all (exponentially many) candidates.

2n candidates

polynomial search problems
Polynomial search problems
  • Subparts of a candidate are independent of other parts: outcome is not influenced by other parts (context free)
  • The same independent subparts appear in many candidates
  • We can (easily) determine that this is the case
  • Consequence: test subparts independent of context, share results
why is lfg parsing np complete
Why is LFG parsing NP Complete?

Classic generate-and-test search problem

  • Exponentially many tree-candidates
    • CFG chart parser quickly produces packed representation of all trees
    • CFG can be exponentiallyambiguous
    • Each tree must be tested for f-structure satisfiability
  • Boolean combinations of per-tree constraints

English base verbs: Not 3rd singular

( SUBJ NUM)SG  ( SUBJ PERS)3

Disjunction!

Exponentially many exponential problems

xle ambiguity management the intuition

Options multiplied out

The sheep-sg saw the fish-sg.

The sheep-pl saw the fish-sg.

The sheep-sg saw the fish-pl.

The sheep-pl saw the fish-pl.

In principle, a verb might require agreement of Subject and Object: Have to check it out.

Options packed

But English doesn’t do that: Subparts are independent

sgpl

sgpl

The sheep saw the fish

XLE Ambiguity Management: The intuition

How many sheep?

How many fish?

The sheep saw the fish.

Packed representation is a “free choice” system

  • Encodes all dependencies without loss of information
  • Common items represented, computed once
  • Key to practical efficiency
dependent choices

nomacc

nomacc

Das Mädchen sah die Katze

The girl saw the cat

bad

The girl saw the cat

The cat saw the girl

bad

Das Mädchen-nom sah die Katze-nom

Das Mädchen-nom sah die Katze-acc

Das Mädchen-acc sah die Katze-nom

Das Mädchen-acc sah die Katze-acc

Dependent choices

… but it’s wrong

Doesn’t encode all dependencies, choices are not free.

Again, packing avoids duplication

Who do you want to succeed?

I want to succeed John want intrans, succeed trans I want John to succeed want trans, succeed intrans

solution label dependent choices

bad

The girl saw the cat

The cat saw the girl

bad

(pq)

(pq)

 =

Das Mädchen-nom sah die Katze-nom

Das Mädchen-nom sah die Katze-acc

Das Mädchen-acc sah die Katze-nom

Das Mädchen-acc sah die Katze-acc

p:nomp:acc

q:nomq:acc

Das Mädchen sah die Katze

Solution: Label dependent choices
  • Label each choice with distinct Boolean variables p, q, etc.
  • Record acceptable combinations as a Boolean expression 
  • Each analysis corresponds to a satisfying truth-value assignment
  • (free choice from the true lines of ’s truth table)
boolean satisfiability

(a  x  c) (a  x  d) (b  x  c) (b  x  d)

(a  b)  x  (c  d)

Boolean Satisfiability

Can solve Boolean formulas by multiplying out: Disjunctive Normal Form

  • Produces simple conjunctions of literal propositions (“facts”--equations)
  • Easy checks for satisfiability

If ad FALSE, replace any conjunction with a and d by FALSE.

  • Blow-up of disjunctive structure before fact processing
  • Individual facts are replicated (and re-processed): Exponential
alternative contexted normal form
Alternative: “Contexted” normal form

(a  b)  x  (c  d)

pa pb  x  qcqd

Produce a flat conjunction of contexted facts

context

fact

alternative contexted normal form1

p

p

q

q

x

p  a

p  b

x

q  c

q  d

a

b

c

d

Alternative: “Contexted” normal form

(a  b)  x  (c  d)

pa pb  x  qcqd

  • Each fact is labeled with its position in the disjunctive structure
  • Boolean hierarchy discarded

Produce a flat conjunction of contexted facts

No blow-up, no duplicates

  • Each fact appears and can be processed once
  • Claims:
    • Checks for satisfiability still easy
    • Facts can be processed first, disjunctions deferred
a sound and complete method

Ambiguity-enabled inference (by trivial logic): If    is a rule of inference, then so is [C1  ]  [C2  ]  [(C1C2)  ]

Valid for any theory

E.g. Substition of equals for equals:

x=y    x/yis a rule of inferenceTherefore: (C1  x=y)  (C2  )  (C1  C2  x/y)

A sound and complete method

Maxwell & Kaplan, 1987, 1991

Conversion to logically equivalent contexted form

Lemma:   iff p  p  (p a new Boolean variable)

Proof: (If) If  is true, let p be true, in which case p  p   is true.

(Only if) If p is true, then  is true, in which case  is true.

test for satisfiability
Test for satisfiability

Suppose R  FALSE is deduced from a contexted formula . Then  is satisfiable only if R.

E.g. R  SG=PL⇒R → FALSE.

R is called a “nogood” context.

  • Perform all fact-inferences, conjoining contexts
  • If infer FALSE, add context to nogoods
  • Solve conjunction of nogoods
    • Boolean satisfiability: exponential in nogood context-Booleans
    • Independent facts: no FALSE, no nogoods
  • Implicitly notices independence/context-freeness
example 1
Example 1

“They walk”

  • No disjunction, all facts are in the default “True” context
  • No change to inference

T(f SUBJ NUM)=SG  T(f SUBJ NUM)=SG  T SG=SG

reduces to: (f SUBJ NUM)=SG  (f SUBJ NUM)=SG  SG=SG

“They walks”

  • No disjunction, all facts still in the default “True” context
  • No change to inference:

T(f SUBJ NUM)=PL  T(f SUBJ NUM)=SG TPL=SG  T→FALSE

Satisfiable iff ¬T, so unsatisfiable

examples 2
Examples 2

“The sheep walks”

  • Disjunction of NUM feature from sheep

(f SUBJ NUM)=SG  (f SUBJ NUM)=PL

  • Contexted facts:

p(f SUBJ NUM)=SG 

p(f SUBJ NUM)=PL 

(f SUBJ NUM)=SG (from walks)

  • Inferences:

p(f SUBJ NUM)=SG  (f SUBJ NUM)=SG  p SG=SG

p(f SUBJ NUM)=PL  (f SUBJ NUM)=SG  p PL=SG  p FALSE

p FALSE is true iff p is false iff p is True.

Conclusion: Sentence is grammatical in context p: Only 1 sheep

contexts and packing index by facts

p  SGp  PL

q  SGq  PL

p  SGp  PL

q  SGq  PL

SUBJ

OBJ

NUM

NUM

NUM

NUM

SUBJ

OBJ

Contexts and packing: Index by facts

The sheep saw the fish.

Contexted unification  concatenation, when choices don’t interact.

compare dnf unification

SUBJ[NUMSG]OBJ [ NUM PL]

SUBJ[NUMPL]OBJ [ NUM SG]

SUBJ[NUMPL]OBJ [ NUM PL]

Compare: DNF unification

The sheep saw the fish.

SUBJ[NUMSG]OBJ [ NUM SG]

[ SUBJ [NUM SG]][SUBJ [NUM PL ]]

[ OBJ [NUM SG]][ OBJ [NUM PL ]]

DNF cross-product of alternatives: Exponential

the xle wager for real sentences of real languages
The XLE wager(for real sentences of real languages)
  • Alternatives from distant choice-sets can be freely chosen without affecting satisfiability
    • FALSE is unlikely to appear
  • Contexted method optimizes for independence
    • No FALSE  no nogoods nothing to solve.

Bet: Worst case2n reduces to k2m where m<< n

ambiguity enabled inference choice logic common to all modules
Ambiguity-enabled inference: Choice-logic common to all modules

If    is a rule of inference,then so is C1    C2    (C1C2)  

1. Substitution of equals for equals (e.g. for LFG syntax)

x=y   x/y Therefore: C1x=y  C2    (C1C2) x/y

2. Reasoning

Cause(x,y)  Prevent(y,z)  Prevent(x,z)

Therefore: C1Cause(x,y)  C2Prevent(y,z)  (C1C2)Prevent(x,z)

3. Log-linear disambiguation

Prop1(x)  Prop2(x) Count(Featuren)Therefore: C1 Prop1(x)  C2 Prop2(x) (C1C2) Count(Featuren)

Ambiguity-enabled components propagate choices, can defer choosing, enumerating

summary contexted constraint satisfaction
Summary: Contexted constraint satisfaction
  • Packed
    • facts not duplicated
    • facts not hidden in Boolean structure
  • Efficient
    • deductions not duplicated
    • fast fact processing (e.g. equality) can prune slow disjunctive processing
    • optimized for independence
  • General and simple
    • applies to any deductive system, uniform across modules
    • not limited to special-case disjunctions
    • mathematically trivial
  • Compositional free-choice system
    • enumeration of (exponentially many?) valid solutions deferred across module boundaries
    • enables backtrack-free, linear-time, on-demand enumeration
    • enables packed refinement by cross-module constraints: new nogoods
the remaining exponential
The remaining exponential
  • Contexted constraint satisfaction (typically) avoids the Boolean explosion in solving f-structure constraints for single trees
  • How can we suppress tree enumeration?

(and still determine satisfiability)

ordering strategy easy things first
Ordering strategy: Easy things first
  • Do all c-structure before any f-structure processing
    • Chart is a free choice representation, guarantees valid trees
  • Only produce/solve f-structure constraints for constituents in complete, well-formed trees

[NB: Interleaved, bottom-up pruning is a bad idea]

Bets on inconsistency, not independence

asking the right question
Asking the right question
  • How can we make it faster?
    • More efficient unifier: undoable operations, better indexing, clever data structures, compiling.
    • Reordering for more effective pruning.
  • Why not cubic?
    • Intuitively, the problem isn’t that hard.
    • GPSG: Natural language is nearly context free.
    • Surely for context-free equivalent grammars!
no f structure filtering no nogoods but still explosive

S 

S( L)=

S( R)=

L [A +]

S 

a( A)=+

L [A +]

S

R

L

R [A +]

S

S

S

S

S

R [A +]

S

S

S

S

S

S

S

S

S

F-structuresenumerate trees

Chart:Packed trees

a

a

a

a

S

S

a

a

a

a

No f-structure filtering, no nogoods... but still explosive

LFG grammar for a context-free language:

disjunctive lazy copy
Disjunctive lazy copy
  • Pack functional information from alternative local subtrees.
  • Unpack/copy to higher consumers only on demand.

p f1q f2r f3

S

L

S

S

1

4

( L)= on Sdoesn’t accessinternal  features

S

S

2

5

p f6q f5r f4

S

S

6

3

R

Automatically takes advantage of context freeness, without grammar analysis or compilation

the xle wager
The XLE wager
  • Most feature dependencies are restricted to local subtrees
    • mother/daughter/sister interactions
    • maybe a grandmother now and then
    • very rarely span an unbounded distance
  • Optimize for local case
    • bounded computation per subtree gives cubic curve
    • graceful degradation with non-local interactions … but still correct
packing equalities in f structure
Packing Equalities in F-structure

S

A1

A2

NP(SUBJ)=

NP(SUBJ)=

VP

V

Adj

NP(NUM)=sg

NP=

Adj

visiting

relatives (NUM)=pl

boring

is

(SUBJ NUM)=sg

V

NP

visiting

relatives

T & A1  sg=sg

A1  (SUBJ NUM)=sg

T  (SUBJ NUM)=sg

nogood(A2)

A2  (SUBJ NUM)=pl

T & A2  sg=pl

slide56

a:1

a:2

xle performance homecentre corpus
XLE Performance: HomeCentre Corpus

About 1100 English sentences

french homecentre
French HomeCentre

R2=.80

3.3 ms/subtree

german homecentre
German HomeCentre

R2=.44

3.8 ms/subtree

generation with lfg xle
Generation with LFG/XLE
  • Parse: string → c-structure → f-structure
  • Generate: f-structure → c-structure → string
  • Same grammar: shared development, maintenance
  • Formal criterion: s ∈ Gen(Parse(s))
  • Practical criterion: don’t generate everything
    • Parsing robustness → undesired strings, needless ambiguity
    • Use optimality marks to restrict generation grammar
    • Restricted (un)tokenizing transducer: don’t allow arbitrary white space, etc.
mathematics and computation
Mathematics and Computation

Formal properties

  • Gen(f) is a (possibly infinite) set
    • Equality is idempotent: x=y ∧ x=y ⇔ x=y
    • Longer strings with redundant equations map to same f-structure
  • What kind of set?

Context-free language (Kaplan & Wedekind, 2000)

computation
Computation

XLE/LFG generation:

  • Convert LFG grammar to CFG only for strings that map to f
    • NP complete, ambiguity managed (as usual)
    • All strings in CFL are grammatical w.r.t. LFG grammar
    • Composition with regular relations is crucial
  • CFG is a packed, free-choice representation of all strings
    • Can use ordinary CF generation algorithms to enumerate strings
    • Can defer enumeration, give CFG for client to enumerate
    • Can apply other context-free technology
      • Choose shortest string
      • Reduce to finite set of unpumped strings (Context free Pumping Lemma)
      • Choose most probable (for fluency, not grammaticality)
generating from incomplete f structures
Generating from incomplete f-structures
  • Grammatical features can’t be read from
    • Back-end question-answering logic
    • F-structure translated from other language
  • Generating from a bounded underspecification of a complete f-structure is still context-free
    • Example: a skeleton of predicates
    • Proof: CFL’s are closed under union, bounded extensions produce finite alternatives
  • Generation from arbitrary underspecification is undecidable
    • Reduces to undecidable emptiness problem (= Hilbert’s 10th)(Dymetman, van Noord, Wedekind, Roach)
slide65

Ask

Parse

Generate

Search

A (light-weight?) approach to QA

Analyze the question, anticipate and search for possible answer phrases

  • Question: What is the graph partitioning problem?
    • Generated Queries: “The graph partitioning problem is *”
    • Answer (Google): The graph partitioning problem is defined as dividing a graph into disjoint subsets of nodes …
  • Question: When were the Rolling Stones formed?
    • Generated Queries: “The Rolling Stones were formed *”“*formed the Rolling Stones *”
    • Answer (Google): Mick Jagger, Keith Richards, Brian Jones, Bill Wyman, and Charlie Watts formed the Rolling Stones in 1962.

Question F-structure Queries

pipeline for answer anticipation
Pipeline for Answer Anticipation

Question f-structures

Answer f-structures

Convert

Parser

Generator

Question

AnswerPhrases

Search

(Google...)

Englishgrammar

Englishgrammar

pargram project
Pargram project
  • Large-scale LFG grammars for several languages
    • English, German, Japanese, French, Norwegian
    • Coming along: Korean, Urdu, Chinese, Arabic, Welsh, Malagasy, Danish
    • Intuition + Corpus: Cover real uses of language--newspapers, documents, etc.
  • Parallelism: test LFG universality claims
    • Common c- to f-structure mapping conventions

(unless typologically motivated variation)

    • Similar underlying f-structures

Permits shared disambiguation properties, Glue interpretation premises

    • Practical: all grammars run on XLE software
  • International consortium of world-class linguists
    • PARC, Stuttgart, Fuji Xerox, Konstanz, Bergen, Copenhagen, Oxford, Dublin City University, PIEAS…
    • Full week meetings, twice a year
    • Contributions to linguistics and comp-ling: books and papers
    • Each group is self-funded, self-managed
pargram goals
Pargram goals
  • Practical
    • Create grammatical resources for NL applications
      • translation, question answering, information retrieval, ...
    • Develop discipline of grammar engineering
      • what tools, techniques, conventions make it easy to develop and maintain broad-coverage grammars?
      • how long does it take?
      • how much does it cost?
  • Theoretical
    • Refine and guide LFG theory through broad coverage of multiple languages
    • Refine and guide XLE algorithms and implementation
pargram grammars
Pargram grammars

German

English*

French

Japanese (Korean)

#Rules

251

388

180

56

#States

3,239

13,655

3,422

368

#Disjuncts

13,294

55,725

16,938

2,012

* English allows for shallow markup: labeled bracketing, named-entities

why norwegian and japanese
Why Norwegian and Japanese?

Engineering assessment: given mature system, parallel grammar specs.

How hard is it?

  • Norwegian: best case
    • Well-trained LFG linguists
    • Users of previous Parc software
    • Closely related to existing Pargram languages
  • Japanese: worst case
    • One computer scientist, one traditional Japanese linguist--no LFG experience
    • Typologically different language
    • Character sets, typographical conventions

Conclusion: not that hard

For both languages: good coverage, accuracy in ~2 person years

engineering results
Engineering results
  • Grammars and Lexicons
  • Grammar writer’s cookbook (Butt et al., 1999)
  • New practical formal devices
    • Complex categories for efficiency NP[nom] vs. NP: ( CASE) = NOM
    • Optimality marks for robustness

enlarge grammar without being overrun by peculiar analyses

    • Lexical priority: merging different lexicons
  • Integration of off-the-shelf morphology

From Inxight, based on earlier PARC research, and Kyoto

accuracy and coverage
Accuracy and coverage

Riezler et al., 2002

  • WSJ F scores for English Pargram grammar
    • Produces dependencies, not labeled trees
    • Stochastic model trained on sections 2-22
    • Tested on dependencies for 700 sentences in section 23
    • Robustness: some output for every input

(Named Entities seem to bump these by ~3%)

meridian will pay a premium of 30 5 million to assume 2 billion in deposits
“Meridian will pay a premium of $30.5 million to assume $2 billion in deposits.”

subj(assume~7, pro~8),

number($~9, billion~17),

adjunct($~9, in~11),

num($~9, pl),

pers($~9, 3),

adjunct_type(in~11, nominal),

obj(in~11, deposit~12),

num(deposit~12, pl),

pers(deposit~12, 3),

adjunct(billion~17, 2~19),

number_type(billion~17, cardinal),

number_type(2~19, cardinal),

obj(of~23, $~24),

number($~24, million~4),

num($~24, pl),

pers($~24, 3),

number_type(30.5~28, cardinal))

mood(pay~0, indicative),

tense(pay~0, fut),

adjunct(pay~0, assume~7),

obj(pay~0, premium~3),

stmt_type(pay~0, declarative),

subj(pay~0, Meridian~5),

det_type(premium~3, indef),

adjunct(premium~3, of~23),

num(premium~3, sg),

pers(premium~3, 3),

adjunct(million~4, 30.5~28),

number_type(million~4, cardinal),

num(Meridian~5, sg),

pers(Meridian~5, 3),

obj(assume~7, $~9),

stmt_type(assume~7, purpose),

accuracy and coverage1
Accuracy and coverage
  • Japanese Pargram grammar
    • ~97% coverage on large corpora
      • 10,000 newspaper sentences (EDR)
      • 460 copier manual sentences
      • 9,637 customer-relations sentences
    • F-scores against 200 hand-annotated sentences from newspaper corpus:
      • Best: 87%
      • Average: 80%

Recall: Grammar constructed with ~2 person-years of effort

(compare: Effort to create an annotated training corpus)

sources of brittleness
Sources of Brittleness
  • Vocabulary problems
    • Gaps in coverage, neologisms, terminology
    • Incorrect entries, missing frames…
  • Missing constructions
    • No theoretical guidance (or interest)

(e.g. dates, company names)

    • Core constructions overlooked
      • Intuition and corpus both limited
  • Ungrammatical input
    • Real world text is not perfect
    • Sometimes it’s horrendous
  • Strict performance limits (XLE parameters)
real world input
Real world input
  • Other weak blue-chip issues included Chevron, which went down 2 to 64 7/8 in Big Board composite trading of 1.3 million shares; Goodyear Tire & Rubber, off 1 1/2 to 46 3/4, and American Express, down 3/4 to 37 1/4.

(WSJ, section 13)

  • ``The croaker\'s done gone from the hook”

(WSJ, section 13)

  • (SOLUTION 27000 20) Without tag P-248 the W7F3 fuse is located in the rear of the machine by the charge power supply (PL3 C14 item 15.

(Copier repair tip)

lfg entries from finite state morphologies
LFG entries from Finite-State Morphologies
  • Broad-coverage inflectional transducers

falls → fall +Noun +Pl

fall +Verb +Pres +3sg

Mary → Mary +Prop +Giv +Fem +Sg

vienne → venir +SubjP +SG {+P1|+P3} +Verb

  • For listed words, transducer provides
    • canonical stem form
    • inflectional information
on the fly lfg entries
On-the-fly LFG entries
  • “-unknown” head-word matches unrecognized stems
  • Grammar writer defines -unknown and affixes

-unknown N (↑ PRED)=‘%stem’ (↑ NTYPE)=common;

V (↑ PRED)=‘%stem<SUBJ,OBJ>’.

+Noun N-AFX (↑ PERS)=3.

+Pl N-AFX (↑ NUM)=pl.

+Pres V-AFX (↑ TENSE)=present

+3sg V-AFX (↑ SUBJ PERS)=3 (↑ SUBJ NUM)=sg

  • Pieces assembled by sublexical rules:

NOUN → N N-AFX*.

VERB → V V-AFX*.

(transitive)

guessing for unlisted words
Guessing for unlisted words
  • Use FST guesser for general patterns
    • Capitalized words can be proper nouns
      • Saakashvili → Saakashvili +Noun +Proper +Guessed
    • ed words can be past tense verbs or adjectives
      • fumped → fump +Verb +Past +Guessed

fumped +Adj +Deverbal +Guessed

  • Languages with richer morphology allow better guessers
subcategorization and argument mapping
Subcategorization and Argument Mapping?
  • Transitive, intransitive, inchoative…
    • Not related to inflection
    • Can’t be inferred from shallow data
  • Fill in gaps from external sources
    • Machine readable dictionaries
    • Other resources: VerbNet, WordNet, FrameNet, Cyc
    • Not always easy, not always reliable
      • Current research
grammatical failures
Grammatical failures

Fall-back approach

  • First try to get a complete analysis
    • Prefer standard rules, but
    • Allow for anticipated errors

E.g. subject/verb disagree, but interpretation is obvious

    • Optimality-theory marks to prefer standard analyses
  • If fail, enlarge grammar, try again
    • Build up fragments that get complete sub-parses (c-structure and f-structure)
    • Allow tokens that can’t be chunked
    • Link chunks and tokens in a single f-structure
fall back grammar for fragments
Fall-back grammar for fragments
  • Grammar writer specifies REPARSECAT
    • Alternative c-structure root if no complete parse
    • Allows for fragments and linking
  • Grammar writer specifies possible chunks
    • Categories (e.g. S, NP, VP but not N, V)
    • Looser expansions
  • Optimality theory

Grammar writer specifies marks to

      • Prefer standard rules over anticipated errors
      • Prefer parse with fewest chunks
      • Disprefer using tokens over chunks
example
Example

“The the dog appears.”

Analyzed as

  • “token” the
  • sentence “the dog appears”
f structure
F-structure
  • Many chunks have useful analyses
  • XLE/LFG degrades to shallow parsing in worst case
robustness summary
Robustness summary
  • External resources for incomplete lexical entries
    • Morphologies, guessers, taggers
    • Current work: Verbnet, Wordnet, Framenet, Cyc
    • Order by reliability
  • Fall back techniques for missing constructions
    • Disprefered rules
    • Fragment grammar
  • Current WSJ evaluation:
    • 100% coverage, ~85% full parses
    • F-score (esp. recall) declines for fragment parses
finding the most probable parse
Finding the most probable parse
  • XLE produces many candidates
    • All valid (with respect to grammar and OT marks)
    • Not all equally likely
    • Some applications are ambiguity enabled (defer selection)
    • … But some require a single best guess
  • Grammar writers have only coarse preference intuitions
    • Many implicit properties of words and structures with unclear significance
  • Appeal to probability model to choose best parse
    • Assume: previous experience is a good guide for future decisions
    • Collect corpus of training sentences
    • Build probability model that optimizes for previous good results
    • Apply model to choose best analysis of new sentences
issues
Issues
  • What kind of probability model?
  • What kind of training data?
  • Efficiency of training, disambiguation?
  • Benefit vs. random choice of parse?
    • Random is awful for treebank grammars
    • Hard LFG constraints restrict to plausible candidates
probability model
Probability model
  • Conventional models: stochastic branching process
    • Hidden Markov models
    • Probabilistic Context-Free grammars
  • Sequence of decisions, each independent of previous decisions, each choice having a certain probability
    • HMM: Choose from outgoing arcs at a given state
    • PCFG: Choose from alternative expansions of a given category
  • Probability of an analysis = product of choice probabilities
  • Efficient algorithms
    • Training: forward/backward, inside/outside
    • Disambiguation: Viterbi
  • Abney 1997 and others: Not appropriate for LFG, HPSG…
    • Choices are not independent: Information from different CFG branches interacts through f-structure
    • Relative-frequency estimator is inconsistent
exponential models are appropriate aka log linear models
Exponential models are appropriate (aka Log-linear models)
  • Assign probabilities to representations, not to choices in a derivation
  • No independence assumption
  • Arithmetic combined with human insight
    • Human:
      • Define properties of representations that may be relevant
      • Based on any computable configuration of f-structure features, trees
    • Arithmetic:
      • Train to figure out the weight of each property
stochastic disambiguation in xle all parses most probable
Stochastic Disambiguation in XLEAll parses  Most probable

Discriminative ranking

  • Conditional log-linear model on c/f-structure pairs

Probability of parse x for string s, where

f is a vector of feature values for x

 is a vector of feature weights

Z is normalizer for all parses of s

  • Discriminative estimation of  from partially labeled data(Riezler et al. ACL’02)
  • Combined l1-regularization and feature-selection
    • Avoid over-fitting, choose best features(Riezler & Vasserman, EMNLP’04)
coarse training data for xle

Compare with full PTB annotation:

(S (S-ADV (NP-SBJ (-NONE- *-1)) (VP (VBG Considering) (NP (NP (DT the) (NNS naggings)) (PP (IN of)

(NP (DT a) (NN culture)

(NN imperative))))))

(, ,)

(NP-SBJ-1 (PRP I))

(VP (ADVP-MNR (RB promptly))

(VBD signed)

(PRT (RB up)))

(. .))

Coarse training data for XLE

“Correct” parses are consistent with weak annotation

Considering/VBG (NP the naggings of a culture imperative), (NP-SBJ I) promptly signed/VBD up.

  • Sufficient for disambiguation, not for grammar induction
classes of properties
Classes of properties
  • C-structure nodes and subtrees
    • indicating certain attachment preferences
  • Recursively embedded phrases
    • indicating high vs. low attachment
  • F-structure attributes
    • presence of grammatical functions
  • Atomic attribute-value pairs in f-structure
    • particular feature values
  • Left/right/ branching behavior of c-structures
  • (Non)parallelism of coordinations in c- and f-structures
  • Lexical elements
    • tuples of head words, argument words, grammatical relations

~60,000 candidate properties, ~1000 selected

some properties and weights
Some properties and weights

0.937481 cs_embedded VPv[pass] 1

-0.126697 cs_embedded VPv[perf] 3

-0.0204844 cs_embedded VPv[perf] 2

-0.0265543 cs_right_branch

-0.986274 cs_conj_nonpar 5

-0.536944 cs_conj_nonpar 4

-0.0561876 cs_conj_nonpar 3

0.373382 cs_label ADVPint

-1.20711 cs_label ADVPvp

-0.57614 cs_label AP[attr]

-0.139274 cs_adjacent_label DATEP PP

-1.25583 cs_adjacent_label MEASUREP PPnp

-0.35766 cs_adjacent_label NPadj PP

-0.00651106 fs_attrs 1 OBL-COMPAR

0.454177 fs_attrs 1 OBL-PART

-0.180969 fs_attrs 1 ADJUNCT

0.285577 fs_attr_val DET-FORM the

0.508962 fs_attr_val DET-FORM this

0.285577 fs_attr_val DET-TYPE def

0.217335 fs_attr_val DET-TYPE demon

0.278342 lex_subcat achieve OBJ,SUBJ,VTYPE SUBJ,OBL-AG,PASSIVE=+

0.00735123 lex_subcat acknowledge COMP-EX,SUBJ,VTYPE

efficiency
Efficiency
  • Properties counts
    • Associated with AND/OR tree of XLE contexts (a1, b2)
      • Detectors may add new nodes to tree: conjoined contexts
    • Shared among many parses
  • Training
    • Dynamic programming algorithm applied to AND/OR tree
      • Avoids unpacking of individual parses (Miyao and Tsujii HLT’02)
      • Similar to inside-outside algorithm of PCFG
    • Fast algorithm for choosing best properties
    • Can train only on sentences with relatively low-ambiguity
      • Shorter, perhaps easier to annotate
    • 5 hours to train over WSJ (given file of parses)
  • Disambiguation
    • Viterbi algorithm applied to Boolean tree
    • 5% of parse time to disambiguate
    • 30% gain in F-score from random-parse baseline
shallow mark up of input strings
Shallow mark-up of input strings
  • Part-of-speech tags (tagger?)

I/PRP saw/VBD her/PRP duck/VB.

I/PRP saw/VBD her/PRP$ duck/NN.

  • Named entities (named-entity recognizer)

<person>General Mills</person> bought it.

<company>General Mills</company> bought it

  • Syntactic brackets (chunk parser?)

[NP-S I] saw [NP-O the girl with the telescope].

[NP-S I] saw [NP-O the girl] with the telescope.

hypothesis
Hypothesis
  • Shallow mark-up
    • Reduces ambiguity
    • Increases speed
    • Without decreasing accuracy
    • (Helps development)
  • Issues
    • Markup errors may eliminate correct analyses
    • Markup process may be slow
    • Markup may interfere with existing robustness mechanisms (optimality, fragments, guessers)
    • Backoff may restore robustness but decrease speed in 2-pass system
implementation in xle

Input string

Input string

Marked up string

Tokenizer (FST)

(plus POS,NE converter)

Tokenizer (FST)

Morphology (FST)

(plus POS filter)

Morphology (FST)

LFG grammar(plus bracket metarule,

NE sublexical rule)

LFG grammar

c-str

f-str

c-str

f-str

Implementation in XLE

Integration with minimal changes to existing system/grammar

comparison shallow vs deep parsing
Comparison: Shallow vs. Deep parsing

HLT, 2004

  • Popular myth
    • Shallow statistical parsers are fast, robust… and useful
    • Deep grammar-based parsers are slow and brittle
  • Is this true?Comparison on predicate-argument relations, not phrase-trees
    • Needed for meaning-sensitive applications (= usefulness)

(translation, question answering…but maybe not IR)

    • Collins (1999) parser: state-of-the-art, marks arguments

(for fair test, wrote special code to make relations explicit--not so easy)

    • LFG/XLE with morphology, named-entities, disambiguation
    • Measured time, accuracy against PARC 700 Gold Standard
    • Results:
      • Collins is a bit times faster than LFG/XLE
      • LFG/XLE makes somewhat fewer errors, provides more useful detail
xle system
XLE System
  • Parser/generator for LFG grammars: multilingual
  • Composition with finite-state transductions
  • Careful ambiguity-management implementation
    • Preserves context-free locality in equational disjunctions
    • Exports ambiguity-enabling interfaces

Efficient implementation of clause-conjunction (C1C2)\

  • Log-linear disambiguation
    • Appropriate for LFG representations
    • Ambiguity-enabled theory and implementation
  • Robustness: shallow in the worst-case
  • Scales to broad-coverage grammars, long sentences
  • Semantic interface: Glue
lfg xle current issues
LFG/XLE: Current issues
  • Induction of LFG grammars from treebanks
    • Basic work in ParGram: Dublin City University
    • Principles of generalization, for human extension, combination with manual grammar

DCU + PARC

  • Large grammars for more language typologies
    • E.g. verb initial: Welsh, Malagasy, Arabic
  • Reduce performance variance; why not linear?
    • Competence vs. performance: limit center embedding?
    • Investigate speed/accuracy trade-off
  • Embedding in applications: XLE as a black box
    • Question answering(!), Translation, Sentence condensation …
    • Develop, combine with other ambiguity-enabled modules

Reasoning, transfer-rewriting…

matching for question answering
Matching for Question Answering

Overlap detector

Semantics

Semantics

F-structure

F-structure

Answer

Sources

Question

Parser

Parser

Englishgrammar

Englishgrammar

logical collocational semantics
Logical & Collocational Semantics
  • Logical Semantics
    • Map sentences to logical representations of meaning
    • Enables inference & reasoning
  • Collocational semantics
    • Represent word meanings as feature vectors
    • Typically obtained by statistical corpus analysis
    • Good for indexing, classification, language modeling, word sense disambiguation
    • Currently does not enable inference
  • Complementary, not conflicting, approaches
example semantic representation

Syntax (f-structure)

Semantics (logical form)

PRED

break<SUBJ>

w. wire(w) & w=part25 &

t. interval(t) & t<now &

e. break_event(e) & occurs_during(e,t) &

object_of_change(e,w) &

c. cause_of_change(e,c)

SUBJ

PRED wire

SPEC def

TENSE

past

NUM sg

Example Semantic Representation

The wire broke

  • F-structure gives basic predicate-argument structure,

but lacks:

    • Standard logical machinery (variables, connectives, etc)
    • Implicit arguments (events, causes)
    • Contextual dependencies (the wire = part25)
  • Mapping from f-structure to logical form is systematic,

but non-trivial

glue semantics dalrymple lamping saraswat 1993 and subsequently
Glue Semantics Dalrymple, Lamping & Saraswat 1993 and subsequently
  • Syntax-semantics mapping as linear logic inference
  • Two logics in semantics:
    • Meaning Logic (target semantic representation)

any suitable semantic representation

    • Glue Logic (deductively assembles target meaning)

fragment of linear logic

  • Syntactic analysis produces lexical glue premises
  • Semantic interpretation uses deduction to assemble final meaning from these premises
linear logic
Linear Logic
  • Influential development in theoretical computer science (Girard 87)
  • Premises are resources consumed in inference (Traditional logic: premises are non-resourced)

Traditional Linear

A, AB |= B A, A -o B |= B

A, AB |= A&B A, A -o B |= AB

A re-usedA consumed

A, B|= B A, B|= B

A discarded Cannot discard A

/

/

  • Linguistic processing typically resource sensitive
    • Words/meanings used exactly once
glue interpretation outline
Glue Interpretation (Outline)
  • Parsing sentence instantiates lexical entries to produce lexical glue premises
  • Example lexical premise (verb “saw” in “John saw Fred”):

see : g -o (h -o f)

Meaning Term Glue Formula

2-place predicate g, h, f: constituents in parse

“consume meanings of g and h

to produce meaning of f”

  • Glue derivation |= M : f
    • Consume all lexical premises ,
    • to produce meaning, M, for entire sentence, f
glue interpretation getting the premises

Syntactic Analysis:

S

see

PRED

NP

VP

g:

PRED John

f:

SUBJ

NP

V

h:

OBJ

PRED Fred

John

saw

Fred

Glue Interpretation Getting the premises

Lexicon:

John NP john:

Fred NP fred: 

saw V see: SUBJ-o (OBJ-o)

Instantiated premises:

john: g

fred: h

see: g -o(h -of)

glue interpretation deduction with premises

Linear Logic Derivation

g -o(h -of) g

h -of h

f

Using linear modus ponens

Derivation with Meaning Terms

see: g -o(h -of)john: g

see(john) : h -of fred: h

see(john)(fred) : f

Linear modus ponens = function application

Glue InterpretationDeduction with premises

Premises

john: g

fred: h

see: g -o(h -of)

modus ponens function application the curry howard isomorphism

Fun: Arg:

Fun(Arg):

g -o f g

f

Modus Ponens = Function ApplicationThe Curry-Howard Isomorphism

Curry Howard Isomorphism:

Pairs LL inference rules with operations on meaning terms

Propositional linear logic inference constructs meanings

LL inference completely independent of meaning language

(Modularity of meaning representation)

semantic ambiguity multiple derivations from single set of premises

Alleged criminal from London

PRED criminal

f:

alleged

MODS

from London

Semantic AmbiguityMultiple derivations from single set of premises

Premises

criminal: f

alleged: f -o f

from-London:f -o f

Two distinct derivations:

1. from-London(alleged(criminal))

2. alleged(from-London(criminal))

semantic ambiguity modifiers
Semantic Ambiguity & Modifiers
  • Multiple derivations from single premise set
    • Arises through different ways of permuting modifiers around an  skeleton
  • Modifiers given formal representation in glue as -ological identities
    • E.g. an adjective is a noun -onounmodifier
  • Modifiers prevalent in natural language, and lead to combinatorial explosion
    • Given N f-o f modifiers, N! ways of permuting them around f skeleton
ambiguity management in semantics
Ambiguity management in semantics
  • Efficient theorem provers that manage combinatorial explosion of modifiers
    • Packing of N! analyses
      • Represent all N! analyses in polynomial space
      • Compute representation in polynomial time
      • Free choice: Read off any given analysis in linear time
    • Packing through structure re-use
      • N! analyses through combinations of N sub-analyses
      • Compute each sub-analysis once, and re-use
parc linguistic environment

Applications

Translation

Question Answering

Dialog

Ambiguity

Management

Condensation

Knowledge tracking

Email Routing

Email Response

Scale

Modularity

Robustness

Parc Linguistic Environment

Multidimensional

Architecture

Glue Semantics

LFG Syntax

FS Morphology

Parse

Generate

Select

Transfer

Interpret

Mathematics

Algorithms

Programs

Data structures

Models, parameters

English

French

Theory

Software

Tableware

Japanese

German

Urdu

Norwegian

ad