Shallow semantics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 54

Shallow Semantics PowerPoint PPT Presentation


  • 98 Views
  • Uploaded on
  • Presentation posted in: General

Shallow Semantics. Semantics and Pragmatics. High-level Linguistics (the good stuff!) Semantics: the study of meaning that can be determined from a sentence, phrase or word. Pragmatics: the study of meaning, as it depends on context (speaker, situation, dialogue history).

Download Presentation

Shallow Semantics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Shallow semantics

Shallow Semantics


Semantics and pragmatics

Semantics and Pragmatics

High-level Linguistics (the good stuff!)

Semantics: the study of meaning that can be determined from a sentence, phrase or word.

Pragmatics: the study of meaning, as it depends on context (speaker, situation, dialogue history)

NLP


Language to simplistic logic

Language to (Simplistic) Logic

  • John went to the book store.

    go(John, store1)

  • John bought a book.

    buy(John,book1)

  • John gave the book to Mary.

    give(John,book1,Mary)

  • Mary put the book on the table.

    put(Mary,book1,on table1)

NLP


What s missing

What’s missing?

  • Word sense disambiguation

  • Quantification

  • Coreference

  • Interpreting within a phrase

  • Many, many more issues …

  • But it’s still more than you get from parsing!


Some problems in shallow semantics

Some problems in shallow semantics

  • Identifying entities

    • noun-phrase chunking

    • named-entity recognition

    • coreference resolution

      (involves discourse/pragmatics too)

  • Identifying relationship names

    • Verb-phrase chunking

    • Predicate identification (step 0 of semantic role labeling)

    • Synonym resolution (e.g., get = receive)

  • Identifying arguments to predicates

    • Information extraction

    • Argument identification (step 1 of semantic role labeling)

  • Assigning semantic roles (step 2 of semantic role labeling)

  • Sentiment classification

    • That is, does the relationship express an opinion?

    • If so, is the opinion positive or negative?


1 identifying entities

1. Identifying Entities

Named Entity Tagging: Identify all the proper names in a text

Sally went to see Up in the Air at the local theater.

Person Film

Noun Phrase Chunking: Find all base noun phrases

(that is, noun phrases that don’t have smaller noun phrases nested inside them)

Sally went to see Up in the Air at the local theater on Elm Street.


1 identifying entities 2

1. Identifying Entities (2)

Parsing: Identify all phrase constituents, which will of course include all noun phrases.

S

VP

NP

NP

V

PP

N

P

NP

NP

PP

P

NP

Up in the Air

at

on

Elm St.

Sally

saw

the theater


1 identifying entities 3

1. Identifying Entities (3)

Coreference Resolution: Identify all references (aka ‘mentions’) of people, places and things in text, and determine which mentions are ‘co-referential’.

John stuck hisfootin hismouth.


2 identifying relationship names

2. Identifying relationship names

Verb phrase chunking: the commonest approach

Some issues:

  • Often, prepositions/particles “belong” with the relation name

    You’re ticking me off.

    2.Many relationships are expressed without a verb:

    Jack Welch, CEO of GE, …

  • Some verbs don’t really express a meaningful relationship by themselves:

    Jim is the father of 12 boys.

  • Verb sense disambiguation

  • Synonymy

    ticking off = bothering


2 identifying relationship names 2

2. Identifying relationship names (2)

Synonym Resolution:

Discovery of Inference Rules from Text (DIRT) (Lin and Pantel, 2001)

1. They collect millions of examples of

Subject Verb Object

triples by parsing a Web corpus.

2. For a pair of verbs, v1 and v2, they compute mutual information scores between

- the vector space model (VSM) for subjects of v1 and the vector space model for the subjects of v2

- the VSM for objects of v1 and VSM for objects of v2

3. They cluster verbs with high MI scores between them

See (Yates and Etzioni, JAIR 2009) for a more recent approach using probabilistic models.


5 sentiment classification

5. Sentiment Classification

Given a review (about a movie, hotel, Amazon product, etc.), a sentiment classification system tries to determine what opinions are expressed in the review.

Coarse-level objective: is the review positive, negative, or neutral overall?

Fine-grained objective: what are the positive aspects (according to the reviewer), and what are the negative aspects?

Question: what technique(s) would you use to solve these two problems?


Semantic role labeling

Semantic Role Labeling

a.k.a., Shallow Semantic Parsing


Semantic role labeling1

Semantic Role Labeling

Semantic role labeling is the computational task of assigning semantic roles to phrases

It’s usually divided into three subtasks:

  • Predicate identification

  • Argument Identification

  • Argument Classification -- assigning semantic roles

Means

(or instrument)

Agent

Patient

B-Arg

B-Arg

I-Arg

B-Arg

I-Arg

I-Arg

Pred

  • John broke the window with a hammer.


Same event different sentences

Same event - different sentences

  • John broke the window with a hammer.

  • John broke the window with the crack.

  • The hammer broke the window.

  • The window broke.

NLP


Same event different syntactic frames

Same event - different syntactic frames

  • John broke the window with a hammer.

  • SUBJ VERB OBJ MODIFIER

  • John broke the window with the crack.

  • SUBJ VERB OBJ MODIFIER

  • The hammer broke the window.

  • SUBJ VERB OBJ

  • The window broke.

  • SUBJ VERB

NLP


Semantic role example

Semantic role example

  • break(AGENT, INSTRUMENT, PATIENT)

    AGENT PATIENT INSTRUMENT

  • John broke the window with a hammer.

  • INSTRUMENT PATIENT

  • The hammer broke the window.

  • PATIENT

  • The window broke.

    • Fillmore 68 - The case for case

NLP


Shallow semantics

  • AGENT PATIENT INSTRUMENT

  • John broke the window with a hammer.

  • SUBJ OBJ MODIFIER

  • INSTRUMENT PATIENT

  • The hammer broke the window.

  • SUBJ OBJ

  • PATIENT

  • The window broke.

  • SUBJ

NLP


Semantic roles

Semantic roles

Semantic roles (or just roles) are slots, belonging to a predicate, which arguments can fill.

- There are different naming conventions, but one common set of names for semantic roles are agent, patient, means/instrument, ….

Some constraints:

1. Only certain kinds of phrases can fill certain kinds of semantic roles

“with a crack” will never be an agent

But many are ambiguous:

“hammer” patient or instrument?

2. Syntax provides a clue, but it is not the full answer

Subject  Agent? Patient? Instrument?


Slot filling

Slot Filling

Phrases

Slots

Pred

John

Agent

broke

Patient

the window

with a hammer

Means

(or instrument)

Argument Classification


Slot filling1

Slot Filling

Phrases

Slots

The hammer

Pred

Agent

broke

Patient

the window

Means

(or instrument)

Argument Classification


Slot filling2

Slot Filling

Phrases

Slots

The window

Pred

Agent

broke

Patient

Means

(or instrument)

Argument Classification


Slot filling and shallow semantics

Slot Filling and Shallow Semantics

Shallow Semantics

Phrases

Slots

Pred

John

Means

(or instrument)

Agent

Patient

Pred

Agent

broke(John, the window, with a hammer)

broke

Patient

the window

with a hammer

Means

(or instrument)


Slot filling and shallow semantics1

Slot Filling and Shallow Semantics

Shallow Semantics

Phrases

Slots

Pred

The window

Means

(or instrument)

Agent

Patient

Pred

Agent

broke( ?x , the window, ?y )

broke

Patient

Means

(or instrument)


Semantic role labeling techniques

Semantic Role Labeling Techniques


Semantic role labeling techniques1

Semantic Role Labeling Techniques

We’ll cover 3 approaches to SRL

  • Basic (Gildea and Jurafsky, Comp. Ling. 2003)

  • Joint inference for argument structure (Toutanovaet al., Comp. Ling. 2008)

  • Open-domain (Huang and Yates, ACL 2010)


1 gildea and jurafsky

1. Gildea and Jurafsky

Main idea:start with parse tree, and try to identify constituents that are arguments.


G j 1

G&J (1)

Build a (probabilistic) classifier for predicting:

- for each constituent, which role is it?

- Essentially, a maximum-entropy classifier, although it’s not described that way

Features for Argument Classification:

  • Phrase type of constituent

  • Governing category of NPs – S or VP (differentiates between subjects and objects)

  • Position w.r.t. predicate (before or after)

  • Voice of predicate (active or passive verb)

  • Head word of constituent

  • Parse tree path between predicate and constituent


G j 2 parse tree path feature

G&J (2) – Parse Tree Path Feature

Parse tree path (or just path) feature:

Determines the syntactic relationship between predicate and current constituent.

In this example, path feature:

VB ↑ VP ↑ S ↓ NP


G j 3

G&J (3)

4086 possible values of the Path feature in training data.

A sparse feature!


G j 4

G&J (4)

Build a (probabilistic) classifier for predicting:

- for each constituent, which role is it?

- Essentially, a maximum-entropy classifier, although it’s not described that way

Features for Argument Identification:

  • Predicate word

  • Head word of constituent

  • Parse tree path between predicate and constituent


G j 5 results

G&J (5): Results


2 toutanova haghighi and manning

2. Toutanova, Haghighi, and Manning

A Global Joint Model for SRL (Comp. Ling., 2008)

Main idea(s):

Include features that depend on multiple arguments

Use multiple parsers as input, for robustness


Thm 1 motivation

THM (1): Motivation

1. “The day that the ogre cooked the children is still remembered.”

2. “The meal that the ogre cooked the children is still remembered.”

Both sentences have identical syntax.

They differ in only 1 word (day vs. meal).

If we classify arguments 1 at a time, “the children” will be labeled the same thing in both cases.

But in (1), “the children” is the Patient (thing being cooked).

And in (2), “the children” is the Beneficiary (people for whom the cooking is done).

Intuitively, we can’t classify these arguments independently.


Thm 2 features

THM(2): Features

  • Features:

  • Whole label sequence

    • [voice:active, Arg1, pred, Arg4, ArgM-TMP]

    • [voice:active, lemma:accelerated, Arg1, pred, Arg4, ArgM-TMP]

    • [voice:active, lemma:accelerated, Arg1, pred, Arg4] (no adjuncts)

    • [voice:active, lemma:accelerated, Arg, pred, Arg] (no adjuncts, no #s)

  • Syntax and semantics in the label sequence

    • [voice:active, NP-Arg1, pred, PP-Arg4]

    • [voice:active, lemma:accelerated, NP-Arg1, pred, PP-Arg4]

  • Repetition features: whether Arg1 (for example) appears multiple times


Thm 3 classifier

THM(3): Classifier

  • First, for each sentence, obtain the top-10 most likely parse tree/semantic role label outputs from G&J

  • Build a max-ent classifier to select from these 10, using the features above

  • Also, include top-10 parses from the Charniak parser


Thm 4 results

THM(4): Results

These are on a different data set from G&J, so results not directly comparable. But the local model is similar to G&J, so think of that as the comparison.

Results show F1 scores for IDentification and CLaSsification of arguments together.

WSJ is the Wall Street Journal test set, a collection of approximately 4,000 news sentences.

Brown is a smaller collection of fiction stories.

The system is trained on a separate set of WSJ sentences.


3 huang and yates

3. Huang and Yates

Open-Domain SRL by Modeling Word Spans, ACL 2010

Main Idea:

One of the biggest problems for SRL systems is that they need lexical features to classify arguments, but lexical features are sparse.

We build a simple SRL system that outperforms the previous state-of-the-art on out-of-domain data, by learning new lexical representations.


Simple open domain srl

Simple, open-domain SRL

SRL Label

Breaker

Pred

Thing Broken

Means

Baseline Features

dist. from predicate

-1

0

+1

+2

+3

+4

+5

Chunk tag

B-NP

B-VP

B-NP

I-NP

B-PP

B-NP

I-NP

Proper Noun

Verb

Det.

Noun

Prep.

Det.

Noun

POS tag

Chris

broke

the

window

with

a

hammer


Simple open domain srl1

Simple, open-domain SRL

SRL Label

Breaker

Pred

Thing Broken

Means

Baseline +HMM

HMM label

dist. from predicate

-1

0

+1

+2

+3

+4

+5

Chunk tag

B-NP

B-VP

B-NP

I-NP

B-PP

B-NP

I-NP

Proper Noun

Verb

Det.

Noun

Prep.

Det.

Noun

POS tag

Chris

broke

the

window

with

a

hammer


The importance of paths

The importance of paths

Chris [predicate broke] [thing broken a hammer]

Chris [predicate broke] a window with [meansa hammer]

Chris [predicate broke] the desk, so she fetched

[not an arg a hammer] and nails.


Simple open domain srl2

Simple, open-domain SRL

SRL Label

Breaker

Pred

Thing Broken

Means

Baseline +HMM + Paths

the-window-with

None

None

None

the

Word path

the-window

the-window-with-a

Chris

broke

the

window

with

a

hammer


Simple open domain srl3

Simple, open-domain SRL

SRL Label

Breaker

Pred

Thing Broken

Means

Baseline +HMM + Paths

Det-Noun-Prep

Det-Noun-Prep-Det

None

None

None

Det

Det-Noun

POS path

the-window-with

None

None

None

the

Word path

the-window

the-window-with-a

Chris

broke

the

window

with

a

hammer


Simple open domain srl4

Simple, open-domain SRL

SRL Label

Breaker

Pred

Thing Broken

Means

Baseline +HMM + Paths

None

None

None

HMM path

Det-Noun-Prep

Det-Noun-Prep-Det

None

None

None

Det

Det-Noun

POS path

the-window-with

None

None

None

the

Word path

the-window

the-window-with-a

Chris

broke

the

window

with

a

hammer


Experimental results f1

Experimental results – F1

All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).


Experimental results f11

Experimental results – F1

All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).


Span hmms

Span-HMMs


Span hmm features

Span-HMM features

SRL Label

Breaker

Pred

Thing Broken

Means

Span-HMM Features

Span-HMM feature

Span-HMM for “hammer”

Chris

broke

the

window

with

a

hammer


Span hmm features1

Span-HMM features

SRL Label

Breaker

Pred

Thing Broken

Means

Span-HMM Features

Span-HMM feature

Span-HMM for “hammer”

Chris

broke

the

window

with

a

hammer


Span hmm features2

Span-HMM features

SRL Label

Breaker

Pred

Thing Broken

Means

Span-HMM Features

Span-HMM feature

Span-HMM for “a”

Chris

broke

the

window

with

a

hammer


Span hmm features3

Span-HMM features

SRL Label

Breaker

Pred

Thing Broken

Means

Span-HMM Features

Span-HMM feature

Span-HMM for “a”

Chris

broke

the

window

with

a

hammer


Span hmm features4

Span-HMM features

SRL Label

Breaker

Pred

Thing Broken

Means

Span-HMM Features

Span-HMM feature

None

None

None

Chris

broke

the

window

with

a

hammer


Experimental results srl f1

Experimental results – SRL F1

All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).


Experimental results feature sparsity

Experimental results – feature sparsity


Benefit grows with distance from predicate

Benefit grows with distance from predicate


  • Login