slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Using Query Patterns to Learn the Durations of Events PowerPoint Presentation
Download Presentation
Using Query Patterns to Learn the Durations of Events

Loading in 2 Seconds...

play fullscreen
1 / 40

Using Query Patterns to Learn the Durations of Events - PowerPoint PPT Presentation

  • Updated on

Using Query Patterns to Learn the Durations of Events. Andrey Gusev joint work with Nate Chambers , Pranav Khaitan , Divye Khilnani , Steven Bethard , Dan Jurafsky. Examples of Event Durations. Talk to a friend – minutes Driving – hours Study for an exam – days Travel – weeks

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Using Query Patterns to Learn the Durations of Events

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Using Query Patterns

to Learn the Durations of Events


joint work withNate Chambers, PranavKhaitan, DivyeKhilnani, Steven Bethard, Dan Jurafsky


Examples of Event Durations

  • Talk to a friend – minutes
  • Driving – hours
  • Study for an exam – days
  • Travel – weeks
  • Run a campaign – months
  • Build a museum – years

Why are we interested in durations?

  • Event Understanding
    • Duration is an important aspectual property
    • Can help build timelines and events
  • Event coreference
    • Duration may be a cue that events are coreferent
      • Gender (learned from the web) helps nominal coreference
  • Integration into search products
    • Query: “healthy sleep time for age groups”
    • Query: “president term length in [country x]”

How can we learn event durations?

Approach1: Supervised System

dataset pan et al 2006
Dataset (Pan et al., 2006)

Labeled 58 documents from TimeBank with event durations

Average of minimum and maximum labeled durations

A Brooklyn woman who was watching her clothes dry in a laundromat.

Min duration – 5 min

Max Duration – 1 hour

Average – 1950 seconds

original features pan et al 2006
Original Features (Pan et al., 2006)

Event Properties

Event token, lemma, POS tag

Subject and Object

Head word of syntactic subject and objects of the event, along with their lemmas and POS tags.


WordNethypernyms for the event, its subject and its object.

Starting from the first synset of each lemma, three hyperhyms were extracted from the WordNet hierarchy.

new features
New Features

Event Attributes

Tense, aspect, modality, event class

Named Entity Class of Subjects and Objects

Person, organization, locations, or other.

Typed Dependencies

Binary feature for each typed dependency

Reporting Verbs

Binary feature for reporting verbs (say, report, reply, etc.)

limitations of the supervised approach
Limitations of the Supervised Approach

Need explicitly annotated datasets

Sparse and limited data

Limited to the annotated domain

Low inter-annotator agreement

More than a Day and Less Than a Day– 87.7%

Duration Buckets – 44.4%

Approximate Duration Buckets– 79.8%

overcoming supervised limitations
Overcoming Supervised Limitations

Statistical Web Count approach

Lots of text/data that can be used

Not limited to the annotated domain

Implicit annotations from many sources

Hearst(1998), Ji and Lin (2009)


How can we learn event durations?

Approach 2: Statistical Web Counts

terms durations buckets and distributions
Terms - Durations Buckets and Distributions

“talked for * seconds”

“talked for * minutes”

“talked for * hours”

“talked for * days”

“talked for * weeks”

“talked for * months”

“talked for * years”

- 1638 hits

- 61816 hits

- 68370 hits

- 4361 hits

- 3754 hits

- 5157 hits

- 103336 hits

Duration Bucket


two duration prediction tasks
Two Duration Prediction Tasks

Coarse grained prediction

“Less than a day” or “Longer than a day”

Fine grained prediction

Second, minute, hour, etc.

yesterday pattern for coarse grained task
Yesterday Pattern for Coarse Grained Task

<eventpast> yesterday

<eventpastp> yesterday

eventpast = past tense

eventpastp= past progressive tense

Normalize yesterday event pattern counts with counts of event occurrence in general

Average the two ratios

Find threshold on the training set

example to say with yesterday pattern
Example: “to say” with Yesterday Pattern

“said yesterday” – 14,390,865 hits

“said” – 1,693,080,248 hits

“was saying yesterday” – 29,626 hits

“was saying” – 14,167,103 hits

Average Ratio = 0.0053


Fine Grained Durations from Web Counts


  • How long does the event “X” last?
  • Ask the web:
    • “X for * seconds”
    • “X for * minutes”
  • Output distribution over time units

Not All Time Units are Equal

  • Need to look at the base distribution
    • “for * seconds”
    • “for * minutes”
  • In habituals, etc. people like to say “for years”

Conditional Frequencies for Buckets


  • Divide
    • “X for * seconds”
  • By
    • “for * seconds”
  • Reduce credit for seeing “X for years”
double peak distribution
Double Peak Distribution
  • Two interpretations
    • Durative
    • Iterative
  • Distributions show that with two peaks

Merging Patterns


  • Multiple patterns
  • Distributions averaged
  • Reduces noise from individual patterns
  • Pattern needs to have greater than 100 and less 100,000 hits

Fine Grained Patterns

  • Used Patterns
    • <eventpast> for * <bucket>
    • <eventpastp> for * <bucket>
    • spent * <bucket> <eventger>
  • Patterns not used
    • <eventpast> in * <bucket>
    • takes * <bucket> to <event>
    • <eventpast> last <bucket>


  • TimeBank annotations (Pan, Mulkar and Hobbs 2006)
    • Coarse Task: Greater or less than a day
    • Fine Task: Time units (seconds, minutes, hours, …, years)
      • Counted as correct if within 1 time unit
    • Baseline: Majority Class
      • Fine Grained – months
      • Coarse Grained – greater than a day
  • Compare with re-implementation of supervised (Pan, Mulkar and Hobbs 2006)
new split for timebank dataset
New Split for TimeBank Dataset

Train – 1664 events (714 unique verbs)

Test – 471 events (274 unique verbs)

TestWSJ – 147 events (84 unique verbs)

Split info is available at

web counts system scoring
Web Counts System Scoring

Fine grained

Smooth over the adjacent buckets and select top bucket

score(bi) = bi-1 + bi + bi+1

Coarse grained

“Yesterday” classifier with a threshold (t = 0.002)

Use fine grained approach

Select coarse grained bucket based on fine grained bucket



Web counts perform as well as the fully supervised system

backoff statistics spent pattern
Backoff Statistics (“Spent” Pattern)

Events in training dataset

Had at least 10 hits


Effect of the Event Context

  • Supervised classifier use context in their features
  • Web counts system doesn’t use context of the events
    • Significantly fewer hits when including context
    • Better accuracy with more hits than with context
  • What is the effect of subject/object context on the understanding of event duration?

Can humans do this task without context?

Human Annotation:Mechanical Turk


MTurk Setup

  • 10 MTurk workers for each event
  • Without the context
    • Event – choice for each duration bucket
  • With the context
    • Event with subject/object – choice for each duration bucket

Web counts vs. Turk distributions

“said” (web count)

“said” (MTurk)


Web counts vs. Turk distributions

“looking” (web count)

“looking” (MTurk)


Web counts vs. Turk distributions

“considering” (web count)

“considering” (MTurk)


Results: Mechanical Turk Annotations

  • Compare accuracy
    • Event with context
    • Event without context

Context significantly improves accuracy of MTurk annotations


Event Duration Lexicon

  • Distributions for 1000 most frequent verbs from the NYT portion of the Gigaword with 10 most frequent grammatical objects of each verb
  • Due to thresholds not all the events have distributions
  • EVENT=to use,
  • ID=e13-7,
  • OBJ=computer,
  • DISTR=[0.009;0.337;0.238;0.090;0.130;0.103;0.092;0.002;]


  • We learned aspectual information from the web
  • Event durations from the web counts are as accurate as a supervised system
  • Web counts are domain-general, work well even without context
  • New lexicon with 1000 most frequent verbs with 10 most frequent objects
  • MTurk suggests that context can improve accuracy of event duration annotation