learning tfc meeting sri march 2005 on the collective classification of email speech acts l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
PowerPoint Presentation
Download Presentation

Loading in 2 Seconds...

play fullscreen
1 / 16

- PowerPoint PPT Presentation


  • 354 Views
  • Uploaded on

Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University Classifying Email into Acts From EMNLP-04, Learning to Classify Email into Speech Acts , Cohen-Carvalho-Mitchell

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - Jeffrey


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning tfc meeting sri march 2005 on the collective classification of email speech acts

Learning TFC Meeting, SRI March 2005On the Collective Classification of Email “Speech Acts”

Vitor R. Carvalho & William W. Cohen

Carnegie Mellon University

slide2

Classifying Email into Acts

  • From EMNLP-04, Learning to Classify Email into Speech Acts, Cohen-Carvalho-Mitchell
  • An Act is described as a verb-noun pair (e.g., propose meeting, request information) - Not all pairs make sense. One single email message may contain multiple acts.
  • Try to describe commonly observed behaviors, rather than all possible speech acts in English. Also include non-linguistic usage of email (e.g. delivery of files)

Verbs

Nouns

slide3

Idea: Predicting Acts from Surrounding Acts

Example of Email Sequence

  • Strong correlation with previous and next message’s acts

Delivery

Request

Request

Proposal

Delivery

Commit

Commit

Delivery

  • Act has little or no correlation with other acts of same message

<<In-ReplyTo>>

Commit

slide4

Related work on the Sequential Nature of Negotiations

  • Winograd and Flores, 1986:“Conversation for Action Structure”
  • Murakoshi et al. 1999; “Construction of Deliberation Structure in Email”
data cspace corpus
Data: CSPACE Corpus
  • Few large, free, natural email corpora are available
  • CSPACE corpus (Kraut & Fussell)
    • Emails associated with a semester-long project for Carnegie Mellon MBA students in 1997
    • 15,000 messages from 277 students, divided in 50 teams (4 to 6 students/team)
    • Rich in task negotiation.
    • More than 1500 messages (from 4 teams) were labeled in terms of “Speech Act”.
    • One of the teams was double labeled, and the inter-annotator agreement ranges from 72 to 83% (Kappa) for the most frequent acts.
evidence of sequential correlation of acts
Evidence of Sequential Correlation of Acts
  • Transition diagram for most common verbs from CSPACE corpus
  • It is NOT a Probabilistic DFA
  • Act sequence patterns: (Request, Deliver+), (Propose, Commit+, Deliver+), (Propose, Deliver+), most common act was Deliver
  • Less regularity than the expected ( considering previous deterministic negotiation state diagrams)
content versus context
Content versus Context

Request

Request

???

Proposal

Delivery

Commit

Parent message

Child message

  • Content: Bag of Words features only
  • Context:Parent and Child Features only ( table below)
  • 8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset
  • Only 1st child message was considered (vast majority – more than 95%)

Kappa Values on 1F3 using Relational (Context) features and Textual (Content) features.

Set of Context Features (Relational)

collective classification using dependency networks
Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its Markov blanket).Collective Classification using Dependency Networks
  • No acyclicity constraint. Simple parameter estimation – approximate inference (Gibbs sampling)
  • In this case, Markov blanket = parent message and child message
  • Heckerman et al., JMLR-2000. Neville & Jensen, KDD-MRDM-2003.
agreement versus iteration
Agreement versus Iteration
  • Kappa versus iteration on 1F3 team dataset, using classifiers trained on 3F2 team data.
leave one team out experiments
Leave-one-team-out Experiments

Kappa Values

  • 4 teams: 1f3(170 msgs), 2f2(137 msgs), 3f2(249 msgs) and 4f4(165 msgs)
  • (x axis)= Bag-of-words only
  • (y-axis) = Collective classification results
  • Different teams present different styles for negotiations and task delegation.
leave one team out experiments12
Leave-one-team-out Experiments
  • Consistent improvement of Commissive, Commit and Meet acts

Kappa Values

leave one team out experiments13
Leave-one-team-out Experiments
  • Deliver and dData performance usually decreases
  • Associated with data distribution, FYI, file sharing, etc.
  • For “non-delivery”, improvement in avg. Kappa is statistically significant (p=0.01 on a two-tailed T-test)

Kappa Values

act by act comparative results
Act by Act Comparative Results

Kappa values with and without collective classification, averaged over the four test sets in the leave-one-team out experiment.

discussion and conclusion
Discussion and Conclusion
  • Sequential patterns of email acts were observed in the CSPACE corpus.
  • These patterns, when studied an artificial experiment, were shown to contain valuable information to the email-act classification problem.
  • Different teams present different styles for negotiations and task delegation.
  • We proposed a collective classification scheme for Email Speech Acts of messages. (based on a Dependency Network model)
conclusion
Conclusion
  • Modest improvements over the baseline (bag of words) were observed on acts related to negotiation (Request, Commit, Propose, Meet, etc) . A performance deterioration was observed for Delivery/dData (acts less associated with negotiations)
  • Agrees with general intuition on the sequential nature of negotiation steps.
  • Degree of linkage in our dataset is small – which makes the observed results encouraging.