Download presentation
Presentation is loading. Please wait.
Published byWillis Wiggins Modified over 9 years ago
1
SIGIR, August 2005, Salvador, Brazil On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University
2
Outline 1. Email “Speech Acts” and Applications 2. Sequential Nature of Negotiations 3. Collective Classification and Results
3
Classifying Email into Acts [Cohen, Carvalho & Mitchell, EMNLP-04] An Act is a verb-noun pair (e.g., propose meeting) An Act is a verb-noun pair (e.g., propose meeting) One single email message may contain multiple acts. Not all pairs make sense. One single email message may contain multiple acts. Not all pairs make sense. Try to describe commonly observed behaviors, rather than all possible speech acts. Try to describe commonly observed behaviors, rather than all possible speech acts. Also include non-linguistic usage of email (delivery of files) Also include non-linguistic usage of email (delivery of files) Most of the acts can be learned (EMNLP-04) Most of the acts can be learned (EMNLP-04) Nouns Verbs
4
Email Acts - Applications Email overload – improved email clients. Email overload – improved email clients. Negotiating/managing shared tasks is a central use of email Negotiating/managing shared tasks is a central use of email Tracking commitments, delegations, pending answers Tracking commitments, delegations, pending answers integrating to-do/task lists to email, etc. integrating to-do/task lists to email, etc. Iterative Learning of Email Tasks and Speech Acts [Kushmerick & Khoussainov, 2005] Iterative Learning of Email Tasks and Speech Acts [Kushmerick & Khoussainov, 2005] Predicting Social Roles and Group Leadership. Predicting Social Roles and Group Leadership. [Leuski, 2004][Carvalho et al., in progress] [Leuski, 2004][Carvalho et al., in progress]
5
Idea: Predicting Acts from Surrounding Acts Delivery Request Commit Proposal Request Commit Delivery Commit Delivery > Act has little or no correlation with other acts of same message Strong correlation with previous and next message’s acts Example of Email Thread Sequence
6
[Winograd and Flores,1986] “Conversation for Action Structure” [Murakoshi et al., 1999] “ Construction of Deliberation Structure in Email ” Related work on the Sequential Nature of Negotiations
7
[Kushmerick & Lau, 2005] “Learning the structure of interactions between buyers and e-commerce vendors” Related work on the Sequential Nature of Negotiations
8
Data: CSPACE Corpus Few large, free, natural email corpora are available Few large, free, natural email corpora are available CSPACE corpus (Kraut & Fussell) CSPACE corpus (Kraut & Fussell) o Emails associated with a semester-long project for Carnegie Mellon MBA students in 1997 o 15,000 messages from 277 students, divided in 50 teams (4 to 6 students/team) o Rich in task negotiation. o 1500+ messages (4 teams) had their “Speech Acts” labeled. o One of the teams was double labeled, and the inter- annotator agreement ranges from 72 to 83% (Kappa) for the most frequent acts.
9
Evidence of Sequential Correlation of Acts Transition diagram for most common verbs from CSPACE corpus It is NOT a Probabilistic DFA Act sequence patterns: (Request, Deliver+), (Propose, Commit+, Deliver+), (Propose, Deliver+), most common act was Deliver Less regularity than the expected (considering previous deterministic negotiation state diagrams)
10
Content versus Context Content: Bag of Words features only Context: Parent and Child Features only ( table below) 8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset Only 1 st child message was considered (vast majority – more than 95%) Kappa Values on 1F3 using Relational (Context) features and Textual (Content) features. Parent Boolean Features Child Boolean Features Parent_Request, Parent_Deliver, Parent_Commit, Parent_Propose, Parent_Directive, Parent_Commissive Parent_Meeting, Parent_dData Child_Request, Child_Deliver, Child_Commit, Child_Propose, Child_Directive, Child_Commissive, Child_Meeting, Child_dData Set of Context Features (Relational) Delivery Request Commit Proposal Request ??? Parent messageChild message
11
Dependency Network Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its Markov blanket). Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its Markov blanket). Approx inference (Gibbs sampling) Approx inference (Gibbs sampling) Markov blanket = parent message and child message Markov blanket = parent message and child message Heckerman et al., JMLR-2000. Neville & Jensen, KDD-MRDM- 2003. Heckerman et al., JMLR-2000. Neville & Jensen, KDD-MRDM- 2003. Parent Message Child Message Current Message Request Commit Deliver … ……
12
Collective Classification Procedure (based on Dependency Networks Model)
13
Improvement over Content-only baseline Kappa often improves after iteration Kappa unchanged for “deliver”
14
Leave-one-team-out Experiments 4 teams: 4 teams: 1f3(170 msgs) 1f3(170 msgs) 2f2(137 msgs) 2f2(137 msgs) 3f2(249 msgs) 3f2(249 msgs) 4f4(165 msgs) 4f4(165 msgs) (x axis)= Bag-of-words only (x axis)= Bag-of-words only (y-axis) = Collective classification results (y-axis) = Collective classification results Different teams present different styles for negotiations and task delegation. Different teams present different styles for negotiations and task delegation. Kappa Values
15
Leave-one-team-out Experiments Consistent improvement of Commissive, Commit and Meet acts Consistent improvement of Commissive, Commit and Meet acts Kappa Values
16
Leave-one-team-out Experiments Deliver and dData performance usually decreases Deliver and dData performance usually decreases Associated with data distribution, FYI, file sharing, etc. Associated with data distribution, FYI, file sharing, etc. For “non-delivery”, improvement in avg. Kappa is statistically significant (p=0.01 on a two-tailed T-test) For “non-delivery”, improvement in avg. Kappa is statistically significant (p=0.01 on a two-tailed T-test) Kappa Values
17
Act by Act Comparative Results Kappa values with and without collective classification, averaged over the four test sets in the leave-one-team out experiment.
18
Conclusion Sequential patterns of email acts were studied in the CSPACE corpus. Less regularity than expected. We proposed a collective classification procedure for Email Speech Acts based on a Dependency Net model. Modest improvements over the baseline on acts related to negotiation (Request, Commit, Propose, Meet, etc). No improvement/deterioration was observed for Deliver/dData (acts less associated with negotiations) Degree of linkage in our dataset is small – which makes the observed results encouraging.
19
Thank you!
21
Inter-Annotator Agreement Kappa Statistic Kappa Statistic A = probability of agreement in a category A = probability of agreement in a category R = prob. of agreement for 2 annotators labeling at random R = prob. of agreement for 2 annotators labeling at random Kappa range: -1…+1 Kappa range: -1…+1 Inter-Annotator Agreement Email Act Kappa Deliver 0.75 Commit 0.72 Request 0.81 Amend 0.83 Propose 0.72
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.