SIGIR, August 2005, Salvador, Brazil On the Collective Classification of Email “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

A Tutorial on Learning with Bayesian Networks
Learning on the Test Data: Leveraging “Unseen” Features Ben Taskar Ming FaiWong Daphne Koller.
+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
Dynamic Bayesian Networks (DBNs)
AP Statistics – Chapter 9 Test Review
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.
Unit 2: Research Methods in Psychology
L ABORATORY FOR P ERCEPTUAL R OBOTICS U NIVERSITY OF M ASSACHUSETTS A MHERST D EPARTMENT OF C OMPUTER S CIENCE A Relational Representation for Procedural.
Chapter18 Determining and Interpreting Associations Among Variables.
Statistical Analysis of the Social Network and Discussion Threads in Slashdot Vicenç Gómez, Andreas Kaltenbrunner, Vicente López Defended by: Alok Rakkhit.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
HTL-ACTS Workshop, June 2006, New York City Improving Speech Acts Analysis via N-gram Selection Vitor R. Carvalho & William W. Cohen Carnegie Mellon.
Holistic Web Page Classification William W. Cohen Center for Automated Learning and Discovery (CALD) Carnegie-Mellon University.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Distributed Representations of Sentences and Documents
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network Kristina Toutanova, Dan Klein, Christopher Manning, Yoram Singer Stanford University.
Classifying into Acts From EMNLP-04, Learning to Classify into Speech Acts, Cohen-Carvalho-Mitchell An Act is described as a verb- noun pair.
Online Stacked Graphical Learning Zhenzhen Kou +, Vitor R. Carvalho *, and William W. Cohen + Machine Learning Department + / Language Technologies Institute.
Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.
Albert Gatt Corpora and Statistical Methods Lecture 9.
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Modeling Intention in Speech Acts, Information Leaks and User Ranking Methods Vitor R. Carvalho Carnegie Mellon University.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Mean Field Inference in Dependency Networks: An Empirical Study Daniel Lowd and Arash Shamaei University of Oregon.
Single Sample Inferences
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Comparative study of various Machine Learning methods For Telugu Part of Speech tagging -By Avinesh.PVS, Sudheer, Karthik IIIT - Hyderabad.
Review of the web page classification approaches and applications Luu-Ngoc Do Quang-Nhat Vo.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Collective Classification A brief overview and possible connections to -acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Semantic on the Social Semantic Desktop.
Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009.
ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.
Learning to Classify into “Speech Acts” William W. Cohen, Vitor R. Carvalho and Tom M. Mitchell Presented by Vitor R. Carvalho IR Discussion Series.
Style & Topic Language Model Adaptation Using HMM-LDA Bo-June (Paul) Hsu, James Glass.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
A Generalization of Forward-backward Algorithm Ai Azuma Yuji Matsumoto Nara Institute of Science and Technology.
Learning TFC Meeting, SRI March 2005 On the Collective Classification of “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University.
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.
Hedge Detection with Latent Features SU Qi CLSW2013, Zhengzhou, Henan May 12, 2013.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
Classification Ensemble Methods 1
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Conditional Markov Models: MaxEnt Tagging and MEMMs
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Predicting Leadership Roles in Workgroups Vitor R. Carvalho, Wen Wu and William W. Cohen Carnegie Mellon University CEAS-2007, Aug 2 nd 2007.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
Contextual Search and Name Disambiguation in Using Graphs Einat Minkov, William W. Cohen, Andrew Y. Ng Carnegie Mellon University and Stanford University.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
ANNOUCEMENTS 9/3/2015 – NO CLASS 11/3/2015 – LECTURE BY PROF.IR.AYOB KATIMON – 2.30 – 4 PM – DKD 5 13/3/2015 – SUBMISSION OF CHAPTER 1,2 & 3.
Bayesian Conditional Random Fields using Power EP Tom Minka Joint work with Yuan Qi and Martin Szummer.
Towards a Personal Briefing Assistant
Discriminative Probabilistic Models for Relational Data
Label and Link Prediction in Relational Data
Learning to Rank Typed Graph Walks: Local and Global Approaches
Sequential Learning with Dependency Nets
Presentation transcript:

SIGIR, August 2005, Salvador, Brazil On the Collective Classification of “Speech Acts” Vitor R. Carvalho & William W. Cohen Carnegie Mellon University

Outline 1. “Speech Acts” and Applications 2. Sequential Nature of Negotiations 3. Collective Classification and Results

Classifying into Acts [Cohen, Carvalho & Mitchell, EMNLP-04] An Act is a verb-noun pair (e.g., propose meeting) An Act is a verb-noun pair (e.g., propose meeting) One single message may contain multiple acts. Not all pairs make sense. One single message may contain multiple acts. Not all pairs make sense. Try to describe commonly observed behaviors, rather than all possible speech acts. Try to describe commonly observed behaviors, rather than all possible speech acts. Also include non-linguistic usage of (delivery of files) Also include non-linguistic usage of (delivery of files) Most of the acts can be learned (EMNLP-04) Most of the acts can be learned (EMNLP-04) Nouns Verbs

Acts - Applications overload – improved clients. overload – improved clients. Negotiating/managing shared tasks is a central use of Negotiating/managing shared tasks is a central use of Tracking commitments, delegations, pending answers Tracking commitments, delegations, pending answers integrating to-do/task lists to , etc. integrating to-do/task lists to , etc. Iterative Learning of Tasks and Speech Acts [Kushmerick & Khoussainov, 2005] Iterative Learning of Tasks and Speech Acts [Kushmerick & Khoussainov, 2005] Predicting Social Roles and Group Leadership. Predicting Social Roles and Group Leadership. [Leuski, 2004][Carvalho et al., in progress] [Leuski, 2004][Carvalho et al., in progress]

Idea: Predicting Acts from Surrounding Acts Delivery Request Commit Proposal Request Commit Delivery Commit Delivery > Act has little or no correlation with other acts of same message Strong correlation with previous and next message’s acts Example of Thread Sequence

[Winograd and Flores,1986] “Conversation for Action Structure” [Murakoshi et al., 1999] “ Construction of Deliberation Structure in ” Related work on the Sequential Nature of Negotiations

[Kushmerick & Lau, 2005] “Learning the structure of interactions between buyers and e-commerce vendors” Related work on the Sequential Nature of Negotiations

Data: CSPACE Corpus Few large, free, natural corpora are available Few large, free, natural corpora are available CSPACE corpus (Kraut & Fussell) CSPACE corpus (Kraut & Fussell) o s associated with a semester-long project for Carnegie Mellon MBA students in 1997 o 15,000 messages from 277 students, divided in 50 teams (4 to 6 students/team) o Rich in task negotiation. o messages (4 teams) had their “Speech Acts” labeled. o One of the teams was double labeled, and the inter- annotator agreement ranges from 72 to 83% (Kappa) for the most frequent acts.

Evidence of Sequential Correlation of Acts Transition diagram for most common verbs from CSPACE corpus It is NOT a Probabilistic DFA Act sequence patterns: (Request, Deliver+), (Propose, Commit+, Deliver+), (Propose, Deliver+), most common act was Deliver Less regularity than the expected (considering previous deterministic negotiation state diagrams)

Content versus Context Content: Bag of Words features only Context: Parent and Child Features only ( table below) 8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset Only 1 st child message was considered (vast majority – more than 95%) Kappa Values on 1F3 using Relational (Context) features and Textual (Content) features. Parent Boolean Features Child Boolean Features Parent_Request, Parent_Deliver, Parent_Commit, Parent_Propose, Parent_Directive, Parent_Commissive Parent_Meeting, Parent_dData Child_Request, Child_Deliver, Child_Commit, Child_Propose, Child_Directive, Child_Commissive, Child_Meeting, Child_dData Set of Context Features (Relational) Delivery Request Commit Proposal Request ??? Parent messageChild message

Dependency Network Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its Markov blanket). Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its Markov blanket). Approx inference (Gibbs sampling) Approx inference (Gibbs sampling) Markov blanket = parent message and child message Markov blanket = parent message and child message Heckerman et al., JMLR Neville & Jensen, KDD-MRDM Heckerman et al., JMLR Neville & Jensen, KDD-MRDM Parent Message Child Message Current Message Request Commit Deliver … ……

Collective Classification Procedure (based on Dependency Networks Model)

Improvement over Content-only baseline Kappa often improves after iteration Kappa unchanged for “deliver”

Leave-one-team-out Experiments 4 teams: 4 teams: 1f3(170 msgs) 1f3(170 msgs) 2f2(137 msgs) 2f2(137 msgs) 3f2(249 msgs) 3f2(249 msgs) 4f4(165 msgs) 4f4(165 msgs) (x axis)= Bag-of-words only (x axis)= Bag-of-words only (y-axis) = Collective classification results (y-axis) = Collective classification results Different teams present different styles for negotiations and task delegation. Different teams present different styles for negotiations and task delegation. Kappa Values

Leave-one-team-out Experiments Consistent improvement of Commissive, Commit and Meet acts Consistent improvement of Commissive, Commit and Meet acts Kappa Values

Leave-one-team-out Experiments Deliver and dData performance usually decreases Deliver and dData performance usually decreases Associated with data distribution, FYI, file sharing, etc. Associated with data distribution, FYI, file sharing, etc. For “non-delivery”, improvement in avg. Kappa is statistically significant (p=0.01 on a two-tailed T-test) For “non-delivery”, improvement in avg. Kappa is statistically significant (p=0.01 on a two-tailed T-test) Kappa Values

Act by Act Comparative Results Kappa values with and without collective classification, averaged over the four test sets in the leave-one-team out experiment.

Conclusion  Sequential patterns of acts were studied in the CSPACE corpus. Less regularity than expected.  We proposed a collective classification procedure for Speech Acts based on a Dependency Net model.  Modest improvements over the baseline on acts related to negotiation (Request, Commit, Propose, Meet, etc). No improvement/deterioration was observed for Deliver/dData (acts less associated with negotiations)  Degree of linkage in our dataset is small – which makes the observed results encouraging.

Thank you!

Inter-Annotator Agreement Kappa Statistic Kappa Statistic A = probability of agreement in a category A = probability of agreement in a category R = prob. of agreement for 2 annotators labeling at random R = prob. of agreement for 2 annotators labeling at random Kappa range: -1…+1 Kappa range: -1…+1 Inter-Annotator Agreement Act Kappa Deliver 0.75 Commit 0.72 Request 0.81 Amend 0.83 Propose 0.72