Dialogue Acts Julia Hirschberg CS 4706 11/18/2018.

Slides:



Advertisements
Similar presentations
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Advertisements

Using prosody to avoid ambiguity: Effects of speaker awareness and referential context Snedeker and Trueswell (2003) Psych 526 Eun-Kyung Lee.
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
ASR Evaluation Julia Hirschberg CS Outline Intrinsic Methods –Transcription Accuracy Word Error Rate Automatic methods, toolkits Limitations –Concept.
Detecting missrecognitions Predicting with prosody.
6/25/20151 Dialogue Acts and Information State Julia Hirschberg CS 4706.
Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
6/28/20151 Spoken Dialogue Systems: Human and Machine Julia Hirschberg CS 4706.
1 Back Channel Communication Antoine Raux Dialogs on Dialogs 02/25/2005.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Pragmatics.
Semantics 3rd class Chapter 5.
Discourse Markers Discourse & Dialogue CS November 25, 2006.
Theories of Discourse and Dialogue. Discourse Any set of connected sentences This set of sentences gives context to the discourse Some language phenomena.
1 Spoken Dialogue Systems Dialogue and Conversational Agents (Part II) Chapter 19: Draft of May 18, 2005 Speech and Language Processing: An Introduction.
Topic 9: perlocution and illocution
Turn-taking Discourse and Dialogue CS 359 November 6, 2001.
Background: Speakers use prosody to distinguish between the meanings of ambiguous syntactic structures (Snedeker & Trueswell, 2004). Discourse also has.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Discourse & Dialogue CS 359 November 13, 2001
Dialogue Act Tagging Discourse and Dialogue CMSC November 4, 2004.
Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Adapting Dialogue Models Discourse & Dialogue CMSC November 19, 2006.
SPEECH ACTS Saying as Doing See R. Nofsinger, Everyday Conversation, Sage, 1991.
Grounding and Repair Joe Tepperman CS 599 – Dialogue Modeling Fall 2005.
On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.
HIGH SCHOOL TEACHER TRAINING WORKSHOP
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Chapter 8 Spoken Discourse. Linguistic Competence communicative competence: the knowledge we bring to using language as a communicative tool in conversation.
SPEECH ACT THEORY: Three Kinds of Act.
Linguistic knowledge for Speech recognition
Towards Emotion Prediction in Spoken Tutoring Dialogues
Conditional Random Fields for ASR
Speech Acts.
Welcome back!.
Why Study Spoken Language?
Recognizing Structure: Dialogue Acts and Segmentation
Studying Intonation Julia Hirschberg CS /21/2018.
Meanings of Intonational Contours
Studying Intonation Julia Hirschberg CS /21/2018.
Issues in Spoken Dialogue Systems
Spoken Dialogue Systems
Intonational and Its Meanings
Intonational and Its Meanings
The American School and ToBI
Dialogue Systems Julia Hirschberg CS /14/2018.
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Why Study Spoken Language?
Meanings of Intonational Contours
Turn-taking and Disfluencies
Advanced NLP: Speech Research and Technologies
Dialogue Acts and Information State
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Dialogue Acts Julia Hirschberg LSA /29/2018.
High Frequency Word Entrainment in Spoken Dialogue
Advanced NLP: Speech Research and Technologies
Spoken Dialogue Systems
Discourse Structure in Generation
Comparative Studies Avesani et al 1995; Hirschberg&Avesani 1997
Intonational and Its Meanings
SPEECH ACTS Saying as Doing
Spoken Dialogue Systems: System Overview
Recognizing Structure: Dialogue Acts and Segmentation
SPEECH ACTS Saying as Doing Professor Lenny Shedletsky
Low Level Cues to Emotion
Presentation transcript:

Dialogue Acts Julia Hirschberg CS 4706 11/18/2018

Today Recognizing structural information: Dialogue Acts vs. Discourse Structure Speech Act theory Speech Acts  Dialogue Acts Coding schemes (DAMSL) Practical goals Recognition Experimental results (Nickerson & Chu-Carroll ’99) Corpus studies (Jurafsky et al ’98) Automatic Detection (Rosset & Lamel ’04) 11/18/2018

Speech Act Theory John Searle Speech Acts ‘69 Locutionary acts: semantic meaning Illocutionary acts: request, promise, statement, threat, question Perlocutionary acts: Effect intended to be produced on Hearer: regret, fear, hope I’ll study tomorrow. Is the Pope Catholic? Can you open that window? 11/18/2018

Today Recognizing structural information: Dialogue Acts Speech Act theory Speech Acts  Dialogue Acts Coding schemes (DAMSL) Practical goals Recognition Experimental results (Nickerson & Chu-Carroll ’99) Corpus studies (Jurafsky et al ’98) Automatic Detection (Rosset & Lamel ’04) 11/18/2018

Dialogue Acts Roughly correspond to Illocutionary acts Motivation: Improving Spoken Dialogue Systems Many coding schemes (e.g. DAMSL) Many-to-many mapping between DAs and words Agreement DA can realized by Okay, Um, Right, Yeah, … But each of these can express multiple DAs, e.g. S: You should take the 10pm flight. U: Okay …that sounds perfect. …but I’d prefer an earlier flight. …(I’m listening) 11/18/2018

A Possible Coding Scheme for ‘ok’ Ritualistic? Closing You're Welcome Other No 3rd-Turn-Receipt? Yes If Ritualistic==No, code all of these as well: Task Management: I'm done I'm not done yet None 11/18/2018

Pivot: finishing and starting Turn Management: Topic Management: Starting new topic Finished old topic Pivot: finishing and starting Turn Management: Still your turn (=traditional backchannel) Still my turn (=stalling for time) I'm done, it is now your turn None Belief Management: I accept your proposition I entertain your proposition I reject your proposition Do you accept my proposition? (=ynq) 11/18/2018

Practical Goals In Spoken Dialogue Systems Disambiguate current DA Represent user input correctly Responding appropriately Predict next DA Switch Language Models for ASR Switch states in semantic processing 11/18/2018

Today Recognizing structural information: Dialogue Acts Speech Act theory Speech Acts  Dialogue Acts Coding schemes (DAMSL) Practical goals Recognition Experimental results (Nickerson & Chu-Carroll ’99) Corpus studies (Jurafsky et al ’98) Automatic Detection (Rosset & Lamel ’04) 11/18/2018

Disambiguating Ambiguous DAs Intonationally Nickerson & Chu-Carroll ’99: Can info-requests be disambiguated reliably from action-requests? Modal (Can/would/would..willing) questions Can you move the piano? Would you move the piano? Would you be willing to move the piano? 11/18/2018

Experiments Production studies: Subjects read ambiguous questions in disambiguating contexts Control for given/new and contrastiveness Polite/neutral/impolite Problems: Cells imbalanced No pretesting No distractors Same speaker reads both contexts 11/18/2018

Results Indirect requests (e.g. for action) If L%, more likely (73%) to be indirect If H%,46% were indirect: differences in height of boundary tone? Politeness: can differs in impolite (higher rise) vs. neutral Speaker variability 11/18/2018

Today Recognizing structural information: Dialogue Acts Speech Act theory Speech Acts  Dialogue Acts Coding schemes (DAMSL) Practical goals Recognition Experimental results (Nickerson & Chu-Carroll ’99) Corpus studies (Jurafsky et al ’98) Automatic Detection (Rosset & Lamel ’04) 11/18/2018

Corpus Studies: Jurafsky et al ‘98 Lexical, acoustic/prosodic/syntactic differentiators for yeah, ok, uhuh, mhmm, um… Labeling Continuers: Mhmm (not taking floor) Assessments: Mhmm (tasty) Agreements: Mhmm (I agree) Yes answers: Mhmm (That’s right) Incipient speakership: Mhmm (taking floor) 11/18/2018

Corpus Switchboard telephone conversation corpus Hand segmented and labeled with DA information (initially from text) Relabeled for this study Analyzed for Lexical realization F0 and rms features Syntactic patterns 11/18/2018

Results: Lexical Differences Agreements yeah (36%), right (11%),... Continuer uhuh (45%), yeah (27%),… Incipient speaker yeah (59%), uhuh (17%), right (7%),… Yes-answer yeah (56%), yes (17%), uhuh (14%),... 11/18/2018

Results: Prosodic and Syntactic Cues Relabeling from speech produces only 2% changed labels over all (114/5757) 43/987 continuers --> agreements Why? Shorter duration, lower F0, lower energy, longer preceding pause Over all DA’s, duration best differentiator but… Highly correlated with DA length in words Assessments: That’s X (good, great, fine,…) 11/18/2018

Today Recognizing structural information: Dialogue Acts Speech Act theory Speech Acts  Dialogue Acts Coding schemes (DAMSL) Practical goals Recognition Experimental results (Nickerson & Chu-Carroll ’99) Corpus studies (Jurafsky et al ’98) Automatic Detection (Rosset & Lamel ’04) 11/18/2018

Automatic DA Detection Rosset & Lamel ’04: Can we detect DAs automatically w/ minimal reliance on lexical content? Lexicons are domain-dependent ASR output is errorful Corpora (3912 utts total) Agent/client dialogues in a French bank call center, in a French web-based stock exchange customer service center, in an English bank call center 11/18/2018

DA tags new again (44) Conventional (openings, closings) Information level (items related to the semantic content of the task) Forward Looking Function: statement (e.g. assert, commit, explanation) infl on Hearer (e.g. confirmation, offer, request) Backward Looking Function: Agreement (e.g. accept, reject) Understanding (e.g. backchannel, correction) Communicative Status (e.g. self-talk, change-mind) NB: each utt could receive a tag for each class, so utts represented as vectors But…only 197 combinations observed 11/18/2018

Method: Memory-based learning (TIMBL) Uses all examples for classification Useful for sparse data Features Speaker identity First 2 words of each turn # utts in turn Previously proposed DA tags for utts in turn Results With true utt boundaries: ~83% accuracy on test data from same domain ~75% accuracy on test data from different domain 11/18/2018

Which DAs are easiest/hardest to detect? DA GE.fr CAP.fr GE.eng On automatically identified utt units: 3.3% ins, 6.6% del, 13.5% sub Which DAs are easiest/hardest to detect? DA GE.fr CAP.fr GE.eng Resp-to 52.0% 33.0% 55.7% Backch 75.0% 72.0% 89.2% Accept 41.7% 26.0% 30.3% Assert 66.0% 56.3% 50.5% Expression 89.0% 69.3% 56.2% Comm-mgt 86.8% 70.7% 59.2% Task 85.4% 81.4% 78.8% 11/18/2018

Strong ‘grammar’ of DAs in Spoken Dialogue systems Conclusions Strong ‘grammar’ of DAs in Spoken Dialogue systems A few initial words perform as well as more 11/18/2018

Next Class Spoken Dialogue Systems J&M (new): comments? Walker et al ’97 on Paradise System for evaluation Bell & Gustafson ‘00 11/18/2018