6/25/20151 Dialogue Acts and Information State Julia Hirschberg CS 4706.

Slides:



Advertisements
Similar presentations
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Advertisements

Language and communication What is language? How do we communicate? Pragmatic principles Common ground.
EL111 Unit Eleven Ms. Khadeeja Rabah .
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
User interaction ‘Rules’ of Human-Human Conversation
Dialogue Management Ling575 Discourse and Dialogue May 18, 2011.
U1, Speech in the interface:2. Dialogue Management1 Module u1: Speech in the Interface 2: Dialogue Management Jacques Terken HG room 2:40 tel. (247) 5254.
Error detection in spoken dialogue systems GSLT Dialogue Systems, 5p Gabriel Skantze TT Centrum för talteknologi.
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
Extracting Social Meaning Identifying Interactional Style in Spoken Conversation Jurafsky et al ‘09 Presented by Laura Willson.
ASR Evaluation Julia Hirschberg CS Outline Intrinsic Methods –Transcription Accuracy Word Error Rate Automatic methods, toolkits Limitations –Concept.
Detecting missrecognitions Predicting with prosody.
Error Detection in Human-Machine Interaction Dan Bohus DoD Group, Oct 2002.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 5): Outliers Fall, 2008.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
6/28/20151 Spoken Dialogue Systems: Human and Machine Julia Hirschberg CS 4706.
Lesson 4 Making Telephone Calls Business English Conversation & Listening Instructor: Hsin-Hsin Cindy Lee, PhD.
Towards Natural Clarification Questions in Dialogue Systems Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg AISB 2014 Convention at Goldsmiths, University.
Introduction to linguistics II
৳ Look, I’ve got a leaflet about it.
Theories of Discourse and Dialogue. Discourse Any set of connected sentences This set of sentences gives context to the discourse Some language phenomena.
1 Spoken Dialogue Systems Dialogue and Conversational Agents (Part II) Chapter 19: Draft of May 18, 2005 Speech and Language Processing: An Introduction.
1 Computational Linguistics Ling 200 Spring 2006.
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.
Adaptive Spoken Dialogue Systems & Computational Linguistics Diane J. Litman Dept. of Computer Science & Learning Research and Development Center University.
Discourse Analysis Force Migration and Refugee Studies Program The American University in Cairo Professor Robert S. Williams.
Turn-taking Discourse and Dialogue CS 359 November 6, 2001.
1 Determining query types by analysing intonation.
Direct speech Indirect speech Present simple She said, "It's cold." › Past simple She said it was cold. Present continuous She said, "I'm teaching English.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
1 CS 224S W2006 CS 224S LING 281 Speech Recognition and Synthesis Lecture 14: Dialogue and Conversational Agents (II) Dan Jurafsky.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Dialogue Act Tagging Discourse and Dialogue CMSC November 4, 2004.
Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Adapting Dialogue Models Discourse & Dialogue CMSC November 19, 2006.
Misrecognitions and Corrections in Spoken Dialogue Systems Diane Litman AT&T Labs -- Research (Joint Work With Julia Hirschberg, AT&T, and Marc Swerts,
1 Spoken Dialogue Systems Error Detection and Correction in Spoken Dialogue Systems.
Dialogue Modeling 2. Indirect Requests “Can I have a cup of coffee?”  One approach to dealing with these kinds of requests is by plan-based inference.
© 2015 albert-learning.com How to talk to your boss How to talk to your boss!!
Grounding and Repair Joe Tepperman CS 599 – Dialogue Modeling Fall 2005.
NLP. Natural Language Processing Abbott You know, strange as it may seem, they give ball players nowadays.
On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.
Speech and Language Processing
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Speech Acts: What is a Speech Act?
CS 224S / LINGUIST 285 Spoken Language Processing
Towards Emotion Prediction in Spoken Tutoring Dialogues
Recognizing Structure: Dialogue Acts and Segmentation
Spoken Dialogue Systems
Error Detection and Correction in SDS
Studying Intonation Julia Hirschberg CS /21/2018.
Issues in Spoken Dialogue Systems
Spoken Dialogue Systems
Spoken Dialogue Systems: Human and Machine
Intonational and Its Meanings
Lecture 3: October 5, 2004 Dan Jurafsky
Dialogue Acts Julia Hirschberg CS /18/2018.
Dialogue Acts and Information State
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Managing Dialogue Julia Hirschberg CS /28/2018.
Dialogue Acts Julia Hirschberg LSA /29/2018.
Spoken Dialogue Systems
Spoken Dialogue Systems: System Overview
Spoken Dialogue Systems
Recognizing Structure: Dialogue Acts and Segmentation
Presentation transcript:

6/25/20151 Dialogue Acts and Information State Julia Hirschberg CS 4706

6/25/20152 Information-State and Dialogue Acts If we want a dialogue system to be more than just form-filling, it Needs to: –Decide when user has asked a question, made a proposal, rejected a suggestion –Ground user’s utterance, ask clarification questions, suggestion plans Good conversational agents need sophisticated models of interpretation and generation – beyond slot filling

6/25/20153 Information-State Architecture Information state representation Dialogue act interpreter Dialogue act generator Set of update rules –Update dialogue state as acts are interpreted –Generate dialogue acts Control structure to select which update rules to apply

6/25/20154 Information-state

6/25/20155 Dialogue acts AKA conversational moves Actions with (internal) structure related specifically to their dialogue function Incorporates ideas of grounding with other dialogue and conversational functions not mentioned in classic Speech Act Theory

6/25/20156 Speech Act Theory: Reminder John Searle Speech Acts ‘69 –Locutionary acts: semantic meaning/surface form –Illocutionary acts: request, promise, statement, threat, question –Perlocutionary acts: Effect intended to be produced on Hearer: regret, fear, hope

6/25/20157 What Kind of Speech Acts do we need for a Real Task: Verbmobil Two-party scheduling dialogues Speakers were asked to plan a meeting at some future date Data used to design conversational agents which would help with this task Issues: –Cross-language –Machine translation –Scheduling assistant

6/25/20158 Verbmobil Dialogue Acts THANKthanks GREETHello Dan INTRODUCEIt’s me again BYEAllright, bye REQUEST-COMMENTHow does that look? SUGGESTJune 13th through 17th REJECTNo, Friday I’m booked all day ACCEPTSaturday sounds fine REQUEST-SUGGESTWhat is a good day of the week for you? INITI wanted to make an appointment with you GIVE_REASONBecause I have meetings all afternoon FEEDBACKOkay DELIBERATELet me check my calendar here CONFIRMOkay, that would be wonderful CLARIFYOkay, do you mean Tuesday the 23rd?

6/25/20159 Automatic Interpretation of Dialogue Acts How do we automatically identify dialogue acts? Given an utterance: –Decide whether it is a QUESTION, STATEMENT, SUGGEST, or ACKNOWLEDGMENT Recognizing illocutionary force will be crucial to building a dialogue agent Perhaps we can just look at the form of the utterance to decide?

6/25/ Can we just use the surface syntactic form? YES-NO-Qs have auxiliary-before-subject syntax: –Will breakfast be served on USAir 1557? STATEMENTs have declarative syntax: –I don’t care about lunch COMMANDs have imperative syntax: –Show me flights from Milwaukee to Orlando on Thursday night

6/25/ Surface Form != Speech Act Type Locutionary Force Illocutionary Force Can I have the rest of your sandwich? QuestionRequest I want the rest of your sandwich DeclarativeRequest Give me your sandwich! ImperativeRequest

6/25/ Dialogue act disambiguation is hard! Who’s on First? Abbott: Well, Costello, I'm going to New York with you. Bucky Harris the Yankee's manager gave me a job as coach for as long as you're on the team. Costello: Look Abbott, if you're the coach, you must know all the players. Abbott: I certainly do. Costello: Well you know I've never met the guys. So you'll have to tell me their names, and then I'll know who's playing on the team. Abbott: Oh, I'll tell you their names, but you know it seems to me they give these ball players now-a-days very peculiar names. Costello: You mean funny names? Abbott: Strange names, pet names...like Dizzy Dean... Costello: His brother Daffy Abbott: Daffy Dean... Costello: And their French cousin. Abbott: French? Costello: Goofe' Abbott: Goofe' Dean. Well, let's see, we have on the bags, Who's on first, What's on second, I Don't Know is on third... Costello: That's what I want to find out. Abbott: I say Who's on first, What's on second, I Don't Know's on third….

6/25/ Dialogue act ambiguity Who’s on first –INFO-REQUEST –or –STATEMENT

6/25/ Dialogue Act ambiguity Can you give me a list of the flights from Atlanta to Boston? –Looks like an INFO-REQUEST. –If so, answer is: YES. –But really it’s a DIRECTIVE or REQUEST, a polite form of: –Please give me a list of the flights… What looks like a QUESTION can be a REQUEST

6/25/ Dialogue Act Ambiguity What looks like a STATEMENT can be a QUESTION: UsOPEN- OPTION I was wanting to make some arrangements for a trip that I’m going to be taking uh to LA uh beginning of the week after next AgHOLDOK uh let me pull up your profile and I’ll be right with you here. [pause] AgCHECKAnd you said you wanted to travel next week? UsACCEPTUh yes.

6/25/ Indirect Speech Acts Utterances which use a surface statement to ask a question –And you want to…. Utterances which use a surface question to issue a request –Can you get me…

6/25/ DA Interpretation as Statistical Classification Lots of clues in each sentence that can tell us which DA it is: –Words and Collocations: Please or would you: good cue for REQUEST Are you: good cue for INFO-REQUEST –Prosody: Rising pitch is a good cue for INFO-REQUEST Loudness/stress can help distinguish yeah/AGREEMENT from yeah/BACKCHANNEL –Conversational Structure Yeah following a proposal is probably AGREEMENT; yeah following an INFORM probably a BACKCHANNEL

Disambiguating Ambiguous DAs Intonationally Nickerson & Chu-Carroll ’99: Can info-requests be disambiguated reliably from action-requests? Modal (Can/would/would..willing) questions –Can you move the piano? –Would you move the piano? –Would you be willing to move the piano?

Experiments Production studies: –Subjects read ambiguous questions in disambiguating contexts –Control for given/new and contrastiveness –Polite/neutral/impolite Problems: –Cells imbalanced –No pretesting –No distractors –Same speaker reads both contexts

Results Indirect requests (e.g. for action) –If L%, more likely (73%) to be indirect –If H%,46% were indirect: differences in height of boundary tone? –Politeness: can differs in impolite (higher rise) vs. neutral –Speaker variability

6/25/ Statistical Classifier Model of DA Interpretation Goal: decide for each sentence what DA it is Classification task: 1-of-N classification decision for each sentence –With N classes (= number of dialog acts). –Three probabilistic models corresponding to the 3 kinds of cues from the input sentence. Conversational Structure: Probability of one dialogue act following another P(Answer|Question) Words and Syntax: Probability of a sequence of words given a dialogue act: P(“do you” | Question) Prosody: probability of prosodic features given a dialogue act : P(“rise at end of sentence” | Question)

Corpus Studies: Jurafsky et al ‘98 Lexical, acoustic/prosodic/syntactic differentiators for yeah, ok, uhuh, mhmm, um… Labeling –Continuers: Mhmm (not taking floor) –Assessments: Mhmm (tasty) –Agreements: Mhmm (I agree) –Yes answers: Mhmm (That’s right) –Incipient speakership: Mhmm (taking floor)

Corpus Switchboard telephone conversation corpus –Hand segmented and labeled with DA information (initially from text) –Relabeled for this study –Analyzed for Lexical realization F0 and rms features Syntactic patterns

Results: Lexical Differences Agreements –yeah (36%), right (11%),... Continuer –uhuh (45%), yeah (27%),… Incipient speaker –yeah (59%), uhuh (17%), right (7%),… Yes-answer –yeah (56%), yes (17%), uhuh (14%),...

Results: Prosodic and Syntactic Cues Relabeling from speech produces only 2% changed labels over all (114/5757) –43/987 continuers --> agreements –Why? Shorter duration, lower F0, lower energy, longer preceding pause Over all DA’s, duration best differentiator but… –Highly correlated with DA length in words Assessments: That’s X (good, great, fine,…)

6/25/ Generating Dialogue Acts Confirmation Rejection

6/25/ Confirmation Another reason for grounding –ASR Errors: Speech is a very errorful channel –Even for humans in noisey conditions –Humans use grounding to confirm that they’ve heard correctly –ASR is much worse than humans! Conclusion: SDS need to do even more grounding and confirmation than humans

6/25/ Explicit confirmation S: Which city do you want to leave from? U: Baltimore S: Do you want to leave from Baltimore? U: Yes

6/25/ Explicit confirmation U: I’d like to fly from Denver Colorado to New York City on September 21st in the morning on United Airlines S: Let’s see then. I have you going from Denver Colorado to New York on September 21st. Is that correct? U: Yes

6/25/ Implicit confirmation: display U: I’d like to travel to Berlin S: When do you want to travel to Berlin? U: Hi I’d like to fly to Seattle Tuesday morning S: Traveling to Seattle on Tuesday, August eleventh in the morning. Your name?

6/25/ Implicit vs. Explicit Complementary strengths Explicit: Easier for users to correct system’s mistakes (Can just say “no”) But explicit is cumbersome and long Implicit: Much more natural, quicker, simpler (if system guesses right).

6/25/ Implicit and Explicit Early systems: all-implicit or all-explicit Modern systems: adaptive How to decide? –ASR system can provide confidence metric. Expresses how convinced system is of its transcription of the speech –If high confidence, use implicit confirmation –If low confidence, use explicit confirmation

6/25/ Computing Confidence Simplest: Use acoustic log-likelihood of user’s utterance More features might help –Prosodic: utterances with longer pauses, F0 excursions, longer durations –Backoff: did we have to backoff in the LM? –Cost of an error: Explicit confirmation before moving money or booking flights

6/25/ Rejection e.g., VoiceXML “nomatch” “I’m sorry, I didn’t understand that.” Reject when: –ASR confidence is low –Best interpretation is semantically ill-formed Option: 4-tiered level of confidence: –Below confidence threshhold, reject –Above threshold, explicit confirmation –If even higher, implicit confirmation –Even higher, no confirmation

6/25/ DA Detection Example: Correction Detection Despite clever confirmation/rejection strategies, dialogue systems still make mistakes If system misrecognizes an utterance, and either –Rejects –Via confirmation, displays its misunderstanding Then user has a chance to make a correction –Repeat themselves –Rephrasing –Saying “no” to the confirmation question.

Learning from Human Behavior (Krahmer et al ’01) Learning from human behavior –‘go on’ and ‘go back’ signals in grounding situations (implicit/explicit verification) –Positive: short turns, unmarked word order, confirmation, answers, no corrections or repetitions, new info –Negative: long turns, marked word order, disconfirmation, no answer, corrections, repetitions, no new info

–Hypotheses supported but… Can these cues be identified automatically? How might they affect the design of SDS?

6/25/ Corrections Unfortunately, corrections are harder to recognize than normal sentences –Swerts et al (2000): Corrections misrecognized twice as often (in terms of WER) as non-corrections –Why? Prosody seems to be largest factor: hyperarticulation Example from Liz Shriberg –“NO, I am DE-PAR-TING from Jacksonville) Hyperarticulation

6/25/ A Labeled dialogue (Swerts et al)

Distribution of Correction Types AddAdd/OmitOmitParRep All8%2%32%19%39% After Misrec 7%3%40%18%32% After Rej6%0%7%28%59%

6/25/ Machine Learning to Detect User Corrections Build classifiers using features like –Lexical information (words “no”, “correct”, “I don’t”, swear words) –Prosodic features (various increases in F0 range, pause duration, and word duration that correlation with hyperarticulation) –Length –ASR confidence –LM probability –Dialogue features (e.g., repetitions)

But…. What to do when you recognize a user is trying to correct the system?

6/25/ Summary Dialogue Acts and Information State Dialogue Acts –Ambiguities and disambiguation Dialogue Acts: Recognition –ML approaches to DA classification Dialogue Acts: Generation –Confirmation Strategies –Rejections Dialogue Acts: Detecting Corrections

Next Evaluating Spoken Dialogue Systems