Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dialogue Acts Julia Hirschberg LSA07 353 11/29/2018.

Similar presentations


Presentation on theme: "Dialogue Acts Julia Hirschberg LSA07 353 11/29/2018."— Presentation transcript:

1 Dialogue Acts Julia Hirschberg LSA07 353 11/29/2018

2 Today Recognizing structural information: Dialogue Acts vs. Discourse Structure Speech Acts  Dialogue Acts Coding schemes (DAMSL) Practical goals Identifying DAs Direct and indirect DAs: experimental results Corpus studies of DA disambiguation Automatic DA identification More corpus studies 11/29/2018

3 Speech Acts Wittgenstein ’53, Austin ’62 and Searle ’75
Contributions to dialogue are actions performed by speakers: I promise to make you very very sorry for that. Performative verbs Locutionary act: the act of conveying the ‘meaning’ of the sentence uttered (e.g. committing the Speaker to making the hearer sorry) Ilocutionary act: the act associated with the verb uttered (e.g. promising) Perlocutionary act: the act of producing an effect on the Hearer (e.g. threatening) 11/29/2018

4 Searle’s Classification Scheme
Assertives: commit S to the truth of X (e.g. The world is flat) Directives: attempt by S to get H to do X (e.g. Open the window please) Commissives: commit S to do X (e.g. I’ll do it tomorrow) Expressives: S’s description of his/her own feelings about X (e.g. I’m sorry I screamed) Declarations: S brings about a change in the world by virtue of uttering X (e.g. I divorce you, I resign) 11/29/2018

5 Dialogue Acts Roughly correspond to Illocutionary acts
Motivation: Modeling Spoken Dialogue Many coding schemes (e.g. DAMSL) Many-to-many mapping between DAs and words Agreement DA can realized by Okay, Um, Right, Yeah, … But each of these can express multiple DAs, e.g. S: You should take the 10pm flight. U: Okay …that sounds perfect. …but I’d prefer an earlier flight. …(I’m listening) 11/29/2018

6 A Possible Coding Scheme for ‘ok’
Ritualistic? Closing You're welcome Other No 3rd-Turn-Receipt? Yes If Ritualistic==No, code all of these as well: Task Management: I'm done I'm not done yet None 11/29/2018

7 Pivot: finishing and starting Turn Management:
Topic Management: Starting new topic Finished old topic Pivot: finishing and starting Turn Management: Still your turn (=traditional backchannel) Still my turn (=stalling for time) I'm done, it is now your turn None Belief Management: I accept your proposition I entertain your proposition I reject your proposition Do you accept my proposition? (=ynq) 11/29/2018

8 Practical Goals In Spoken Dialogue Systems Disambiguate current DA
Represent user input correctly Respond appropriately Predict next DA Switch Language Models for ASR Switch states in semantic processing Produce DA for next system turn appropriately 11/29/2018

9 Disambiguating Ambiguous DAs Intonationally
Modal (Can/would/would..willing) questions Can you move the piano? Would you move the piano? Would you be willing to move the piano? Nickerson & Chu-Carroll ’99: Can info-requests be disambiguated reliably from action-requests? By prosodic information? Role of politeness 11/29/2018

10 Production Studies Design
Subjects read ambiguous questions in disambiguating contexts Control for given/new and contrastiveness Polite/neutral/impolite readings ToBI-style labeling Problems: Cells imbalanced; little data No pretesting No distractors Same speaker reads both contexts No perception checks 11/29/2018

11 Results Indirect requests (e.g. for action)
If L%, more likely (73%) to be indirect If H%,46% were indirect: differences in height of boundary tone? Politeness: can differs in impolite (higher rise) vs. neutral cases Speaker variability Some production differences Limited utility in production of indirect DAs Beware too steep a rise 11/29/2018

12 Corpus Studies: Jurafsky et al ‘98
Can we distinguish different DA functions for affirmative words Lexical, acoustic/prosodic/syntactic differentiators for yeah, ok, uhuh, mhmm, um… Functional categories to distinguish Continuers: Mhmm (not taking floor) Assessments: Mhmm (tasty) Agreements: Mhmm (I agree) Yes answers: Mhmm (That’s right) Incipient speakership: Mhmm (taking floor) 11/29/2018

13 Questions Are these terms important cues to dialogue structure?
Does prosodic variation help to disambiguate them? Is there any difference in syntactic realization of certain DAs, compared to others? 11/29/2018

14 SwitchBoard telephone conversation corpus
Hand segmented and labeled with DA information (initially from text) using the SWBD-DAMSL dialogue tagset ~60 labels that could be combined in different dimensions 84% inter-labeler agreement on tags Tagset reduced to 42 7 CU-Boulder linguistics grad students labeling switchboard conversations of human-to-human interaction 11/29/2018

15 Relabeling from speech  only 2% changed labels (114/5757)
43/987 continuers --> agreements Why? Shorter duration, lower F0, lower energy, longer preceding pause DAs analyzed for Lexical realization F0 and intensity features Syntactic patterns 11/29/2018

16 Results: Lexical Differences
Agreements yeah (36%), right (11%),... Continuer uhuh (45%), yeah (27%),… Incipient speaker yeah (59%), uhuh (17%), right (7%),… Yes-answer yeah (56%), yes (17%), uhuh (14%),... 11/29/2018

17 Prosodic and Lexico/Syntactic Cues
Over all DA’s, duration best differentiator Highly correlated with DA length in words Assessments: Pro Term + Copula + (Intensifier) + Assessment Adjective That’s X (good, great, fine,…) 11/29/2018

18 Observations Yeah (and variations) ambiguous agreement at 36%
incipient speaker at 59% Yes-answer at 86% Uh-huh (with its variations): a continuer at 45% (vs. yeah at 27%) Continuers (compared to agreements) are: shorter in duration less intonationally `marked’ Preceded by longer pauses 11/29/2018

19 Hypothesis Prosodic information may be particularly helpful in distinguishing DAs with less lexical content 11/29/2018

20 Automatic DA Detection
Rosset & Lamel ’04: Can we detect DAs automatically w/ minimal reliance on lexical content? Lexicons are domain-dependent ASR output is errorful Corpora (3912 utts total) Agent/client dialogues in a French bank call center, in a French web-based stock exchange customer service center, in an English bank call center 11/29/2018

21 DA tags (44) similar to DAMSL
Conventional (openings, closings) Information level (items related to the semantic content of the task) Forward Looking Function: statement (e.g. assert, commit, explanation) infl on Hearer (e.g. confirmation, offer, request) Backward Looking Function: Agreement (e.g. accept, reject) Understanding (e.g. backchannel, correction) Communicative Status (e.g. self-talk, change-mind) NB: each utt could receive a tag for each class, so utts represented as vectors But…only 197 combinations observed 11/29/2018

22 Method: Memory-based learning (TIMBL)
Uses all examples for classification Useful for sparse data Features Speaker identity First 2 words of each turn # utts in turn Previously proposed DA tags for utts in turn Results With true utt boundaries: ~83% accuracy on test data from same domain ~75% accuracy on test data from different domain 11/29/2018

23 Which DAs are easiest/hardest to detect? DA GE.fr CAP.fr GE.eng
On automatically identified utt units: 3.3% ins, 6.6% del, 13.5% sub Which DAs are easiest/hardest to detect? DA GE.fr CAP.fr GE.eng Resp-to 52.0% 33.0% 55.7% Backch 75.0% 72.0% 89.2% Accept 41.7% 26.0% 30.3% Assert 66.0% 56.3% 50.5% Expression 89.0% 69.3% 56.2% Comm-mgt 86.8% 70.7% 59.2% Task 85.4% 81.4% 78.8% 11/29/2018

24 Strong ‘grammar’ of DAs in Spoken Dialogue systems
Conclusions Strong ‘grammar’ of DAs in Spoken Dialogue systems A few initial words perform as well as more 11/29/2018

25 Phonetic, Prosodic, and Lexical Context Cues to DA Disambiguation
Hypothesis: Prosodic information may be important for disambiguating shorter DAs Observation: ASR errors suggest it would be useful to limit the role of lexical content in DA disambiguation as much as possible …and that this is feasible Experiment: Can people distinguish one (short) DA from another purely from phonetic/acoustic/prosodic cues? Are they better with lexical context? 11/29/2018

26 The Columbia Games Corpus Collection
12 spontaneous task-oriented dyadic conversations in Standard American English. 2 subjects playing a computer game, no eye contact. Describer: Follower: 11/29/2018

27 The Columbia Games Corpus Affirmative Cue Words
alright gotcha huh mm-hm okay right uh-huh yeah yep yes yup Functions Acknowledgment / Agreement Backchannel Cue beginning discourse segment Cue ending discourse segment Check with the interlocutor Stall / Filler Back from a task Literal modifier Pivot beginning Pivot ending count the 4565 of 1534 okay 1151 and 886 like 753 11/29/2018

28 Perception Study Selection of Materials
Cue beginning discourse segment Acknowledgment / Agreement Backchannel Speaker 1: yeah um there's like there's some space there's Speaker 2: okay I think I got it okay Speaker 1: but it's gonna be below the onion Speaker 2: okay Speaker 1: okay alright I'll try it okay Speaker 2: okay the owl is blinking 11/29/2018

29 Perception Study Experiment Design
54 instances of ‘okay’ (18 for each function). 2 tokens for each ‘okay’: Isolated condition: Only the word ‘okay’. Contextualized condition: 2 full speaker turns: The turn containing the target ‘okay’; and The previous turn by the other speaker. contextualized ‘okay’ speakers okay 11/29/2018

30 Perception Study Experiment Design
Two conditions: Part 1: 54 isolated tokens Part 2: 54 contextualized tokens Subjects asked to classify each token of ‘okay’ as: Acknowledgment / Agreement, or Backchannel, or Cue beginning discourse segment. 11/29/2018

31 Perception Study Definitions Given to the Subjects
Acknowledge/Agreement: The function of okay that indicates “I believe what you said” and/or “I agree with what you say”. Backchannel: The function of okay in response to another speaker's utterance that indicates only “I’m still here” or “I hear you and please continue”. Cue beginning discourse segment The function of okay that marks a new segment of a discourse or a new topic. This use of okay could be replaced by now. 11/29/2018

32 Perception Study Subjects and Procedure
20 paid subjects (10 female, 10 male). Ages between 20 and 60. Native speakers of English. No hearing problems. GUI on a laboratory workstation with headphones. 11/29/2018

33 Results Inter-Subject Agreement
Kappa measure of agreement with respect to chance (Fleiss ’71) Isolated Condition Contextualized Condition Overall .120 .294 Ack / Agree vs. Other .089 .227 Backchannel vs. Other .118 .164 Cue beginning vs. Other .157 .497 11/29/2018

34 Results Cues to Interpretation
Phonetic transcription of okay: Isolated Condition Strong correlation for realization of initial vowel  Backchannel  Ack/Agree, Cue Beginning Contextualized Condition No strong correlations found for phonetic variants. 11/29/2018

35 Results Cues to Interpretation
Isolated Condition Contextualized Condition Ack / Agree Shorter /k/ Shorter latency between turns Shorter pause before okay Backchannel Higher final pitch slope Longer 2nd syllable Lower intensity More words by S2 before okay Fewer words by S1 after okay Cue beginning Lower final pitch slope Lower overall pitch slope Longer latency between turns More words by S1 after okay Pearson’s r for % of subjects choosing this interpretation w/feature (ttests to determine signif) S1 = Utterer of the target ‘okay’. S2 = The other speaker. 11/29/2018

36 Conclusions Agreement:
Availability of context improves inter-subject agreement. Cue beginnings easier to disambiguate than the other two functions. Cues to interpretation: Contextual features override word features Exception: Final pitch slope of okay in both conditions. Guide to generation… 11/29/2018

37 Summary: Dialogue Act Modeling for SDS
DA identification Looks potentially feasible, even when transcription is errorful Prosodic and lexical cues useful DA generation Descriptive results may be more useful for generation than for recognition, ironically Choice of DA realization, lexical and prosodic 11/29/2018

38 Next Class J&M 22.5 Hirschberg et al ’04 Goldberg et al ’03
Krahmer et al ‘01 11/29/2018


Download ppt "Dialogue Acts Julia Hirschberg LSA07 353 11/29/2018."

Similar presentations


Ads by Google