Download presentation
Presentation is loading. Please wait.
1
Julia Hirschberg Columbia University SIGdial 2008
Speaking More Like You: Lexical, Acoustic/Prosodic, and Discourse Entrainment in Spoken Dialogue Systems Julia Hirschberg Columbia University SIGdial 2008 11/26/2018
2
Entrainment/Adaptation/Accommodation/Alignment/Priming
Hypothesis: People tend to adapt their communicative behavior to that of their conversational partner Consequences Key to successful communication (Goleman ‘06) Entrainment leads subjects to like their conversational partners more and to perceive conversations as more successful (Chartrand & Bargh ‘99) Reitter et al ’07 found entrainment a good predictor of task success in Map Task Norma mendoza denton (oprah and /ay/) 1999 11/26/2018
3
Dimensions of Entrainment
Lexical and syntactic: Collaborating on referential choice A: It’s that thing that looks like a harpsichord. B: So the harpsichord-looking thing… B: The harpsichord… Semantic (Mills & Healey, yesterday) Phonological Word pronunciation/accent (e.g. Oprah) Acoustic/Prosodic Speaking rate, pitch range, intensity, contour, voice quality Socio-cultural dimensions 11/26/2018
4
Facial expression and gesture Who studies?
Linguists, Psycholinguists, Sociolinguists, Computational Linguists (Spoken Dialogue Systems researchers) Research questions What are the dimensions of entrainment? How can we measure it? Does everyone entrain? Along the same dimensions? What are the consequences of entrainment? Non-entrainment? Dis-entrainment? What types of entrainment can and should be modeled in SDS? 11/26/2018
5
Outline Previous work Lexical Acoustic-prosodic Discourse/social level
SDS Entrainment in the Columbia Games Corpus The Corpus New approaches to lexical entrainment Pilot studies Conclusions and Future Directions 11/26/2018
6
Outline Previous work Lexical Acoustic-prosodic Discourse/social level
SDS Entrainment in the Columbia Games Corpus The Corpus New approaches to lexical entrainment Pilot studies Conclusions and Future Directions 11/26/2018
7
Lexical Entrainment Gricean prediction for choice of referring expression People use descriptions that minimally but effectively distinguish among items in the discourse – Maxim of Quantity Garrod & Anderson ’87 Output/Input Principle Conversational partners formulate their current utterance according to the model used to interpret their partner’s most recent utterance Clark, Brennan, et al’s Conceptual Pacts People make Conceptual Pacts wrt appropriate referring expressions with particular conversational partners Reluctant to abandon these even when shorter expressions would be sufficient (e.g. ‘red car’ even no other cars visible) More on clark & brennan? 11/26/2018
8
Entrainment in Acoustic/Prosodic Dimensions
How are speech timing and voice quality affected by an (unfamiliar) conversational partner? (Sherblom & La Riviere ’87) Study: 65 ugrad pairs asked to discuss a ‘problem situation’ together Utter a single sentence before and after the conversation Sentences compared for speaking rate, utterance length and vocal jitter Results: Substantial influence of partner on all 3 measures Gender, interpersonal uncertainty and differences in arousal influenced degree of adaptation Other investigations of effect of gender: only when partner was male was their a significant effect of gender on adaptation Switch to another example? 11/26/2018
9
How early do we start to entrain?
Do children entrain to their mother’s speaking rate? (Guitar & Marchinkoski ’01) Study: 6 mothers with own (`normal’) 3-yr-olds (3M, 3F) Mothers’ speaking rate significantly reduced (B) or not (A) in A-B-A-B design Results: 5/6 children reduced rate when their mothers spoke more slowly 11/26/2018
10
Do humans adapt to the behavior of non-human partners
Do humans adapt to the behavior of non-human partners? (Coulston et al ’02) Study: yr olds interacted with an extroverted, loud animated character and with an introverted, soft character (TTS voices) Multiple tasks using different amplitude ranges and response latencies Results: 79-94% of children adapted their amplitude, bi-directionally Also adapted response latencies (mean 18.4%), bidirectionally 11/26/2018
11
Social Entrainment Do speakers adapt to the style of other social classes? (Azuma ’97) Study: Emperor Hirohito visits the countryside Corpus-based study of speech style of Japanese Emperor Hirohito during chihoo jyunkoo (`visits to countryside‘), Published transcripts of speeches Findings: Emperor Hirohito converged his speech style to that of listeners lower in social status Choice of verb-forms, pronouns no longer those of person with highest authority Perceived as like those of a (low-status) mother Before, used pronouns used only by the highest authority (only one recorded instance of such a speech tho) 11/26/2018
12
Do speakers adapt in cultural markers? (Roth ’05): Context
High school in NE with predominantly African-American student body Co-teachers: Cristobal: Cuban-African-American teacher Chris: new Italian-American teacher Adaptation of Chris to Cristobal Catch phrases (e.g. right!, really really hot) and their production: pitch and intensity contours Pitch ‘matching’ across speakers Mimesis vs entrainment Right rises sharply Really really downstepped and then higher pitch on modifier 11/26/2018
13
Entrainment in SDS If users entrain to systems, systems can
Predict vocabulary better and improve recognition Influence other user behavior such as speaking rate or amplitude to improve recognition If systems entrain to users they might Improve task performance Enhance user satisfaction Is entrainment feasible, given current technology? 11/26/2018
14
KTH’s Waxholm System 11/26/2018
15
Verb Priming: How often do you go abroad on holiday?
Hur ofta åker du utomlands på semestern? Hur ofta reser du utomlands på semestern? jag åker en gång om året kanske jag åker ganska sällan utomlands på semester jag åker nästan alltid utomlands under min semester jag åker ungefär 2 gånger per år utomlands på semester jag åker utomlands nästan varje år jag åker utomlands på semestern varje år jag åker utomlands ungefär en gång om året jag är nästan aldrig utomlands en eller två gånger om året en gång per semester kanske en gång per år ungefär en gång per år åtminståne en gång om året nästan aldrig jag reser en gång om året utomlands jag reser inte ofta utomlands på semester det blir mera i arbetet jag reser reser utomlands på semestern vartannat år jag reser utomlands en gång per semester jag reser utomlands på semester ungefär en gång per år jag brukar resa utomlands på semestern åtminståne en gång i året en gång per år kanske en gång vart annat år varje år vart tredje år ungefär nu för tiden inte så ofta varje år brukar jag åka utomlands 11/26/2018
16
CMU’s Let’s Go Lab 11/26/2018
17
Systems Entraining to Users
Let’s Go adapts confirmation prompts to speech of non-native users, finding the closest match to user input in its own grammar and lexicon (Raux & Eskenazi 2004) Check this – nothing beyond? 11/26/2018
18
Outline Previous work Lexical Acoustic-prosodic Discourse/social level
SDS Entrainment in the Columbia Games Corpus The Corpus New approaches to lexical entrainment Pilot studies Conclusions and Future Directions 11/26/2018
19
Entrainment in the Columbia Games Corpus
Joint work with Agus Gravano and Ani Nenkova (ACL 2008) Corpus-based approach to multiple dimensions of entrainment Questions: What types of entrainment occur? How should we measure entrainment? What are the consequences of entrainment? Of dis-entrainment? How much should/can entrainment be modeled in SDS? 11/26/2018
20
The Columbia Games Corpus
12 spontaneous task-oriented dyadic conversations (9h 8m speech) 2 subjects play series of computer games, no eye contact (45m 39s mean session time) 2 sessions per subject, w/different partners Multiple games and types Recorded on separate channels in soundproof booth, digitized and downsampled to 16k All user and system behaviors logged ~73K words 11/26/2018
21
Cards Game #1 Player 1 (Describer) Player 2 (Searcher)
Short monologues Vary frequency and order of occurrence of objects on the cards. 11/26/2018
22
Cards Game #2 Player 1 (Describer) Player 2 (Searcher) Dialogue
Vary frequency and order of occurrence of objects on the cards across speakers. 11/26/2018
23
Objects Game Follower must place the target object where it appears on the Describer’s screen solely via the description provided (4h 19m) Describer: Follower: 11/26/2018
24
Annotation Orthographic transcription and alignment (~73k words)
Intonation, using ToBI conventions Laughs, coughs, breaths, smacks, throat-clearings. Self-repairs Function (10 categories) of affirmative cue words (alright, mm-hm, okay, right, uh-huh, yeah, yes, …) Question form and function Turn-taking behaviors 11/26/2018
25
Entrainment in Referring Expressions
S13: the orange M&M looking kind of scared and then a one on the bottom left and a nine on the bottom right S12: alright I have the exact same thing I just had it's an M&M looking scared that's orange S13: yeah the scared M&M guy yeah S12: framed mirror and the scared M&M on the lower right S13: and it's to the right of the scared M&M guy S13: yeah and the iron should be on the same line as the frightened M&M kind of like an L S12: to the left of the scared M&M to the right of the onion and above the iron Neither S12 nor S13 carried this term on to their second session. Neither repeats this description in their second session. 11/26/2018
26
Entrainment and High-Frequency Words
Lexical entrainment: agreeing on a common vocabulary (Niederhoffer & Pennebaker ’02) How does entrainment to another’s use of HFWs (the N most common words in a corpus) affect task success in dialogue? Data: Columbia Games Corpus subset 48 tasks Recall that: Game scores available for each game Labeled for cue phrases, turn-taking, and other behaviors 11/26/2018
27
Experiments Does entrainment in use of 25 most frequent words in the corpus (HFW-C)? the game (HFW-G)? correlate with task success as defined by game scores? Does entrainment in use of affirmative cue words (ACW) correlate with task success? Does entrainment in use of filled pauses (FP) correlate with task success? Are dialogues smoother, more coordinated when entrainment occurs? ACW represent 7.9% of all words in corpus 11/26/2018
28
Entrainment Metric I Where fraction(w, Si) Fraction of times Speaker i used word w in the conversation Range: -1 (no entr.) to 0 (complete entr.) Generalize to word classes: Intuition: Do speakers use particular word tokens in like proportions to all words they use in a conversation? The greater the difference, the less the degree of entrainment. Where c = Word class 11/26/2018
29
Entrainment Metric II Where
countSi(w) = No. of times Si used word w in the conversation Range: -1 (no entr.) to 0 (complete entr.) 11/26/2018
30
Correlations with Game Score
NHFW-X: # High Frequency Words in Corpus (C), Game (G) ACW: Affirmative Cue Words Word class ENTR1 cor (p) ENTR2 cor (p) 25HFW-C (0.02) (0.20) 25HFW-G (0.01) (0.07) ACW (0.12) (0.01) 11/26/2018
31
Correlations with Game Score
NHFW-X: # High Frequency Words in Corpus (C), Game (G) ACW: Affirmative Cue Words Word class ENTR1 cor (p) ENTR2 cor (p) 25HFW-C (0.02) (0.20) 25HFW-G (0.01) (0.07) ACW (0.12) (0.01) 11/26/2018
32
Correlations with Game Score
NHFW-X: # High Frequency Words in Corpus (C), Game (G) ACW: Affirmative Cue Words No correlation for filled pauses Word class ENTR1 cor (p) ENTR2 cor (p) 25HFW-C (0.02) (0.20) 25HFW-G (0.01) (0.07) ACW (0.12) (0.01) 11/26/2018
33
Entrainment & Dialogue Coordination
Are dialogues more coordinated when entrainment occurs? Columbia Games Corpus Labeled for type of turn exchange (Beattie, 1982), including: Smooth Switch: S2 starts his turn after S1 has finished hers Interruption: S2 starts his turn before S1 has finished hers Overlap: S2 starts his turn just before S1 has finished hers, but S1 finishes her turn 11/26/2018
34
Significant Correlations (p<.05)
ENTR1(ACW) & Prop. of Overlaps (cor = ) ENTR2(ACW) & Prop. of Overlaps (cor = ) ENTR2(ACW) & Mean Latency of Smooth Switches (cor = – 0.76) ENTR2(25MF-G) & Prop. of Overlaps (cor = ) ENTR1(25MF-C) & Prop. of Interruptions (cor = – 0.61) HFW/ACW entrainment positively correlated with more overlaps, fewer interruptions, and shorter inter-turn latencies 11/26/2018
35
Acoustic/Prosodic Pilot Studies: Speaking Rate
Do speakers entrain in (mean) speaking rate to their (different) partners? 11/26/2018
36
Acoustic/Prosodic Pilot Studies: Speaking Rate
Do speakers entrain in (mean) speaking rate to their (different) partners? 11/26/2018
37
Outline Previous work Lexical Acoustic-prosodic Discourse/social level
SDS Entrainment in the Columbia Games Corpus The Corpus New approaches to lexical entrainment Pilot studies Conclusions and Future Directions 11/26/2018
38
Current/Future Work Examine additional dimensions of entrainment and how they correlate with task success, perceived naturalness Acoustic/prosodic Pitch, intensity, rate measures Voice quality, contour Discourse: Do people entrain in styles of topic shift, use of cue phrases, turn-taking behaviors? Laughter, disfluencies Personality? Explore additional measures of entrainment 11/26/2018
39
Experiment with system entrainment to users in SDS
11/26/2018
40
Thank you! 11/26/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.