Turn-taking and Disfluencies Julia Hirschberg CS 4706 11/23/2018
Today Turn-taking behaviors Disfluencies Conversational Analysis Importance in real systems Disfluencies How to model? Detect? Role in human-human interaction Importance in real systems? 11/23/2018
Turn-taking Expected patterns of behavior Deviation is significant How do we find the patterns? Ordinary conversation Telephone talk Meetings Email? Who looks for these? 11/23/2018
Terminology Examples: Adjacency pairs Preference Pre-sequence Repair Telephone openings, closings Broadcasts 11/23/2018
Could this be useful when we build SDS? What do we expect to hear? What should we produce? 11/23/2018
Auditory Cues to Turn-Taking M. Schegloff “Reflections on studying prosody in talk-in-interaction, ” Language and Speech 41, 1999. (Michael Mu.) H. Koiso et al ‘99 “An analysis of turn-taking and backchannels based on prosodic and syntactic features…,” Language and Speech 41, 1999. (Sarah) 11/23/2018
Disfluencies and Self-Repairs Are these just ‘noise’? For people S. Brennan & M. Williams, “The Feeling of Another’s Knowing,” J Memory and Language 34, 1995. (Judd) S. Brennan & Schober, “How listeners compensate for disfluencies in spontaneous speech,” J Memory and Language 44, 2001. (Aron) For parsers For speech recognizers 11/23/2018
Hindle ’83: Finding the Edit Signal If we have it, can we ‘repair’ the self-repair automatically? Builds a correcting parser, Fidditch, for spontaneous speech Given a string with an edit signal marked, produces a ‘repaired’ version I was * I am really annoyed If X1 * X2 are similar linguistic elements separated by an edit signal, replace X1 w/X2 11/23/2018
What does it mean to be the same Same surface string Well if they’d * if they’d… Same category I was just that * the kind of guy… Same constituent I think that you get * it’s more strict in Catholic schools Restarts are completely different… I just think * Do you want something to eat? 11/23/2018
Bear et al ’92: Detecting and Correcting Self-Repairs Use multiple knowledge sources but not edit signal Lexical pattern matching Parsing failure + pattern matching + re-parsing Acoustic information: pause, peak F0, Cue words: well, no Fragments 11/23/2018
But…is there an edit signal? 11/23/2018
11/23/2018
RIM Model of Self-Repairs (Nakatani & Hirschberg ’94) ATIS corpus 6414 turns with 346 (5.4%) repairs, 122 speakers Hand-labeled for repairs and prosodic features Findings: Reparanda: 73% end in fragments, 30% in glottalization, co-articulatory gestures DI: pausal duration differs significantly from fluent boundaries,small increase in f0 and amplitude I.e. 346 repairs Reparanda: 73% end in fragments 30% end in glottalization (short, no decrease in energy or f0) co-articulatory gestures DI: signif diffs in duration of pause but can’t be used (alone) to predict disfluencies as too many false positive small increase in f0 and amplitude across to repair Repairs: offsets occur at phrase boundaries differences in phrasing compared to fluent speech CART prediction of repairs: 86% precision, 91% recall (192 of 223 interruption sites, 19 false positives). Features imp: dur of interva, fragment, pause filler, pos, lexical matching across DI. 11/23/2018
Does it identify self-repairs reliably? CART prediction: 86% precision, 91% recall Duration of interval, presence of fragment, pause filler, p.o.s., lexical matching across DI Are there edit signals? 11/23/2018
Next Week Spoken Dialogue Systems Andy, David and Vera reporting 11/23/2018