1 A suitable place to speak Jens Edlund & Mattias Heldner Presented 2004-09-21 at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Jack Jedwab Association for Canadian Studies September 27 th, 2008 Canadian Post Olympic Survey.
1 A B C
Simplifications of Context-Free Grammars
Variations of the Turing Machine
AP STUDY SESSION 2.
1
& dding ubtracting ractions.
Select from the most commonly used minutes below.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
STATISTICS HYPOTHESES TEST (I)
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
David Burdett May 11, 2004 Package Binding for WS CDL.
We need a common denominator to add these fractions.
LAW 11 Offside.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
Create an Application Title 1Y - Youth Chapter 5.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt RhymesMapsMathInsects.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
The 5S numbers game..
1 00/XXXX © Crown copyright Carol Roadnight, Peter Clark Met Office, JCMM Halliwell Representing convection in convective scale NWP models : An idealised.
Media-Monitoring Final Report April - May 2010 News.
Break Time Remaining 10:00.
This module: Telling the time
The basics for simulations
EE, NCKU Tien-Hao Chang (Darby Chang)
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Localisation and speech perception UK National Paediatric Bilateral Audit. Helen Cullington 11 April 2013.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
1 The Royal Doulton Company The Royal Doulton Company is an English company producing tableware and collectables, dating to Operating originally.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
1 Termination and shape-shifting heaps Byron Cook Microsoft Research, Cambridge Joint work with Josh Berdine, Dino Distefano, and.
Artificial Intelligence
1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.
Before Between After.
Subtraction: Adding UP
: 3 00.
5 minutes.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Types of selection structures
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
1 Titre de la diapositive SDMO Industries – Training Département MICS KERYS 09- MICS KERYS – WEBSITE.
Essential Cell Biology
Converting a Fraction to %
Clock will move after 1 minute
famous photographer Ara Guler famous photographer ARA GULER.
PSSA Preparation.
Physics for Scientists & Engineers, 3rd Edition
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Presentation transcript:

1 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens A suitable place to speak: On turn-taking for a conversational computer Mattias Heldner & Jens Edlund KTH Centre for Speech Technology (CTT) Seminar given at KTH

2 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Structure of conversation Conversation is characterized by turn-taking One participant, A, talks, stops; another participant, B, starts, talks, stops A-B-A-B-A-B etc. Gaps and overlaps minimized How is this acheived?

3 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Turns and turn-taking Goodwin, C. (1981). Conversational organization: Interaction between speakers and hearers. New York: Academic Press. In the abstract, the phenomenon of turn-taking seems quite easy to define. The talk of one party bounded by the talk of others constitutes a turn, with turn-taking being the process through which the party doing the talk of the moment in changed.

4 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Conversation analysis (CA) theory of turn-taking Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn- taking for conversation. Language, 50(4), Turn-taking is (1) an emergent property of (2) local decisions based on (3) prediction by the participants Turn-taking rules

5 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens TCUs and TRPs Turns are composed of smaller turn-constructional units (TCUs) The end of a TCUs is a transition-relevance place (TRP) TRPs are predictable to the listeners A set of rules that govern the transition of speakers come into play at the TRP

6 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Turn-taking rule 1a For any turn, at the initial TRP of an initial TCU: If the turn-so-far is so constructed as to involve the use of a current speaker selects next technique, then the party so selected has the right and is obliged to take next turn to speak; no others have such rights or obligations, and transfer occurs at that place

7 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Rule 1b If the turn-so-far is so constructed as not to involve the use of a current speaker selects next technique, then self- selection for next speakership may, but need not, be instituted; first starter acquires rights to a turn, and transfer occurs at that place.

8 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Rule 1c If the turn-so-far is so constructed as not to involve the use of a current speaker selects next technique, then current speaker may, but need not continue, unless another self-selects.

9 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Rule 2 (If, at the initial transition-relevance place of an initial turn-constructional unit, neither 1a nor 1b has operated, and, following the provision of 1c, current speaker has continued, then the rule-set a–c re- applies at the next transition relevance place, and recursively at each next transition relevance place, until transfer is effected.

10 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Predictions by the rules One speaker at a time Overlaps occur either as competing first starts, or, where TRPs have been misprojected

11 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Simplified rule 1 Rule 1 – applies initially at the first TRP of any turn If C selects N in current turn, then C must stop speaking, and N must speak next, transition occurring at the first TRP after N selection If C does not select N, then any (other) party may self-select, first speaker gaining rights to the next turn If C has not selected N, and no other party self-selects under option (b), then C may (but need not) continue (i.e. claim rights to a further TCU)

12 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Simplified rule 2 Rule 2 – applies at all subsequent TRPs When Rule 1(c) has been applied by C, then at the next TRP Rules 1 (a)–(c) apply, and recursively at the next TRP, until speaker change is effected

13 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens What exactly is a TCU? Syntactic unit TRPs occur at possible completion points of sentences, clauses, phrases, and one-word constructions Intonation also important

14 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Problems with CA theory of turn-taking Syntactic (and semantic and pragmatic) categories can be very difficult to segment in spoken dialogue Spontaneous conversation is not always well-formed – fragmentary and/or ungrammatical utterances common Non-verbal signals not included

15 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Psychologists working on conversation Duncan, S., Jr. (1972). Some signals and rules for taking speaking turns in conversations. Journal of Personality and Social Psychology, 23(2), Turn-taking is regulated by explicit signals A current speaker signals when he/she intends to hand over the floor – turn-yielding No single cue is required to display a signal

16 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Turn-taking rules The listener may take his speaking turn when the speaker gives a turn-yielding signal. An attempt-supressing signal displayed by the speaker maintains the turn for him, regardless of the number of yielding cues concurrently being displayed. Back-channel communication does not constitute a turn or a claim for a turn.

17 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Turn-yielding signals Intonation: Rising or falling terminal junctures (boundary tones) Paralanguage: Drawl on the final syllable (final lengthening) Body motion: Termination of any hand gesticulation Sociocentric sequences: e.g. but uh, or something, you know Paralanguage: Drop in pitch and loudness Syntax: Completion of a grammatical clause

18 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Gaze Kendon, A. (1967). Some functions of gaze-direction in social interaction. Acta Psychologia, 26, A speaker will break mutual gaze while speaking, returning gaze to the addressee upon turn completion

19 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Problems with signaling view Simulaneous speaking occurs either because the listener attempts ot take his speaking turn in the absence of a turn-yielding signal by the speaker or if the speaker displays a yielding signal, and the listener acts to take his turn, and the original speaker then continues to claim his speaking turn.

20 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Synthesis of CA and psycho theories Signals indicating the completion of turn- constructional units do indeed occur Signals are the features that conversants use to identify the turn-constructional units and their boundaries. Much subsequent work on turn-taking has tried to analyze what features are used to signal a TRP

21 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Final major accents Wells, B., & MacFarlane, S. (1998). Prosody as an interactional resource: turn projection and overlap. Language and Speech, 41(3-4), Define the TRP as the space between the TRP- projecting accent of the current turn and the onset of the next turn TRP-projecting accent = final major accent (focal accent)

22 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Boundary tones Caspers, J. (2003). On the function of low and high boundary tones in Dutch dialogue. In Proceedings ICPhS 2003 (pp ). Barcelona. Caspers, J. (2003). Local speech melody as a limiting factor in the turn-taking system in Dutch. Journal of Phonetics, 31,

23 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Boundary tones... High boundary tone associated with obligatory aspects of turn-taking change of turn, e.g. answer to a question turn holding, e.g. continued speech after a pause following an incomplete message Low boundary tone associated with optional aspects of turn-taking completeness of a domain

24 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Incomings French, P., & Local, J. (1983). Turn-competitive incomings. Journal of Pragmatics, 7(1), Turn-competitive incomings i.e. interruptions – before final major accent, Non-turn competitive incomings i.e. backchannels/asides – after final major accent

25 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Resolution of overlap One speaker drops out rapidly Recycling of the part obscured by overlap Competitive allocation – the speaker who upgrades (increases intensity, slows tempo, etc.) most wins the floor

26 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Summarizing Turn-endings predictable Gap and overlap minimized Syntactic, semantic, pragmatic completeness Gaze, head nods, hand gestures, facial expressions Prosody! Boundary tones, accents, speaking rate, silent pauses, voice quality etc.

27 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Summarizing...

28 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens A suitable place to speak...

29 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Ultimately a conversational computer should be able to: perceive turn-keeping and turn-yielding signals initiate turns after turn-yielding signals to make non-competitive and turn-competitive incomings react to incomings from other participants avoid interrupting human participants – it must be unobtrusive!

30 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Prosodic boundaries Turns where the speaker is allowed to finish end in a prosodic boundary Prosodic boundary turn-taking position Prosodic boundaries predictable to listeners from left- hand context only Prosodic rather than lexico-grammatical information the primary cue To some extent detectable using prosodic feature vectors and statistical classifiers

31 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Goal Ultimate goal: Online prediction of acceptable places for turn-takings, as well as of impossible ones, for a conversational computer A step towards this goal: Exploring the relation between turn-taking and prosodic boundaries Two experiments: A listening test and a production experiment

32 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Listening test Made-up turn-takings Fragment of a seminar followed by fragment of a question: what about could you give us some rough idea what Turn-takings in no boundary, weak boundary and strong prosodic boundary positions Task: to rate whether the questions occurred in appropriate places on a five-point scale

33 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens 1. Turn-taking at a strong boundary

34 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens 2. Turn-taking at a weak boundary

35 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens 3. Turn-taking at a no boundary

36 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens No boundaryWeak boundaryStrong boundary

37 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Mean scoreStd dev No boundary Weak boundary Strong boundary TOTAL

38 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Results by stimuli All strong boundary stimuli got higher means than the total of the experiment Nine out of ten no boundary stimuli got lower means than the total More variation in the weak boundaries

39 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Production experiment Same speech material as in listening test Subjects pressed a button when they thought it was appropriate to take the turn Demo…

40 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Demo

41 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Results of production experiment Clear preference for strong boundaries (77%) Most of the strong boundaries (84%) used for turn- taking at least once

42 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Timing differences Silence before question dependent on boundary type: Strong boundary 1 s, weak boundary 0.6 s, no boundary 0.02 s. Future work: Check whether the length of the silence should be governed by the prosodic boundary strength

43 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Discussion Strong boundaries prefered both in listening and production experiments Exceptions from the general trend due to semantic and pragmatic factors: Simultaneous starts Enumerations

44 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Conclusions Turn-taking closely related to prosodic boundaries Appropriate to take the turn after strong boundaries in this communicative situation If we can predict strong boundaries, we can predict possible places for turn-taking

45 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Your turn to work...

46 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Finding a suitable place to speak How to identify prosodic boundaries to find strong boundaries and to avoid weak and no boundaries? Preliminary results of acoustic analysis in the rhymes of the last words before the turn-takings

47 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Boundary tones Level (less than 1 ST) Fall Rise Fall-rise Rise-fall

48 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens F0 range Cumulative mean ±2 standard deviations based on semitone transformed F0 data High, mid and low registers Stabilizes after about 20 seconds

49 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens F0 range with F0 curve

50 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Other measures Silent intervals Final lengthening Average z-score normalized duration of the segments in the word-final rhyme Z-score normalized duration of the word-final segment

51 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Region 11

52 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Region 9

53 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Region 10

54 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Region 15

55 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens

56 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Region 24

57 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Turn-keeping Duncan, 1972; 2 2 | (English) Selting, 1996; level pitch before pause (German) Caspers, 2003; level boundary tone (Dutch) Noguchi & Den, 1998; flat intonation at the end of pause bounded phrases (Japanese)

58 A suitable place to speak Jens Edlund & Mattias Heldner Presented at KTH, the Department for Speech, Music and Hearing, by Mattias & Jens Thank you!