Results Clear distinction between two question intonations: perception and understanding level Three distinct prototypes for different interpretations.

Slides:



Advertisements
Similar presentations
The Role of F0 in the Perceived Accentedness of L2 Speech Mary Grantham O’Brien Stephen Winters GLAC-15, Banff, Alberta May 1, 2009.
Advertisements

Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
CS 551/651: Structure of Spoken Language Lecture 12: Tests of Human Speech Perception John-Paul Hosom Fall 2008.
Speech Synthesis Markup Language SSML. Introduced in September 2004 XML based Assists the generation of synthetic speech Specifies the way speech is outputted.
Fluency Grades 2-5 Planning Session Presentation October 2010.
Prosodic facilitation and interference in the resolution of temporary syntactic closure ambiguity Kjelgaard & Speer 1999 Kent Lee Ψ 526b 16 March 2006.
/ nailon / – software for online analysis of prosody Interspeech 2006 special session: The prosody of turn-taking and dialog acts September 20, 2006 Jens.
Using prosody to avoid ambiguity: Effects of speaker awareness and referential context Snedeker and Trueswell (2003) Psych 526 Eun-Kyung Lee.
Outlines  Objectives  Study of Thai tones  Construction of contextual factors  Design of decision-tree structures  Design of context clustering.
Nuclear Accent Shape and the Perception of Prominence Rachael-Anne Knight Prosody and Pragmatics 15 th November 2003.
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
Prosodic Signalling of (Un)Expected Information in South Swedish Gilbert Ambrazaitis Linguistics and Phonetics Centre for Languages and Literature.
TT Centre for Speech Technology Early error detection on word level Gabriel Skantze and Jens Edlund Centre for Speech Technology.
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
Dr. O. Dakkak & Dr. N. Ghneim: HIAST M. Abu-Zleikha & S. Al-Moubyed: IT fac., Damascus U. Prosodic Feature Introduction and Emotion Incorporation in an.
Word-edge tones in Catalan Pilar Prieto ICREA and UAB 2004 TIE Workshop Santorini, 9-11 September 2004.
The HIGGINS domain The primary domain of HIGGINS is city navigation for pedestrians. Secondarily, HIGGINS is intended to provide simple information about.
HIGGINS Error handling strategies in a spoken dialogue system Rolf Carlson, Jens Edlund and Gabriel Skantze Error handling research issues The long term.
HIGGINS A spoken dialogue system for investigating error handling techniques Jens Edlund, Gabriel Skantze and Rolf Carlson Scenario User:I want to go to.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Error detection in spoken dialogue systems GSLT Dialogue Systems, 5p Gabriel Skantze TT Centrum för talteknologi.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Attention Limited amount of mental resources Mental “resources” = general term could refer mental processes, mental representations, or mental structures.
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
Dianne Bradley & Eva Fern á ndez Graduate Center & Queens College CUNY Eliciting and Documenting Default Prosody ABRALIN23-FEB-05.
1 error handling – Higgins / Galatea Dialogs on Dialogs Group July 2005.
Error Detection in Human-Machine Interaction Dan Bohus DoD Group, Oct 2002.
Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.
Chapter three Phonology
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
A Question of Questions: Prosodic Cues to Question Form and Function Julia Hirschberg (Joint work with) Jennifer Venditti and Jackson Liscombe.
1 Back Channel Communication Antoine Raux Dialogs on Dialogs 02/25/2005.
MUSCLE Multimodal e-team related activity Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Prof. Alex Potamianos Technical.
Toshiba Update 14/09/2005 Zeynep Inanoglu Machine Intelligence Laboratory CU Engineering Department Supervisor: Prof. Steve Young A Statistical Approach.
Experiments concerning boundary tone perception in German 3 rd Workshop of the SPP-1234 Potsdam, 7 th January 2009 Presentation of the Stuttgart Project.
Perceived prominence and nuclear accent shape Rachael-Anne Knight LAGB 5 th September 2003.
LISTENING.
Multi-modal expression of Swedish prominence Björn Granström Centre for Speech Technology, Department of Speech, Music and Hearing, KTH, Stockholm, Sweden.
A Study in Cross-Cultural Interpretations of Back-Channeling Behavior Yaffa Al Bayyari Nigel Ward The University of Texas at El Paso Department of Computer.
Prosody-driven Sentence Processing: An Event-related Brain Potential Study Ann Pannekamp, Ulrike Toepel, Kai Alter, Anja Hahne and Angela D. Friederici.
Lecture 9 Usability of Health Informatics Applications (Chapter 9)
circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.
SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
Turn-taking Discourse and Dialogue CS 359 November 6, 2001.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
Wrapping Up Ling575 Spoken Dialog Systems June 5, 2013.
HMM training strategy for incremental speech synthesis.
Turn-taking and Backchannels Ryan Lish. Turn-taking We all learned it in preschool, right? Also an essential part of conversation Basic phenomenon of.
Speech Perception.
Intonation Lecture 11.
User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Gabriel Skantze, David House & Jens Edlund.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
1 Spoken Dialogue Systems Error Detection and Correction in Spoken Dialogue Systems.
On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.
COMMUNICATION Pages 4-6. Michigan Merit Curriculum Standard 7: Social Skills – 4.9 Demonstrate how to apply listening and assertive communication skills.
2014 Development of a Text-to-Speech Synthesis System for Yorùbá Language Olúòkun Adédayọ̀ Tolulope Department of Computer Science.
Non Verbal Communication. Program Objectives (1 of 2)  Hone your interpersonal advantages while interacting with others.  Recognize how the eyes, face,
INTONATION And IT’S FUNCTIONS
Investigating Pitch Accent Recognition in Non-native Speech
Studying Intonation Julia Hirschberg CS /21/2018.
Meanings of Intonational Contours
Studying Intonation Julia Hirschberg CS /21/2018.
Spoken Dialogue Systems
Intonational and Its Meanings
Speech Perception.
Meanings of Intonational Contours
Representing Intonational Variation
Spoken Dialogue Systems
Low Level Cues to Emotion
Presentation transcript:

Results Clear distinction between two question intonations: perception and understanding level Three distinct prototypes for different interpretations Future work Will be implemented and tested in Higgins to evaluate user responses and dialogue efficiency Corpus study of human-human dialog Multi-syllable, multi-word, accent II Back-channels (hmm, eh) Multimodal synthesis (max 48) The Effects of Prosodic Features on the Interpretation of Clarification Ellipses Jens Edlund, David House and Gabriel Skantze Abstract In this paper, the effects of prosodic features on the interpretation of elliptical clarification requests in dialogue are studied. An experiment is presented where subjects were asked to listen to short human-computer dialogue fragments in Swedish, where a synthetic voice was making an elliptical clarification after a user turn. The prosodic features of the synthetic voice were systematically varied, and the subjects were asked to judge what was actually intended by the computer. The results show that an early low F0 peak signals acceptance, that a late high peak is perceived as a request for clarification of what was said, and that a mid high peak is perceived as a request for clarification of the meaning of what was said. Setting Levels of understanding Allwood et al. (1992), Clark (1996) Ellipsis interpretation Errors and clarification in dialog Dialog not always error free Error detection often made using explicit or implicit spoken clarification/verification: User[…] on the right I see a red building. System (low conf.) Did you say ’A red building’? System (high conf.)A red building… ok, take a left […]? Traditionally: Clarification Ellipses User[…] on the right I see a red building. Systemred(?) Advantages: Level AcceptanceH accepts what S says UnderstandingH understands what S says PerceptionH hears what S says ContactH hears that S speaks Experiment 8 subjects judged the meaning of one-word elliptical clarification requests in dialogue context Task:Select paraphrase for elliptical system utterance Swedish System utterance: red, blue, yellow F 0 peak position: early, mid, late F 0 peak height: low, high Vowel duration: normal, long = 36 stimuli LUKAS diphone MBROLA synthesis LevelParaphrase Signal AcceptanceOk, red. Clarify UnderstandingDo you really mean red? Clarify PerceptionDid you say red? The Higgins spoken dialog system for pedestrian navigation No effects for: Subject Color Duration Prototypes: Accept: Early low peak Clarify Understanding: Mid high peak Clarify Perception: Late high peak The Problem Elliptical one-word clarification requests are potentially ambiguous Little syntax and structure Prosody more critical How do prosodic features affect the interpretation of these utterances? Constructed as full propositions Often perceived as tedious Clarifies entire user utterances Fast Focuses on problematic fragment Often used in human-human dialog Question intonation Swedish question intonation Raised top-line and widened F0 range on focal accent (Gårding, 1998) Delayed focal peak (House, 2003) German dialog Rodriguez & Schlangen (2004) Rising boundary tones to clarify acoustic problems (perception) Used less for reference resolution (understanding)