Detecting missrecognitions Predicting with prosody.

Slides:



Advertisements
Similar presentations
1 Using the HTK speech recogniser to analyse prosody in a corpus of German spoken learners English Toshifumi Oba, Eric Atwell University of Leeds, School.
Advertisements

Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
TT Centre for Speech Technology Early error detection on word level Gabriel Skantze and Jens Edlund Centre for Speech Technology.
Dialogue Management Ling575 Discourse and Dialogue May 18, 2011.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Error detection in spoken dialogue systems GSLT Dialogue Systems, 5p Gabriel Skantze TT Centrum för talteknologi.
Error Analysis: Indicators of the Success of Adaptation Arindam Mandal, Mari Ostendorf, & Ivan Bulyko University of Washington.
Characterizing and Recognizing Spoken Corrections in Human-Computer Dialog Gina-Anne Levow August 25, 1998.
Understanding Spoken Corrections in Human-Computer Dialogue Gina-Anne Levow University of Chicago MAICS April 1, 2006.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Spoken Language Processing Lab Who we are: Julia Hirschberg, Stefan Benus, Fadi Biadsy, Frank Enos, Agus Gravano, Jackson Liscombe, Sameer Maskey, Andrew.
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
Extracting Social Meaning Identifying Interactional Style in Spoken Conversation Jurafsky et al ‘09 Presented by Laura Willson.
ASR Evaluation Julia Hirschberg CS Outline Intrinsic Methods –Transcription Accuracy Word Error Rate Automatic methods, toolkits Limitations –Concept.
1 error handling – Higgins / Galatea Dialogs on Dialogs Group July 2005.
Error Detection in Human-Machine Interaction Dan Bohus DoD Group, Oct 2002.
Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
“k hypotheses + other” belief updating in spoken dialog systems Dialogs on Dialogs Talk, March 2006 Dan Bohus Computer Science Department
1 Back Channel Communication Antoine Raux Dialogs on Dialogs 02/25/2005.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman and Kate Forbes-Riley University of Pittsburgh Pittsburgh, PA USA.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
® Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists Klaus Zechner, John Sabatini and Lei Chen Educational Testing Service.
Clarification in Spoken Dialogue Systems: Modeling User Behaviors Julia Hirschberg Columbia University 1.
Speech Recognition ECE5526 Wilson Burgos. Outline Introduction Objective Existing Solutions Implementation Test and Result Conclusion.
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.
Evaluation of SDS Svetlana Stoyanchev 3/2/2015. Goal of dialogue evaluation Assess system performance Challenges of evaluation of SDS systems – SDS developer.
Automatic Detection of Plagiarized Spoken Responses Copyright © 2014 by Educational Testing Service. All rights reserved. Keelan Evanini and Xinhao Wang.
Adaptive Spoken Dialogue Systems & Computational Linguistics Diane J. Litman Dept. of Computer Science & Learning Research and Development Center University.
Turn-taking Discourse and Dialogue CS 359 November 6, 2001.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Using Word-level Features to Better Predict Student Emotions during Spoken Tutoring Dialogues Mihai Rotaru Diane J. Litman Graduate Research Competition.
Shriberg & Stolcke: Harnessing Prosody for HCI NASA IS-HCC Meeting, Feb , Elizabeth Shriberg Andreas Stolcke Speech Technology and Research.
Spoken Dialog Systems Diane J. Litman Professor, Computer Science Department.
1/17/20161 Emotion in Meetings: Business and Personal Julia Hirschberg CS 4995/6998.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Adapting Dialogue Models Discourse & Dialogue CMSC November 19, 2006.
Misrecognitions and Corrections in Spoken Dialogue Systems Diane Litman AT&T Labs -- Research (Joint Work With Julia Hirschberg, AT&T, and Marc Swerts,
circle Spoken Dialogue for the Why2 Intelligent Tutoring System Diane J. Litman Learning Research and Development Center & Computer Science Department.
1 Spoken Dialogue Systems Error Detection and Correction in Spoken Dialogue Systems.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.
Yow-Bang Wang, Lin-Shan Lee INTERSPEECH 2010 Speaker: Hsiao-Tsung Hung.
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Linguistic knowledge for Speech recognition
Investigating Pitch Accent Recognition in Non-native Speech
Towards Emotion Prediction in Spoken Tutoring Dialogues
Course Projects Speech Recognition Spring 1386
Deep Exploration and Filtering of Text (DEFT)
Error Detection and Correction in SDS
Spoken Dialogue Systems
Speech Technology for Language Learning
Automatic Fluency Assessment
Prosody in Recognition/Understanding
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Automatic Speaker Identification Using Sentinel Word Discrimination
Spoken Dialogue Systems
ECE5526 HW#1 By Clay McCreary.
Emotional Speech Julia Hirschberg CS /16/2019.
Spoken Dialogue Systems
Automatic Prosodic Event Detection
Presentation transcript:

Detecting missrecognitions Predicting with prosody

Missrecognitions - papers “Predicting automatic speech recognition performance using prosodic cues” - TooT “Generalizing prosodic prediction of speech recognition errors” – W99

Missrecognitions - generalities What are they? WER – Word error rate CA – concept accuracy Why it is important to detect them? User dificulty to correct system missundertandings User frustration by unnecessary confirmations or rejections

Prosody to the rescue!!! Prosodic features used: Fundamental frequency (f0) Energy (rms) Duration of speaker turn (dur) Pause preceding turn (ppau) Speaking rate (tempo) Silence in speaker turn (zeros)

Predicting Missrecognitions - results Rule based learner (RIPPER) Characteristics of missrecognitions: Higher in pitch Louder, longer Less internal space Improved prediction with prosody TooT – 6.53% vs 22.23% W99 – 22.77% vs 26.14%

Predicting Missrecognitions - comments Is WER a adequate measure? Do we model the ASR capabilities or its training set? Comparing with ASR confidence score learning is ok?

Detecting user corrections Predicting with prosody

User corrections - papers “Corrections in spoken dialog systems” “Identifying user corrections automatically in spoken dialog systems”

User corrections - generalities What are they? Why it is important to detect them? Recognized much more poorly Tuning dialog strategies ASR for hyperarticulated speech Change of initiative and confirmation strategy

User corrections - insights Types: REP – repetition PAR – paraphrase ADD – content added OMIT – content omitted ADD/OMIT Characterized by prosodic features associated with hyperarticulation – but not the same

Predicting user corrections Rule based learner on TooT corpus Features: PROS, ASR, SYS, POS, DIA 15.72% error rate on Raw+ASR+ SYS+POS+PreTurn