Low Level Cues to Emotion

Slides:



Advertisements
Similar presentations
Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
Advertisements

Sub-Project I Prosody, Tones and Text-To-Speech Synthesis Sin-Horng Chen (PI), Chiu-yu Tseng (Co-PI), Yih-Ru Wang (Co-PI), Yuan-Fu Liao (Co-PI), Lin-shan.
Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Results: Word prominence detection models Each feature set increases accuracy over the 69% baseline accuracy. Word Prominence Detection using Robust yet.
Mandarin Chinese Speech Recognition. Mandarin Chinese Tonal language (inflection matters!) Tonal language (inflection matters!) 1 st tone – High, constant.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
Emotions in IVR Systems Julia Hirschberg COMS 4995/6998 Thanks to Sue Yuen and Yves Scherer.
Dr. O. Dakkak & Dr. N. Ghneim: HIAST M. Abu-Zleikha & S. Al-Moubyed: IT fac., Damascus U. Prosodic Feature Introduction and Emotion Incorporation in an.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Presented by Ravi Kiran. Julia Hirschberg Stefan Benus Jason M. Brenier Frank Enos Sarah Friedman Sarah Gilman Cynthia Girand Martin Graciarena Andreas.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Spoken Language Processing Lab Who we are: Julia Hirschberg, Stefan Benus, Fadi Biadsy, Frank Enos, Agus Gravano, Jackson Liscombe, Sameer Maskey, Andrew.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
Context in Multilingual Tone and Pitch Accent Recognition Gina-Anne Levow University of Chicago September 7, 2005.
Extracting Social Meaning Identifying Interactional Style in Spoken Conversation Jurafsky et al ‘09 Presented by Laura Willson.
Emotional Grounding in Spoken Dialog Systems Jackson Liscombe Giuseppe Riccardi Dilek Hakkani-Tür
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Cues to Emotion: Anger and Frustration Julia Hirschberg COMS 4995/6998 Thanks to Sue Yuen and Yves Scherer.
Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
Shriberg, Stolcke, Ang: Prosody for Emotion Detection DARPA ROAR Workshop 11/30/01 1 Liz Shriberg* Andreas Stolcke* Jeremy Ang + * SRI International International.
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
Turn-taking Discourse and Dialogue CS 359 November 6, 2001.
1 Computation Approaches to Emotional Speech Julia Hirschberg
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
New Acoustic-Phonetic Correlates Sorin Dusan and Larry Rabiner Center for Advanced Information Processing Rutgers University Piscataway,
Why predict emotions? Feature granularity levels [1] uses pitch features computed at the word-level Offers a better approximation of the pitch contour.
Using Word-level Features to Better Predict Student Emotions during Spoken Tutoring Dialogues Mihai Rotaru Diane J. Litman Graduate Research Competition.
1/17/20161 Emotion in Meetings: Business and Personal Julia Hirschberg CS 4995/6998.
Speech recognition Home Work 1. Problem 1 Problem 2 Here in this problem, all the phonemes are detected by using phoncode.doc There are several phonetics.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.
Yow-Bang Wang, Lin-Shan Lee INTERSPEECH 2010 Speaker: Hsiao-Tsung Hung.
A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,
Predicting Emotion in Spoken Dialogue from Multiple Knowledge Sources Kate Forbes-Riley and Diane Litman Learning Research and Development Center and Computer.
Investigating Pitch Accent Recognition in Non-native Speech
Deep learning David Kauchak CS158 – Fall 2016.
Towards Emotion Prediction in Spoken Tutoring Dialogues
Conditional Random Fields for ASR
Recognizing Disfluencies
Why Study Spoken Language?
Recognizing Structure: Dialogue Acts and Segmentation
Studying Intonation Julia Hirschberg CS /21/2018.
Spoken Dialogue Systems
Why Study Spoken Language?
Turn-taking and Disfluencies
Advanced NLP: Speech Research and Technologies
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Searching and Summarizing Speech
Searching and Summarizing Speech
Recognizing Disfluencies
Advanced NLP: Speech Research and Technologies
Spoken Dialogue Systems
Liz Shriberg* Andreas Stolcke* Jeremy Ang+ * SRI International
Agustín Gravano1,2 Julia Hirschberg1
Emotional Speech Julia Hirschberg CS /16/2019.
Spoken Dialogue Systems
Recognizing Structure: Dialogue Acts and Segmentation
Research on the Modeling of Chinese Continuous Speech Recognition
Ju Lin, Yanlu Xie, Yingming Gao, Jinsong Zhang
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
Tools for Speech Analysis
Acoustic-Prosodic and Lexical Entrainment in Deceptive Dialogue
Guest Lecture: Advanced Topics in Spoken Language Processing
Automatic Prosodic Event Detection
Presentation transcript:

Low Level Cues to Emotion Julia Hirschberg CS 4995/6998 8/6/2019

Liscombe et al ’05a Domain: Phone account information, How May I Help You? system Motivation Improve customer satisfaction Emotions examined: Negative, non-negative (collapsed from 7 classes) Corpus: 5690 dialogs, 20,013 user turns Training-test split: 75% - 25% ML method: BoostTexter, combines multiple weak classifiers

~80 Features Lexical: bag of words from transcripts 1,2,3grams Prosodic: Energy min, max, median, s.d., F0 min, max, median, s.d., mean slope Ratio of voiced frames to total (rate) Slope after final vowel (turn-final pitch contour) Mean F0 and energy over longest normalized vowel (accent) Syllables per second (rate) mean vowel length percent internal silence (hesitation) Local jitter over longest normalized vowel (VQ) Last 7 are semi-automatically extracted

Results Baseline 73.1% (majority) Lexical + prosodic features 76.1% Lexical + prosodic + dialog act features 77.0% Lexical + prosodic + dialog act + context 79.0%

Dialogue Act (DA) of current turn Context: Change in value of prosodic features from n-1 to n and n to n+1 Bag of words from two previous turns Edit difference between n-1 and n, n and n-2 DAs of n-1 and n-2 DAs of system prompts eliciting n and n-1 Hand-labeled emotion of n-1 and n-2