1 Computation Approaches to Emotional Speech Julia Hirschberg

Slides:



Advertisements
Similar presentations
Collaborating By: Mandi Schumacher.
Advertisements

Sub-Project I Prosody, Tones and Text-To-Speech Synthesis Sin-Horng Chen (PI), Chiu-yu Tseng (Co-PI), Yih-Ru Wang (Co-PI), Yuan-Fu Liao (Co-PI), Lin-shan.
Identifying Sarcasm in Twitter: A Closer Look
Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
CIS630 Spring 2013 Lecture 2 Affect analysis in text and speech.
Audiovisual Emotional Speech of Game Playing Children: Effects of Age and Culture By Shahid, Krahmer, & Swerts Presented by Alex Park
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Spoken Language Processing Lab Who we are: Julia Hirschberg, Stefan Benus, Fadi Biadsy, Frank Enos, Agus Gravano, Jackson Liscombe, Sameer Maskey, Andrew.
Outline Why study emotional speech?
Context in Multilingual Tone and Pitch Accent Recognition Gina-Anne Levow University of Chicago September 7, 2005.
Extracting Social Meaning Identifying Interactional Style in Spoken Conversation Jurafsky et al ‘09 Presented by Laura Willson.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
Inducing and Detecting Emotion in Voice Aaron S. Master Peter X. Deng Kristin L. Richards Advisor: Clifford Nass.
1 Evidence of Emotion Julia Hirschberg
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
Machine Learning Theory Maria-Florina Balcan Lecture 1, Jan. 12 th 2010.
Introduction to machine learning
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
Shriberg, Stolcke, Ang: Prosody for Emotion Detection DARPA ROAR Workshop 11/30/01 1 Liz Shriberg* Andreas Stolcke* Jeremy Ang + * SRI International International.
Machine Learning Theory Maria-Florina (Nina) Balcan Lecture 1, August 23 rd 2011.
SSIP Project 2 GRIM GRINS Michal Hradis Ágoston Róth Sándor Szabó Ilona Jedyk Team 2.
CS 4705 Natural Language Processing Fall 2010 What is Natural Language Processing? Designing software to recognize, analyze and generate text and speech.
circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Multimodal Information Analysis for Emotion Recognition
CS 6961: Structured Prediction Fall 2014 Course Information.
Class will start at the top of the hour! Please turn the volume up on your computer speakers to access the audio feature of this seminar. WELCOME TO EP100.
PIER Research Methods Protocol Analysis Module Hua Ai Language Technologies Institute/ PSLC.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Why predict emotions? Feature granularity levels [1] uses pitch features computed at the word-level Offers a better approximation of the pitch contour.
M. Brendel 1, R. Zaccarelli 1, L. Devillers 1,2 1 LIMSI-CNRS, 2 Paris-South University French National Research Agency - Affective Avatar project ( )
12/5/20151 Spoken Language Processing Julia Hirschberg CS 4706.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
Performance Comparison of Speaker and Emotion Recognition
يادگيري ماشين Machine Learning Lecturer: A. Rabiee
ISQS 7342 Dr. zhangxi Lin By: Tej Pulapa. DT in Forecasting Targeted Marketing - Know before hand what an online customer loves to see or hear about.
1/17/20161 Emotion in Meetings: Business and Personal Julia Hirschberg CS 4995/6998.
January 9, 2012 Nursing 330 Human Reproductive Health.
CPE542: Pattern Recognition Course Introduction Dr. Gheith Abandah د. غيث علي عبندة.
CS 4705 Natural Language Processing Who am I? Julia Hirschberg –Computational Linguist in CS –Focus: Spoken Language Processing –Lab: The Speech Lab,
Data Structures and Algorithms in Java AlaaEddin 2012.
Computational Linguistics Courses Experiment Test.
circle Spoken Dialogue for the Why2 Intelligent Tutoring System Diane J. Litman Learning Research and Development Center & Computer Science Department.
Personality Classification: Computational Intelligence in Psychology and Social Networks A. Kartelj, School of Mathematics, Belgrade V. Filipovic, School.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.
Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009.
Yow-Bang Wang, Lin-Shan Lee INTERSPEECH 2010 Speaker: Hsiao-Tsung Hung.
Speech emotion detection General architecture of a speech emotion detection system: What features?
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Detection Of Anger In Telephone Speech Using Support Vector Machine and Gaussian Mixture Model Prepared By : Siti Marahaini Binti Mahamood.
Intonation and Computation: Emotion
University of Rochester
Why Study Spoken Language?
Studying Intonation Julia Hirschberg CS /21/2018.
Why Study Spoken Language?
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Liz Shriberg* Andreas Stolcke* Jeremy Ang+ * SRI International
Emotional Speech Julia Hirschberg CS /16/2019.
Adjectives Multiple choice game Click the right option
Low Level Cues to Emotion
Social-Emotional Learning
Presentation transcript:

1 Computation Approaches to Emotional Speech Julia Hirschberg

2 Why Study Emotional Speech? Recognition Anger/frustration in call centers Confidence/uncertainty in online tutoring systems “Hot spots” in meetings Generation TTS for –Computer games –IVR systems Other applications: Speaker State –Deception, Charisma, Sleepiness, Interest… –The Love Detector (available for Skype )…The Love Detector

3 Assessing Health-Related Conditions Assessing intoxication levels (Levit et al ‘01) Distinguishing between active and passive coping responses in patients with breast cancer (Zei Pollermann ’02) Assessing schizophrenia (Bitouk et al ‘09) Classifying degree of autistic behavior (Columbia) Suicide notes

4 Hard Questions in Emotion Recognition How do we know what emotional speech is? –Acted speech vs. natural (hand labeled) corpora What can we classify? Distinguish among multiple ‘classic’ emotions Distinguish –Valence: is it positive or negative? –Activation: how strongly is it felt? (sad/despair) What features best predict emotions? What techniques best to use in classification?

5 happy sad angry confident frustrated friendly interested anxious bored encouraging Acted Speech: LDC Emotional Speech CorpusLDC Emotional Speech Corpus

6 Is Natural Emotion Different? (thanks to Liz Shriberg) Neutral –July 30 –Yes Disappointed/tired –No Amused/surprised –No Annoyed –Yes –Late morning Frustrated –Yes –No –No, I am … –…no Manila...

7 Major Problems for Classification: Different Valence/Different Activation

8 But…. Different Valence/ Same Activation

9 Good Features Can be Hard to Find Useful features: –Automatically extracted pitch, intensity, rate, VQ –Hand-labeled, automatically stylized pitch contours –Context –Lexical information: Dictionary of Affect –But….individual and cultural differences Algorithms for classification: –Machine learning (Decision trees, Support Vector Machines, Rule induction algorithms, HMMs,…)

10 Results: Different Emotions, Different Success Rates EmotionBaselineAccuracy angry69.32%77.27% confident75.00% happy57.39%80.11% interested69.89%74.43% encouraging52.27%72.73% sad61.93%80.11% anxious55.68%71.59% bored66.48%78.98% friendly59.09%73.86% frustrated59.09%73.86%

11 Open Questions New features and algorithms New types of emotion/speaker state to identify New ways of finding/collecting useful data New applications of more-or-less successful emotion classification Interspeech Paralinguistic Challenges

12 This Class Goals: –Learn what we know about: readings and discussion participation –Learn how to analyze speech, how to design a speech experiment, how to classify speaker states –Try to contribute something new: term project –Practice doing research Syllabus: – abus11.htm

13 Readings and Discussion Weekly readings –Everyone prepares/hands in 3 discussion questions on each assigned paper or website If you read an optional paper, submit questions on that as well if you want ‘credit’ –Everyone participates in class discussion –Each week one person leads discussion on one paper –Submit pdf in courseworks shared files

14 Term Project Everyone prepares a term project on a topic of their choice –You may work alone or in teams of 2 Deliverables –Proposal –Interim progress report –Final report –Short presentation/demo

15 Possible Topics Collect audio from children of different ages winning and losing a game and see if adults can distinguish those who win (happy speech) from those who lose (sad speech). Create hybrid speech stimuli from tokens uttered with different emotions (mixing pitch, loudness, duration, speaking rate,...) and see which features of emotional speech are most reliably associated with emotions. Detect different emotions from Cantonese and Mandarin speakers and compare performance of an automatic program to performance of human judges. Train Machine Learning algorithms on emotional speech corpora and see if you can improve over other approaches on the same corpora Develop an reader that detects emotion from text and uses the appropriate emotional TTS system to read it to the use

16 Important Details Read the academic integrity paragraph in the syllabus and understand it. Do all the readings when they are due, turn in all discussion questions by noon on the day of class, come to every class

17 Questions?