Download presentation
Presentation is loading. Please wait.
Published byHope Malone Modified over 8 years ago
1
Emotion Recognition from Speech: Stress Experiment Stefan Scherer, Hansjörg Hofmann, Malte Lampmann, Martin Pfeil, Steffen Rhinow, Friedhelm Schwenker, Günther Palm Stefan Scherer | 24.09.2007 | LREC 2008 Institute of Neural Information Processing Ulm University stefan.scherer@uni-ulm.de
2
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 2 Motivation Why stress recognition from speech? –Safety and usability purposes –More efficient and natural interfaces –Several existing applications are based on speech only (call center applications) Existing problems: –Existing databases are limited –Stress induced by increasing workload missing –Choice of representative features difficult
3
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 3 Experimental Setup
4
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 4 Experimental Setup – Summary Direct planes towards corresponding exit Four types of questions (personal, enumerations, general knowledge, Jeopardy) Difficulty levels differ in plane speed, number of planes and exit sizes Points are earned or lost and current score is color coded One game lasts 10 minutes Self-assessment of experienced stress is questioned three times
5
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 5 Evaluation and Labeling of Recordings Everybody reacts differently towards stress No common labels available for the recordings → Second labeling experiment to obtain fuzzy labels for each of the recordings
6
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 6 Evaluation and Labeling of Recordings SpeakerMeanP 25 P 75 Self-Assess.Crashes 135.824471/2/40/4/13 241.925592/4/?0/4/30 345.229.5617/6/81/10/37 431.020401/1/20/2/16 543.225617/8/90/3/28 643.023604/4/60/3/26 731.221371/3/70/1/23 833.221411/1/3-40/0/8 938.023511/1-2/50/6/31 1035.722491/2/50/3/11 1149.631.75657/9/105/9/17 1249.132654/4/?0/5/27 1343.426621/3/46/22/38 1432.122412/5/81/1/26 1541.626562/3/70/2/19
7
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 7 Evaluation and Labeling of Recordings Spearman correlation tests: – Mean vs. self-assessment – Mean vs. crashes – Self-assessment vs. crashes ρ p-value M vs. SA0.610.01 M vs. C0.680.005 C vs. SA0.400.13
8
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 8 Automatic Stress Recognition Biologically motivated features –Representing the rate of change of frequency –Representative features –Robust against noisy conditions Echo state networks –Easy to train using direct pseudo inverse method –Using sequential characteristics of features –Robust against noisy conditions
9
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 9 Utilized Features Motivation –Pitch not always easy to extract –Statistics of Pitch may not suffice –Preliminary experiments show worse performance –Goal: representative features, that do not need to be aggregated over time Modulation spectrum based features –Representing the rate of change of frequency –Extracted at 25 Hz
10
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 10 Modulation Spectrum Features Rate of change of frequency Standard procedures: FFT and Mel filtering Most prominent energies are observed between 2 and 16 Hz
11
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 11 Waveform Spectrogram Modulation Spectrogram Time
12
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 12 Echo State Networks Recurrent artificial neural network Dynamic reservoir represents history → echo state property W out are the connections that need to be adapted using pseudo inverse method
13
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 13 Experiments and Results No „true“ label → mean for each utterance of all labelers as target 10 fold cross validation Human labelers vs. ESN – ESN outperforms labelers MSEME Labeler 10.2840.421 Labeler 20.1510.281 Labeler 30.2910.422 Labeler 40.2410.384 Labeler 50.2110.365 ESN0.0840.235
14
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 14 Conclusions Experimental setup to record speech data with different levels of stress Large vocabulary dataset is available (with additional video material and mouse movement data) Method to label the individual stressed utterances by humans Automatic stress recognizer based on recurrent neural networks → outperforming human labelers in accuracy
15
Emotion Recognition from Speech: Stress Experiment | 24.09.2007 Page 15 Thank you, for your attention!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.