Nataliya Nadtoka James Edge, Philip Jackson, Adrian Hilton CVSSP Centre for Vision, Speech & Signal Processing UNIVERSITY OF SURREY
Motivation Non-verbal cues convey additional information Existing visual speech from audio methods produce plausible animation of neutral speech, but fail to generate realistic expressive content The factors that contribute to emotional speech are vastly understudied Aim Learn the emotional characteristics Model the emotional characteristics of speech
Overview
Dataset 4D sequence of geometry and texture (60 fps) and synchronized audio (44100Hz) recorded with 3dMD scanner Emotions: Anger, Surprise, Fear, Happiness, Disgust, Sadness All sentences are repeated in Neutral to facilitate cross- comparison 110 sentences with a strong expressive content Phonetically balanced IR projector IR stereo cameras colour camera
Post-processing Surface registration is done by using painted visual markers Lip contour is tracked by using blue lipstick Audio is used to phonetically annotate the data Differences in duration are further used for emotion analysis
Durational differences Neutral Anger Disgust Fear Happiness Sadness Surprise t sec Don’t ask me to carry an oily rag like that
Isolated region analysis Don’t ask me to carry an oily rag like that Neutral Anger Disgust Fear Happiness Sadness Surprise
PCA based Analysis first principal component 55% of total variance Surprise Neutral Happiness t sec Don’t an that Don’t an that PC 1
Emotion Transfer Neutral Sentence ASentence A in Emotion Phonetic transcription emphasis Phonetic transcription emphasis DTW Audio of Emotion Sentence B emphasis Phonetic transcriptio n Neutral animation Δ = Emotion - Neutral Model of Emotion Animation of Emotion Sentence B
Conclusions This work presents an isolated upper face region analysis for selected sentences Promising relation between the principal component features and emotion Observed dynamics reflects non-constant nature of emotion within a sentence Future work will focus on expressive features with respect to emotion transfer