Speech emotion detection General architecture of a speech emotion detection system: What features?
Local approach: RFC & Tilt P. Taylor, The Tilt Intonation Model, 1998 RFC Model: (rise/fall/connection) Tilt Model: Amplitude Duration Tilt (shape) Problem: automatic event labeling Intonation events ……
Global approach Breazeal & Aryananda, Recognition of Affective Communicative Intent in Robot-Directed Speech, 2002 Features: ▫Pitch mean, variance, min, max, range, … ▫Energy mean, variance, min, max, range, mean/variance, … ▫Other pace, voiced percentage, … Less precise, but more simple Problem: needs training data!
Global approach (2) First idea: ▫Decision tree ▫Data not very appropriate ▫Difficult to configure Second idea ▫Nearest neighbor ▫With Mahalanobis distance
Implementation Edimburgh Speech Tools library (University of Edimburgh) ▫pitch tracking ▫sound recording Online recognition
Results Only tested with one person Ok for sad and happy (~100%) More difficult for angry and neutral (~60%) [Video]