Detection Of Anger In Telephone Speech Using Support Vector Machine and Gaussian Mixture Model Prepared By : Siti Marahaini Binti Mahamood
What is emotion? Emotion is often the driving force behind motivation, positive or negative.[2] An alternative definition of emotion is a "positive or negative experience that is associated with a particular pattern of physiological activity."[3]
Emotion recognition methods There are several approaches to detect emotion state have been proposed such as speech, face expression, gesture, automatic nervous system (ANS) and Electroencephalography signal (EEG). Emotion recognition through speech is complex and complicated task due to unambiguous answer to recognize the correct emotion of the speaker.
Introduction Customer call center self-service is one of automatic speech recognition (ASR) system. The system will help the customer to detect the problems that arise from unsatisfactory interaction by either offering the assistance of a human operator or trying to react with appropriate dialog strategies. This is important to detect the changes in the call flow when the anger of caller’s voice can be monitor [7].
Problem Statement 1.There is almost impossible to know with certainty the actual emotion of the speaker [9]. 2.It is challenging to distinguish between certain emotions based only on speech [7]. 3.It is challenging to determine the robust features to describe the emotions from speech [7]. 4.It is confusing to detect emotions classes when speech signal contaminated by noise [4].
Research QuestionResearch Objective 1.What is the method to pre-process the speech signal? 2.How to identify the connection between emotion state and speech signal? 3.How to detect anger in telephone speech? 1.To pre-process the speech signal using Support Vector Machine and Gaussian Mixture Model (GMM). 2.To identify the connection between anger emotion state and speech signal. 3.To analyze a model to detect anger in telephone speech. Research Question & Research Objective
Methodology Figure 1: Diagram of Emotion Recognition System through speech [12]
Methodology phase Methods and techniques Expected output Phase 1 : Emotional Speech Input Gathering the data from voice portal Speech Signal Phase 2 : Feature extraction Mel-frequency cepstrum characteristic (MFCC) Speech intensity, pitch and speaking rate Phase 3 : Classification Gaussian Mixture Model, Support Vector Machine Classify emotion state; anger or not anger Phase 4: Emotion RecognitionEmotion stateAnger
Summary Detection of anger in speech is analyze with a particular focus on realistic, adverse acoustic conditions involving the telephone channel. A new method for focusing on particular modulation frequencies of the features in classification is proposed. A multitude classifiers, GMM and SVM with several features sets (prosidic and linguistic), the adoption of a meta classifier would be an issue to improve the robustness detection of anger.
Thank you