Introduction Databases Conclusion Robust Recognition of Emotion from Speech in e-Leaning Environment Mohammed E. Hoque 1,2, Mohammed Yeasin 1, Max Louwerse.

Slides:



Advertisements
Similar presentations
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Advertisements

Detecting Faces in Images: A Survey
Computational Physiology Lab Department of Computer Science University of Houston Houston, TX Eustressed or Distressed? Combining Physiology with.
Automatic classification of weld cracks using artificial intelligence and statistical methods Ryszard SIKORA, Piotr BANIUKIEWICZ, Marcin CARYK Szczecin.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
An Overview of Machine Learning
Minimum Redundancy and Maximum Relevance Feature Selection
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Face Recognition Method of OpenCV
COMPUTER AIDED DIAGNOSIS: FEATURE SELECTION Prof. Yasser Mostafa Kadah –
AUTOMATIC SPEECH CLASSIFICATION TO FIVE EMOTIONAL STATES BASED ON GENDER INFORMATION ABSTRACT We report on the statistics of global prosodic features of.
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by grants from the National.
Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.
Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.
1 Affective Learning with an EEG Approach Xiaowei Li School of Information Science and Engineering, Lanzhou University, Lanzhou, China
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
Fig. 2 – Test results Personal Memory Assistant Facial Recognition System The facial identification system is divided into the following two components:
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
Sufficient Dimensionality Reduction with Irrelevance Statistics Amir Globerson 1 Gal Chechik 2 Naftali Tishby 1 1 Center for Neural Computation and School.
Lie Detection using NLP Techniques
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Robust Recognition of Emotion from Speech Mohammed E. Hoque Mohammed Yeasin Max M. Louwerse {mhoque, myeasin, Institute for Intelligent.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Eng. Shady Yehia El-Mashad
Introduction Mohammad Beigi Department of Biomedical Engineering Isfahan University
This week: overview on pattern recognition (related to machine learning)
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Presented by Tienwei Tsai July, 2005
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
KYLE PATTERSON Automatic Age Estimation and Interactive Museum Exhibits Advisors: Prof. Cass and Prof. Lawson.
Local Non-Negative Matrix Factorization as a Visual Representation Tao Feng, Stan Z. Li, Heung-Yeung Shum, HongJiang Zhang 2002 IEEE Presenter : 張庭豪.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Phase Congruency Detects Corners and Edges Peter Kovesi School of Computer Science & Software Engineering The University of Western Australia.
Computational Physiology Lab Department of Computer Science University of Houston Houston, TX Eustressed or Distressed? Combining Physiology with.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
Intelligent Control and Automation, WCICA 2008.
Spectral centroid PianoFlute Piano Flute decayed not decayed F0-dependent mean function which captures the pitch dependency (i.e. the position of distributions.
Design of PCA and SVM based face recognition system for intelligent robots Department of Electrical Engineering, Southern Taiwan University, Tainan County,
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
3D Face Recognition Using Range Images
Seoul National University Neural Network Modeling for Intelligent Novelty Detection 제 2 차 뇌신경정보학 Workshop 일시 : 2002 년 2 월 27 일 ( 수 ) 10:00-18:00 장소 : KAIST.
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.
0 / 27 John-Paul Hosom 1 Alexander Kain Brian O. Bush Towards the Recovery of Targets from Coarticulated Speech for Automatic Speech Recognition Center.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
3D Face Recognition Using Range Images Literature Survey Joonsoo Lee 3/10/05.
 Wind Power TEAK – Traveling Engineering Activity Kits Partial support for the TEAK Project was provided by the National Science Foundation's Course,
WLD: A Robust Local Image Descriptor Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikäinen, Xilin Chen, Wen Gao 报告人:蒲薇榄.
Research Methodology Proposal Prepared by: Norhasmizawati Ibrahim (813750)
Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Deeply learned face representations are sparse, selective, and robust
Sentiment analysis algorithms and applications: A survey
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Can Computer Algorithms Guess Your Age and Gender?
Seunghui Cha1, Wookhyun Kim1
Match Score Fusion of Fingerprint and Face Biometrics for Verification
Discussion and Conclusion
COMP61011 : Machine Learning Ensemble Models
Hybrid Features based Gender Classification
TITLE Authors Institution RESULTS INTRODUCTION CONCLUSION AIMS METHODS
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
Extra Tree Classifier-WS3 Bagging Classifier-WS3
A Hybrid PCA-LDA Model for Dimension Reduction Nan Zhao1, Washington Mio2 and Xiuwen Liu1 1Department of Computer Science, 2Department of Mathematics Florida.
National Conference on Recent Advances in Wireless Communication & Artificial Intelligence (RAWCAI-2014) Organized by Department of Electronics & Communication.
This material is based upon work supported by the National Science Foundation under Grant #XXXXXX. Any opinions, findings, and conclusions or recommendations.
CAMCOS Report Day December 9th, 2015 San Jose State University
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Presentation transcript:

Introduction Databases Conclusion Robust Recognition of Emotion from Speech in e-Leaning Environment Mohammed E. Hoque 1,2, Mohammed Yeasin 1, Max Louwerse 2 Acknowledgements 1. Computer Vision, Pattern and Image Analysis Laboratory Department of Electrical and Computer Engineering University of Memphis, TN Multimodal Aspects of Discourse (MAD) Laboratory Department of Psychology / Institute for Intelligent Systems University of Memphis, TN Results This research was partially supported by grant NSF-IIS awarded to the third author. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding institution. (a) The word “OK” uttered under confusion (b) The word “OK” uttered under flow (c) The word “OK” uttered under delight (d) The word “OK” uttered normally Figure 1: Pitch of the word “OK” in various emotional states.  Emotion in Speech in learning environment is a strong indicator of how effective the learning process is [1,2].  This assertion has impacted the study of emotion significantly.  Our aim is to identify salient words and observe their prosodic features.  Any set of words can be expressed with different intonational patterns and they will convey totally different meanings, as shown in figure 1.  Therefore, we argue that extracting lexical and prosodic features from “salient words” only will yield robust recognition of emotion from speech in learning environment. Three movies were selected to clip emotional utterances from. (a) Fahrenheit 911. (b) Bowling for Columbine. (c) Before Sunset. (a)(b)(c) Categories of Emotion Emotion Positive Negative Flow Delight Frustration Confusion Figure 2: Categories of emotion pertinent to e-Learning Novel Features of Speech Pitch: Minimum, Maximum, Mean, Standard deviation, Absolute Value, Quantile, Ratio between voiced and unvoiced frames. Formant: First formant, Second formant, Third formant, Fourth formant, Fifth formant, Second formant / first formant, Third formant / first formant Intensity: Minimum, Maximum, Mean, Standard deviation, Quantile. Figure 3: Clustered speech features, after reducing their dimensions using both Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). CategoryClassifiersAccuracy (%) PCA + LDA Feature s RulesPart NNge Ridor66.67 TreesRandomForre st J LMT MetaAdaBoostM Bagging Classification via Regression LogitBoost Multi Class Classifier Ordinal Class Classifier Threshold Selector FunctionsLogistic Multi-layer Perceptron RBF Network Simple Logistics SMO71.42 BayesNaïve Bayes66.67 Naïve Bayes Simple Naïve Bayes Updateable Catego ry ClassifiersAccuracy (%) Deligh t + Flow Confusion + Frustration RulesPart72.72 (66%) 100 NNge80 (50%) 100 Ridor66.67 (80%) 100 TreesRandomFo rrest (66%) J (66%) 100 LMT72.72 (66%) 100 MetaAdaBoost M (66%) 100 Bagging63.64 (66%) Classificati on via Regression (66%) 100 LogitBoost63.64 (66%) 100 Multi Class Classifier (66%) 100 Ordinal Class Classifier (66%) 100 Threshold Selector (80%) 100 Functio ns Logistic72.72 (66%) 100 Multi-layer Perceptron (80%) 100 RBF Network (80%) 100 Simple Logistics (66%) 100 SMO72.72 (66%) 100 BayesNaïve Bayes (66%) 100 Naïve Bayes Simple (66%) 100 Naïve Bayes Updateable (66%) 100  The hypothesis of extracting prosodic features from salient words has been successfully demonstrated.  The results have been validated using 21 different classifiers.  The results from the classifiers are cross validated by 10 folds.  The results show that applying data projection and dimension reduction techniques, such as Principal Component Analysis and Linear Discriminant Analysis, yield better results.  Classifiers performed nearly 100% to distinguish between frustration and confusion.  Classifiers performed comparatively worse in distinguishing between positive patterns such as delight and flow.  The next phase of the project will involve testing the algorithm on maptask data collected from the i-MAP project of Institute for Intelligent Systems (IIS).  Future efforts are going to involve fusing multimodal channels such as facial expression, speech and gestures in both decision and feature levels. 21 different classifiers are used to validate the robustness of our algorithm to distinguish between positive and negative emotions as shown in Table 1. The comparison between with and without using data projection/reduction techniques on the features are also demonstrated in Table 1. Table 1: Classification results for positive and negative emotion Table 2: Classification results in positive and negative emotion Table 2 shows a comparison between how the classifiers performed in distinguishing the delight and flow in positive emotion and confusion and frustration in negative emotion. Results show that negative emotions are classified better than positive emotions. References [1] Craig, S. D. & Gholson, B. (2002, July). Does an agent matter?: The Effects of Animated Pedagogical Agents on Multimedia Environments. In Barker P. & Rebelsky S. (Eds.) Proceedings for ED-MEDIA 2002: World Conference on Educational Multimedia, Hypermedia and Telecommunications. ( ). Norfolk, VA: Association for the Advancement of Computing in Education. [2] Craig, S. D., Gholson, B., & Driscoll, D. (2002) Animated Pedagogical Agents in Multimedia Educational Environments: Effects of Agent Properties, Picture Features, and Redundancy. Journal of Educational Psychology. 94, ( ).