Download presentation
Presentation is loading. Please wait.
Published byThomasina McCarthy Modified over 9 years ago
1
Multimodal Emotion Recognition Colin Grubb Advisor: Nick Webb
2
M OTIVATION
3
P REVIOUS R ESEARCH o Multimodal fusion o Research looking at audio, visual, and gesture information o Feature Level vs. Decision Level
4
R ESEARCH Q UESTION o To what extent can we improve emotion recognition by using classification methods on audio and visual data?
5
D ECISION L EVEL A NALYSIS o Set of rules vs. training a classifier o Rule set is too basic o Will use classifier to learn outputs of unimodal systems
6
https://www.informatik.uni- augsburg.de/en/chairs/hcm/projects/em ovoice/ A UDIO S YSTEM o EmoVoice (EMV) o Real Time Audio Analysis o Five emotional states w/ probabilities o Published accuracy: 47.67%
7
E MO V OICE C ONFIDENCE L EVELS (Negative Active) Angry (Negative Passive) Sad (NEutral) Neutral (Positve Active) Happy (Positive Passive) Content negativeActive
8
V ISUAL S YSTEM o Software created by Prof. Shane Cotter o Uses still images o Published accuracy: 93.4%
9
S YSTEM L AYOUT I’m in a good mood! EmoVoice Images Emotion: Happy Video Software Emotion: Happy Classifier Output: Happy
10
D ATA G ATHERING o 8 subjects o Five male, three female o Audio Data o Read sample sentences o Visual Data o Gather facial expressions from regular and long distance (6 ft.)
11
E XPERIMENTS o Weka Data Mining Software o Used J48 Classifier o C4.5 algorithm – decision tree o Each branch represents decision made at that node 1 23 Output 1Output 2Output 3Output 4 http://www.cs.waikato.ac.nz/ml/weka/
12
E MOTION C LASSES o Final dataset classifies between o Happy o Angry o Neutral o Sad o Audio performance: 38.43% o Visual performance: 77.43 %
13
I NITIAL P ERFORMANCE o Ran combined dataset against J.48 classifier o Multimodal data initially ineffective o Needed a way to improve dataset ExperimentMultimodal Data EmoVoice OnlyVisual Only Regular Distance76.6438.43 *77.43 Long Distance65.6038.43 *67.01
14
I MPROVING A CCURACY o How can we use the two individual systems to complement each other? o Two pieces of information: o What does the visual system do poorly on? o What kind of biases does EmoVoice have?
15
M ANUAL B IAS o Visual System o Performs poorly at Neutral o Some inaccuracy for all emotions tested o EmoVoice o Bias towards negative voice o Very strong bias towards active voice
16
E MO V OICE – M ODIFICATION R ULES o Happy: For all happy training instances, if PP + PA > NA & NE & NP, change EMV Class to Happy o Sad: If NP is 2 nd to NA and within 0.05, change EMV Class to Sad o Neutral: o If NE tied with another confidence level, change EMV Class to Neutral o If all probabilities within 0.05 of each other, change EMV Class to Neutral
17
R ESULTS ExperimentMultimodal Data EmoVoice OnlyVisual Only Regular Distance76.6438.43 *77.43 Long Distance65.6038.43 *67.01 Regular Distance82.4758.17 *77.43 * Long Distance70.0958.17 *67.36 Regular Distance – Confidence Levels Removed 81.0860.04 *77.43 * Long Distance – Confidence Levels Removed 73.9860.04 *67.36 * Post Man. Bias
18
F UTURE W ORK o Spring Practicum o Refine rules o Automation o Online Classifier o Mount on robot; cause apocalypse
19
T HANK YOU FOR LISTENING. o Questions? Comments?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.