Download presentation
Presentation is loading. Please wait.
1
Building & Applying Emotion Recognition
Cristian Canton Microsoft @CristianCanton Anna S. Roth Microsoft @AnnaSRoth
2
Microsoft Cognitive Services
Vision Computer Vision | Emotion | Face | Video Microsoft Cognitive Services We’re hiring! Speech Custom Recognition | Speech Language Bing Spell Check | Language Understanding | Linguistic Analysis | Text Analytics | Web Language Model Knowledge Academic Knowledge | Entity Linking | Knowledge Exploration | Recommendations Search Bing Autosuggest | Bing Image Search | Bing News Search | Bing Video Search | Bing Web Search
3
Goals Emotion as a subjective problem
Building an image classifier end-to-end
4
The Recipe Data collection Tagging Aggregation
Data preprocessing Architecture selection Cost function Training
5
“Emotion”
6
Microsoft Confidential - Internal Only
1) Sample 2) Why 3) Switch to demo Microsoft Confidential - Internal Only
7
Machine Learning, Analytics, & Data Science Conference
8/7/2018 2:13 AM Basic Emotions Neutral Happiness Surprise Sadness Angry Contempt Disgust Fear © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
9
FACS Emotion Action Units Happiness 6+12 Sadness 1+4+15 Surprise
1+2+5B+26 Fear Anger Disgust Contempt R12A+R14A
10
CIRCUMPLEX High Arousal Valence Negative Positive Low
11
Lots of models of emotion
𝑃 𝐴 𝐷 Lovheim Cube Image is CC-BY-SA-4.0 from Wikimedia user “Fred The Oyster” - Plutchik wheel image public domain from:
12
Machine Learning, Analytics, & Data Science Conference
8/7/2018 2:13 AM Basic Emotions Neutral Happiness Surprise Sadness Angry Contempt Disgust Fear © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
13
Dog image is CC-BY-SA-4.0 from Wikimedia user “Edmontcz”
Subjective Dog image is CC-BY-SA-4.0 from Wikimedia user “Edmontcz”
14
Subjective Your cat may be dead or alive, but it’s still a cat
15
Very Subjective Your cat may be dead or alive, but it’s still a cat
16
Other Subjective Problems
Attractiveness Personality traits Style
17
The Recipe Data collection Tagging Data preprocessing
Architecture selection Aggregation Cost function Training
18
FER Data – used for early academic work
28k training, 7k val+test 71.73% with aug
19
In-house Data Collection
Machine Learning, Analytics, & Data Science Conference 8/7/2018 2:13 AM In-house Data Collection 4.5 million webcrawled images Emotional keywords Names Preprocessed images before tagging- face detector © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
20
The Recipe Data collection Tagging Data preprocessing
Architecture selection Aggregation Cost function Training
21
Machine Learning, Analytics, & Data Science Conference
8/7/2018 2:13 AM Tagging FACS Appearance based More accurate and less subjective. Easy expand to more emotions. Con: Expensive and require a certified tagger. Cheap and doesn’t require a certified tagger. Con: Crowdsourcing is very noisy. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
22
Machine Learning, Analytics, & Data Science Conference
8/7/2018 2:13 AM Crowd Sourced Tagging Each tagger can choose between 1 of the 8 emotions or unknown or not a face. We started with at least 2 taggers agree and up to 5 taggers. Quality was very bad specially with subtle emotions. We retagged all our data with 10 taggers. Quality improved drastically (detailed next). Even after using gold standard, amount of time taken by each taggers…etc. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
23
How many taggers to we need?
3 45.4 4 59 5 66.4 6 74.2 7 80.6 8 85.8 9 92.6
24
https://github.com/Microsoft/FERPlus
FER++
25
Unreliable?
26
The Recipe Data collection Tagging Data preprocessing
Architecture selection Aggregation Cost function Training
27
Input data Face detection
28
Input data
29
The Recipe Data collection Tagging Data preprocessing
Input Data Data Preprocessing DNN Architecture Cost Function Training Data collection Tagging Data preprocessing Architecture selection Aggregation Cost function Training
30
Data pre-processing Present the data in a more or less homogeneous way to the system
31
Data pre-processing Present the data in a more or less homogeneous way to the system Reduce variability of the input data exploiting any known characteristics
32
Data pre-processing Present the data in a more or less homogeneous way to the system Reduce variability of the input data exploiting any known characteristics In our case: Grayscale conversion Image cropping and scaling to the input size No frontalization DeepFace: Ranzato et al Taigman et al.,2014
33
Data pre-processing: Augmentation
Rotation
34
Data pre-processing: Augmentation
Translation
35
Data pre-processing: Augmentation
Scaling
36
Data pre-processing: Augmentation
Flip Other augmentations: - Affine, projective transformations, lens distortion - Noise - Be creative
37
The Recipe Data collection Tagging Data preprocessing
Architecture selection Aggregation Cost function Training
38
DNN Architecture It is very difficult to predict the performance of a given DNN architecture for a particular problem Explored several deep architectures: VGG16, VGG19, Resnet-50, Resnet-101 Commodity architectures
39
The Recipe Data collection Tagging Data preprocessing
Architecture selection Aggregation Cost function Training
40
Cost Function Link between distilled info from tags into cost function. Soft max and entropy
41
Emotion Probability Distribution
Machine Learning, Analytics, & Data Science Conference 8/7/2018 2:13 AM Emotion Probability Distribution Happiness Surprise Fear 5 4 1 Majority Voting (MV) Each face is associated with one emotion, the one that has the majority vote. Multi-Label Learning (ML) All emotions above certain threshold are treated as valid emotion. Probabilistic Drawing (PLD) During training draw the target emotion according to its probability. Cross-entropy loss (CEL) Learn the actual probability distribution. © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
42
Emotion Probability Distribution Training result (on FER+)
Machine Learning, Analytics, & Data Science Conference 8/7/2018 2:13 AM Emotion Probability Distribution Training result (on FER+) Schemes Accuracy MV 83.85±0.63% ML 83.97±0.36% PLD 84.99±0.37% CEL 84.72±0.24% © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
43
The Recipe Data collection Tagging Data preprocessing
Architecture selection Aggregation Cost function Training
44
The Recipe Data collection Tagging Aggregation
Data preprocessing Architecture selection Cost function Training ... Future work??
45
Video
46
Emotion in Video Difficulties: Potential approaches:
Temporal component of expressions Necessity to track the face along time Data tagging Potential approaches: Fame-by-frame analysis + temporal aggregation Fully train a RNN or LSTM (data hungry!)
47
Multimodal Future
48
Multimodal Emotion Combine audio+video in sequences to improve the recognition ratio of emotions Combine audio+text to improve the recognition ratio
49
Microsoft Cognitive Services
Vision Computer Vision | Emotion | Face | Video Microsoft Cognitive Services We’re hiring! Speech Custom Recognition | Speech Language Bing Spell Check | Language Understanding | Linguistic Analysis | Text Analytics | Web Language Model Knowledge Academic Knowledge | Entity Linking | Knowledge Exploration | Recommendations Search Bing Autosuggest | Bing Image Search | Bing News Search | Bing Video Search | Bing Web Search
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.