A Survey of Autonomous Human Affect Detection Methods for Social Robots Engaged in Natural HRI (2015) Derek McColl[1], Alexander Hong[1][2], Naoaki Hatakeyama[1],

Slides:



Advertisements
Similar presentations
The Extended Cohn-Kanade Dataset(CK+):A complete dataset for action unit and emotion-specified expression Author:Patrick Lucey, Jeffrey F. Cohn, Takeo.
Advertisements

Model-based Image Interpretation with Application to Facial Expression Recognition Matthias Wimmer
The University of Auckland New Zealand Matthias Wimmer Technische Universitat München Bruce MacDonald, Dinuka Jayamuni, and Arpit Yadav Department of Electrical.
Matthias Wimmer, Ursula Zucker and Bernd Radig Chair for Image Understanding Computer Science Technische Universität München { wimmerm, zucker, radig
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
Joemon M Jose (with Ioannis Arapakis & Ioannis Konstas) Department of Computing Science.
Advanced Technology Center Stuttgart EMOTIONAL SPACE IMPROVES EMOTION RECOGNITION Raquel Tato, Rocio Santos, Ralf Kompe Man Machine Interface Lab Advance.
Emotion. The heart has reasons that reason does not recognize -- Pascal Reason is and ought to be the slave of passion -- Hume Are Emotions Necessary.
ISTD 2003, Thoughts and Emotions Interactive Systems Technical Design Seminar work: Thoughts & Emotions Saija Gronroos Mika Rautanen Juha Sunnari.
1 Modeling Facial Shape and Appearance M. L. Gavrilova.
Sunee Holland University of South Australia School of Computer and Information Science Supervisor: Dr G Stewart Von Itzstein.
Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Cognitive Robots © 2014, SNU CSE Biointelligence Lab.,
Facial Feature Detection
IF A MAD SCIENTIST WERE TO REPLACE YOUR BEST FRIENDS BRAIN WITH A COMPUTER- HOW WOULD YOU KNOW SOMETHING WAS DIFFERENT? Emotion.
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
Nonverbal Communication
Emotion Recognition from Electromyography and Skin Conductance Arturo Nakasone (University of Tokyo) Helmut Prendinger (National Institute of Informatics,
Multimodal Information Analysis for Emotion Recognition
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
SVMs for (x) Recognition (From Moghaddam / Yang’s “Gender Classification with SVMs”) Brian Whitman.
Module 16 Emotion.
Performance Comparison of Speaker and Emotion Recognition
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
MIT Artificial Intelligence Laboratory — Research Directions The Next Generation of Robots? Rodney Brooks.
Unit 4: Emotions.
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING Salih Burak Gokturk.
Facial Expressions and Emotions Mental Health. Total Participants Adults (30+ years old)328 Adults (30+ years old) Adolescents (13-19 years old)118 Adolescents.
Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009.
Under Guidance of Mr. A. S. Jalal Associate Professor Dept. of Computer Engineering and Applications GLA University, Mathura Presented by Dev Drume Agrawal.
 ASMARUL SHAZILA BINTI ADNAN  Word Emotion comes from Latin word, meaning to move out.  Human emotion can be recognize from facial expression,
Detection Of Anger In Telephone Speech Using Support Vector Machine and Gaussian Mixture Model Prepared By : Siti Marahaini Binti Mahamood.
Perceptive Computing Democracy Communism Architecture The Steam Engine WheelFire Zero Domestication Iron Ships Electricity The Vacuum tube E=mc 2 The.
Emotion is a psychological state involving
Motivation and Emotion
Nonverbal Communication
CS 445/656 Computer & New Media
Emotions Emotions seem to rule our daily lives.
Social Interaction.
Derek McColl Alexander Hong Naoaki Hatakeyama Goldie Nejat
Modeling Facial Shape and Appearance
University of Rochester
Theories of Emotion 3 Theories of Emotion.
Artificial Intelligence for Speech Recognition
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
GESTURE RECOGNITION TECHNOLOGY
Thinking About Psychology: The Science of Mind and Behavior
Chapter 9 Lesson 3 Section 4: Emotion.
Emotion.
When to engage in interaction – and how
Voluntary (Motor Cortex)
Teaching Robot’s Proactive Behavior Using Human Assistance
Presentation by Sasha Beltinova
Expressing and Experiencing Emotion
The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression By: Patrick Lucey, Jeffrey F. Cohn, Takeo.
Chapter 5: Nonverbal Communication
Emotion Lesson Objectives
What is blue eyes ? aims on creating computational machines that have perceptual and sensory ability like those of human beings. interactive computer.
OTHER MOTIVATIONS.
Emotion Recognition from Electromyography and Skin Conductance
13.4: Emotions.
Expressed Emotion Emotions are expressed on the face, by the body, and by the intonation of voice. Is this non-verbal language of emotion universal?
Audio and Speech Computers & New Media.
Emotion notes 13-2 (Objectives 2-7)
42.1 – Describe our ability to communicate nonverbally, and discuss gender differences in this capacity. Expressed Emotion Emotions are expressed on the.
COMMUNICATION.
Module 16 Emotion.
Social and Emotional Development.
Chapter 9 System Control
Presentation transcript:

A Survey of Autonomous Human Affect Detection Methods for Social Robots Engaged in Natural HRI (2015) Derek McColl[1], Alexander Hong[1][2], Naoaki Hatakeyama[1], Goldie Nejat[1], Beno Benhabib[2] [1] Autonomous Systems and Biomechatronics Laboratory, University of Toronto [2] Computer Integrated Manufacturing Laboratory, University of Toronto Vivek Kumar SEAS ‘20 4/9/2019

Natural HRI Detect common human communication cues/modalities for more natural social interaction. Affect: a complex combination of emotions, moods, interpersonal stances, attitudes, and personality traits that influence behavior. A robot capable of interpreting affect will be better at sustaining more effective and engaging interactions with users.

Affect Categorization Models Physiological Signals Timeline 1 Affect Categorization Models Facial Expression 2 Body Language 3 Voice 4 Physiological Signals 5 Combination 6 Future Work 7 HRI Scenarios: Collaborative HRI, Assistive HRI, Mimicry HRI, General HRI

Affect Categorization Models

Categorical Models Classifies affect into discrete states. Darwin (1872), Tomkins (1962): [joy/happiness, surprise, anger, fear, disgust sadness/anguish, dis-smell, shame], neutral Ekman: Facial Action Coding System (FACS), codifies facial expression with action units (AU) Clearly distinguish affect, but may have trouble with inclusion and combination. https://en.wikipedia.org/wiki/Facial_Action_Coding_System

Dimensional Models Classifies affect into dimensional space. Wundt (1897), Schlosberg (1954): [pleasure/displeasure, arousal/calm, relaxation/tension] Valence: positive or negative affectivity Arousal: calming or exciting Russell: two-dimensional circumplex model with valence and arousal dimensions. https://researchgate.net

Facial Affect Recognition

Experiment: iCat Summary: iCat plays chess and provides real-time affective (empathetic) feedback to the opponent (change in facial expression, comment, bad move on purpose). Affect Classifier: Support Vector Machine (SVM) Input: smile probability (faceAPI), eye gaze (faceAPI), game state Output: positive, neutral, negative valence, action Experiment: three groups of 40 chess players- no empathetic response, random empathetic response, adaptive empathetic response. Results tallied from interviews of chess players afterwords. https://azure.microsoft.com/en-us/services/cognitive-services/face/

Results: iCat Demo: https://www.youtube.com/watch?v=Qk2_rlWvHts

Experiment: AIBO Summary: make AIBO respond to human affect with sound, light, or motions in a dog-like manner. Affect Classifier: Probabilistic FACS lookup Input: camera stream, obtain feature points and AUs, map to facial expression Output: categorical [joy, sadness, surprise, anger, fear, disgust, neutral] Experiment: Alongside Q-learning, AIBO adapts its behavior to the human’s affective state over time. This is characterized by appropriate affect recognition.

Demo: https://www.youtube.com/watch?v=z9mWWU-T6WU Results: AIBO AIBO appeared to achieve a friendly personality when treated with positive affects (joy). Demo: https://www.youtube.com/watch?v=z9mWWU-T6WU

Body Language Affect Recognition

Experiment: Nadine Summary: Nadine implements a gesture classification system (GUHRI) and the appropriate response. Affect Classifier: Large Margin Nearest Neighbors Input: Action gesture from Kinect sensor and CyberGlove (skeleton joint locations) Output: confidence, praise (affects), or responsive output gesture: wave, shake hand Experiment: 25 participants of varying body type (race, gender, etc), who performed one of the designated output gestures. https://en.wikipedia.org/wiki/Kinect http://www.cyberglovesystems.com/

GUHVI System Architecture

Nadine: Results Gesture recognition rate of 90-97%, followed by appropriate response. Demo: https://www.youtube.com/watch?v=usif8BOBHgA

Voice Affect Recognition

Feature Extraction from Human Voice: OpenEar (OE) Baseline Fundamental Frequency: the lowest frequency of a periodic waveform. For any given waveform, we find period T such that is minimized: Energy: area squared magnitude of waveform- Mel-frequency Cepstral Coefficient (MFCC): power spectrum Phase distortion: a change in the shape of waveform Jitter/Shimmer: variations in signal frequency/amplitude http://geniiz.com/wp-content/uploads/sites/12/2012/01/26-TUM-Tools-openEAR.pdf

Experiment: Nao Summary: adults speak with Nao about a variety of topics, and Nao attempts to classify positive or negative valence from the user’s voice. Affect Classifier: Support Vector Machine (SVM) Input: fundamental frequency, energy, MFCCs, phase distortion, jitter, shimmer, etc. Output: positive or negative valence Experiment: 22 adults speak with NAO wearing a microphone, with pre-coded, highly aroused voice signals.

Results: Nao RD: relaxation coefficient P-unvoiced: unvoiced consonants HNR: harmonicity FPD: functions of phase distortion Demo: https://www.youtube.com/watch?v=p1ID- gvUnWs

Affective Physiological Signal Recognition

Physiological Signals Heart Rate Affect Skin Cond. Muscle Tension

Experiment: Robotic Arm Summary: an affect detection system for a robotic arm when the human and arm are engaged in a collaborative task. Affect Classifier: Hidden Markov Model (HMM) Input: heart rate (EKG), skin conductance, muscle tension (forehead, MyoScan) Output: arousal (low, medium, high); valence (low, medium, high) Experiment: 36 participants with tasks: pick-and-place; reach-and-grab; varying speed and obstacles.

Results: Robotic Arm

Experiment: Basketball Summary: attach a basketball net to a robotic arm, and move it based on the player’s anxiety level. Affect Classifier: Support Vector Machine (SVM) Input: heart rate (EKG), skin conductance, muscle tension (neck, MyoScan), skin temperature Output: anxiety classification (low, medium, high) Experiment: a basketball game with the robot. The robot varies the difficulty based on the anxiety level of the players.

Results: Basketball 79% of participants reported proper anxiety change. https://researchgate.net

Multimodal Affect Recognition

Experiment: Maggie Summary: a general purpose interactable robot. Affect Classifier: Decision Tree/Decision Rule, Bayes’s Theorem Input: voice features (see voice section), CERT/SHORE face detection Output: categorical [happy, neutral, sad, surprise] Experiment: 40 students pose with voice and facial affective state. https://www.iis.fraunhofer.de/en/ff/sse/ils/tech/shore-facedetection.html http://mplab.ucsd.edu/~marni/Projects/CERT.htm

Results: Maggie 1 2 3 56% 57% 76% 63% 77-83% Voice Classification DT | CERT DR | SHORE 56% 57% 1 Voice Classification 76% 63% 2 Facial Classification 77-83% 3 Overall Classification https://researchgate.net

Conclusion

Future Work Improve affect categorization models Improve sensors (most frequent- Kinect, webcams, etc) Use transfer learning Develop systems that are robust to age and cultural backgrounds

Popular Resources Cohn-Kanade database CMU database JAFFE database DaFEx database EmoVoice EmoDB database