Irfan Essa, Alex Pentland Facial Expression Recognition using a Dynamic Model and Motion Energy (a review by Paul Fitzpatrick for 6.892)

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

The Extended Cohn-Kanade Dataset(CK+):A complete dataset for action unit and emotion-specified expression Author:Patrick Lucey, Jeffrey F. Cohn, Takeo.
Model-based Image Interpretation with Application to Facial Expression Recognition Matthias Wimmer
Face Recognition. Introduction Why we are interested in face recognition? Why we are interested in face recognition? Passport control at terminals in.
Matthias Wimmer, Ursula Zucker and Bernd Radig Chair for Image Understanding Computer Science Technische Universität München { wimmerm, zucker, radig
Face Alignment with Part-Based Modeling
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Amir Hosein Omidvarnia Spring 2007 Principles of 3D Face Recognition.
Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.
Facial feature localization Presented by: Harvest Jang Spring 2002.
Adviser:Ming-Yuan Shieh Student:shun-te chuang SN:M
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.
Introduction to Data-driven Animation Jinxiang Chai Computer Science and Engineering Texas A&M University.
Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING Salih Burak Gokturk.
Department of Electrical and Computer Engineering Physical Biometrics Matthew Webb ECE 8741.
Detecting and Tracking Moving Objects for Video Surveillance Isaac Cohen and Gerard Medioni University of Southern California.
Automatic Face Recognition under Component-Based Manifolds CVGIP 2006 Wen-Sheng Chu ( 朱文生 ) and Jenn-Jier James Lien ( 連震杰 ) Robotics Lab. CSIE NCKU.
Face Recognition Based on 3D Shape Estimation
Object Detection and Tracking Mike Knowles 11 th January 2005
Robust estimation Problem: we want to determine the displacement (u,v) between pairs of images. We are given 100 points with a correlation score computed.
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
An Illumination Invariant Face Recognition System for Access Control using Video Ognjen Arandjelović Roberto Cipolla Funded by Toshiba Corp. and Trinity.
Recognizing Emotions in Facial Expressions
1 Modeling Facial Shape and Appearance M. L. Gavrilova.
TEAM-1 JACKIE ABBAZIO SASHA PEREZ DENISE SILVA ROBERT TESORIERO Face Recognition Systems.
Optical flow (motion vector) computation Course: Computer Graphics and Image Processing Semester:Fall 2002 Presenter:Nilesh Ghubade
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Facial Feature Detection
MITRE Corporation is a federally-funded research-and- development corporation that has developed their own facial recognition system, known as MITRE Matcher.
1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.
Flow Based Action Recognition Papers to discuss: The Representation and Recognition of Action Using Temporal Templates (Bobbick & Davis 2001) Recognizing.
Last tuesday, you talked about active shape models Data set of 1,500 hand-labeled faces 20 facial features (eyes, eye brows, nose, mouth, chin) Train 40.
Laboratory of Computational Engineering Michael Frydrych, Making the head smile Smile? -> part of non-verbal communication Functions of non-verbal.
EXPRESSED EMOTIONS Monica Villatoro. Vocab to learn * Throughout the ppt the words will be bold and italicized*  Emotions  Facial Codes  Primary Affects.
1 TEMPLATE MATCHING  The Goal: Given a set of reference patterns known as TEMPLATES, find to which one an unknown pattern matches best. That is, each.
Graphite 2004 Statistical Synthesis of Facial Expressions for the Portrayal of Emotion Lisa Gralewski Bristol University United Kingdom
Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Microsoft Research Irfan Ullah Dept. of Info. and Comm. Engr. Myongji University.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Recognition, Analysis and Synthesis of Gesture Expressivity George Caridakis IVML-ICCS.
1 Ying-li Tian, Member, IEEE, Takeo Kanade, Fellow, IEEE, and Jeffrey F. Cohn, Member, IEEE Presenter: I-Chung Hung Advisor: Dr. Yen-Ting Chen Date:
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
Template matching and object recognition. CS8690 Computer Vision University of Missouri at Columbia Matching by relations Idea: –find bits, then say object.
Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.
Face Recognition: An Introduction
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
1 University of Texas at Austin Machine Learning Group 图像与视频处理 计算机学院 Motion Detection and Estimation.
Adaptive Rigid Multi-region Selection for 3D face recognition K. Chang, K. Bowyer, P. Flynn Paper presentation Kin-chung (Ryan) Wong 2006/7/27.
Temporally Coherent Completion of Dynamic Shapes AUTHORS:HAO LI,LINJIE LUO,DANIEL VLASIC PIETER PEERS,JOVAN POPOVIC,MARK PAULY,SZYMON RUSINKIEWICZ Presenter:Zoomin(Zhuming)
Autonomous Robots Vision © Manfred Huber 2014.
Motion Estimation using Markov Random Fields Hrvoje Bogunović Image Processing Group Faculty of Electrical Engineering and Computing University of Zagreb.
Introduction Performance of metric learning is heavily dependent on features extracted Sensitive to Performance of Filters used Need to be robust to changes.
Face Recognition Summary –Single pose –Multiple pose –Principal components analysis –Model-based recognition –Neural Networks.
CONTENTS:  Introduction.  Face recognition task.  Image preprocessing.  Template Extraction and Normalization.  Template Correlation with image database.
Deformation Modeling for Robust 3D Face Matching Xioguang Lu and Anil K. Jain Dept. of Computer Science & Engineering Michigan State University.
Tracking Objects with Dynamics
Dynamical Statistical Shape Priors for Level Set Based Tracking
The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression By: Patrick Lucey, Jeffrey F. Cohn, Takeo.
The Functional Space of an Activity Ashok Veeraraghavan , Rama Chellappa, Amit Roy-Chowdhury Avinash Ravichandran.
Range Imaging Through Triangulation
Eigenfaces for recognition (Turk & Pentland)
Image and Video Processing
Data-Driven Methods to Diversify Knowledge of Human Psychology
The Human Face as a Dynamic Tool for Social Communication
CSSE463: Image Recognition Day 30
Marian Stewart Bartlett, Gwen C. Littlewort, Mark G. Frank, Kang Lee 
Presentation transcript:

Irfan Essa, Alex Pentland Facial Expression Recognition using a Dynamic Model and Motion Energy (a review by Paul Fitzpatrick for 6.892)

Overview  Want to categorize facial motion  Existing coding schemes not suitable –Oriented towards static expressions –Designed for human use  Build better coding scheme –More detailed, sensitive to dynamics  Categorize using templates constructed from examples of expression changes –Facial muscle actuation templates –Motion energy templates

Facial Action Coding System  FACS allows psychologists code expression from static facial “mug-shots”  Facial configuration = combination of “action units”

Problems with action units  Spatially localized –Real expressions are rarely local  Poor time coding –Either no temporal coding, or heuristic –Co-articulation effects not represented

Solution: add detail  Represent time course of all muscle activations during expression  For recognition, match against templates derived from example activation histories  To estimate muscle activation: –Register image of face with canonical mesh –Through mesh, locate muscle attachments on face –Estimate muscle activation from optic flow –Apply muscle activation to face model to generate “corrected” motion field, also used for recognition

Registering image with mesh  Find eyes, nose, mouth  Warp on to generic face mesh  Use mesh to pick out further features on face

Registering mesh with muscles  Once face is registered with mesh, can relate to muscle attachments  36 muscles modeled; 80 face regions

Parameterize face motion  Use continuous time Kalman filter to estimate: –Shape parameters: mesh positions, velocities, etc. –Control parameters: time course of muscle activation

Driven by optic flow  Computed using coarse to fine methods  Use flow to estimate muscle actuation  Then use muscle actuation to generate flow on model

Spatial patterning  Can capture simultaneous motion across the entire face  Can represent the detailed time course of muscle activation  Both are important for typical expressions

Temporal patterning  Application/release/relax structure – not a ramp  Co-articulation effects present a(e bx -1) a(e (c-bx) -1) Second Peak

Peak muscle actuation templates Peak muscle actuations during smile (dotted line is template)  Normalize time period of expression  For each muscle, measure peak value over application and release  Use result as template for recognition –Normalizes out time course, doesn’t actually use it for recognition?

Peak muscle actuation templates  Randomly pick two subjects making expression, combine to form template  Match against template using normalized dot product TemplatesPeak muscle actuations for 5 subjects

Motion energy templates  Use motion field on face model, not on original image  Build template representing how much movement there is at each location on the face –Again, summarizes over time course, rather than representing it in detail –But does represent some temporal properties Motion energy template for smile High Low

Motion energy templates  Randomly pick two subjects making expression, combine to form template  Match against template using Euclidean distance SmileSurpriseAngerDisgustRaise brow High Low

Data acquisition  Video sequences of 20 subjects making 5 expressions – smile, surprise, anger, disgust, raise brow  Omitted hard-to-evoke expressions of sadness, fear  Test set: 52 sequences across 8 subjects

Data acquisition

Using peak muscle actuation  Comparison of peak muscle actuation against templates across entire database  1.0 indicates complete similarity

Using peak muscle actuation  Actual results for classification  One misclassification over 51 sequences

Using motion energy templates  Comparison of motion energy against templates across entire database  Low scores indicate greater similarity

Using motion energy templates  Actual results for classification  One misclassification over 49 sequences

Small test set  Test set is a little small to judge performance  Simple simulation of the motion energy classifier using their tables of means and std. deviations shows: –Large variation in results for their sample size –Results are worse than test data would suggest –Example: anger classification for large sample size has accuracy of 67%, as opposed to 90%  Simulation based on false Gaussian, uncorrelated assumption (and means, deviations derived from small data set!)

Naïve simulated results Smile90.7%1.4%2.0%19.4%0.0% Surprise0.0%64.8%9.0%0.1%0.0% Anger0.0%18.2%67.1%3.8%9.9% Disgust9.3%13.1%21.4%76.7%0.0% Raise brow 0.0%2.4%0.5%0.0%90.1% Overall success rate: 78% (versus 98%)

Motion estimation vs. categorization  The authors’ formulation allows detailed prior knowledge of the physics of the face to be brought to bear on motion estimation  The categorization component of the paper seems a little primitive in comparison  The template-matching the authors use is: –Sensitive to irrelevant variation (facial asymmetry, intensity of action) –Does not fully use the time course data they have been so careful to collect

Video, gratuitous image of Trevor ’95 paper – what came next? Real-time version with Trevor