UNIVERSITY OF MURCIA (SPAIN) ARTIFICIAL PERCEPTION AND PATTERN RECOGNITION GROUP Estimating 3D Facial Pose in Video with Just Three Points Ginés García.

Slides:

Advertisements

Similar presentations

Real-Time Detection, Alignment and Recognition of Human Faces

Advertisements

UNIVERSIDAD DE MURCIA LÍNEA DE INVESTIGACIÓN DE PERCEPCIÓN ARTIFICIAL Y RECONOCIMIENTO DE PATRONES - GRUPO DE COMPUTACIÓN CIENTÍFICA A CAMERA CALIBRATION.

Active Appearance Models

Simultaneous surveillance camera calibration and foot-head homology estimation from human detection 1 Author : Micusic & Pajdla Presenter : Shiu, Jia-Hau.

Face Detection & Synthesis using 3D Models & OpenCV Learning Bit by Bit Don Miller ITP, Spring 2010.

Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.

Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.

Weiwei Zhang, Jian Sun, and Xiaoou Tang, Fellow, IEEE.

Face Alignment with Part-Based Modeling

Electrical & Computer Engineering Dept. University of Patras, Patras, Greece Evangelos Skodras Nikolaos Fakotakis.

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Automatic Feature Extraction for Multi-view 3D Face Recognition

Face Alignment by Explicit Shape Regression

AdaBoost & Its Applications

Face detection Many slides adapted from P. Viola.

A Robust Pedestrian Detection Approach Based on Shapelet Feature and Haar Detector Ensembles Wentao Yao, Zhidong Deng TSINGHUA SCIENCE AND TECHNOLOGY ISSNl.

Active Calibration of Cameras: Theory and Implementation Anup Basu Sung Huh CPSC 643 Individual Presentation II March 4 th,

HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,

Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.

UNIVERSITY OF MURCIA (SPAIN) ARTIFICIAL PERCEPTION AND PATTERN RECOGNITION GROUP A PERCEPTUAL INTERFACE USING INTEGRAL PROJECTIONS Ginés García Mateos.

Accurate Non-Iterative O( n ) Solution to the P n P Problem CVLab - Ecole Polytechnique Fédérale de Lausanne Francesc Moreno-Noguer Vincent Lepetit Pascal.

Real-time Embedded Face Recognition for Smart Home Fei Zuo, Student Member, IEEE, Peter H. N. de With, Senior Member, IEEE.

LYU0603 A Generic Real-Time Facial Expression Modelling System Supervisor: Prof. Michael R. Lyu Group Member: Cheung Ka Shun ( ) Wong Chi Kin ( )

Real-time Computer Vision with Scanning N-Tuple Grids Simon Lucas Computer Science Dept.

1 Face Tracking in Videos Gaurav Aggarwal, Ashok Veeraraghavan, Rama Chellappa.

Face Recognition with Harr Transforms and SVMs EE645 Final Project May 11, 2005 J Stautzenberger.

Incremental Learning of Temporally-Coherent Gaussian Mixture Models Ognjen Arandjelović, Roberto Cipolla Engineering Department, University of Cambridge.

Robust Lane Detection and Tracking

4EyesFace-Realtime face detection, tracking, alignment and recognition Changbo Hu, Rogerio Feris and Matthew Turk.

Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.

Robust Real-Time Object Detection Paul Viola & Michael Jones.

Viola and Jones Object Detector Ruxandra Paun EE/CS/CNS Presentation

UNIVERSITY OF MURCIA (SPAIN) ARTIFICIAL PERCEPTION AND PATTERN RECOGNITION GROUP REFINING FACE TRACKING WITH INTEGRAL PROJECTIONS Ginés García Mateos Dept.

Facial Features Extraction Amit Pillay Ravi Mattani Amit Pillay Ravi Mattani.

Segmentation and tracking of the upper body model from range data with applications in hand gesture recognition Navin Goel Intel Corporation Department.

Human tracking and counting using the KINECT range sensor based on Adaboost and Kalman Filter ISVC 2013.

WP3 - 3D reprojection Goal: reproject 2D ball positions from both cameras into 3D space Inputs: – 2D ball positions estimated by WP2 – 2D table positions.

3D SLAM for Omni-directional Camera

Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Microsoft Research Irfan Ullah Dept. of Info. and Comm. Engr. Myongji University.

Video Based Palmprint Recognition Chhaya Methani and Anoop M. Namboodiri Center for Visual Information Technology International Institute of Information.

Sign Classification Boosted Cascade of Classifiers using University of Southern California Thang Dinh Eunyoung Kim

An Information Fusion Approach for Multiview Feature Tracking Esra Ataer-Cansizoglu and Margrit Betke ) Image and.

1 Webcam Mouse Using Face and Eye Tracking in Various Illumination Environments Yuan-Pin Lin et al. Proceedings of the 2005 IEEE Y.S. Lee.

DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.

Course 9 Texture. Definition: Texture is repeating patterns of local variations in image intensity, which is too fine to be distinguished. Texture evokes.

 Detecting system  Training system Human Emotions Estimation by Adaboost based on Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki （ Kobe University ） User's.

Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL

Real-Time Detection, Alignment and Recognition of Human Faces Rogerio Schmidt Feris Changbo Hu Matthew Turk Pattern Recognition Project June 12, 2003.

The geometry of the system consisting of the hyperbolic mirror and the CCD camera is shown to the right. The points on the mirror surface can be expressed.

User Attention Tracking in Large Display Face Tracking and Pose Estimation Yuxiao Hu Media Computing Group Microsoft Research, Asia.

The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.

AAM based Face Tracking with Temporal Matching and Face Segmentation Mingcai Zhou 1 、 Lin Liang 2 、 Jian Sun 2 、 Yangsheng Wang 1 1 Institute of Automation.

Detecting Eye Contact Using Wearable Eye-Tracking Glasses.

3D head pose estimation from multiple distant views X. Zabulis, T. Sarmis, A. A. Argyros Institute of Computer Science, Foundation for Research and Technology.

CS 548 Spring 2016 Model and Regression Trees Showcase by Yanran Ma, Thanaporn Patikorn, Boya Zhou Showcasing work by Gabriele Fanelli, Juergen Gall, and.

Face detection Many slides adapted from P. Viola.

Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.

Face Detection 蔡宇軒.

MIT Artificial Intelligence Laboratory — Research Directions Intelligent Perceptual Interfaces Trevor Darrell Eric Grimson.

3D Perception and Environment Map Generation for Humanoid Robot Navigation A DISCUSSION OF: -BY ANGELA FILLEY.

FINGERTEC FACE ID FACE RECOGNITION Technology Overview.

Paper – Stephen Se, David Lowe, Jim Little

José Manuel Iñesta José Martínez Sotoca Mateo Buendía

Seunghui Cha1, Wookhyun Kim1

Head pose estimation without manual initialization

Efficient Deformable Template Matching for Face Tracking

Chapter 1: Image processing and computer vision Introduction

Combining Geometric- and View-Based Approaches for Articulated Pose Estimation David Demirdjian MIT Computer Science and Artificial Intelligence Laboratory.

AHED Automatic Human Emotion Detection

HALO-FREE DESIGN FOR RETINEX BASED REAL-TIME VIDEO ENHANCEMENT SYSTEM

Presentation transcript:

UNIVERSITY OF MURCIA (SPAIN) ARTIFICIAL PERCEPTION AND PATTERN RECOGNITION GROUP Estimating 3D Facial Pose in Video with Just Three Points Ginés García Mateos, Alberto Ruiz García Dept. de Informática y Sistemas P.E. López-de-Teruel, A.L. Rodriguez, L. Fernández Dept. Ingeniería y Tecnología de Computadores University of Murcia - SPAIN

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Introduction (1/3) Main objective: to develop a new method to estimate the 3D pose of the head of a human user: –Estimation through a video sequence –Working with the minimum necessary information: a 2D location of the face –A very simple method, without training, running in real-time: fast processing –Under realistic conditions: robust to facial expressions, light, movements –Robustness preferred to accuracy

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Introduction (2/3) 3D pose estimation using 3D tracking… Active Appearance Model Shape & texture models Cylindrical Models 3D morphable mesh

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Introduction (3/3) In short, we want to obtain something like this: The result is 3D location (x, y, x), and 3D orientation (roll, pitch, yaw): 6 D.O.F.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Index of the presentation Overview of the proposed method –2D facial detection and location –2D face tracking 3D Facial pose estimation –3D Position –3D Orientation Experimental results Conclusions

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Overview of the Proposed Method The key idea: separate the problems of 2D tracking and 3D pose estimation. Introducing some assumptions and simplifications, pose is extracted with very little information. The proposed 3D pose estimator could use any 2D facial tracker 2D Face detection 2D Face tracking 3D Pose estimation

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Detection, Location and Tracking Using I.P. We use a method based on integral projections (I.P.), which is simple and fast. Definition of I.P.: average of gray levels of an image along rows and columns. i(x, y) PV i : [y min,..., y max ] → R Given by: PV i (y) := i(·, y) PH i : [x min,..., x max ] → R Given by: PH i (x) := i(x, ·)

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Detection with I.P. Global view of the I.P. face detector Input image PVface PHeyes Step 1. Vertical projections by strips Step 2. Horizontal projection of the candidates Step 3. Grouping of the candidates Final result

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Detection with I.P. To improve the results, we combine two face detectors: combined detector. Face Detector 1. Look for candidates Face Detector 2. Verify face candidates Final detection result Haar + AdaBoost [Viola and Jones, 2001] Integral Projections [Garcia et al, 2007]

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Detection with I.P. [Garcia et al, 2007]

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Location with I.P. Global view of the 2D face locator Input image and face Step 1. Orientation estimation Step 2. Vertical alignment Step 3. Horizontal alignment Final result MVface(y) y PV’face(y) PVface(y) y

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Location with I.P. Location accuracy of the 2D face locator Av. time PIV 2.6Gh 1,7 ms IntProjNeuralNetEigenFeat 323,6 ms 20,5 ms

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Tracking with I.P.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Face Tracking with I.P. Sample result of the proposed tracker. 320x240 pixels, 312 frames at 25fps, laptop webcam (e1x, e1y) = location of left eye; (e2x, e2y) = right eye; (mx, my) = location of the mouth

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Facial Pose Estimation In theory, 3 points should be enough to solve the 6 degrees-of-freedom (if focal length and face geometry are known). But… Location errors are high in the mouth for non-frontal faces. Some assumptions are introduced to avoid the effect of this error.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Facial Pose Estimation Fixed body assumption: fixed user’s body, moving the head  3D position is estimated in the first frame; 3D orientation in the following frames. A simple perspective projection model is used to estimate 3D position.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Position Estimation f: focal length (known) (cx,cy): tracked center of the face (0,0,0) p= (px,py,pz) cx= (e1x+e2x+mx)/3 cy= (e1y+e2y+my)/3 cx= (e1x+e2x+mx)/3 cy= (e1y+e2y+my)/3

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, D Position Estimation We have: cx/f = px/pz ; cy/f = py/pz Where: cx= (e1x+e2x+mx)/3; cy= (e1y+e2y+my)/3 So: px= (e1x+e2x+mx)/3·pz/f py= (e1y+e2y+my)/3·pz/f The depth of the face, pz, is computed with: pz= f·t/r, where r is the apparent face size* and t is the real size. * For more information, see the paper..

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Estimation of Roll Angle Roll angle can be approximately associated with the 2D rotation of the face in the image. roll = arctan e2y − e1y e2x − e1x This equation is valid in most practical situations, but it is not precise in all cases. roll = -43,7ºroll = -2,8ºroll = 15,9ºroll = 34,6º

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Estimation of Pitch and Yaw The head-neck system can be modeled as a robotic arm, with 3 rotational DOF. Y Z X roll pitch yaw X Y b b c Z X Y b b a ORTHOGRAPHIC VIEW TOP VIEW FRONT VIEW Z i In this model, any point of the head lies in a sphere  its projection is related to pitch and yaw. Y X i (dx0,dy0) (dxt,dyt) r i

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Estimation of Pitch and Yaw r w : radius of the sphere where the center of the eyes lies. r i : radius of the circle where that sphere is projected. (dx0, dy0): initial center of eyes. (dxt, dyt): current center of eyes Y i X i (dx0,dy0) r i  r w = sqrt(a 2 +c 2 )  r i = r w ·f/pz  ((e1x+e2x)/2, (e1y+e2y)/2) Y i X i (dx0,dy0) (dx1,dy1) r i Y i X i (dx0,dy0) (dx2,dy2) r i Initial frame pitch= 0, yaw= 0 Instant t = 1 Instant t = 2

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Estimation of Pitch and Yaw In essence, we have a problem of computing altitude and latitude for a given point in a circle. The center of the circle is: (dx0, dy0 − a·f/pz) So we have: pitch = arcsin And: yaw = arcsin dyt − (dy0 − a · f/pz) riri - arcsin a/c dxt − dx0 r i · cos(pitch + arcsin(a/c)) Y X i (dx0,dy0) (dxt,dyt) r i

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (1/7) Experiments carried out: –Off-the-shelf webcams. –Different individuals. –Variations in facial expressions and facial elements (glasses). Studies of robustness, efficiency, comparison with a projection-based 3D estimation algorithm. In a Pentium IV at 2.6Gh: ~5 ms file reading, ~3 ms tracking, ~0.006 ms pose estimation

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (2/7) Sample input video: bego.a.avi 320x240 pixels, 312 frames at 25fps, laptop webcam

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (3/7) 3D pose estimation results 320x240 pixels, 312 frames at 25fps, laptop webcam

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (4/7) Pitch Proposed method Projection-based Proposed method Projection-based

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (5/7) Range of working angles… Approx. ±20º in pitch and ±40º in yaw. The 2D tracker is not explicitly prepared for profile faces!

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (6/7) With glasses and without glasses

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Experimental Results (7/7) When fixed-body assumption does not hold Body/shoulder tracking could be used to compensate body movement.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Conclusions (1/3) Our purpose was to design a fast, robust, generic and approximate 3D pose estimation method: –Separation of 2D tracking and 3D pose. –Fixed-body assumption. –Robotic head model. 3D position is computed in the first frame. 3D orientation is estimated in the rest of frames. Estimation process is very simple, and avoids inaccuracies in the 2D tracker.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Conclusions (2/3) Future work: using the 3D pose estimator in a perceptual interface.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Conclusions (3/3) The simplifications introduced lead to several limitations of our system, but in general… Human anatomy of the head/neck system could be used in 3D face trackers. The human head cannot move independently of the body! Taking advantage of these anatomical limitations could simplify and improve current trackers.

ESTIMATING 3D FACIAL POSE IN VIDEO WITH JUST THREE POINTS G. García A. Ruiz P.E. López A.L. Rodríguez L. Fernández 3DFP’2008 ANCHORAGE JUNE, Last This work has been supported by the project Consolider Ingenio-2010 CSD , and TIN C Sample videos: Grupo PARP web page: Thank you very much