Face Detection, Pose Estimation, and Landmark Localization in the Wild

Slides:



Advertisements
Similar presentations
Latent SVMs for Human Detection with a Locally Affine Deformation Field Ľubor Ladický 1 Phil Torr 2 Andrew Zisserman 1 1 University of Oxford 2 Oxford.
Advertisements

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.
Face Alignment by Explicit Shape Regression
Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.
A Unified Framework for Context Assisted Face Clustering
Factorial Mixture of Gaussians and the Marginal Independence Model Ricardo Silva Joint work-in-progress with Zoubin Ghahramani.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Ľubor Ladický1 Phil Torr2 Andrew Zisserman1
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.
Face Alignment with Part-Based Modeling
Many slides based on P. FelzenszwalbP. Felzenszwalb General object detection with deformable part-based models.
Parsing Clothing in Fashion Photographs
Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.
Automatic Feature Extraction for Multi-view 3D Face Recognition
Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction Atul Kanaujia, CBIM, Rutgers Cristian Sminchisescu, TTI-C Dimitris Metaxas,CBIM, Rutgers.
Steerable Part Models Hamed Pirsiavash and Deva Ramanan
Outline  Facial Attributes Analysis  Animated Pose Templates(APT) for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
2.Our Framework 2.1. Enforcing Temporal Consistency by Post Processing  Human Detection from Yang and Ramanan [1] Articulated Pose Estimation using Flexible.
Human Action Recognition by Learning Bases of Action Attributes and Parts.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
More sliding window detection: Discriminative part-based models Many slides based on P. FelzenszwalbP. Felzenszwalb.
DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
One-Shot Multi-Set Non-rigid Feature-Spatial Matching
Beyond Actions: Discriminative Models for Contextual Group Activities Tian Lan School of Computing Science Simon Fraser University August 12, 2010 M.Sc.
Locally Constraint Support Vector Clustering
Real-time Combined 2D+3D Active Appearance Models Jing Xiao, Simon Baker,Iain Matthew, and Takeo Kanade CVPR 2004 Presented by Pat Chan 23/11/2004.
3D Hand Pose Estimation by Finding Appearance-Based Matches in a Large Database of Training Views
Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely.
Image Matching via Saliency Region Correspondences Alexander Toshev Jianbo Shi Kostas Daniilidis IEEE Conference on Computer Vision and Pattern Recognition.
Proxemics Recognition
Generic object detection with deformable part-based models
Feature Matching Longin Jan Latecki. Matching example Observe that some features may be missing due to instability of feature detector, view change, or.
Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca
Learning Collections of Parts for Object Recognition and Transfer Learning University of Illinois at Urbana- Champaign.
Object Detection with Discriminatively Trained Part Based Models
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Deformable Part Model Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 11 st, 2013.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.
Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September, 2006.
Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.
Recognition Using Visual Phrases
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.
Scene Parsing with Object Instances and Occlusion Ordering JOSEPH TIGHE, MARC NIETHAMMER, SVETLANA LAZEBNIK 2014 IEEE CONFERENCE ON COMPUTER VISION AND.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Object detection with deformable part-based models
Compositional Human Pose Regression
Nonparametric Semantic Segmentation
Huazhong University of Science and Technology
Thesis Advisor : Prof C.V. Jawahar
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
A Convolutional Neural Network Cascade For Face Detection
Brief Review of Recognition + Context
Author: Ye Li, Meng Joo Er, and Dayong Shen Speaker: Kai-Wen, Weng
Outline Background Motivation Proposed Model Experimental Results
Active Appearance Models theory, extensions & cases
Deep Object Co-Segmentation
Presentation transcript:

Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine

Outline Introduction Model Inference Learning Experimental Results Conclusion

Introduction A unified model for face detection, pose estimation,and landmark estimation. Based on a mixtures of trees with a shared pool of parts Use global mixtures to capture topological changes

Introduction

mixture-of-trees model

Tree structured part model We write each tree Tm =(Vm,Em) as a linearly-parameterized ,where m indicates a mixture and . I : image, and li = (xi, yi) : the pixel location of part i. We score a configuration of parts : a scalar bias associated with viewpoint mixture m

Tree structured part model Sums the appearance evidence for placing a template for part i, tuned for mixture m, at location li. Scores the mixture-specific spatial arrangement of parts L

Shape model the shape model can be rewritten : re-parameterizations of the shape model (a, b, c, d) : a block sparse precision matrix, with non-zero entries corresponding to pairs of parts i, j connected in Em.

The mean shape and deformation modes

Inference Inference corresponds to maximizing S(I, L,m) in Eqn.1 over L and m: Since each mixture Tm =(Vm,Em) is a tree, the inner maximization can be done efficiently with dynamic programming.

Learning Given labeled positive examples {In,Ln,mn} and negative examples {In}, we will define a structured prediction objective function similar to one proposed in [41]. To do so, let’s write zn = {Ln,mn}. Concatenating Eqn1’s parameters into a single vector [41] Y. Yang and D. Ramanan. Articulated pose estimation using flexible mixtures of parts. In CVPR 2011.

Learning Now we can learn a model of the form: The objective function penalizes violations of these constraints using slack variables write K for the indices of the quadratic spring terms (a, c) in parameter vector .

Experimental Results

Dataset CMU MultiPIE annotated face in-the-wild (AFW) (from Flickr images)

Dataset

Sharing We explore 4 levels of sharing, denoting each model with the number of distinct templates encoded. Share-99 (i.e. fully shared model) Share-146 Share-622 Independent-1050 (i.e. independent model)

In-house baselines We define Multi.HoG to be rigid, multiview HoG template detectors, trained on the same data as our models. We define Star Model to be equivalent to Share- 99 but defined using a “star” connectivity graph, where all parts are directly connected to a root nose part.

Face detection on AFW testset [22] Z. Kalal, J. Matas, and K. Mikolajczyk. Weighted sampling for large-scale boosting. In BMVC 2008.

Pose estimation

Landmark localization

Landmark localization

AFW image

Conclusion Our model outperforms state-of-the-art methods, including large-scale commercial systems, on all three tasks under both constrained and in-the-wild environments.

Thanks for listening