Joint Face Alignment The Recognition Pipeline

Slides:

Advertisements

Similar presentations

Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.

Advertisements

Dynamic View Selection for Time-Varying Volumes Guangfeng Ji* and Han-Wei Shen The Ohio State University *Now at Vital Images.

Active Appearance Models

Active Shape Models Suppose we have a statistical shape model –Trained from sets of examples How do we use it to interpret new images? Use an “Active Shape.

Learning deformable models Yali Amit, University of Chicago Alain Trouvé, CMLA Cachan.

JPEG Compresses real images Standard set by the Joint Photographic Experts Group in 1991.

Scalable Learning in Computer Vision

Computer Science Department Learning on the Fly: Rapid Adaptation to the Image Erik Learned-Miller with Vidit Jain, Gary Huang, Laura Sevilla Lara, Manju.

Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.

Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.

1 Approximated tracking of multiple non-rigid objects using adaptive quantization and resampling techniques. J. M. Sotoca 1, F.J. Ferri 1, J. Gutierrez.

Image Congealing (batch/multiple) image (alignment/registration) Advanced Topics in Computer Vision (048921) Boris Kimelman.

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.

Face Alignment at 3000 FPS via Regressing Local Binary Features

Unsupervised Learning of Visual Object Categories Michael Pfeiffer

Robust Object Tracking via Sparsity-based Collaborative Model

Image classification by sparse coding.

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.

Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.

A Study of Approaches for Object Recognition

Segmentation Divide the image into segments. Each segment:

Image Segmentation. Introduction The purpose of image segmentation is to partition an image into meaningful regions with respect to a particular application.

Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.

CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.

Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques

An Illumination Invariant Face Recognition System for Access Control using Video Ognjen Arandjelović Roberto Cipolla Funded by Toshiba Corp. and Trinity.

Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.

Introduction to machine learning

Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.

Exercise Session 10 – Image Categorization

Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.

Computer Science Department Detection, Alignment and Recognition of Real World Faces Erik Learned-Miller with Vidit Jain, Gary Huang, Andras Ferencz, et.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.

1 Mean shift and feature selection ECE 738 course project Zhaozheng Yin Spring 2005 Note: Figures and ideas are copyrighted by original authors.

Boris Babenko Department of Computer Science and Engineering University of California, San Diego Semi-supervised and Unsupervised Feature Scaling.

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

计算机学院计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知计算机学院 Perceptron Revisited: Linear Separators Binary classification.

Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.

Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.

Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

Robust Real Time Face Detection

Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.

The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.

Face Alignment at 3000fps via Regressing Local Binary Features CVPR14 Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun Presented by Sung Sil Kim.

Inference in generative models of images and video John Winn MSR Cambridge May 2004.

Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)

Machine learning & object recognition Cordelia Schmid Jakob Verbeek.

Classification of unlabeled data:

LOCUS: Learning Object Classes with Unsupervised Segmentation

Fast Preprocessing for Robust Face Sketch Synthesis

Unsupervised Learning of Models for Recognition

Presented by: Chang Jia As for: Pattern Recognition

Presented by Wanxue Dong

EM Algorithm and its Applications

Presentation transcript:

Unsupervised Joint Alignment of Complex Images Gary B Huang, Vidit Jain, Erik Learned-Miller

Joint Face Alignment The Recognition Pipeline Most systems ignore the middle stage, relying on the initial detector to do a rough alignment Alignment reduces variability and allows for conditioning on spatial position and analysis of structure Two major drawbacks to current alignment methods Designed for a single class Require manually labeling of either specific features or pose More involved than simple discrete labels for detection and recognition AAM - ~80 landmarks for >100 training images Unsupervised method with congealing No manually selected landmarks or hand selected parts No image explicitly labeled as canonical pose End result entirely determined by data

Congealing Intuition Distribution Field Congealing Intra-class images have similar structure and shape Thus, low variability of pixel values at specific location Distribution Field Distribution over alphabet ({0,1} for binary images) at each pixel Set of images defines an empirical distribution field Congealing update distribution field from transformed images increase likelihood of image with respect to distribution field

Congealing How to align a new image after congealing? Insert into training set, re-run algorithm More efficient to save sequence of distribution fields from congealing High entropy to low entropy sequence  “Image Funnel” Funneling: increase likelihood of new image at each iteration according to corresponding distribution field Image Funnel New Image Aligned Image

Congealing Complex Images Congealing has proven to work well on certain object classes Traditionally applied directly to pixel values Applied successfully to binary handwritten digits and MRI volumes Our goal: Extend congealing to deal with noise in real world images Complex and variable lighting effects Occlusions Highly varied foreground objects (hair, hats, glasses…) Highly varied backgrounds

Congealing Complex Images Extending Congealing to Complex Images Traditionally congealing is done on pixel intensities High variation due to lighting and variable foreground  high entropy even when correctly aligned Congealing on edge values No “basin of attraction”, plateaus in optimization landscape Integrate over window  SIFT descriptor at each pixel Each descriptor is 32 dimensional vector, too large to estimate entropy

Congealing Complex Images Extending Congealing to Face Images (cont) Cluster SIFT descriptors using kmeans Congealing on hard assignments forces pixels to take relatively small number of values Similar local minima problems as with edge values Initial experiments with hard assignments led to congealing terminating early with no significant changes from initial alignment Use soft assignment of pixels to clusters Each pixel is multinomial distribution, with probabilities equal to probability of belonging to each cluster Does not change nature of distribution field Distribution field is still a set of distributions, one at each pixel, over the possible clusters Analogy with grayscale using binary alphabet Gray pixels are treated as mixtures of underlying black and white “subpixels”

Congealing Complex Images Window around pixel SIFT vector and clusters Posterior distribution

Results (faces) Congealed with 300 images from “Faces in the Wild” Realistic data set of news photos with different people, complex backgrounds, variable illumination and foreground appearance

Results (cars) Congealed with 125 rear car images (variable background/lighting) Achieved with no labeling and no changes to code

Results on Recognition Tested effect on recognition Used trained hyper-feature based recognizer (Jain et al) Tested using outputs of Viola-Jones, Zhou (supervised), and funneling Congealing improves recognition with no added supervision AUC Unaligned 0.6870 Zhou aligned 0.7312 Congealing 0.7549

Future Work Two-tier alignment process Score alignment results based on likelihood under final distribution field, align low scoring images in separate stage