BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David.

Slides:



Advertisements
Similar presentations
CSCE 643 Computer Vision: Template Matching, Image Pyramids and Denoising Jinxiang Chai.
Advertisements

Face Alignment by Explicit Shape Regression
Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Gabor Filter: A model of visual processing in primary visual cortex (V1) Presented by: CHEN Wei (Rosary) Supervisor: Dr. Richard So.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Computational Biology, Part 23 Biological Imaging II Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.
ImageNet Classification with Deep Convolutional Neural Networks
HMAX Models Architecture Jim Mutch March 31, 2010.
Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.
Facial feature localization Presented by: Harvest Jang Spring 2002.
Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection Ejaz Ahmed 1, Gregory Shakhnarovich 2, and Subhransu Maji 3 1 University.
Learning Convolutional Feature Hierarchies for Visual Recognition
Virtual Dart: An Augmented Reality Game on Mobile Device Supervisor: Professor Michael R. Lyu Prepared by: Lai Chung Sum Siu Ho Tung.
Image Enhancement To process an image so that the result is more suitable than the original image for a specific application. Spatial domain methods and.
Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.
Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.
Motion Map: Image-based Retrieval and Segmentation of Motion Data EG SCA ’ 04 學生 : 林家如
Scale Invariant Feature Transform (SIFT)
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Spatial Pyramid Pooling in Deep Convolutional
Object Detection Using the Statistics of Parts Henry Schneiderman Takeo Kanade Presented by : Sameer Shirdhonkar December 11, 2003.
Overview of Back Propagation Algorithm
Oral Defense by Sunny Tang 15 Aug 2003
Radial-Basis Function Networks
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Multimedia Systems & Interfaces Karrie G. Karahalios Spring 2007.
Multiclass object recognition
Multimodal Interaction Dr. Mike Spann
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Face detection Slides adapted Grauman & Liebe’s tutorial
A Face processing system Based on Committee Machine: The Approach and Experimental Results Presented by: Harvest Jang 29 Jan 2003.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Face Detection Using Large Margin Classifiers Ming-Hsuan Yang Dan Roth Narendra Ahuja Presented by Kiang “Sean” Zhou Beckman Institute University of Illinois.
Image Classification for Automatic Annotation
2D Texture Synthesis Instructor: Yizhou Yu. Texture synthesis Goal: increase texture resolution yet keep local texture variation.
Digital Watermarking Using Phase Dispersion --- Update SIMG 786 Advanced Digital Image Processing Mahdi Nezamabadi, Chengmeng Liu, Michael Su.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
11/6/ :55 Graphics II Introduction to Parametric Curves and Surfaces Session 2.
A Statistical Approach to Texture Classification Nicholas Chan Heather Dunlop Project Dec. 14, 2005.
CS332 Visual Processing Department of Computer Science Wellesley College High-Level Vision Face Recognition I.
3D Face Recognition Using Range Images Literature Survey Joonsoo Lee 3/10/05.
WLD: A Robust Local Image Descriptor Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikäinen, Xilin Chen, Wen Gao 报告人:蒲薇榄.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
Lecture 4b Data augmentation for CNN training
Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.
SIFT.
A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004.
1 Nonlinear models for Natural Image Statistics Urs Köster & Aapo Hyvärinen University of Helsinki.
Supplement 189: Parametric Blending Presentation State Storage.
Computer Science and Engineering, Seoul National University
Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.
Lit part of blue dress and shadowed part of white dress are the same color
Final Year Project Presentation --- Magic Paint Face
Object Recognition in the Dynamic Link Architecture
Introduction to Neural Networks
Deep Learning Hierarchical Representations for Image Steganalysis
Enhancing the Enlargement of Images
SIFT.
Neural Networks and Neuroscience-Inspired Computer Vision
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Enhancing the Enlargement of Images
Lark Kwon Choi, Alan Conrad Bovik
Presentation transcript:

BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David Cox The Rowland Institute at Harvard, Harvard University International Conference on Automatic Face and Gesture Recognition (FG), 2011.

Outline  Introduction  Method  V1-like visual representation  High-throughput-derived multilayer visual representations  Kernel Combination  Experiment Result  Discussion

Introduction  “Biologically-inspired” representation  capture aspects of the computational architecture of the brain and mimic its computational abilities

Introduction  Large Scale Feature Search Framework  Generate models with different parameters then screening

Method - V1-like visual representation  “Null model” - only represent first-order description of the primary visual cortex  Detail  Preprocessing: resize image to 150 pixels with aspect ratio preserved using bicubic interpolation  Input normalization: divide each pixel’s intensity value by the norm of the pixels in the 3x3 neighboring region  Gabor wavelet: 16 orientation, 6 spatial frequencies  Output normalization: divide by the norm of the pixels in the 3x3 neighboring region  Thresholding and Clipping: output value not in (0,1) is set to {0,1}

V1-like visual representation  Gabor Filter

Method - High-throughput-derived multilayer visual representations  Model architecture:  Candidate models were composed of a hierarchy of two (HT-L2) or three layers (HT-L3)

High-throughput-derived multilayer visual representations  Input size  HT-L2: 100 x 100 pixels  HT-L3: 200 x 200 pixels  Input was converted into grayscale and locally normalized

High-throughput-derived multilayer visual representations

 Activation Function  Output values were clipped to be within a parametrically defined

High-throughput-derived multilayer visual representations

 Pooling  neighboring region were then pooled together and the resulting outputs were spatially downsampled

High-throughput-derived multilayer visual representations

 Normalization  Draws biological inspiration from the competitive interactions observed in natural neuronal systems (e.g. contrast gain control mechanisms in cortical area V1, and elsewhere)

High-throughput-derived multilayer visual representations

Method - Evaluation

Method  Model overview

Method – Screening  Screening (model selection)  Select the best five models on LFW View1 aligned Set  Output dimension are ranged from 256 to  Number of models:  HT-L2 : 5915  HT-L3 : 6917

Feature Augmentation  Multiple rescaled crops  Three different centered crops 250x x x75  Resized to the standard input size  Train SVMs separately

Kernel Combination  Three strategies  Blend kernels result from different crops Simple kernel addition with each kernel being trace- normalized  Blend 5 models within the same class  Hierarchical blends across model class Assign exponentially larger weight to higher-level representation (V1-like < HT-L2 < HT-L3)

Kernel Combination  Kernel Method Example:

Kernel Combination  The original formulation  Is Equivalent

Kernel Combination  Multiple Kernel Learning (MKL)  learn the kernel directly from data

Kernel Combination  Multiple Kernel Learning (MKL)

Experiment  Screen model on LFW View1  Train SVM and evaluate result using 10-cross validation on LFW View 2

Result

 Some error cases

Discussion  Use whole image pixel value, not dealing with pose variation  take advantage on background information ?  Disturb by background Performance increase when adding different crops

16-GPU Monster-Class Supercomputer  Environment  GNU/Linux  Python, C, C++, Cython  CUDA, PyCuda