Download presentation
Presentation is loading. Please wait.
1
BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David Cox The Rowland Institute at Harvard, Harvard University International Conference on Automatic Face and Gesture Recognition (FG), 2011.
2
Outline Introduction Method V1-like visual representation High-throughput-derived multilayer visual representations Kernel Combination Experiment Result Discussion
3
Introduction “Biologically-inspired” representation capture aspects of the computational architecture of the brain and mimic its computational abilities
4
Introduction Large Scale Feature Search Framework Generate models with different parameters then screening
5
Method - V1-like visual representation “Null model” - only represent first-order description of the primary visual cortex Detail Preprocessing: resize image to 150 pixels with aspect ratio preserved using bicubic interpolation Input normalization: divide each pixel’s intensity value by the norm of the pixels in the 3x3 neighboring region Gabor wavelet: 16 orientation, 6 spatial frequencies Output normalization: divide by the norm of the pixels in the 3x3 neighboring region Thresholding and Clipping: output value not in (0,1) is set to {0,1}
6
V1-like visual representation Gabor Filter
7
Method - High-throughput-derived multilayer visual representations Model architecture: Candidate models were composed of a hierarchy of two (HT-L2) or three layers (HT-L3)
8
High-throughput-derived multilayer visual representations Input size HT-L2: 100 x 100 pixels HT-L3: 200 x 200 pixels Input was converted into grayscale and locally normalized
9
High-throughput-derived multilayer visual representations
11
Activation Function Output values were clipped to be within a parametrically defined
12
High-throughput-derived multilayer visual representations
13
Pooling neighboring region were then pooled together and the resulting outputs were spatially downsampled
14
High-throughput-derived multilayer visual representations
15
Normalization Draws biological inspiration from the competitive interactions observed in natural neuronal systems (e.g. contrast gain control mechanisms in cortical area V1, and elsewhere)
16
High-throughput-derived multilayer visual representations
17
Method - Evaluation
18
Method Model overview
19
Method – Screening Screening (model selection) Select the best five models on LFW View1 aligned Set Output dimension are ranged from 256 to 73984 Number of models: HT-L2 : 5915 HT-L3 : 6917
20
Feature Augmentation Multiple rescaled crops Three different centered crops 250x250 150x150 125x75 Resized to the standard input size Train SVMs separately
21
Kernel Combination Three strategies Blend kernels result from different crops Simple kernel addition with each kernel being trace- normalized Blend 5 models within the same class Hierarchical blends across model class Assign exponentially larger weight to higher-level representation (V1-like < HT-L2 < HT-L3)
22
Kernel Combination Kernel Method Example:
23
Kernel Combination The original formulation Is Equivalent
24
Kernel Combination Multiple Kernel Learning (MKL) learn the kernel directly from data
25
Kernel Combination Multiple Kernel Learning (MKL)
26
Experiment Screen model on LFW View1 Train SVM and evaluate result using 10-cross validation on LFW View 2
27
Result
28
Some error cases
29
Discussion Use whole image pixel value, not dealing with pose variation take advantage on background information ? Disturb by background Performance increase when adding different crops
30
16-GPU Monster-Class Supercomputer Environment GNU/Linux Python, C, C++, Cython CUDA, PyCuda
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.