Presentation is loading. Please wait.

Presentation is loading. Please wait.

BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David.

Similar presentations


Presentation on theme: "BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David."— Presentation transcript:

1 BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David Cox The Rowland Institute at Harvard, Harvard University International Conference on Automatic Face and Gesture Recognition (FG), 2011.

2 Outline  Introduction  Method  V1-like visual representation  High-throughput-derived multilayer visual representations  Kernel Combination  Experiment Result  Discussion

3 Introduction  “Biologically-inspired” representation  capture aspects of the computational architecture of the brain and mimic its computational abilities

4 Introduction  Large Scale Feature Search Framework  Generate models with different parameters then screening

5 Method - V1-like visual representation  “Null model” - only represent first-order description of the primary visual cortex  Detail  Preprocessing: resize image to 150 pixels with aspect ratio preserved using bicubic interpolation  Input normalization: divide each pixel’s intensity value by the norm of the pixels in the 3x3 neighboring region  Gabor wavelet: 16 orientation, 6 spatial frequencies  Output normalization: divide by the norm of the pixels in the 3x3 neighboring region  Thresholding and Clipping: output value not in (0,1) is set to {0,1}

6 V1-like visual representation  Gabor Filter

7 Method - High-throughput-derived multilayer visual representations  Model architecture:  Candidate models were composed of a hierarchy of two (HT-L2) or three layers (HT-L3)

8 High-throughput-derived multilayer visual representations  Input size  HT-L2: 100 x 100 pixels  HT-L3: 200 x 200 pixels  Input was converted into grayscale and locally normalized

9 High-throughput-derived multilayer visual representations

10

11  Activation Function  Output values were clipped to be within a parametrically defined

12 High-throughput-derived multilayer visual representations

13  Pooling  neighboring region were then pooled together and the resulting outputs were spatially downsampled

14 High-throughput-derived multilayer visual representations

15  Normalization  Draws biological inspiration from the competitive interactions observed in natural neuronal systems (e.g. contrast gain control mechanisms in cortical area V1, and elsewhere)

16 High-throughput-derived multilayer visual representations

17 Method - Evaluation

18 Method  Model overview

19 Method – Screening  Screening (model selection)  Select the best five models on LFW View1 aligned Set  Output dimension are ranged from 256 to 73984  Number of models:  HT-L2 : 5915  HT-L3 : 6917

20 Feature Augmentation  Multiple rescaled crops  Three different centered crops 250x250 150x150 125x75  Resized to the standard input size  Train SVMs separately

21 Kernel Combination  Three strategies  Blend kernels result from different crops Simple kernel addition with each kernel being trace- normalized  Blend 5 models within the same class  Hierarchical blends across model class Assign exponentially larger weight to higher-level representation (V1-like < HT-L2 < HT-L3)

22 Kernel Combination  Kernel Method Example:

23 Kernel Combination  The original formulation  Is Equivalent

24 Kernel Combination  Multiple Kernel Learning (MKL)  learn the kernel directly from data

25 Kernel Combination  Multiple Kernel Learning (MKL)

26 Experiment  Screen model on LFW View1  Train SVM and evaluate result using 10-cross validation on LFW View 2

27 Result

28  Some error cases

29 Discussion  Use whole image pixel value, not dealing with pose variation  take advantage on background information ?  Disturb by background Performance increase when adding different crops

30 16-GPU Monster-Class Supercomputer  Environment  GNU/Linux  Python, C, C++, Cython  CUDA, PyCuda


Download ppt "BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David."

Similar presentations


Ads by Google