Download presentation
Presentation is loading. Please wait.
1
Thomas Berg and Peter Belhumeur
“POOF: Part Based One-vs-One Features for Fine Grained Categorization, Face Verification, and Attribute Estimation” Thomas Berg and Peter Belhumeur CVPR 2013 VGG Reading Group Eric Sommerlade
2
Summary A POOF is a scalar defined Perks:
for a discriminative region between two classes and two landmarks for a set of base features (e.g. HOG or colour hist.) Perks: Regions automatically learned from data set Great Performance transfers in knowledge from external datasets
3
Motivation: Standard approach to part based recognition:
- extract standard feature (SIFT, HOG, LBP) - train classifier - relevant regions tuned by hand Idea: “standard” features hardly optimal for specific problem “best” according to - domain (dog features != bird features) - task (face recognition != gender classification)
4
POOF feature learning:
From dataset with landmark annotations
5
POOF feature learning:
Choose feature part f Choose alignment part a Align and crop to 128x64 region Larger/shorter distance -> coarser/finer scale
6
POOF feature learning:
Scales: 8x8 and 16x16 8*16 + 4*8 = 160 cells
7
POOF feature learning:
Per cell: 8 bin gradient direction histogram Dg=8 (‘gradhist’) Or Felsenszwalb HOG: Dg=31 Color histogram Dc=32 Concatenated length (Dg+Dc)*160
8
POOF feature learning:
For each scale (8x8, 16x16): learn linear SVM, get weights w Keep max abs(w_i) per cell Keep cells with max(w_c)>=median(max(w_c)) keep connected component (4?) starting at f W: c1 c2 cn … c1 c2 cn max: … c1 c2 cn threshold: …
9
POOF feature learning:
retrain SVM on selected cells only Get POOF (bitmap+svm weight vector):
10
POOF feature extraction:
Find corresponding landmarks Authors use Belhumeur CVPR 2011 Align & crop to 128x64 region Get base features Get SVM score from features in masked region
11
Results: categorization
UCSD birds dataset, 200 classes 13 landmarks used About 5m POOF combinations possible Randomly chosen subset of 5000 POOFs Use as feature vector in one-vs-all linear SVM Evaluation on gt bbox of object gt landmarks or detected landmarks
12
Results: categorization
13
Results: categorization
gradhist HOG lowlevel baseline [27] [4] (MKL) [33] (RF) [32] [8] [35] 200det 54 56 28 14det 65 70 57 200gt 69 73 40 17 19 14gt 80 85 44 5det 55
14
Results: Face Verification
Are two images of the same person? LFW dataset 16 landmarks 120 subjects ~3.5m POOF choices Each image yields random POOFs f(I) For image pair concat [|f(I)-f(J)| f(I).*f(J)] Train same-vs-different classifier
15
Results: Face Verification
16
Results: Face Verification
Performance equal to Tom-vs-Pete (bmvc2012) But: Support regions learned automatically Linear SVM, not RBF faster Uses same “identity preserving alignment” on landmark detections [2] input affine canonical Mean of all closest in dataset
17
Results: Attribute classification
Attributes such as gender, “big nose”, “eyeglasses” (Kumar [14]) POOFs learned as before, on LFW dataset Extracts POOFs from attribute dataset Train linear SVM for each attribute POOFs transfer discriminability from different classes no need for fully labelled attribute dataset
18
Results: Attribute classification
Restricted number of attribute samples POOF features don’t latch on to noise …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.