Download presentation
Presentation is loading. Please wait.
Published byLaureen Potter Modified over 9 years ago
1
SVMs for (x) Recognition (From Moghaddam / Yang’s “Gender Classification with SVMs”) Brian Whitman
2
“Commodity Intelligence” ‘Wow factor’ important Collaborative filtering ‘Simple’ tasks sometimes the most useful An SVM embedded evaluator… Cameras with ‘common sense’
3
Why SVM for feature detection? Quick evaluation model Machines (SVs) are easily stored and small
4
Experiment: Gender ID Using MITFaces dataset ~7500 faces with varying genders, races, ages, expressions, ‘extras’ All aligned 160x160 with left eye at 80,80 Face content is usually only 80x40
5
MITFaces examples
6
Representation? Simple pixel values Why?
7
Sample size Maintain ‘ground rule’ of ML Dimensions < Examples*2 At 3200 dims (80x40), this is hard Training parameters (maximum lagrangians, kernel width) help We use 80x40 and 40x20 in our examples
8
Training stage Choose 3200 random adult faces for training and 3200 random faces for testing Extract 80x40 ‘face window’ from each face and treat the 3200 doubles (0..1) as a training example Train SVM on pixel values of the train set (dual p4 xeon linux 2ghz -- 30 minutes)
9
Testing Stage Take the other 3200 face vectors and present them to the learned SVM If class > 0, male, < 0, female. Confidence: some linear combination of # of support vectors and magnitude of result Had no problem doing this at 10hz on a PIII800 with tons running
10
In-class face gender results 80x40; C=100, aux=100 93% of faces classified correctly 95% male 90% female 40x20; C=100, aux=10 97% 98% male 95% female
11
Next step: Realtime Media Lab is where webcams go to die Webcam at 160x120, ‘face region’ to 80x40, downsampled to 40x20. Webcam gets frames at 10hz, we greyscale it and present it to the previously trained SVM Results… mixed
12
Realtime examples (If the demo crashes)
13
‘Creepybot’ With better control over alignment Monitors Windows clipboard Same architecture as the Creepycam
14
Creepybot Examples (If the demo crashes)
15
Other parameters MITFaces has a great data label set Train an SVM for appearance of each descriptor: Race Age Gender Expression Moustache
16
Per-class results (40x20, etc…) “Adult or not” Overall: 94% (Not adult: 403/516) (78%) (Adult): 2605/2684) (97%)
17
Per-class results… “Smiling or not” Overall: 88% (Not smiling: 1354/1520) (89%) (Smiling: 1450 / 1672) (87%)
18
Per-class results “Serious or not” Overall: 88% (Not serious: 1517/1712) (89%) (Serious: 1311/1484) (88%)
19
Could we do better? Representation is lacking But results are surprisingly good For realtime, need auto-alignment / rescaling, or a better representation Could this lead to an invasion of cheap intelligent cameras, each with tacky switches for feature detection and marketing?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.