High-Level Vision Face Detection.

Slides:



Advertisements
Similar presentations
Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Advertisements

Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.
Face Detection & Synthesis using 3D Models & OpenCV Learning Bit by Bit Don Miller ITP, Spring 2010.
AdaBoost & Its Applications
Face detection Many slides adapted from P. Viola.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
The Viola/Jones Face Detector Prepared with figures taken from “Robust real-time object detection” CRL 2001/01, February 2001.
The Viola/Jones Face Detector (2001)
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
Rapid Object Detection using a Boosted Cascade of Simple Features
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
A Robust Real Time Face Detection. Outline  AdaBoost – Learning Algorithm  Face Detection in real life  Using AdaBoost for Face Detection  Improvements.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
Fitting a Model to Data Reading: 15.1,
A Robust Real Time Face Detection. Outline  AdaBoost – Learning Algorithm  Face Detection in real life  Using AdaBoost for Face Detection  Improvements.
Robust Real-Time Object Detection Paul Viola & Michael Jones.
Computer Vision CSPP Artificial Intelligence March 3, 2004.
Foundations of Computer Vision Rapid object / face detection using a Boosted Cascade of Simple features Presented by Christos Stoilas Rapid object / face.
Face Detection CSE 576. Face detection State-of-the-art face detection demo (Courtesy Boris Babenko)Boris Babenko.
FACE DETECTION AND RECOGNITION By: Paranjith Singh Lohiya Ravi Babu Lavu.
Face Detection using the Viola-Jones Method
Human tracking and counting using the KINECT range sensor based on Adaboost and Kalman Filter ISVC 2013.
Object Detection Using the Statistics of Parts Presented by Nicholas Chan – Advanced Perception Robust Real-time Object Detection Henry Schneiderman.
Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Microsoft Research Irfan Ullah Dept. of Info. and Comm. Engr. Myongji University.
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Face detection Slides adapted Grauman & Liebe’s tutorial
Terrorists Team members: Ágnes Bartha György Kovács Imre Hajagos Wojciech Zyla.
Robust Real-time Face Detection by Paul Viola and Michael Jones, 2002 Presentation by Kostantina Palla & Alfredo Kalaitzis School of Informatics University.
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Robust Real Time Face Detection
Adaboost and Object Detection Xu and Arun. Principle of Adaboost Three cobblers with their wits combined equal Zhuge Liang the master mind. Failure is.
HCI/ComS 575X: Computational Perception Instructor: Alexander Stoytchev
Lecture 09 03/01/2012 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Bibek Jang Karki. Outline Integral Image Representation of image in summation format AdaBoost Ranking of features Combining best features to form strong.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
FACE DETECTION : AMIT BHAMARE. WHAT IS FACE DETECTION ? Face detection is computer based technology which detect the face in digital image. Trivial task.
October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],
Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short.
A Brief Introduction on Face Detection Mei-Chen Yeh 04/06/2010 P. Viola and M. J. Jones, Robust Real-Time Face Detection, IJCV 2004.
Face detection Many slides adapted from P. Viola.
Stock Market Application: Review
Reading: R. Schapire, A brief introduction to boosting
2. Skin - color filtering.
Cascade for Fast Detection
License Plate Detection
Recognition Part II: Face Detection via AdaBoost
Session 7: Face Detection (cont.)
Presented by Minh Hoai Nguyen Date: 28 March 2007
Lit part of blue dress and shadowed part of white dress are the same color
HCI/ComS 575X: Computational Perception
State-of-the-art face recognition systems
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Learning to Detect Faces Rapidly and Robustly
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.
Face Detection via AdaBoost
ADABOOST(Adaptative Boosting)
Observer motion problem
Assignment 4 Face Detection.
Face Detection - before can recognize face, need to detect in image – for forward looking full view of face (GHC), not so challenging (Face++ missed one,
Ensemble learning.
CSE 185 Introduction to Computer Vision
Lecture 29: Face Detection Revisited
Jie Chen, Shiguang Shan, Shengye Yan, Xilin Chen, Wen Gao
ECE738 Final Project Face Detection Baseline
Presentation transcript:

High-Level Vision Face Detection

Face detection: Viola & Jones Multiple view-based classifiers based on simple features that best discriminate faces vs. non-faces Most discriminating features learned from thousands of samples of face and non-face image windows Attentional mechanism: cascade of increasingly discriminating classifiers improves performance - problem of face detection, approach developed by Viola & Jones, construct classifier that can take small window of image and determine whether or not it’s face, imagine strategy where scan image looking at each small patch, ask if face? sample result – found all faces and soccer ball that looks a little like face - to design classifier, ask what features of image within window most effective at distinguishing faces from non-faces? features best for finding faces with roughly frontal view may differ from those best for detecting profile face - designed multiple classifiers specialized for frontal or profile views (keep point back of mind) - started out with thousands of simple features that could be used for classification, learned ones most effective by training classifier using lots examples of image patches labeled as faces or non-faces - another key aspect of approach is attentional mechanism that greatly reduced amount of computation needed, suppose couple hundred features sufficient to construct classifier with good performance (each window, 200 features, decision), no point wasting time computing all 200 features for every subwindow of image, e.g. don’t bother computing for regions with roughly uniform brightness, nice if could use few features as quick filter to remove lot of regions from consideration, clearly not faces, and then focus computation resources on regions more promising as possible faces, using more discriminating features to determine whether really face

Viola & Jones use simple features Use simple rectangle features: Σ I(x,y) in gray area – Σ I(x,y) in white area within 24 x 24 image sub-windows  initially consider 160,000 potential features per sub-window!  features computed very efficiently Which features best distinguish face vs. non-face? - what are features used? simple rectangular features, squares represent sub-window of image, 24 x 24 pixels, each feature consists of 2-4 rectangular sub-regions at particular locations within window, to calculate value of feature, add all intensities within gray areas and subtract sum of all intensities within white areas, then compare difference in brightness of regions to threshold, e.g. is difference larger than certain amount, this might provide evidence about whether or not window is face - think of first feature as measuring change in brightness between left and right sub-regions – where large value? where large difference between adjacent vertical strips of image (left/right face boundaries, side hairline) - defined number different geometric configurations, positive/negative regions, different sizes, orientations, provided initial set of 160,000 potential features this sort, lot but very simple, computed very efficiently (imagine lot of redundancy in regions adding intensities, computed intermediate representation of sums of particular image regions that avoids redundant computations) - which features best distinguish whether face or not? this is learned, two most informative shown here, seeing where lie on typical face, why good at distinguishing face? face typically has eye region darker than area below, bridge of nose between eyes typically brighter than eyes, these templates capture these relationships Learn most discriminating features from thousands of samples of face and non-face image windows

Learning the best features weak classifier using one feature: x = image window f = feature p = +1 or -1  = threshold … n training samples, equal weights, known classes  (x1,w1,1) (xn,wn,0) find next best weak classifier normalize weights - what’s learning process - iterative process, progressively find next best feature until system can do good enough job of classifying image patches as faces vs. non-faces, first define weak classifier as one that just uses single feature, h is classifier, value 1 (face) or 0 (non-face), function of 4 things, window x, feature f, threshold theta use to decide face or not, extra p +/- 1, because may be that more likely face if greater than threshold, or less than threshold, depends on specific feature, so extra p +/- 1 to allow the decision to be based on being larger/smaller than threshold - to learn good features, training data, n training samples (large), sub-windows with known class (1 face, 0 non-face), weight associated with samples, start out equal, later change, may be certain examples are more challenging to classify (soccer ball) and want to give them more weight in training process, iterative process, weights normalized to sum to 1, find next weak classifier (combination f, theta, p) does best job correctly identifying class of each image window in training set, difference what feature indicates (0/1) and correct class (0,1), summed over all windows with each multiplied by current weight associated with sample, expression measures incorrect classifications, want to minimize discrepancy between true class and what classifier gives us, based on this feature - not perfect, some samples wrong, increase weight of those samples for next time around, searching for next best feature – strategy AdaBoost - lots of weak classifiers based on single features, each does better or worse job of classifying, define final classifier that integrates evidence from all features, weighing contribution of each feature according to how good it was distinguishing face/non-face during training, sum all evidence and check if larger than threshold - if yes, then face - ~ 200 features give good results for this formulation of classifier final classifier AdaBoost ~ 200 features yields good results for “monolithic” classifier use classification errors to update weights

“Attentional cascade” of increasingly discriminating classifiers Early classifiers use a few highly discriminating features, low threshold 1st classifier uses two features, removes 50% non-face windows later classifiers distinguish harder examples - as said, don’t want to compute values of all 200 features everywhere in image, not efficient, so used idea of attentional cascade, start all sub-windows, use small number of features to reject many sub-windows clearly non-faces, only preserve more promising ones for further analysis with additional features - these two features remove half non-face windows while preserving ~100% real faces, second layer uses about 10 more features and rejects about 80% of remaining non-faces - given computing features for fewer windows, in long run, allows more feature to be used, in practice, cascade 38 classifiers total about 6000 features yields high level performance [user intervention - each layer, minimum detection rate e.g. detect > 99%, max false positive rate tolerate e.g. up to 10% false positives, learning process determines number of features needed to accomplish, keep adding layers until overall performance good enough]  Increases efficiency  Allows use of many more features  Cascade of 38 classifiers, using ~6000 features

Training with normalized faces many more non-face patches faces are normalized for scale & rotation small variation in pose - small sample training set, 5000 faces, many more non-face patches (9500 images with no faces, took many random image patches from each) - normalized to have common rotation, scale (by hand), small variation in pose over samples (some variation facial expression, background) [ROC: correct detection rate (e.g. 0.5 to 1.0) vs. false positive rate (0.0 to 0.01), vary parameters of algorithm gives different performance, want to be close to upper left corner]

Viola & Jones results With additional diagonal features, classifiers were created to handle image rotations and profile views - original classifiers tolerate some rotation, about 45 deg in depth, 15 deg image plane, added more diagonal features to create classifiers that could tolerate more rotation in image, specialized classifiers to recognize profile views (specialized work better than one general-purpose) - detail shuffled under rug, faces appear different sizes in image, classifier always based on 24x24 pixel window, so sample image at different scales (e.g. 400 x 400 image with 24x24 windows, sample to give 300 x 300 image with 24 x 24 windows (find larger face in original)), find face at multiple scales (12) and go with whatever assessment has strongest evidence - one of greatest challenges is occlusion, extreme illuminations, accessories e.g. sunglasses - simple image features, learns ones most effective by training classifier with large dataset, strategies for computational efficiency - many approaches that try to find the best features for classifying image regions into different object classes, differ in way define features, way go about training process - one things makes Labeled Faces in Wild not so wild, faces detected by Viola-Jones face detector

Faces everywhere...