CS262: Computer Vision Lect 06: Face Detection John Magee Slides Courtesy of Diane H. Theriault
Framing In Computer Vision, we want to analyze and interpret images and video First: How are images represented? How are images formed? What kinds of tasks are we interested in doing?
Framing Next: How do you find things you are looking for? Version 1: Thresholding / Color Analysis Post-processing (morphological operators) Need to clean up messy output Binary image analysis Try to reason about the objects that you find
Framing Next: How do you find things you are looking for? Version 2: Compute features in the images Example: Gradients Example: General Filter Responses (Today) Example: Corners and Keypoints (Next week) How to combine and use image features will be our focus when considering object recognition and many other tasks
Framing Different points of view about tasks in computer vision. Examples: Algorithmic (e.g. represent image as a graph) Statistical (Images are noisy measurements) Signal processing (think of image as a 2D signal) Machine Learning (train models using data)
Question of the Day: How can we find faces in images?
Face Detection Compute features in the image (Today) Apply a classifier Viola & Jones. “Rapid Object Detection using a Boosted Cascade of Simple Features”
Features with Image Filtering Perform image filtering by convolving an image with a “filter”/”mask” / “kernel” to obtain a “result” / “response” The value of the result will be positive in regions of the image that “look like” the filter One type of image feature is the way the image responds to different types of filters Filter
Image Filtering Image -1 1 What is the response of the image to the filter (the result) in the region denoted by the red box? To perform convolution: Multiply each element of the filter with the corresponding element of the image Sum the results Filter -1 1
Image Filtering Image 1 -1 What is the response of the image to the filter (the result) in the region denoted by the red box? To perform convolution: Multiply each element of the filter with the corresponding element of the image Sum the results Filter -1 1
Image Filtering Image 1 -1 What is the response of the image to the filter (the result) in the region denoted by the red box? To perform convolution: Multiply each element of the filter with the corresponding element of the image Sum the results Filter -1 1
Image Filtering Image 1 -1 What is the response of the image to the filters (the result) in the region denoted by the red box? To perform convolution: Multiply each element of the filter with the corresponding element of the image Sum the results Filter Filter -1 1 1 -1
Features with Image Filtering Perform image filtering by convolving an image with a “filter”/”mask” / “kernel” to obtain a “result” / “response” The value of the result will be positive in regions of the image that “look like” the filter One type of image feature is the way the image responds to different types of filters Filter
What do Faces “Look Like”? Make a “face filter”?
What do Faces “Look Like”? Chosen features are responses of the image to the “Haar-like” box filters
Image With Faces
Filter Responses
Filter Responses
Image with Non-faces
Filter Responses
Filter Responses
Convolution is Expensive! Computational complexity of brute force convolution is linear in the number of pixels in the filter if your image is NxM, and your filter is 3x3, then the cost is 9*N*M (that’s a teeny face!) If your image is NxM and your filter is 20 x 20, then the cost is 400*N*M
Computing the Responses Efficiently Viola and Jones chose “box” filters To compute the response, you take the difference of the sum of the image values in the boxes (Red minus blue) What if you could compute the sum of the image values in a box without visiting every pixel in the box? Box filter
Computing the Responses Efficiently The Integral Image is the computational trick that made this paper a star In the Integral Image, every pixel contains the sum of all of the pixels above and to the left
Computing the Responses Efficiently Once integral image has been computed, the sum of the pixels in any sized box can be computed with 4 numbers Red
Computing the Responses Efficiently Once integral image has been computed, the sum of the pixels in any sized box can be computed with 4 numbers Red – blue
Computing the Responses Efficiently Once integral image has been computed, the sum of the pixels in any sized box can be computed with 4 numbers Red – blue – green
Computing the Responses Efficiently Once integral image has been computed, the sum of the pixels in any sized box can be computed with 4 numbers Red – blue – green + orange (lower right) – (upper right) – (lower left) + (upper left)
Discussion Questions: Describe how you use the integral image to compute the sum of any region in an image Using the integral image, how many operations does it take to compute the sum of a region that is 3×3? 10×10? 10×30? How would you use the integral image to efficiently compute the response of an image region to a box filter? How many operations do you need to compute the response of an image region to a box filter containing two pieces? three? four? What is a simple way you might try to classify image regions as containing a face or not, using the response of the image region to a box filter?