Local Invariant Features EECS 274 Computer Vision Local Invariant Features
:Local features Matching with Harris Detector CS4495/7495 2004 4/22/2017 :Local features Matching with Harris Detector Scale-invariant Feature Detection Scale Invariant Image Descriptors Affine-invariant Feature Detection Object Recognition SIFT Features Reading: S Chapter 4 Local Invariant Features
CS4495/7495 2004 4/22/2017 Examples Local Invariant Features
CS4495/7495 2004 4/22/2017 Features Local Invariant Features
What local features to use? CS4495/7495 2004 4/22/2017 What local features to use? Local Invariant Features
Aperture problem Ambiguity of 1-dimensional motion perception CS4495/7495 2004 4/22/2017 Aperture problem Ambiguity of 1-dimensional motion perception Stripes moved left 5 pixels Stripes moved upward 6 pixels Local Invariant Features
( ) Introduction Local invariant photometric descriptors local descriptor Local : robust to occlusion/clutter + no segmentation Photometric : distinctive Invariant : to image transformations + illumination changes
History - Recognition Color histogram [Swain 91] Each pixel is described by a color vector Distribution of color vectors is described by a histogram not robust to occlusion, not invariant, not distinctive
. . . . History - Recognition Eigenimages [Turk 91] Each face vector is represented in the eigenimage space eigenvectors with the highest eigenvalues = eigenimages The new image is projected into the eigenimage space determine the closest face . . . . not robust to occlusion, requires segmentation, not invariant, discriminant
History - Recognition Geometric invariants [Rothwell 92] Function with a value independent of the transformation Invariant for image rotation : distance of two points Invariant for planar homography : cross-ratio local and invariant, not discriminant, requires sub-pixel extraction of primitives
History - Recognition Problems : occlusion, clutter, image transformations, distinctiveness Solution : recognition with local photometric invariants [ Local greyvalue invariants for image retrieval, C. Schmid and R. Mohr, PAMI 1997 ]
Approach ( ) local descriptor 1) Extraction of interest points (characteristic locations) 2) Computation of local descriptors 3) Determining correspondences 4) Selection of similar images
Matching with interest points CS4495/7495 2004 4/22/2017 Matching with interest points Extraction of interest points with the Harris detector Comparison of points with cross-correlation Verification with the fundamental matrix Local Invariant Features
Moravec corner detector CS4495/7495 2004 4/22/2017 Moravec corner detector Developed for Stanford Cart in 1977 Local Invariant Features
Moravec corner detector Change of intensity for the shift [u,v]: Intensity Window function Shifted intensity Four shifts: (u,v) = (1,0), (1,1), (0,1), (-1, 1) Look for local maxima in min(E)
Problems of Moravec detector Noisy response due to a binary window function Only a set of shifts at every 45 degree is considered Only minimum of E is taken into account Harris corner detector (1988) solves these problems.
Harris detector Based on the idea of auto-correlation CS4495/7495 2004 4/22/2017 Harris detector Based on the idea of auto-correlation Important difference in all directions interest point Local Invariant Features
Interest points Geometric features 2D characteristics of the signal 4/22/2017 Interest points Geometric features repeatable under transformations 2D characteristics of the signal high informational content Comparison of different detectors [Schmid98] Harris detector Local Invariant Features
Discrete shifts can be avoided with the auto-correlation matrix CS4495/7495 2004 4/22/2017 Harris detector Auto-correlation function for a point and a shift Discrete shifts can be avoided with the auto-correlation matrix Local Invariant Features
Comparison of detectors: Rotation CS4495/7495 2004 4/22/2017 Comparison of detectors: Rotation repeatability = #good matches/mean(#points) [Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98] Local Invariant Features
Comparison of detectors: Perspective CS4495/7495 2004 4/22/2017 Comparison of detectors: Perspective repeatability = #good matches/mean(#points) [Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98] Local Invariant Features
Harris corner detector Intensity change in shifting window: eigenvalue analysis 1, 2 – eigenvalues of A Ellipse E(u,v) = const direction of the fastest change direction of the slowest change (max)-1/2 (min)-1/2 Shi and Tomasi use min(1, 2) to locate good features to track uncertainty ellipse
Harris corner detector Classification of image points using eigenvalues of M: 2 edge 2 >> 1 Corner 1 and 2 are large, 1 ~ 2; E increases in all directions 1 and 2 are small; E is almost constant in all directions edge 1 >> 2 flat 1
Harris detection Auto-correlation matrix Interest point detection captures the structure of the local neighborhood measure based on eigenvalues of this matrix 2 strong eigenvalues => interest point 1 strong eigenvalue => contour 0 eigenvalue => uniform region Interest point detection threshold on the eigenvalues local maximum for localization
Harris corner detector Measure of corner response: (k – empirical constant, k = 0.04-0.06)
Example
Example
Example
Good features Using auto-correlation or Hessian matrix CS4495/7495 2004 4/22/2017 Good features Using auto-correlation or Hessian matrix Local Invariant Features
Local descriptors ( ) local descriptor Descriptors characterize the local neighborhood of a point
Local jet Convolution of image I with Gaussian derivatives
N-Jet, local jet Invariance to image rotation : differential invariants [Koen87]
Local descriptors Robustness to illumination changes In case of an affine transformation or normalization of the image patch with mean and variance
Determining correspondences ? ( ) = ( ) Vector comparison using the Mahalanobis distance
Selection of similar images In a large database voting algorithm additional constraints Rapid acces with an indexing mechanism
Voting algorithm local characteristics vector of ( )
Voting algorithm 2 1 1 } } I is the corresponding model image 1 2
Additional constraints Semi-local constraints neighboring points should match angles, length ratios should be similar Global constraints robust estimation of the image transformation (homogaphy, epipolar geometry) 1 1 2 2 3 3
Results database with ~1000 images
Results
Harris detector Interest points extracted with Harris (~ 500 points) CS4495/7495 2004 4/22/2017 Harris detector Interest points extracted with Harris (~ 500 points) Local Invariant Features
Cross-correlation matching CS4495/7495 2004 4/22/2017 Cross-correlation matching Initial matches (188 pairs) Local Invariant Features
Global constraints Robust estimation of the fundamental matrix CS4495/7495 2004 4/22/2017 Global constraints Robust estimation of the fundamental matrix 99 inliers 89 outliers Local Invariant Features
Summary of Harris detector Very good results in the presence of occlusion and clutter local information discriminant greyvalue information invariance to image rotation and illumination Not invariance to scale and affine changes Solution for more general view point changes local invariant descriptors to scale and rotation extraction of invariant points and regions
Scale Invariant Feature Detection CS4495/7495 2004 4/22/2017 Scale Invariant Feature Detection Consider two images of the same scene, related by a scale change (i.e. zooming) How are their scale space representations related? Scale Space Theory (Lindeberg ’98): Normalized derivatives have the same value at corresponding relative scales Local Invariant Features
Harris detector + scale changes CS4495/7495 2004 4/22/2017 Harris detector + scale changes Local Invariant Features
Scale Adapted Harris Detector CS4495/7495 2004 4/22/2017 Scale Adapted Harris Detector Many corresponding points at which the scale factor corresponds to scale change between images Local Invariant Features
Scale invariant Harris points CS4495/7495 2004 4/22/2017 Scale invariant Harris points Multi-scale extraction of Harris interest points Selection of points at characteristic scale in scale space Chacteristic scale : - maximum in scale space - scale invariant Laplacian Local Invariant Features
Scale invariant interest points CS4495/7495 2004 4/22/2017 Scale invariant interest points multi-scale Harris points selection of points at the characteristic scale with Laplacian invariant points + associated regions [Mikolajczyk & Schmid’01] Harris-Laplacian Feature Local Invariant Features
Harris detector – adaptation to scale CS4495/7495 2004 4/22/2017 Harris detector – adaptation to scale Local Invariant Features
Evaluation of scale invariant detectors CS4495/7495 2004 4/22/2017 Evaluation of scale invariant detectors repeatability – scale changes Local Invariant Features
SIFT: Overview 1999 Generates image features, “keypoints” invariant to image scaling and rotation partially invariant to change in illumination and 3D camera viewpoint many can be extracted from typical images highly distinctive
Algorithm overview Scale-space extrema detection Keypoint localization Uses difference-of-Gaussian function Keypoint localization Sub-pixel location and scale fit to a model Orientation assignment 1 or more for each keypoint Keypoint descriptor Created from local image gradients
Scale space Definition: where
Scale space Keypoints are detected using scale-space extrema in difference-of-Gaussian function D D definition: Efficient to compute
Relationship of D to Close approximation to scale-normalized Laplacian of Gaussian, Diffusion equation: Approximate ∂G/∂σ: giving, When D has scales differing by a constant factor it already incorporates the σ2 scale normalization required for scale-invariance e.g.,
Scale space construction 2k2σ 2kσ 2σ kσ σ 2kσ 2σ kσ σ
CS4495/7495 2004 4/22/2017 Scale space A collection of images obtained by progressively smoothing the input image Analogous to gradually reducing image resolution See Vedaldi’s implementation (http://www.vlfeat.org/ ) for details Discretized scales Local Invariant Features
Scale space images … … … … … first octave … second octave … third octave … fourth octave
Difference-of-Gaussian images … … … … … first octave … second octave … third octave … fourth octave
Frequency of sampling There is no minimum Best frequency determined experimentally
Prior smoothing for each octave Increasing σ increases robustness, but costs σ = 1.6 a good tradeoff Doubling the image initially increases number of keypoints
Finding extrema Sample point D(x,y,σ) is selected only if it is a minimum or a maximum of these points Extrema in this image DoG scale space
Localization 3D quadratic function is fit to the local sample points Start with Taylor expansion with sample point as the origin where Take the derivative with respect to X, and set it to 0, giving is the location of the keypoint This is a 3x3 linear system
Localization Hessian and derivative approximated by finite differences, example: If X is > 0.5 in any dimension, process repeated
Filtering Contrast (use prev. equation): Edge-iness: If |D(X)| < 0.03, throw it out Edge-iness: Use ratio of principal curvatures to throw out poorly defined peaks Curvatures come from Hessian: Ratio of Trace(H)2 and Determinant(H) If ratio > (r+1)2/(r), throw it out (SIFT uses r=10)
Orientation assignment Descriptor computed relative to keypoint’s orientation achieves rotation invariance Precomputed along with mag. for all levels (useful in descriptor computation) Multiple orientations assigned to keypoints from an orientation histogram Significantly improve stability of matching
Keypoint images
Keypoint Selection Finding extrema with DoG Removing Removing CS4495/7495 2004 4/22/2017 Keypoint Selection Finding extrema with DoG Removing |D(X)| < 0.03 Removing |D(X)| < 0.03 Local Invariant Features
Select canonical orientation 4/22/2017 Select canonical orientation Create histogram of local gradient directions computed at selected scale Assign canonical orientation at peak of smoothed histogram Each key specifies stable 2D coordinates (x, y, scale, orientation)
Creating features stable to viewpoint change CS4495/7495 2004 4/22/2017 Creating features stable to viewpoint change Edelman, Intrator & Poggio (97) showed that complex cell outputs are better for 3D recognition than simple correlation Local Invariant Features
Stability to viewpoint change CS4495/7495 2004 4/22/2017 Stability to viewpoint change Classification of rotated 3D models (Edelman 97): Complex cells: 94% vs simple cells: 35% Local Invariant Features
Descriptor Descriptor has 3 dimensions (x,y,θ) Orientation histogram of gradient magnitudes Position and orientation of each gradient sample rotated relative to keypoint orientation
Descriptor Weight magnitude of each sample point by Gaussian weighting function Distribute each sample to adjacent bins by trilinear interpolation (avoids boundary effects)
Descriptor Best results achieved with 4x4x8 = 128 descriptor size Normalize to unit length Reduces effect of illumination change Cap each element to 0.2, normalize again Reduces non-linear illumination changes 0.2 determined experimentally
Object detection Create a database of keypoints from training images Match keypoints to a database Nearest neighbor search
PCA-SIFT Different descriptor (same keypoints) Apply PCA to the gradient patch Descriptor size is 20 (instead of 128) More robust, faster
SIFT: Summary Scale space Difference-of-Gaussian Localization Filtering Orientation assignment Descriptor, 128 elements
CS4495/7495 2004 4/22/2017 Object recognition Definition: Identify an object and determine its pose and model parameters Commercial object recognition $4 billion/year industry for inspection and assembly Almost entirely based on template matching Upcoming applications Mobile robots, toys, user interfaces Location recognition Digital camera panoramas, 3D scene modeling Local Invariant Features
Invariant local features CS4495/7495 2004 4/22/2017 Invariant local features Image content => local feature coordinates invariant to translation, rotation, scale Technical details regarding Keypoint matching (using 1st and 2nd nearest neighbors) Efficient nearest neighbor indexing Clustering with Hough transform (model location, orientation, scale) Account for affine distortion Features Local Invariant Features
Advantages of invariant local features CS4495/7495 2004 4/22/2017 Advantages of invariant local features Locality: features are local, so robust to occlusion and clutter (no prior segmentation) Distinctiveness: individual features can be matched to a large database of objects Quantity: many features can be generated for even small objects Efficiency: close to real-time performance Local Invariant Features
CS4495/7495 2004 4/22/2017 Local Invariant Features
CS4495/7495 2004 4/22/2017 Local Invariant Features
Experimental evaluation CS4495/7495 2004 4/22/2017 Experimental evaluation Local Invariant Features
Scale change (factor 2.5) Harris-Laplace DoG CS4495/7495 2004 4/22/2017 Scale change (factor 2.5) Harris-Laplace DoG Local Invariant Features
Viewpoint change (60 degrees) CS4495/7495 2004 4/22/2017 Viewpoint change (60 degrees) Harris-Affine (Harris-Laplace) Local Invariant Features
Descriptors - conclusion CS4495/7495 2004 4/22/2017 Descriptors - conclusion SIFT + steerable perform best Performance of the descriptor independent of the detector Errors due to imprecision in region estimation, localization Local Invariant Features
change in viewing angle CS4495/7495 2004 4/22/2017 Image retrieval … > 5000 images change in viewing angle Local Invariant Features
Matches 22 correct matches CS4495/7495 2004 4/22/2017 Local Invariant Features
change in viewing angle CS4495/7495 2004 4/22/2017 Image retrieval … > 5000 images change in viewing angle + scale change Local Invariant Features
Matches 33 correct matches CS4495/7495 2004 4/22/2017 Local Invariant Features
Multiple panoramas from an unordered image set CS4495/7495 2004 4/22/2017 Multiple panoramas from an unordered image set Local Invariant Features
Image registration and blending 4/22/2017 Image registration and blending
Location recognition CS4495/7495 2004 4/22/2017 Local Invariant Features
Robot localization Joint work with Stephen Se, Jim Little CS4495/7495 2004 4/22/2017 Robot localization Joint work with Stephen Se, Jim Little Local Invariant Features
Recognizing panoramas 4/22/2017 Recognizing panoramas Matthew Brown and David Lowe Recognize overlap from an unordered set of images and automatically stitch together SIFT features provide initial feature matching Image blending at multiple scales hides the seams Panorama automatically assembled from 143 images
Map continuously built over time CS4495/7495 2004 4/22/2017 Map continuously built over time Local Invariant Features
CS4495/7495 2004 4/22/2017 Planar recognition Planar surfaces can be reliably recognized at a rotation of 60° away from the camera Affine fit approximates perspective projection Only 3 points are needed for recognition Local Invariant Features
3D object recognition Extract outlines with background subtraction CS4495/7495 2004 4/22/2017 3D object recognition Extract outlines with background subtraction Local Invariant Features
CS4495/7495 2004 4/22/2017 3D object recognition Only 3 keys are needed for recognition, so extra keys provide robustness Affine model is no longer as accurate Local Invariant Features
Recognition under occlusion CS4495/7495 2004 4/22/2017 Recognition under occlusion Local Invariant Features
Comparison to template matching 4/22/2017 Comparison to template matching Costs of template matching 250,000 locations x 30 orientations x 4 scales = 30,000,000 evaluations Does not easily handle partial occlusion and other variation without large increase in template numbers Viola & Jones cascade must start again for each qualitatively different template Costs of local feature approach 3000 evaluations (reduction by factor of 10,000) Features are more invariant to illumination, 3D rotation, and object variation Use of many small subtemplates increases robustness to partial occlusion and other variations
More local features Harris Laplace Harris Affine Hessian detector CS4495/7495 2004 4/22/2017 More local features Harris Laplace Harris Affine Hessian detector Hessian Laplace Hessian Affine MSER (Maximally Stable Extremal Regions) SURF (Speeded-Up Robust Feature) Local Invariant Features