Feature Extraction and Matching Feature Tracking Sudipta N Sinha Sep 19, 2006
Outline Feature Extraction and Matching (for Larger Motion) What are features ? Tasks Detection: finding the feature locations Representation: computing a compact descriptor Matching: Finding distances in feature space. Algorithms: Harris Corner Detector, SIFT. More complex (wide-baseline correspondence) Tracking (for Small Motion) Track geometric primitives (points, lines, patches, objects …) from frame to frame in video. High temporal coherence. Typically required in a real-time system.
Matching comes up in all kinds of problems in computer vision Panoramas, mosaics Structure from Motion ( F, T , … ) Object recognition More: Detect object in clutter, Motion segmentation, Image-based retrieval, Video mining .. (Check Papers in References)
The Correspondence Problem and Invariance Invariance: Features need to be detected repeatedly at the same locations and the computed descriptors must be similar in-spite of the following type of changes observed in two images of the same scene.
Point Features (Interest Points) Goal: To detect the same point in each image independently Challenges: Need repeatability in presence of Scale, Rotation, Affine distortions and Illumination change Not all pixels are good candidates. Texture-less regions, edges. Effect of noise on feature extraction. Examples: Harris Corner Detector, SIFT
Harris Corner Detector Idea: Detect a patch which looks locally unique. Shifting the patch in any direction will give a large change in intensity. Texture-less region: no change in all directions Edge: no change along one direction. Corner: large changes in all direction.
A symmetric matrix represents an ellipse Matrix is symmetric semi-definite
Harris Corner Detector Eigen-value analysis of the 2x2 matrix M:
Corners: Feature Descriptors and Matching. Simple Descriptor: convert a patch of n x n pixels centered at that pixel into a vector. Matching: SAD, SSD, ZMNCC Invariance: Translation ? Yes Rotation ? No. But the image patch could be re-sampled using eigen-vector pair as the local coordinate frame. Scale and Affine ? No Brightness Change ? Yes, normalize image intensity (ZMNCC) Feature point in high dim feature space
Point Features: SIFT First: Scale Invariant Feature Detection, Later: SIFT descriptors (rotational invariance)
The SIFT Algorithm (Lowe IJCV’04) Create Scale Space Stack : Intensity Gradient DoG Images from SIFT Tutorial [Thomas F. El-Maraghi May 2004 ]
The SIFT Algorithm Find Local Extrema of DoG in Scale Space. Remove Low Contrast Point Points on Edges. Images from SIFT Tutorial [Thomas F. El-Maraghi May 2004 ]
The SIFT Algorithm Descriptor represents Local Patch Appearance. Oriented Histograms built from Weighted Gradients. Images from Lowe IJCV’04
SIFT: Results
Wide Baseline Matching: Elliptical and Parallelogram features (Tuteylaar, Van Gool et. al. IJCV 2004) Anchor point: Traditional Corners
Wide Baseline Matching: Elliptical and Parallelogram features (Tuteylaar, Van Gool et. al. IJCV 2004) Anchor point: local intensity maxima
Tracking Corners – The KLT algorithm Main Idea: Assuming brightness constancy, try to find the new positions of some ‘salient’ image points in the second image (where the motion is small) Steps: Detecting Salient Points to track (in current frame) Track those features in next frame Could be done by Searching (Template matching) BUT KLT algorithm does this analytically, hence its faster !
KLT equations: Assumption – Brightness Constancy Find a displacement d, such that the error given by the following equation is minimized (over a tracking window )
KLT equations: Assumption – Brightness Constancy Find a displacement d, such that the error given by the following equation is minimized (over a tracking window )
KLT equations: A symmetric form was later proposed by Tomasi, as follows To estimate d, differentiate w.r.t d,
KLT equations: Substituting Taylor Series Expansion for J(.) and I(.) We get, Setting derivative to zero at the minima, and re-arranging, we get a linear system of equations for d
KLT equations
Multiscale and Iterative KLT Build Image Pyramid Coarse to Fine Tracking Increases Effective Spatial Range within which features can be tracked. View Dependent Effects : If surface patch is small, then large persective distortions can be approximated by an affine transformation Brightness change = gain + offset (2 more parameters) Affine KLT Invariance to illumination
