Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIFT paper.

Similar presentations


Presentation on theme: "SIFT paper."— Presentation transcript:

1 SIFT paper

2 Terms & Definitions Pose = position + orientation of an object
Image gradient = change in brightness (orientation dependent) Scale = ‘size’ of features. Gaussian filters eliminate ‘smaller’ features Octave = a set of scales that goes up to a double, e.g. 1, sqrt(2), 2 is an octave.

3 Overview Scale space extrema detection Keypoint localization
Potential image points from DoG function, invariant to scale and orientation Keypoint localization Choose stable locations & determine their exact position & scale Orientation assignment Assign keypoint orientation based on image gradients Keypoint Descriptor = final representation

4 Using the SIFT Collect keypoints for each reference image and store in database Collect keypoints for the ‘unknown’ image Look for clusters of matches Agree on object, scale, and image location Compute probability of object, given features

5 Related Research Corner & feature detectors (Moravec, Harris)
Feature matching into database (Schmid & Mohr) Feature scale invariance (Lowe) Affine invariant matching (many) Alternative feature types (many) Section 2 is well-written. It’s not just a listing of X did A and Y did B, but relates all the work to each other and to the current work.

6 Detection of Scale Space Extrema
Scale space: (x, y, sigma) Sigma is parameter of a Gaussian function Extrema in scale space Pixel whose values is max (min) of local window Difference of Gaussian function D(x,y,sigma) = (G(x, y, k*sigma) – G(x, y, sigma)) * I(x,y) first create DoG kernel, then convolve with image

7 Difference of Gaussian

8 Local Extrema Detection
S+3 images per octave, gives s+2 dog images, and S complete 3x3x3 windows. This example shows 2 samples per octave Szeliski: Fig 4.11

9 How many scales? In Section 3.2, experiments show:
Repeatability peaks at 3 scales / octave, then slowly drops off Number of keypoints grows as scales grow, but slowly than linear (appears approx. log) Bottom line: they chose 3 scales / octave

10 Keypoint localization
Initial implementation: location/scale of keypoint taken from pixel coordinates in scale space Improved implementation: Fit 3D quadratic function to local sample points Return coordinates of peak of fit function (subpixel) – see equations 2 and 3 If value of D(x, y, sigma) is too small, reject due to low contrast Note: Image pixel values apparently normalized to [0,1]

11 Avoiding Edges A point on an edge does not localize well (it can slide along the edge) Compute Dxx, Dxy, Dyy (as we did last week) Tr(H) = Dxx + Dyy Det(H) = DxxDyy – Dxy*Dxy If (Tr(H)*Tr(H))/Det(H) > (using r of 10), then location is eliminated (max curvature / min curvature > 10)

12 Keypoint Orientation Depends on local image properties (e.g. intensities) Choose the Gaussian smoothed image L at the keypoint’s scale Compute gradients using horizontal and vertical [1 0 -1] masks (call them H and V) Gradient magnitude = sqrt(H*H+V*V) Gradient direction = atan(V/H)

13 Orientation Histogram
Collect orientations in window around sample point Weights fall off based on Gaussian with sigma that is 1.5 times scale Build histogram (36 bins) of weighted orientations Peak of histogram is keypoint orientation Any other peaks within 80% are additional keypoint orientations Peak is localized by fitting a parabola to 3 closest values in histogram

14 Local image descriptor
Each keypoint has image location, scale, and orientation Descriptor is array of histograms of orientations surrounding the keypoint (See Fig. 7) Array is normalized to reduce effects of lighting

15 Application: Object Recognition
Match each keypoint independently to database (nearest neighbor) Find clusters of at least 3 features that agree on object, position and orientation (pose) Perform detailed geometric fit to model and accept or reject Nearest neighbor must not be significantly closer than 2nd nearest neighbor. Reject distance ratio > 0.8 Clustering uses Hough transform – we’ll do this later Geometric fit = affine transform – need to find values for the 6 DOF (p. 106)


Download ppt "SIFT paper."

Similar presentations


Ads by Google