Scale and interest point descriptors

Scale and interest point descriptors
T-11 Computer Vision University of Ioannina Christophoros Nikou Images and slides from: James Hayes, Brown University, Computer Vision course Svetlana Lazebnik, University of North Carolina at Chapel Hill, Computer Vision course D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2011. R. Gonzalez and R. Woods. Digital Image Processing, Prentice Hall, 2008.

Harris corners can be localized in space but not in scale
Scale-invariant feature detection Harris corners can be localized in space but not in scale

Scale-invariant feature detection
Goal: independently detect corresponding regions in scaled versions of the same image. Need scale selection mechanism for finding characteristic region size that is covariant with the image transformation. Idea: Given a key point in two images determine if the surrounding neighborhoods contain the same structure up to scale. We could do this by sampling each image at a range of scales and perform comparisons at each pixel to find a match but it is impractical.

Automatic Scale Selection
How to find corresponding patch sizes? K. Grauman, B. Leibe

Automatic Scale Selection
Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

The only operator fulfilling these requirements is a scale-normalized Gaussian. Scale normalized LoG may detect blob-like features T. Lindeberg. Scale space theory: a basic tool for analyzing structures at different scales. Journal of Applied Statistics, 21(2), pp. 224—270, 1994.

Based on the above idea, Lindeberg (1998) proposed a detector for blob-like features that searches for scale space extrema of a scale-normalized LoG. T. Lindeberg. Feature detection with automatic scale selection. International Journal of Computer Vision, 21(2), pp. 224—270, 1998.

Recall: Edge Detection
f Derivative of Gaussian Edge = maximum of derivative Source: S. Seitz

Recall: Edge Detection
f Edge Second derivative of Gaussian (Laplacian) Edge = zero crossing of second derivative Source: S. Seitz

From edges to blobs Edge = ripple.
Blob = superposition of two edges (two ripples). maximum Spatial selection: the magnitude of the Laplacian response will achieve a maximum at the center of the blob, provided the scale of the Laplacian is “matched” to the scale of the blob.

original signal (radius=8)
Scale selection We want to find the characteristic scale of the blob by convolving it with Laplacians at several scales and looking for the maximum response. However, Laplacian response decays as scale increases: increasing σ original signal (radius=8) Why does this happen?

Scale normalization The response of a derivative of Gaussian filter to a perfect step edge decreases as σ increases.

Scale normalization The response of a derivative of Gaussian filter to a perfect step edge decreases as σ increases. To keep the response the same (scale-invariant), we must multiply the Gaussian derivative by σ. The Laplacian is the second derivative of the Gaussian, so it must be multiplied by σ2.

Effect of scale normalization
Original signal Unnormalized Laplacian response Scale-normalized Laplacian response maximum

Blob detection in 2D Laplacian: Circularly symmetric operator for blob detection in 2D.

Blob detection in 2D Laplacian: Circularly symmetric operator for blob detection in 2D. Scale-normalized:

Scale selection At what scale does the Laplacian achieve a maximum response for a binary circle of radius r? r image Laplacian

Scale selection The 2D LoG is given (up to scale) by
For a binary circle of radius r, the LoG achieves a maximum at Laplacian response r scale (σ) image

Blob detection in 2D: scale selection
Laplacian-of-Gaussian = “blob” detector filter scales img2 img3 img1 Bastian Leibe

Characteristic scale We define the characteristic scale as the scale that produces a peak of the LoG response. characteristic scale T. Lindeberg (1998). "Feature detection with automatic scale selection." International Journal of Computer Vision, 30 (2): pp

Scale-space blob detector
Convolve the image with scale-normalized LoG at several scales. Find the maxima of squared LoG response in scale-space.

Scale-space blob detector

Example Original image at ¾ the size Kristen Grauman

Original image at ¾ the size
Kristen Grauman

Kristen Grauman

Local maximum in space and scale in the reduced image
Kristen Grauman

Local maximum in space and scale in the original image
Kristen Grauman

Kristen Grauman

Scale invariant interest points
Interest points are local maxima in both position and scale. scale s1 s2 s3 s4 s5  List of (x, y, σ)

Scale-space blob detector: Example

Efficient implementation
Approximating the LoG with a difference of Gaussians: (Laplacian) (Difference of Gaussians) We have studied this topic in edge detection.

Efficient implementation
Divide each octave into an equal number K of intervals such that: Implementation by a Gaussian pyramid. D. G. Lowe. "Distinctive image features from scale-invariant keypoints.” International Journal of Computer Vision 60 (2), pp , 2004.

Find local maxima in position and scale
Blob detector is distinct from harris corner detector  List of (x, y, s) K. Grauman, B. Leibe

Results: Difference-of-Gaussian
K. Grauman, B. Leibe

The Harris-Laplace detector
It combines the Harris operator for corner-like structures with the scale space selection mechanism of DoG. Two scale spaces are built: one for the Harris corners and one for the blob detector (DoG). A key point is a Harris corner with a simultaneously maximun DoG at the same scale. It provides fewer key points with respect to DoG due to the constraint. K. Mikolajczyk and C. Schmid, Scale and Affine invariant interest point detectors, International Journal of Computer Vision, 60(1):63-86, 2004.

Local features: main components
Step 1: feature detection Step 2: feature description: describe neighborhood compactly Step 3: image matching

Geometric transformations
e.g. scale, translation, rotation

Photometric transformations
Figure from T. Tuytelaars ECCV 2006 tutorial

Raw patches as local descriptors
The simplest way to describe the neighborhood around an interest point is to write down the list of intensities to form a feature vector. But this is very sensitive to even small shifts, rotations.

Gradient orientations as descriptors
Gradient magnitude is affected by illumination scaling (e.g. multiplication by a constant) but the gradient direction is not.

Gradient direction depends on the smoothing scale. Blurred edges generate new orientations

Orientation fields are quite characteristic of particular arrangements (e.g. textures).

Building Orientation Representations
We would like to represent a pattern in an image patch to detect things in images to match points between images Desirable features representation doesn’t change much if the center is slightly wrong (use histograms) representation doesn’t change much if the size is slightly wrong (use histograms) representation is distinctive (break window into boxes – describe each one separately) representation doesn’t change much if the patch gets brighter/darker (use gradient orientations) large gradients are more important than small gradients (weight the histogram of orientations by the gradient magnitude)

SIFT descriptor [Lowe 2004]
Use histograms to bin pixels within sub-patches according to their orientation. 2 p

SIFT Features

SIFT descriptor Steve Seitz

Orientation ambiguity
We have to assign a unique orientation to the keypoints: Create the histogram of local gradient directions in the patch. Assign to the patch the orientation of the peak of the smoothed histogram. 2 p

SIFT descriptor [Lowe 2004]
Extraordinarily robust matching technique Can handle changes in viewpoint Up to about 60 degree out of plane rotation Can handle significant changes in illumination Sometimes even day vs. night (below) Fast and efficient—can run in real time Lots of code available Steve Seitz

Histogram of oriented gradients (HOG) [Dalal 2005]
Variant of SIFT Nearby gradients only are considered for the histogram computation Rather than normalize gradient contributions over the whole neighborhood, we normalize with respect to nearby gradients only A single gradient may contribute to several histograms Steve Seitz

HOG features

Matching local features

Matching local features
? To generate candidate matches, find patches that have the most similar appearance (e.g., lowest SSD) Simplest approach: compare them all, take the closest (or closest k, or within a thresholded distance)

? ? ? ? Ambiguous matches Image 1 Image 2
At what SSD value do we have a good match? To add robustness to matching, can consider ratio : distance to best match / distance to second best match If low, first match looks good. If high, could be ambiguous match.

From scale invariance to affine invariance
For many problems, it is important to find features that are covariant under large viewpoint changes. The projective distortion may not be corrected locally due to the small number of pixels. A local affine approximation is usually sufficient.

From scale invariance to affine invariance
Affine adaptation of scale invariant detectors. Find local regions where an ellipse can be reliably and repeatedly extracted purely from local image properties.

Affine adaptation Recall:
We can visualize M as an ellipse with axis lengths determined by the eigenvalues and orientation determined by R. direction of the slowest change direction of the fastest change (max)-1/2 (min)-1/2 Ellipse equation:

Affine adaptation The “covarying” of Harris corner detector ellipse may be viewed as the “characteristic shape” of a region. We can normalize the region by transforming the ellipse into a unit circle. The normalized regions may be detected under any affine transformation.

Affine adaptation Problem: the second moment “window” determined by weights w(x,y) must match the characteristic shape of the region. Solution: iterative approach Use a circular window to compute the second moment matrix. Based on the eigenvalues, perform affine adaptation to find an ellipse-shaped window. Recompute second moment matrix in the ellipse and iterate.

We have to assign a unique orientation to the keypoints: Create the histogram of local gradient directions in the patch. Assign to the patch the orientation of the peak of the smoothed histogram. 2 p

Iterative affine adaptation
K. Mikolajczyk and C. Schmid, Scale and Affine invariant interest point detectors, International Journal of Computer Vision, 60(1):63-86, 2004.

There is no unique transformation from an ellipse to a unit circle We keep the orientation of the peak in the histogram of gradients.

Affine adaptation example
Scale-invariant regions (blobs)

Affine adaptation example
Affine-adapted blobs

Summary: Feature extraction
Eliminate rotational ambiguity Compute appearance descriptors Extract affine regions Normalize regions SIFT (Lowe ’04)

Invariance vs. covariance
Invariance: features(transform(image)) = features(image) Covariance: features(transform(image)) = transform(features(image)) Covariant detection => invariant description

Neighborhoods and descriptors - Crucial Points
Algorithms to find neighborhoods Represented by location, scale and orientation Neighborhood is covariant If image is translated, scaled, rotated then neighborhood is translated, scaled, rotated Important property for matching Once found, describe them with of local orientation histograms, comparable to HOG Histograms reduce the effect of poor estimate of location and size Gradient orientations are not affected by intensity Orientations with larger magnitude are more important Normalized differently Affine covariant versions are available

Scale and interest point descriptors

Similar presentations

Presentation on theme: "Scale and interest point descriptors"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scale and interest point descriptors

Similar presentations

Presentation on theme: "Scale and interest point descriptors"— Presentation transcript:

Similar presentations

About project

Feedback