Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Feature Detection. Description Localization More Points Robust to occlusion Works with less texture More Repeatable Robust detection Precise localization.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Features Induction Moravec Corner Detection

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

CSE 473/573 Computer Vision and Image Processing (CVIP)

Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Robust and large-scale alignment Image from

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Distinctive Image Feature from Scale-Invariant KeyPoints

Feature extraction: Corners and blobs

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Scale Invariant Feature Transform (SIFT)

Automatic Matching of Multi-View Images

SIFT - The Scale Invariant Feature Transform Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer.

Blob detection.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.

Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.

Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Interest Point Descriptors

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Feature matching Digital Visual Effects, Spring 2005 Yung-Yu Chuang 2005/3/16 with slides by Trevor Darrell Cordelia Schmid, David Lowe, Darya Frolova,

Bag of Visual Words for Image Representation & Visual Search Jianping Fan Dept of Computer Science UNC-Charlotte.

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.

776 Computer Vision Jan-Michael Frahm Fall SIFT-detector Problem: want to detect features at different scales (sizes) and with different orientations!

CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.

Harris Corner Detector & Scale Invariant Feature Transform (SIFT)

Features Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2006/3/15 with slides by Trevor Darrell Cordelia Schmid, David Lowe, Darya Frolova, Denis Simakov,

Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008.

Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?

CSE 185 Introduction to Computer Vision Feature Matching.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Scale Invariant Feature Transform (SIFT)

776 Computer Vision Jan-Michael Frahm Spring 2012.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.

Blob detection.

SIFT Scale-Invariant Feature Transform David Lowe

CS262: Computer Vision Lect 09: SIFT Descriptors

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2006/3/22

Feature description and matching

CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5

From a presentation by Jimmy Huff Modified by Josiah Yoder

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Interest Points & Descriptors 3 - SIFT

Edge detection f(x,y) viewed as a smooth function

ECE734 Project-Scale Invariant Feature Transform Algorithm

SIFT SIFT is an carefully designed procedure with empirically determined parameters for the invariant and distinctive features.

Presented by Xu Miao April 20, 2005

Presentation transcript:

Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University

Goal Extract distinctive invariant features from an image that can be used to perform reliable object recognition and many other applications.

Key requirements for good feature Highly distinctive, with a low probability of mismatch Should be easy to extract Invariance: - Scale and rotation - change in illumination - slight change in viewing angle - image noise - clutter and occlusion

Motivation for SIFT Earlier Methods – Harris corner detector Finds locations in image with large gradients in all directions at a predetermined scale. Sensitive to changes in image scale – No method was fully affine invariant Although the SIFT approach is not fully invariant it allows for considerable affine change SIFT also allows for changes in 3D viewpoint

Brief History Moravec(1981): – corner detector. Harris(1988): – Selects locations that has large gradients in all directions (corners) – Invariant to image rotation and to slight affine intensity change, but faces problems for scale change. Zhang(1995): – Introduced correlation window to match images in large range motion. Schmid (1997): – Suggested use of local feature matching for image recognition. – Used rotationally invariant descriptor of the local image region. – Multiple feature matches accomplish recognition under occlusion & clutter.

Harris Corner Detector: direction of the slowest change direction of the fastest change ( max ) -1/2 ( min ) -1/2 “Edge” 2 >> 1 “Flat” 2 ~ 1 ~0 “Corner” 2 ~ 1 ~ Large

SIFT Algorithm Overview Filtered approach 1. Scale-space extrema detection – Identify potential points: invariant to scale & orientation. – Difference-of-Gaussian function 2. Keypoint localization – Improve the estimate for location by fitting a quadratic – Extrema thresholded for filter out insignificant points. 3. Orientation Assignment - Orientation assigned to each keypoint and neighboring pixels based on local gradient. 4. Keypoint Descriptor construction – Feature vector based on gradients of local neighborhood

Keypoint Selection: Scale space We express the image at different scales by filtering it with a Gaussian kernel

Keypoint Selection: why DoG’s? Lindeberg(1994) and Mikolajczyk (2002) found that the maxima and minima of the scaled Laplacian provides the most stable scale invariant features We can use the scaled images to approximate this: Efficient to compute – Smoothed images L needed later so D can be computed by simple image subtraction

Back to picking keypoints 1. Supersample original image 2. Compute smoothed images using different scales σ for entire octave 3. Compute doG images from adjacent scales for entire octave 4. Isolate keypoints in each octave by detecting extrema in doG compared to neighboring pixels 5. Subsample image 2σ of current octave and repeat process (2-3) for next octave

Visual representation of process

Visual representation of process (cont.)

A More Visual Representation Original Image Starting Image

A More Visual Representation (cont.) First Octave of L images

A More Visual Representation (cont.) Second Octave Third Octave Fourth Octave

A More Visual Representation (cont.) First Octave difference-of-Gaussians

A More Visual Representation (cont.)

Scale space sampling How many fine scales in every octave? Extremas can be arbitrary close but very close ones are unstable. After subsampling and before finding scaled images of the octave, prior smoothing of 1.6 is done To compensate the loss of higher spatial frequencies, original image is doubled in size

Accurate Keypoint Localization From difference-of-Gaussian local extrema detection we obtain approximate values for keypoints Originally these approximations were used directly For an improvement in matching and stability fitting to a 3D quadratic function is used

The Taylor Series Expansion Take Taylor Series Expansion of scale-space function D(x,y,σ) – Use up to quadratic terms – origin shifted to sample point – offset from this sample point – to find location of extremum, take derivative and set to 0

Thresholding Keypoints (part 1) The function value at the extrema is used to reject unstable extrema – Low contrast – Evaluate – Absolute value less than 0.03 at extrema location results in discarding of extrema

Thresholding Keypoints (part 2) Difference-of-Gaussian function will be strong along edges – Some locations along edges are poorly determined and will become unstable when even small amounts of noise are added – These locations will have a large principal curvature across the edge but a small principal of curvature perpendicular to the edge – Therefore we need to compute the principal curvatures at the location and compare the two

Computing the Principal Curvatures Hessian matrix – The eigenvalues of H are proportional to principal curvatures – We are not concerned about actual values of eigenvalue, just the ratio of the two

Stages of keypoint selection

Assigning an Orientation We finally have a keypoint that we are going to keep The next step is assigning an orientation for the keypoint – Used in making the matching technique invariant to rotation

Assigning an Orientation (cont.) Gaussian smoothed image, L, with closest scale is chosen (scale invariance) Points in region around keypoint are selected and magnitude and orientations of gradient are calculated Orientation histogram formed with 36 bins. Sample is added to appropriate bin and weighted by gradient magnitude and Gaussian-weighted circular window with a of σ 1.5 times scale of keypoint

Assigning an Orientation (cont.) Highest peak in orientation is found along with any other peaks within 80% of highest peak 3 closest histogram values to each peak are used to interpolate (fit to a parabola) a better accurate peak As of now each keypoint has 4 dimesions: x location, y location, scale, and orientation

Keypoint Descriptor Calculated using a region around the keypoint as opposed to directly from the keypoint for robustness Like before, magnitudes and orientations are calculated for points in region around keypoint using L of nearest scale To ensure orientation invariance the gradient orientations and coordinates of descriptor are rotated relative to orientation of keypoint Provides invariance to changes in illumination and 3D camera viewpoint

Visual Representation of Keypoint Descriptor

Keypoint Descriptor (cont.) Magnitude of each point is weighted with σ of one half the width of the descriptor window – Stops sudden changes in descriptor due to small changes in position of the window – Gives less weight to gradients far from keypoint Samples are divided into 4x4 subregions around keypoint. Allows for a change of up to 4 positions of a sample while still being included in the same histogram

Keypoint Descriptor (cont.) Avoiding boundary effects between histograms – Trilinear interpolation used to distribute value of gradient of each sample into adjacent histogram bins Weight equal to 1 – d, where d is the distance of a specific sample to the center of a bin Vector normalization – Done at the end to ensure invariance to illumination change (affine) – Entire vector normalized to 1 – To combat non-linear illumination (camera saturation) changes values in feature vector are thresholded to no larger than 0.2 and then the vector is normalized.

We now have features Up to this point we have: – Found rough approximations for features by looking at the difference-of-Gaussians – Localized the keypoint more accurately – Removed poor keypoints – Determined the orientation of a keypoint – Calculated a 128 feature vector for each keypoint What do we do now?

Matching Keypoints between images Tale of two images (or more) – One image is the training sample of what we are looking for – The other image is the world picture that might contain instances of the training sample Both images have features associated with them across different octaves How do we match features between the two?

Matching Keypoints between images (cont.) Nearest Neighbor Algorithm( Euclidean Distance ) – Independently match all keypoints in all octaves in one image with all keypoints in all octaves in other image – How to solve problem of features that have no correct match with opposite image (i.e. background noise, clutter) Global threshold on distance (did not perform well) Ratio of closest neighbor with second closest neighbor.

Matching Keypoints between images (cont.) Threshold at 0.8 – Eliminates 90% of false matches – Eliminates less than 5% of correct matches

Efficient Nearest Neighbor indexing In high dimensions no algorithms – k-dimensional tree: Is O(2 k log n).

Efficient Nearest Neighbor indexing Use an approximate algorithm called Best- Bin-First (feature space) – Bins are searched in order of their closest distance. – Heap-based priority queue. – Returns closest neighbor with high probability – Search cut off after searching the 200 nearest-neighbor candidates Works well since we only consider matches where nearest neighbor is less than 0.8 times the distance to second nearest neighbor.

Clustering -Hough Transform Small \ Highly-occluded objects: – 3 feature matches are sufficient. – 99% outliers. – RANSAC doesn’t perform well due to large number of outliers. Hough Transform – Uses each feature to vote for all object poses consistent with the feature – Predicts approx. model for similarity transformation. – Bin Sizes must be large to account for large error bounds Orientation-30 degrees, Scale-of 2, Location Each keypoint votes to two closest bins in each dimension/ 16 entries for each hypothesis (feature).

Model Verification-Affine Transformation Geometric Verification of clusters

Results(Object Recognition)

Results (cont.)

Questions?