Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.
Group Meeting Presented by Wyman 10/14/2006
Presented by Xinyu Chang
Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,
Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.
Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.
SURF: Speeded-Up Robust Features
Robust and large-scale alignment Image from
Patch Descriptors CSE P 576 Larry Zitnick
A Study of Approaches for Object Recognition
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Recognising Panoramas
Lecture 4: Feature matching
Scale Invariant Feature Transform
Distinctive Image Feature from Scale-Invariant KeyPoints
Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.
Scale Invariant Feature Transform (SIFT)
Representation, Description and Matching
Automatic Matching of Multi-View Images
SIFT - The Scale Invariant Feature Transform Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer.
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.
Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.
Interest Point Descriptors
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Computer vision.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Interest Point Descriptors and Matching
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.
Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.
CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.
CSE 185 Introduction to Computer Vision Feature Matching.
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )
Distinctive Image Features from Scale-Invariant Keypoints
Presented by David Lee 3/20/2006
776 Computer Vision Jan-Michael Frahm Spring 2012.
Frank Bergschneider February 21, 2014 Presented to National Instruments.
Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.
SIFT.
776 Computer Vision Jan-Michael Frahm Spring 2012.
SIFT Scale-Invariant Feature Transform David Lowe
CS262: Computer Vision Lect 09: SIFT Descriptors
Presented by David Lee 3/20/2006
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Distinctive Image Features from Scale-Invariant Keypoints
Scale Invariant Feature Transform (SIFT)
SIFT paper.
Nearest-neighbor matching to feature database
SURF: Speeded-Up Robust Features
Feature description and matching
Nearest-neighbor matching to feature database
CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5
From a presentation by Jimmy Huff Modified by Josiah Yoder
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
SIFT.
Computational Photography
Feature descriptors and matching
Presented by Xu Miao April 20, 2005
Presentation transcript:

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004

Applications of SIFT Object recognition Object categorization Location recognition Robot localization Image retrieval Image panoramas

Object Recognition Object Models

Object Categorization

Location recognition

Robot Localization

Map continuously built over time

SIFT Computation – Steps (1) Scale-space extrema detection –Extract scale and rotation invariant interest points (i.e., keypoints). (2) Keypoint localization –Determine location and scale for each interest point. –Eliminate “weak” keypoints (3) Orientation assignment –Assign one or more orientations to each keypoint. (4) Keypoint descriptor –Use local image gradients at the selected scale. D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60(2):91-110, 2004.

4. Keypoint Descriptor (cont’d) 16 histograms x 8 orientations = 128 features 1.Take a 16 x16 window around detected interest point. 2.Divide into a 4x4 grid of cells. 3.Compute histogram in each cell. (8 bins)

Matching SIFT features Given a feature in I 1, how to find the best match in I 2 ? 1.Define distance function that compares two descriptors. 2.Test all the features in I 2, find the one with min distance. I1I1 I2I2

Matching SIFT features (cont’d) I1I1 I2I2 f1f1 f2f2

Matching SIFT features (cont’d) Accept a match if SSD(f1,f2) < t How do we choose t?

Matching SIFT features (cont’d) A better distance measure is the following: –SSD(f 1, f 2 ) / SSD(f 1, f 2 ’) f 2 is best SSD match to f 1 in I 2 f 2 ’ is 2 nd best SSD match to f 1 in I 2 I1I1 I2I2 f1f1 f2f2 f2'f2'

Matching SIFT features (cont’d) Accept a match if SSD(f 1, f 2 ) / SSD(f 1, f 2 ’) < t t=0.8 has given good results in object recognition. –90% of false matches were eliminated. –Less than 5% of correct matches were discarded

Variations of SIFT features PCA-SIFT SURF GLOH

SURF: Speeded Up Robust Features Speed-up computations by fast approximation of (i) Hessian matrix and (ii) descriptor using “integral images”. What is an “integral image”? Herbert Bay, Tinne Tuytelaars, and Luc Van Gool, “SURF: Speeded Up Robust Features”, European Computer Vision Conference (ECCV), 2006.

Integral Image The integral image I Σ (x,y) of an image I(x, y) represents the sum of all pixels in I(x,y) of a rectangular region formed by (0,0) and (x,y).. Using integral images, it takes only four array references to calculate the sum of pixels over a rectangular region of any size.

SURF: Speeded Up Robust Features (cont’d) Approximate L xx, L yy, and L xy using box filters. Can be computed very fast using integral images! (box filters shown are 9 x 9 – good approximations for a Gaussian with σ=1.2) derivative approximation derivative

SURF: Speeded Up Robust Features (cont’d) In SIFT, images are repeatedly smoothed with a Gaussian and subsequently sub- sampled in order to achieve a higher level of the pyramid.

SURF: Speeded Up Robust Features (cont’d) Alternatively, we can use filters of larger size on the original image. Due to using integral images, filters of any size can be applied at exactly the same speed! (see Tuytelaars’ paper for details)

SURF: Speeded Up Robust Features (cont’d) Approximation of H: Using DoG Using box filters

SURF: Speeded Up Robust Features (cont’d) Instead of using a different measure for selecting the location and scale of interest points (e.g., Hessian and DOG in SIFT), SURF uses the determinant of to find both. Determinant elements must be weighted to obtain a good approximation:

SURF: Speeded Up Robust Features (cont’d) Once interest points have been localized both in space and scale, the next steps are: (1) Orientation assignment (2) Keypoint descriptor

SURF: Speeded Up Robust Features (cont’d) Orientation assignment Circular neighborhood of radius 6 σ around the interest point ( σ = the scale at which the point was detected) Haar wavelets (responses weighted with Gaussian) Side length = 4 σ x response y response Can be computed very fast using integral images! 60 0 angle

SURF: Speeded Up Robust Features (cont’d) Sum the response over each sub-region for d x and d y separately. To bring in information about the polarity of the intensity changes, extract the sum of absolute value of the responses too. Feature vector size: 4 x 16 = 64 Keypoint descriptor (square region of size 20σ) 4 x 4 grid

SURF: Speeded Up Robust Features (cont’d) SURF-128 –The sum of d x and |d x | are computed separately for points where d y 0 –Similarly for the sum of d y and |d y | –More discriminatory!

SURF: Speeded Up Robust Features Has been reported to be 3 times faster than SIFT. Less robust to illumination and viewpoint changes compared to SIFT. K. Mikolajczyk and C. Schmid,"A Performance Evaluation of Local Descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp , 2005.

About matching…  Can be done with as few as 3 features.  Use Hough transform to cluster features in pose space  Have to use broad bins since 4 items but 6 dof  Match to 2 closest bins  After Hough finds clusters with 3 entries  Verify with affine constraint

Hough Transform Clustering Create 4D Hough Transform (HT) Space for each reference image 1.Orientation bin = 30° bin 2.Scale bin = 2 3.X location bin = 0.25*ref image width 4.Y location bin = 0.25*ref image height If key point “votes” for reference image, tally its vote in 4D HT Space. – This gives estimate of location and pose – Keep list of which key points vote for a bin

Hough Transform Example (Simplified) 0 90 Theta  For the Current View, color feature match with the database image  If we take each feature and align the database image at that feature we can vote for the x position of the center of the object and the theta of the object based on all the poses that align X position

Hough Transform Example (Simplified) Database Image Current Item Assume we have 4 x locations And only 4 possible rotations (thetas) 0 90 Theta X position Then the Hough space can look like the Diagram to the left

Hough Transform Example (Simplified) X position 0 90 Theta

Hough Transform Example (Simplified) X position 0 90 Theta

3. Feature matching 3D object recognition: solve for pose – Affine transform of [x,y] to [u,v]: – Rewrite to solve for transform parameters:

3. Feature matching 3D object recognition: verify model 1)Discard outliers for pose solution in prev step 2)Perform top-down check for additional features 3)Evaluate probability that match is correct

Two View Geometry (simple cases) In two cases this results in homography: 1.Camera rotates around its focal point 2.The scene is planar Then: – Point correspondence forms 1:1mapping – depth cannot be recovered

Camera Rotation (R is 3x3 non-singular)

Planar Scenes Intuitively A sequence of two perspectivities Algebraically Need to show: Scene Camera 1 Camera 2

Summary: Two Views Related by Homography Two images are related by homography: One to one mapping from p to p’ H contains 8 degrees of freedom Given correspondences, each point determines 2 equations 4 points are required to recover H Depth cannot be recovered

OpenCV packages FindHomography FlannBasedMatcher FLANN stands for Fast Library for Approximate Nearest Neighbors. It contains a collection of algorithms optimized for fast nearest neighbor search in large datasets and for high dimensional features. o Uses k-d trees