Object Recognition using Local Descriptors Javier Ruiz-del-Solar, and Patricio Loncomilla Center for Web Research Universidad de Chile.

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Recognising Panoramas M. Brown and D. Lowe, University of British Columbia.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Aggregating local image descriptors into compact codes

Presented by Xinyu Chang

TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.

A NOVEL LOCAL FEATURE DESCRIPTOR FOR IMAGE MATCHING Heng Yang, Qing Wang ICME 2008.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Special Topic on Image Retrieval Local Feature Matching Verification.

Image alignment Image from

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.

Robust and large-scale alignment Image from

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Recognising Panoramas

Distinctive Image Feature from Scale-Invariant KeyPoints

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Scale Invariant Feature Transform (SIFT)

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.

Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

FLANN Fast Library for Approximate Nearest Neighbors

CSE 185 Introduction to Computer Vision

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.

CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd.

CSE 185 Introduction to Computer Vision Feature Matching.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

CS654: Digital Image Analysis

776 Computer Vision Jan-Michael Frahm Spring 2012.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.

Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Visual homing using PCA-SIFT

SIFT Scale-Invariant Feature Transform David Lowe

Presented by David Lee 3/20/2006

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Nearest-neighbor matching to feature database

Object recognition Prof. Graeme Bailey

Nearest-neighbor matching to feature database

Geometric Hashing: An Overview

Aim of the project Take your image Submit it to the search engine

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Scale-Space Representation for Matching of 3D Models

ECE734 Project-Scale Invariant Feature Transform Algorithm

Presented by Xu Miao April 20, 2005

Presentation transcript:

Object Recognition using Local Descriptors Javier Ruiz-del-Solar, and Patricio Loncomilla Center for Web Research Universidad de Chile

Outline Motivation & Recognition Examples Dimensionality problems Object Recognition using Local Descriptors Matching & Storage of Local Descriptors Conclusions

Motivation Object recognition approaches based on local invariant descriptors (features) have become increasingly popular and have experienced an impressive development in the last years. Invariance against: scale, in-plane rotation, partial occlusion, partial distortion, partial change of point of view. The recognition process consists on two stages: 1.scale-invariant local descriptors (features) of the observed scene are computed. 2.these descriptors are matched against descriptors of object prototypes already stored in a model database. These prototypes correspond to images of objects under different view angles.

Recognition Examples (1/2)

Recognition Examples (2/2)

Image Matching Examples (1/2)

Image Matching Examples (2/2)

Some applications Object retrieval in multimedia databases (e.g. Web) Image retrieval by similarity in multimedia databases Robot self-localization Binocular vision Image alignment and matching Movement compensation …

However … there are some problems Dimensionality problems A given image can produce ~100-1,000 descriptors of 128 components (real values) The model database can contain until 1,000-10,000 objects in some special applications => large number of comparisons => large processing time => large databases size Main motivation of this talk: To get some ideas about how to make efficient comparisons between local descriptors as well as efficient storage of them …

Recognition Process The recognition process consists on two stages: 1.scale-invariant local descriptors (features) of the observed scene are computed. 2.these descriptors are matched against descriptors of object prototypes already stored in a model database. These prototypes correspond to images of objects under different view angles.

Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Affine Transform Calculation SIFT Matching SIFT Database Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Reference Image Offline Database Creation Input Image Affine Transform Parameters

Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Affine Transform Calculation SIFT Matching SIFT Database Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Input Image Affine Transform Parameters Reference Image Offline Database Creation

Interest Points Detection (1/2) Interests points correspond to maxima of the SDoG (Subsampled Difference of Gaussians) Scale-Space (x,y, ). Scale Space SDoG Ref: Lowe 1999

Interest Points Detection (2/2) Examples of detected interest points. Our improvement: Subpixel location of interest points by a 3D quadratic approximation around the detected interest point in the scale-space.

Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Affine Transform Calculation SIFT Matching SIFT Database Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Input Image Affine Transform Parameters Reference Image Offline Database Creation

SIFT Calculation For each obtained keypoint, a descriptor or feature vector that considers the gradient values around the keypoint is computed. This descriptors are called SIFT (Scale - Invariant Feature Transformation). SIFTs allow obtaining invariance against to scale and orientation. Ref: Lowe 2004

Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Affine Transform Calculation SIFT Matching SIFT Database Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Input Image Affine Transform Parameters Reference Image Offline Database Creation

SIFT Matching Euclidian distance between the SIFTs (vectors) is employed.

Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Affine Transform Calculation SIFT Matching SIFT Database Interest Points Detection Scale Invariant Descriptors (SIFT) Calculation Input Image Affine Transform Parameters Reference Image Offline Database Creation

Affine Transform Calculation (1/2) Several stages are employed: 1.Object Pose Prediction In the pose space a Hough transform is employed for obtaining a coarse prediction of the object pose, by using each matched keypoint for voting for all object pose that are consistent with the keypoint. A candidate object pose is obtained if at least 3 entries are found in a Hough bin. 2. Affine Transformation Calculation A least-squares procedure is employed for finding an affine transformation that correctly account for each obtained pose.

Affine Transform Calculation (2/2) 3. Affine Transformation Verification Stages: Verification using a probabilistic model (Bayes classifier). Verification based on Geometrical Distortion Verification based on Spatial Correlation Verification based on Graphical Correlation Verification based on the Object Rotation 4. Transformations Merging based on Geometrical Overlapping In blue verification stages proposed by us for improving the detection of robots heads.

Input Image Reference Images AIBO Head Pose Detection Example

Matching & Storage of Local Descriptors Each reference image gives a set of keypoints. Each keypoint have a graphical descriptor, which is a 128- components vector. All the (keypoint,vector) pairs corresponding to a set of reference images are stored in a set T. Reference image x,y,n, v 1 v 2... v 128 x,y,n, v 1 v 2... v 128 x,y,n, v 1 v 2... v 128 x,y,n, v 1 v 2... v (1)(2)(3)(4) = T

Reference image p1p1 d1d1 p2p2 d2d2 p3p3 d3d3 p4p4 d4d4... = T Matching & Storage of Local Descriptors More compact notation Each reference image gives a set of keypoints. Each keypoint have a graphical descriptor, which is a 128- components vector. All the (keypoint,vector) pairs corresponding to a set of reference images are stored in a set T.

In the matching-generation stage, an input image gives another set of keypoints and vectors. For each input descriptor, the first and second nearest descriptors in T must be found. Then, a pair of nearest descriptors (d,d FIRST ) gives a pair of matched keypoints (p,p FIRST ). Matching & Storage of Local Descriptors Input image p d... Search in T p FIRST d FIRST p SEC d SEC p1p1 d1d1 p2p2 d2d2...

The match is accepted if the ratio between the distance to the first nearest descriptor and the distance to the second nearest descriptor is lower than a given threshold This indicates that exists no possible confusion in the search results. Matching & Storage of Local Descriptors Accepted if: distance(, ) < * distance (, ) dd FIRST dd SEC

A way to store the T set in a ordered way is using a kd-tree In this case, we will use a 128d-tree As well known, in a kd-tree the elements are stored in the leaves. The other nodes are divisions of the space in some dimension. Storage: Kd-trees 1 >2 2 >3 2 > All the vectors with more than 2 in the first dimension, stored at right side Division node Storage node

Generation of balanced kd-trees: We have a set of vectors We calculate the means and variances for each dimension i. Storage: Kd-trees a1a2…a1a2… b1b2…b1b2… c1c2…c1c2… d1d2…d1d2… ………………

Tree construction: Select the dimension i MAX with the largest variance Order the vectors with respect to the i MAX dimension. Select the median M in this dimension. Get a division node. Repeat the process in a recursive way. Storage: Kd-trees iMAX >M Nodes with i MAX component lesser than M Nodes with i MAX component greater than M

Search process of the nearest neighbors, two alternatives: Compare almost all the descriptors in T with the given descriptor and return the nearest one, or Compare Q nodes at most, and return the nearest of them (compare calculate Euclidean distance) Requires a good search strategy It can fail The failure probability is controllable by Q We choose the second option and we use the BBF (Best Bin First) algorithm. Search Process

Set: v: query vector Q: priority queue ordered by distance to v (initially void) r: initially is the root of T v FIRST : initially not defined and with an infinite distance to v ncomp: number of comparisons, initially zero. While (!finish): Make a search for v in T from r => arrive to a leaf c Add all the directions not taken during the search to Q in an ordered way (each division node in the path gives one not-taken direction) If c is more near to v than v FIRST, then v FIRST =c Make r = the first node in Q (the more near to v), ncomp++ If distance(r,v) > distance(v FIRST,v), finish=1 If ncomp > ncomp MAX, finish=1 Search Process: BBF Algorithm

1 >2 2 > >7 1 > ?: queue: C MIN : Distance between 2 and 20 Search Example Requested vector 1 >2 18 I am a pointer 20>2 Go right Not-taken option 18

Search Example 1 >2 2 > >7 1 > ?: C MIN : queue: 1 >2 2 > >7 Go right 18 1 comparisons: 0

1 >2 2 > >7 1 > ?: queue: 1 >2 2 > >8 C MIN : Search Example >6 Go right comparisons: 0

1 >2 2 > >7 1 > ?: queue: 1 >2 2 > >8 14 C MIN : We arrived to a leaf Store nearest leaf in C MIN Search Example comparisons: 1

1 >2 2 > >7 1 > ?: queue: 1 >2 2 > >8 14 C MIN : Distance from best-in-queue is lesser than distance from c MIN Start new search from best in queue Delete best node in queue Search Example

1 >2 2 > >7 1 > ?: queue: 1 > >8 12 C MIN : Go down from here Search Example comparisons: 1

1 >2 2 > >7 1 > ?: queue: 1 > >8 12 C MIN : We arrived to a leaf Store nearest leaf in C MIN Search Example comparisons: 2

1 >2 2 > >7 1 > ?: queue: 1 > >8 12 C MIN : Search Example Distance from best-in-queue is NOT lesser than distance from c MIN Finish comparisons: 2

Conclusions BBF+Kd-trees: good trade off between short search time and high success probability. But, perhaps BBF+ Kd-trees is not the optimal solution. Finding a better methodology is very important to massive applications (as an example, for Web image retrieval)