Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd.

Slides:

Advertisements

Similar presentations

Image Rectification for Stereo Vision

Advertisements

Object Recognition Using Locality-Sensitive Hashing of Shape Contexts Andrea Frome, Jitendra Malik Presented by Ilias Apostolopoulos.

Feature extraction: Corners

Object Recognition using Local Descriptors Javier Ruiz-del-Solar, and Patricio Loncomilla Center for Web Research Universidad de Chile.

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.

Alignment Visual Recognition “Straighten your paths” Isaiah.

Order Structure, Correspondence, and Shape Based Categories Presented by Piotr Dollar October 24, 2002 Stefan Carlsson.

Geometric Hashing Visual Recognition Lecture 9 “Answer me speedily” Psalm, 17.

Extended Gaussian Images

Mapping: Scaling Rotation Translation Warp

A 3-D reference frame can be uniquely defined by the ordered vertices of a non- degenerate triangle p1p1 p2p2 p3p3.

Special Topic on Image Retrieval Local Feature Matching Verification.

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

Image alignment Image from

Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.

Image alignment.

Uncalibrated Geometry & Stratification Sastry and Yang

GENERALIZED HOUGH TRANSFORM. Recap on classical Hough Transform 1.In detecting lines – The parameters  and  were found out relative to the origin (0,0)

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.

Fitting a Model to Data Reading: 15.1,

Object Recognition Using Geometric Hashing

Scale Invariant Feature Transform (SIFT)

Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.

Object Recognition. Geometric Task : find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding.

A unified statistical framework for sequence comparison and structure comparison Michael Levitt Mark Gerstein.

MSU CSE 803 Stockman CV: Matching in 2D Matching 2D images to 2D images; Matching 2D images to 2D maps or 2D models; Matching 2D maps to 2D maps.

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Recognition of object by finding correspondences between features of a model and an image. Alignment repeatedly hypothesize correspondences between minimal.

Geometric Hashing Visual Recognition Lecture 9 “Answer me speedily” Psalm, 17.

1 Fingerprint Classification sections Fingerprint matching using transformation parameter clustering R. Germain et al, IEEE And Fingerprint Identification.

Image alignment.

CSE 185 Introduction to Computer Vision

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.

Shape Matching for Model Alignment 3D Scan Matching and Registration, Part I ICCV 2005 Short Course Michael Kazhdan Johns Hopkins University.

Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.

Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.

Objects at infinity used in calibration

Generalized Hough Transform

Fitting: The Hough transform

Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.

776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.

Affine Structure from Motion

CS 376b Introduction to Computer Vision 04 / 28 / 2008 Instructor: Michael Eckmann.

EECS 274 Computer Vision Affine Structure from Motion.

Reconnaissance d’objets et vision artificielle Jean Ponce Equipe-projet WILLOW ENS/INRIA/CNRS UMR 8548 Laboratoire.

Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

ELE 488 Fall 2006 Image Processing and Transmission ( )

776 Computer Vision Jan-Michael Frahm Spring 2012.

Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.

Matching Geometric Models via Alignment Alignment is the most common paradigm for matching 3D models to either 2D or 3D data. The steps are: 1. hypothesize.

Keypoint extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.

Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.

Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.

776 Computer Vision Jan-Michael Frahm Spring 2012.

SIFT Scale-Invariant Feature Transform David Lowe

Nearest-neighbor matching to feature database

Homogeneous Coordinates (Projective Space)

Paper Presentation: Shape and Matching

Nearest-neighbor matching to feature database

Shape matching and object recognition using shape contexts

Application: Geometric Hashing

Geometric Hashing: An Overview

RIO: Relational Indexing for Object Recognition

CV: Matching in 2D Matching 2D images to 2D images; Matching 2D images to 2D maps or 2D models; Matching 2D maps to 2D maps MSU CSE 803 Stockman.

Presented by Xu Miao April 20, 2005

Presentation transcript:

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004 Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004

Motivation Object recognition (ultimate goal of most computer vision research). Inputs: A database of objects. A database of objects. A scene or image to recognize. A scene or image to recognize.Problems: 1.Objects in the scene undergo some transformations. 2.Objects may partially occlude each other. 3.Computationally expensive to retrieve each object from database and compare it against the observed scene. Object recognition (ultimate goal of most computer vision research). Inputs: A database of objects. A database of objects. A scene or image to recognize. A scene or image to recognize.Problems: 1.Objects in the scene undergo some transformations. 2.Objects may partially occlude each other. 3.Computationally expensive to retrieve each object from database and compare it against the observed scene.

Problem Statement Recognition under Similarity Transformation: “Is there a transformed (rotated, translated and scaled) subset of some model point-set which matches a subset of the scene point-set?” Recognition under Similarity Transformation: “Is there a transformed (rotated, translated and scaled) subset of some model point-set which matches a subset of the scene point-set?”

Outline 1. Key idea 2.General Framework 3.Recognition under Various Transformations 4.Recognition of 3D Objects from 2D Images 5.Recognition of Polyhedra Objects 6.Comparisons – Alignment – Generalized Hough Transform 1. Key idea 2.General Framework 3.Recognition under Various Transformations 4.Recognition of 3D Objects from 2D Images 5.Recognition of Polyhedra Objects 6.Comparisons – Alignment – Generalized Hough Transform

Key Idea (1/8) Recognizing a pentagon in an image

Key Idea (2/8) Blue: 1

Key Idea (3/8) Red: 1

Key Idea (4/8) Green: 5

Key Idea (5/8) Purple: 1

Key Idea (6/8) Brown: 1

Key Idea (7/8) Blue: 1 Red: 1 Green: 5 Purple: 1 Brown: 1 Object is a pentagon!

Key Idea (8/8) Blue: 1 Red: 2 Green: 2 Purple: 1 Brown: 1 Object is NOT a pentagon!

Brute Force Recognition Let m : points on the model, n : points on the scene. Recognize a single model: O(( m x n ) 2 x t ) where t is the complexity to verify the model against the scene. If m = n, and t = n, then we have O( n 5 ) to recognize a single model. Let m : points on the model, n : points on the scene. Recognize a single model: O(( m x n ) 2 x t ) where t is the complexity to verify the model against the scene. If m = n, and t = n, then we have O( n 5 ) to recognize a single model.

General Framework (1/2) Two stages algorithm: 1.Preprocessing (for each model): For each feature points pair: Define a local coordinate basis on this pair. Define a local coordinate basis on this pair. Compute and quantize all other feature points in this coordinate basis. Compute and quantize all other feature points in this coordinate basis. Record (model, basis) in a hash table. Record (model, basis) in a hash table. Two stages algorithm: 1.Preprocessing (for each model): For each feature points pair: Define a local coordinate basis on this pair. Define a local coordinate basis on this pair. Compute and quantize all other feature points in this coordinate basis. Compute and quantize all other feature points in this coordinate basis. Record (model, basis) in a hash table. Record (model, basis) in a hash table.

General Framework (2/2) 2.Online recognition (given a scene, extract feature points): a)Pick arbitrary ordered pair: Compute the other points using this pair as a basis. Compute the other points using this pair as a basis. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. b)Matching candidates: (model, basis) pairs with large number of votes. c)Recover the transformation that results in the best least- squares match between all corresponding feature points. d)Transform the features, and verify against the input image features (if fails, repeat to 1). 2.Online recognition (given a scene, extract feature points): a)Pick arbitrary ordered pair: Compute the other points using this pair as a basis. Compute the other points using this pair as a basis. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. b)Matching candidates: (model, basis) pairs with large number of votes. c)Recover the transformation that results in the best least- squares match between all corresponding feature points. d)Transform the features, and verify against the input image features (if fails, repeat to 1).

Two Stages Algorithm (1/2) [1]

Two Stages Algorithm (2/2) [1]

Complexity Assume m = n, and k is the number of point to define the basis. Preprocessing: O( n k+1 ) for a single model.Preprocessing: O( n k+1 ) for a single model. Recognition: O( n k+1 ) against all objects in the database.Recognition: O( n k+1 ) against all objects in the database. Assume m = n, and k is the number of point to define the basis. Preprocessing: O( n k+1 ) for a single model.Preprocessing: O( n k+1 ) for a single model. Recognition: O( n k+1 ) against all objects in the database.Recognition: O( n k+1 ) against all objects in the database.

Under Various Transformations (1/2) 1.Translation in 2D and 3D. 1-point basis. 1-point basis. O(n 2 ). O(n 2 ). 2.Similarity transformation in 2D. 2-point basis. 2-point basis. O(n 3 ). O(n 3 ). 3.Similarity transformation in 3D. 3-point basis. 3-point basis. O(n 4 ). O(n 4 ). 1.Translation in 2D and 3D. 1-point basis. 1-point basis. O(n 2 ). O(n 2 ). 2.Similarity transformation in 2D. 2-point basis. 2-point basis. O(n 3 ). O(n 3 ). 3.Similarity transformation in 3D. 3-point basis. 3-point basis. O(n 4 ). O(n 4 ).

Under Various Transformations (2/2) 4.Affine transformation 3-point basis. 3-point basis. O(n 4 ) O(n 4 ) 5.Projective transformation 4-point basis. 4-point basis. O(n 5 ) O(n 5 ) 4.Affine transformation 3-point basis. 3-point basis. O(n 4 ) O(n 4 ) 5.Projective transformation 4-point basis. 4-point basis. O(n 5 ) O(n 5 )

Recognition of 3D Objects from 2D Images (1/5) 1.Correspondence of planes Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Hash (model, plane, basis) triplet. Hash (model, plane, basis) triplet. Use either projective transformation or affine transformation. Use either projective transformation or affine transformation. Once the planes correspondence have been established, the position of the entire 3D body is solved. Once the planes correspondence have been established, the position of the entire 3D body is solved. 1.Correspondence of planes Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Hash (model, plane, basis) triplet. Hash (model, plane, basis) triplet. Use either projective transformation or affine transformation. Use either projective transformation or affine transformation. Once the planes correspondence have been established, the position of the entire 3D body is solved. Once the planes correspondence have been established, the position of the entire 3D body is solved.

Recognition of 3D Objects from 2D Images (2/5) 2.Singular affine transformation A x + b = U where A : 2x3 affine matrix x : 3x1 3D vector b : 2x1 2D translation vector U : 2x1 image 2.Singular affine transformation A x + b = U where A : 2x3 affine matrix x : 3x1 3D vector b : 2x1 2D translation vector U : 2x1 image

Recognition of 3D Objects from 2D Images (3/5) A set of four non-coplanar points in 3D defines a 3D affine basis: – One point as origin – The vectors between origin and the other three points as the unit (oblique) coordinate system. Preprocess the model points in this four- basis point. A set of four non-coplanar points in 3D defines a 3D affine basis: – One point as origin – The vectors between origin and the other three points as the unit (oblique) coordinate system. Preprocess the model points in this four- basis point.

Recognition of 3D Objects from 2D Images (4/5) Recognition: Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 A point p in the image, with v be the vector from p 0 to p. A point p in the image, with v be the vector from p 0 to p. Vote for all t ≠ 0 ( a line with parameter t): Vote for all t ≠ 0 ( a line with parameter t): v = (  + t  ) v 1 + (  + t  ) v 2 + ( t  ) v 3, where ( ,  ) is the coordinate of v in the v 1, v 2 basis. Recognition: Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 A point p in the image, with v be the vector from p 0 to p. A point p in the image, with v be the vector from p 0 to p. Vote for all t ≠ 0 ( a line with parameter t): Vote for all t ≠ 0 ( a line with parameter t): v = (  + t  ) v 1 + (  + t  ) v 2 + ( t  ) v 3, where ( ,  ) is the coordinate of v in the v 1, v 2 basis.

Recognition of 3D Objects from 2D Images (5/5) 3.Establishing a viewing angle with similarity transformation. Tesselate a viewing sphere (uniform in spherical coordinates). Tesselate a viewing sphere (uniform in spherical coordinates). Record (model, basis, angle) in the hash table. Record (model, basis, angle) in the hash table. 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene). 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene). 3.Establishing a viewing angle with similarity transformation. Tesselate a viewing sphere (uniform in spherical coordinates). Tesselate a viewing sphere (uniform in spherical coordinates). Record (model, basis, angle) in the hash table. Record (model, basis, angle) in the hash table. 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene). 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene).

Recognition of Polyhedral Objects Polygonal objects Choose an edge as the basis, record (model, basis edge) in the hash table. Choose an edge as the basis, record (model, basis edge) in the hash table. Preprocessing and recognition is O(n 2 ). Preprocessing and recognition is O(n 2 ). Polygonal objects Choose an edge as the basis, record (model, basis edge) in the hash table. Choose an edge as the basis, record (model, basis edge) in the hash table. Preprocessing and recognition is O(n 2 ). Preprocessing and recognition is O(n 2 ). [1]

Comparisons (1/2) 1.With alignment method. Use exhaustive enumeration of all possible pairs in the objects and the images. Use exhaustive enumeration of all possible pairs in the objects and the images. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. Geometric hashing more efficient if: Geometric hashing more efficient if: The scene contains enough features (6-10) for efficient recognition by voting. The scene contains enough features (6-10) for efficient recognition by voting. There are many models. There are many models. 1.With alignment method. Use exhaustive enumeration of all possible pairs in the objects and the images. Use exhaustive enumeration of all possible pairs in the objects and the images. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. Geometric hashing more efficient if: Geometric hashing more efficient if: The scene contains enough features (6-10) for efficient recognition by voting. The scene contains enough features (6-10) for efficient recognition by voting. There are many models. There are many models.

Comparisons (2/2) 2.With Generalized Hough Transform (GHT). GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while Geometric Hashing quantizes just the (discrete) transformation represented by the basis. Geometric Hashing quantizes just the (discrete) transformation represented by the basis. 2.With Generalized Hough Transform (GHT). GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while Geometric Hashing quantizes just the (discrete) transformation represented by the basis. Geometric Hashing quantizes just the (discrete) transformation represented by the basis.

Summary Ability to recognize objects that have undergo an arbitrary transformation.Ability to recognize objects that have undergo an arbitrary transformation. Can perform partial matching.Can perform partial matching. Efficient and can be parallelized easily.Efficient and can be parallelized easily. Use transformation-invariant access key to the hash table.Use transformation-invariant access key to the hash table. Two phases (preprocessing and recognition).Two phases (preprocessing and recognition). Require a large memory to store hash table.Require a large memory to store hash table. Ability to recognize objects that have undergo an arbitrary transformation.Ability to recognize objects that have undergo an arbitrary transformation. Can perform partial matching.Can perform partial matching. Efficient and can be parallelized easily.Efficient and can be parallelized easily. Use transformation-invariant access key to the hash table.Use transformation-invariant access key to the hash table. Two phases (preprocessing and recognition).Two phases (preprocessing and recognition). Require a large memory to store hash table.Require a large memory to store hash table.

References [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition Scheme, ICCV, 1988.