Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Bayesian Decision Theory Case Studies

Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.

RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.

Computer vision: models, learning and inference

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Vision Based Control Motion Matt Baker Kevin VanDyke.

Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.

Instructor: Mircea Nicolescu Lecture 17

Robust Object Tracking via Sparsity-based Collaborative Model

Image alignment Image from

Fitting: The Hough transform

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Object Recognition Using Genetic Algorithms CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition.

A Study of Approaches for Object Recognition

Overview of Computer Vision CS491E/791E. What is Computer Vision? Deals with the development of the theoretical and algorithmic basis by which useful.

2D/3D Geometric Transformations CS485/685 Computer Vision Dr. George Bebis.

Object Recognition CS485/685 Computer Vision Dr. George Bebis.

Face Recognition Based on 3D Shape Estimation

Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition.

Object Recognition Using Geometric Hashing

Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.

Camera Calibration CS485/685 Computer Vision Prof. Bebis.

Object recognition under varying illumination. Lighting changes objects appearance.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Linear Discriminant Functions Chapter 5 (Duda et al.)

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Recognition of object by finding correspondences between features of a model and an image. Alignment repeatedly hypothesize correspondences between minimal.

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.

1 Fingerprint Classification sections Fingerprint matching using transformation parameter clustering R. Germain et al, IEEE And Fingerprint Identification.

Image alignment.

Efficient Algorithms for Matching Pedro Felzenszwalb Trevor Darrell Yann LeCun Alex Berg.

Recognition Part II Ali Farhadi CSE 455.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.

Mean-shift and its application for object tracking

Brief Introduction to Geometry and Vision

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.

Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.

Correspondence-Free Determination of the Affine Fundamental Matrix (Tue) Young Ki Baik, Computer Vision Lab.

Object Detection with Discriminatively Trained Part Based Models

Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

Non-Euclidean Example: The Unit Sphere. Differential Geometry Formal mathematical theory Work with small ‘patches’ –the ‘patches’ look Euclidean Do calculus.

Fitting: The Hough transform

Expectation-Maximization (EM) Case Studies

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.

Data Mining and Decision Support

Structure from Motion Paul Heckbert, Nov , Image-Based Modeling and Rendering.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.

Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching Link: singhashwini.mesinghashwini.me.

Bayesian Decision Theory Case Studies CS479/679 Pattern Recognition Dr. George Bebis.

Computer vision: models, learning and inference

- photometric aspects of image formation gray level images

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Recognition: Face Recognition

Machine Learning Basics

Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”

George Bebis and Wenjing Li Computer Vision Laboratory

Chapter 11: Stereopsis Stereopsis: Fusing the pictures taken by two cameras and exploiting the difference (or disparity) between them to obtain the depth.

Presentation transcript:

Algebraic Functions of Views for 3D Object Recognition CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition

Object Appearance The appearance of an object can have a large range of variation due to: –Photometric effects –Scene clutter –Changes in shape (e.g., non-rigid objects) –Viewpoint changes

Algebraic Functions of Views (AFoVs) A powerful mathematical foundation for investigating variations in the geometrical appearance of an object due to viewpoint changes. “the variety of of 2D views depicting the geometrical appearance of a 3D object can be expressed as a combination of a small number of 2D views of the object” “the variety of of 2D views depicting the geometrical appearance of a 3D object can be expressed as a combination of a small number of 2D views of the object” S. Ullman and R. Basri, "Recognition by Linear Combinations of Models", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 10, pp , 1991.

Orthographic Projection Case of 3D rigid transformations (3 ref. views)

Orthographic Projection Case of 3D linear transformations (2 ref views)

More Results … Perspective projection (2 ref. views, obtained under orthographic projection) Objects with smooth surfaces and non-rigid objects –More reference views are required. A. Shashua, “Algebraic functions for recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp , 1995.

A Word of Caution! Only common features in the reference views can be predicted in a novel view. novel view reference view

Recognition Framework Using AFoVs “novel 2D views of a 3D object can be recognized by matching them to combinations of a small number of known 2D views of the object” “novel 2D views of a 3D object can be recognized by matching them to combinations of a small number of known 2D views of the object”

Representation and Matching using AFoVs Representation –Objects are represented by a small number of views. –Each view is represented by some geometric features (e.g., points) Matching – Predict the geometric appearance of an object in a novel view by combining a small number of reference views of the object.

Advantages of the Method No 3D models or camera calibration are required. Only a small number of 2D views are required. Novel views can be different from the stored ones. Simpler verification scheme. More general framework (“family” of methods). Evidence that the human visual system works similarly.

Main Challenges Which model views to combine to predict a novel view? How to establish the correspondences between novel and reference views? How to find the coefficients of the combination?. How to handle occlusions? How to choose the reference views? Integrate AFoVs with Indexing!

Method Overview (G. Bebis, M. Georgiopoulos, M. Shah, and N. da Vitoria Lobo, "Indexing Based on Algebraic Functions of Views", Computer Vision and Image Understanding (CVIU), Vol. 72, No. 3, pp , 1998) Preprocessing step (1) Extract groups of points from each model. (2) Sample the space of appearances of each group. (3) Store information about the groups in an index table Recognition step (1) Extract groups of points from the scene. (2) Predict their appearance. (3) Verify the predictions.

Overview of the Method (cont’d)

Which Model Groups to Choose? Cluster geometric features into higher level descriptions. Consider properties that are unlikely to occur at random. convexityProperty used in our work: convexity

Which Model Groups to Choose? (cont’d)

How to generate the appearances of a group? Estimate each parameter’s range of values Sample the space of parameter values Generate a new appearance for each sample of values

Estimate the Range of Values of the Parameters or Using SVD: and

Estimate the Range of Values of the Parameters (cont’d) Assume normalized coordinates: Use Interval Arithmetic (Moore, 1966) ( note that the solutions will be identical )

Example

Preconditioning the Reference Views : Transform the original views to new views such that has the best possible condition. effect of the condition number of P on the intervals

Preconditioning the Reference Views (cont’d) Choosing: This implies: Thus:

Example (preconditioned views)

Decouple Image Coordinates Same transformation generates the x- and y-coordinates: Represent only the x-coordinates in the index table. For each group, store the following entry:

Hypothesis Generation and Verification 1. take intersection of hypotheses model 2. apply constraints to reject invalid hypotheses

How to Choose the Scene Groups? convex groupingUsing convex grouping to extract salient scene groups.

Implementation Issues Space requirements –select salient groups –reject groups giving rise to bad conditioned matrices –coarse sampling of parameters Index computation and table size

Important Implementation Issues (cont’d) Sampling step (i.e., parameters of AFoVs) Noise tolerance actual: predicted: make additional entries in a neighborhood around the indexed location

Experiments and Results model objects and reference views used in our experiments

Experiments and Results (cont’d) reference views novel view

Experiments and Results (cont’d) reference views novel view

Experiments and Results (cont’d) reference views novel view

Criticism of the Method Relies heavily on feature extraction It has high memory requirements. The index table might represent unrealistic model appearances. Indexing based on hashing is not very efficient. No explicit ranking of hypotheses.

Improving AFoVs Recognition Framework Reject unrealistic appearances Reduce storage requirements and improve speed Develop a probabilistic hypothesis generation scheme –Learn shape appearance –Rank hypotheses Represent object appearance more efficiently using improved indexing schemes and probabilistic models. W. Li, G. Bebis, and N. Bourbakis, "Integrating Algebraic Functions of Views with Indexing and Learning for 3D Object Recognition", IEEE Workshop on Learning in Computer Vision and Patter Recognition (in conjunction with CVPR04), Washington DC, June 28, 2004.

Combine Indexing with Learning Sample the space of appearances sparsely and represent the samples in a K-d tree Sample the space of views densely and represent the samples using probabilistic models. Given a novel view: (1) Use K-d tree to retrieve a small number of candidate models (2) For each candidate model, compute the probability that it might have produced the novel view (3) Verify most likely hypotheses first

Combine Indexing with Learning (cont’d) The first stage provides hypothetical matches fast. The second stage evaluates the feasibility of hypothetical matches fast, without having to apply verification explicitly. Only “highly likely” hypotheses are verified explicitly.

Improved Framework Reference views Extract model groups Estimate the range of AFoVs parameters Sampling AFoVs parameter space New image Extract image groups K-d Tree Rank hypotheses Estimate AFoVs parameters Verify hypotheses TRAINING PHASE RECOGNITION PHASE Using SVD & IA Access Retrieve Validate views Hypothetical matches Low-dimensional representation Manifold learning using EM Recognition results Random Projection densecoarse dense coarse

Eliminate Unrealistic Model Appearances Under the assumption of linear transformations, many unrealistic views could be generated. Impose rigidity constraints to eliminate them. –Storage requirements can be reduced significantly. –Recognition becomes faster and more efficient.

Eliminate Unrealistic Model Appearances Unrealistic Views (without constraints) Realistic Views (with constraints)

Indexing Appearances Sample the space of views “coarsely” and represent the samples in an index table. Hashing might not very well in this case... Need an improved indexing scheme.

Range Search vs Nearest Neighbor Search Nearest Neighbor Search Range Search Range search is not appropriate when storing a sparse number of views. K-d trees perform a nearest-neighbor search.

K-d Trees for Indexing P1 P2 P3 P4 P1P2P3 P4 K-d trees perform a nearest-neighbor search.

Learning Geometric Appearance We can pre-compute the views that an object can produce off-line. These views form a manifold in lower dimensional space. Model object appearance using a pdf. –Sample the space of appearances. –Fit a parametric model (e.g., mixtures of Gaussians using EM). –Use mutual information theory to choose the number of components. EM has problems when the dimensionality of the data is high. Apply “Random Projection” first, then run EM algorithm.

Manifolds of Real Objects: An Example Need to store a small number of parameters only for each model

Hypothesis Ranking where Each hypothesis generated by the K-d tree is ranked by computing its probability using mixture models. For each test group, we compute two probabilities, one from x coordinates, and the other from y coordinates. The overall probability for a particular hypothesis is computed according to the following equation:

Reference Views 1 st Reference view 2 nd Reference view

Reference Views (cont’d) 1 st Reference view 2 nd Reference view

Test Views (a)(b)(c) (d) (e) (f)

Test Views (cont’d) Hypothesis rejected

Integrate Geometric Appearance with Intensity Appearance Using geometrical information only does not provide enough discrimination for objects having similar “geometric” appearance but probably different “intensity” appearance. Integrating geometric and intensity apperance during hypothesis verification to improve discrimination power and robustness. W. Li, G. Bebis, and N. Bourbakis, "3D Object Recognition Using 2D Views", IEEE Transactions on Image Processing (under revision).

Dense Correspondences For each group of corresponding points, apply triangulation recursively to get denser correspondences. Divide triangles into four sub-triangles by considering the middle point of each side of each triangle.

Refine AFoVs parameters (before refinement)(after refinement)

Predict Intensity Appearance - Example Reference view 1Reference view 2 Test view Prediction

Predict Intensity Appearance - Example Reference view 1 Reference view 2 Test viewPrediction

Predict Intensity Appearance - Example (hypothesis rejected) (hypothesis accepted)