3D Models and Matching representations for 3D object models

Slides:



Advertisements
Similar presentations
Matching in 2D Readings: Ch. 11: Sec alignment point representations and 2D transformations solving for 2D transformations from point correspondences.
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
電腦視覺 Computer and Robot Vision I
Surface normals and principal component analysis (PCA)
Extended Gaussian Images
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
Automatic Feature Extraction for Multi-view 3D Face Recognition
Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.
Image Segmentation and Active Contour
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
Stockman CSE/MSU Fall 2005 Models and Matching Methods of modeling objects and their environments; Methods of matching models to sensed data for recogniton.
Project 4 out today –help session today –photo session today Project 2 winners Announcements.
Face Recognition Jeremy Wyatt.
1 3D Models and Matching representations for 3D object models particular matching techniques alignment-based systems appearance-based systems GC model.
Fitting a Model to Data Reading: 15.1,
3D Geometry for Computer Graphics
1 High-Level Computer Vision Detection of classes of objects (faces, motorbikes, trees, cheetahs) in images Detection of classes of objects (faces, motorbikes,
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Face Detection and Recognition Readings: Ch 8: Sec 4.4, Ch 14: Sec 4.4
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
3D Models and Matching representations for 3D object models
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
SVD(Singular Value Decomposition) and Its Applications
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Recognition Part II Ali Farhadi CSE 455.
Presented By Wanchen Lu 2/25/2013
CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.
Multimodal Interaction Dr. Mike Spann
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
Generalized Hough Transform
Face Recognition: An Introduction
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
CSE 185 Introduction to Computer Vision Face Recognition.
Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd.
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Specific Object Recognition: Matching in 2D engine model image containing an instance of the model Is there an engine in the image? If so, where is it.
Matching Geometric Models via Alignment Alignment is the most common paradigm for matching 3D models to either 2D or 3D data. The steps are: 1. hypothesize.
1 Review and Summary We have covered a LOT of material, spending more time and more detail on 2D image segmentation and analysis, but hopefully giving.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
CSE 554 Lecture 8: Alignment
Face Detection and Recognition Readings: Ch 8: Sec 4.4, Ch 14: Sec 4.4
University of Ioannina
Recognizing Deformable Shapes
Recognition: Face Recognition
3D Models and Matching particular matching techniques
Announcements Project 1 artifact winners
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Specific Object Recognition: Matching in 2D
3D Models and Matching representations for 3D object models
Matching in 2D Readings: Ch. 11: Sec
Application: Geometric Hashing
RIO: Relational Indexing for Object Recognition
Principal Component Analysis
Outline H. Murase, and S. K. Nayar, “Visual learning and recognition of 3-D objects from appearance,” International Journal of Computer Vision, vol. 14,
CS4670: Intro to Computer Vision
CS4670: Intro to Computer Vision
Announcements Project 2 artifacts Project 3 due Thursday night
Announcements Project 4 out today Project 2 winners help session today
Where are we? We have covered: Project 1b was due today
Announcements Artifact due Thursday
Announcements Artifact due Thursday
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

3D Models and Matching representations for 3D object models particular matching techniques alignment-based systems appearance-based systems GC model of a screwdriver

3D Models Many different representations have been used to model 3D objects. Some are very coarse, just picking up the important features. Others are very fine, describing the entire surface of the object. Usually, the recognition procedure depends very much on the type of model.

Mesh Models Mesh models were originally for computer graphics. With the current availability of range data, they are now used for 3D object recognition. What types of features can we extract from meshes for matching ? In addition to matching, they can be used for verification.

Surface-Edge-Vertex Models SEV models are at the opposite extreme from mesh models. They specify the (usually linear) features that would be extracted from 2D or 3D data. They are suitable for objects with sharp edges and corners that are easily detectable and characterize the object. surface edge vertex surface surface

Generalized-Cylinders A generalized cylinder is a volumetric primitive defined by: a space curve axis a cross section function This cylinder has - curved axis - varying cross section standard cylinder rectangular cross sections

Generalized-Cylinder Models Generalized cylinder models include: a set of generalized cylinders the spatial relationships among them the global properties of the object 2 1 2 3 1 How can we describe the attributes of the cylinders and of their connections? 3

Finding GCs in Intensity Data Generalized cylinder models have been used for several different classes of objects: - airplanes (Brooks) - animals (Marr and Nishihara) - humans (Medioni) - human anatomy (Zhenrong Qian) The 2D projections of standard GCs are - ribbons - ellipses

Octrees - Octrees are 8-ary tree structures that compress a voxel representation of a 3D object. - Kari Puli used them to represent the 3D objects during the space carving process. - They are sometimes used for medical object representation. M 1 5 4 4 6 2 F F F E F F F E 0 1 2 3 4 5 6 7

Superquadrics Superquadrics are parameterized equations that describe solid shapes algebraically. They have been used for graphics and for representing some organs of the human body, ie. the heart.

2D Deformable Models A 2D deformable model or snake is a function that is fit to some real data, along its contours. The fitting mimizes: - internal energy of the contour (smoothness) - image energy (fit to data) - external energy (user- defined constraints)

3D Deformable Models In 3D, the snake concept becomes a balloon that expands to fill a point cloud of 3D data.

Matching Geometric Models via Alignment Alignment is the most common paradigm for matching 3D models to either 2D or 3D data. The steps are: 1. hypothesize a correspondence between a set of model points and a set of data points 2. From the correspondence compute a transformation from model to data 3. Apply the transformation to the model features to produce transformed features 4. Compare the transformed model features to the image features to verify or disprove the hypothesis

3D-3D Alignment of Mesh Models to Mesh Data Older Work: match 3D features such as 3D edges and junctions or surface patches More Recent Work: match surface signatures - curvature at a point - curvature histogram in the neighborhood of a point - Medioni’s splashes - Johnson and Hebert’s spin images

The Spin Image Signature P is the selected vertex. X is a contributing point of the mesh.  is the perpendicular distance from X to P’s surface normal.  is the signed perpendicular distance from X to P’s tangent plane. X n  tangent plane at P P 

Spin Image Construction A spin image is constructed - about a specified oriented point o of the object surface - with respect to a set of contributing points C, which is controlled by maximum distance and angle from o. It is stored as an array of accumulators S(,) computed via: For each point c in C(o) 1. compute  and  for c. 2. increment S (,) o

Spin Image Matching ala Sal Ruiz

Spin Images Object Recognition Offline: Compute spin images of each vertex of the object model(s) 1. Compute spin images at selected points of a scene. 2. Compare scene spin images with model scene images by correlation or related method. 3. Group strong matches as in pose clustering and eliminate outliers. 4. Use the winning pose transformation to align the model to the image points and verify or disprove.

2D-3D Alignment single 2D images of the objects 3D object models - full 3D models, such as GC or SEV - view class models representing characteristic views of the objects

View Classes and Viewing Sphere The space of view points can be partitioned into a finite set of characteristic views. Each view class represents a set of view points that have something in common, such as: 1. same surfaces are visible 2. same line segments are visible 3. relational distance between pairs of them is small V v

3 View Classes of a Cube 1 surface 2 surfaces 3 surfaces

TRIBORS: view class matching of polyhedral objects Each object had 4-5 view classes (hand selected) The representation of a view class for matching included: - triplets of line segments visible in that class - the probability of detectability of each triplet determined by graphics simulation

RIO: Relational Indexing for Object Recognition RIO worked with more complex parts that could have - planar surfaces - cylindrical surfaces - threads

Object Representation in RIO 3D objects are represented by a 3D mesh and set of 2D view classes. Each view class is represented by an attributed graph whose nodes are features and whose attributed edges are relationships. For purposes of indexing, attributed graphs are stored as sets of 2-graphs, graphs with 2 nodes and 2 relationships. share an arc coaxial arc cluster ellipse

RIO Features ellipses coaxials coaxials-multi parallel lines junctions triples close and far L V Y Z U

RIO Relationships share one arc share one line share two lines coaxial close at extremal points bounding box encloses / enclosed by

Hexnut Object What other features and relationships can you find?

Graph and 2-Graph Representations 1 coaxials- multi encloses 1 1 2 3 2 3 3 2 encloses 2 ellipse e e e c encloses 3 parallel lines coaxial

Relational Indexing for Recognition Preprocessing (off-line) Phase for each model view Mi in the database encode each 2-graph of Mi to produce an index store Mi and associated information in the indexed bin of a hash table H

Matching (on-line) phase Construct a relational (2-graph) description D for the scene For each 2-graph G of D Select the Mis with high votes as possible hypotheses Verify or disprove via alignment, using the 3D meshes encode it, producing an index to access the hash table H cast a vote for each Mi in the associated bin

The Voting Process

RIO Verifications 1. The matched features of the hypothesized incorrect hypothesis 1. The matched features of the hypothesized object are used to determine its pose. 2. The 3D mesh of the object is used to project all its features onto the image. 3. A verification procedure checks how well the object features line up with edges on the image.

Functional Models (Stark and Bowyer) Classes of objects are defined through their functions. Knowledge primitives are parameterized procedures that check for basic physical concepts such as - dimensions - orientation - proximity - stability - clearance - enclosure

Example: Chair

Functional Recognition Procedure Segment the range data into surfaces Use a bottom-up analysis to determine all functional properties From this, construct indexes that are used to rank order the possible objects and prune away the impossible ones Use a top-down approach to fully test for the most highly ranked categories. What are the strengths and weaknesses of this approach?

Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented by basis functions (eigenvectors) and their coefficients. Matching is performed on this compressed image representation.

Linear subspaces Consider the sum squared distance of a point x to all of the orange points: What unit vector v minimizes SSD? What unit vector v maximizes SSD? Solution: v1 is eigenvector of A with largest eigenvalue v2 is eigenvector of A with smallest eigenvalue

Principle component analysis Suppose each data point is N-dimensional Same procedure applies: The eigenvectors of A define a new coordinate system eigenvector with largest eigenvalue captures the most variation among training vectors x eigenvector with smallest eigenvalue has least variation We can compress the data by only using the top few eigenvectors corresponds to choosing a “linear subspace” represent points on a line, plane, or “hyper-plane”

The space of faces + = An image is a point in a high dimensional space An N x M image is a point in RNM We can define vectors in this space

Dimensionality reduction We can find the best subspace using PCA Suppose it is K dimensional This is like fitting a “hyper-plane” to the set of faces spanned by vectors v1, v2, ..., vK any face x  a1v1 + a2v2 + , ..., + aKvK The set of faces is a “subspace” of the set of images.

Turk and Pentland’s Eigenfaces: Training Let F1, F2,…, FM be a set of training face images. Let F be their mean and i = Fi – F Use principal components to compute the eigenvectors and eigenvalues of the covariance matrix of the i s Choose the vector u of most significant M eigenvectors to use as the basis. Each face is represented as a linear combination of eigenfaces u = (u1, u2, u3, u4, u5); F27 = a1*u1 + a2*u2 + … + a5*u5

Matching unknown face image I convert to its eigenface representation Find the face class k that minimizes k = ||  - k ||

training images 3 eigen- images mean image linear approxi- mations

Extension to 3D Objects Murase and Nayar (1994, 1995) extended this idea to 3D objects. The training set had multiple views of each object, on a dark background. The views included multiple (discrete) rotations of the object on a turntable and also multiple (discrete) illuminations. The system could be used first to identify the object and then to determine its (approximate) pose and illumination.

Significance of this work The extension to 3D objects was an important contribution. Instead of using brute force search, the authors observed that All the views of a single object, when transformed into the eigenvector space became points on a manifold in that space. Using this, they developed fast algorithms to find the closest object manifold to an unknown input image. Recognition with pose finding took less than a second.

Sample Objects Columbia Object Recognition Database

Appearance-Based Recognition Training images must be representative of the instances of objects to be recognized. The object must be well-framed. Positions and sizes must be controlled. Dimensionality reduction is needed. It is still not powerful enough to handle general scenes without prior segmentation into relevant objects.-- my comment Hybrid systems (features plus appearance) seem worth pursuing.