Cambridge, Massachusetts Pose Estimation in Heavy Clutter using a Multi-Flash Camera Ming-Yu Liu, Oncel Tuzel, Ashok Veeraraghavan, Rama Chellappa, Amit.

Slides:

Advertisements

Similar presentations

Shape Context and Chamfer Matching in Cluttered Scenes

Advertisements

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Feature Based Image Mosaicing

3D Model Matching with Viewpoint-Invariant Patches(VIP) Reporter ：鄒嘉恆 Date ： 10/06/2009.

Chamfer Distance for Handshape Detection and Recognition CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.

Feature Matching and Robust Fitting Computer Vision CS 143, Brown James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial.

Face Alignment with Part-Based Modeling

Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Announcements Final Exam May 13th, 8 am (not my idea).

Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Robust Object Tracking via Sparsity-based Collaborative Model

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

3D Hand Pose Estimation by Finding Appearance-Based Matches in a Large Database of Training Views

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Scale Invariant Feature Transform (SIFT)

Comparison and Combination of Ear and Face Images in Appearance-Based Biometrics IEEE Trans on PAMI, VOL. 25, NO.9, 2003 Kyong Chang, Kevin W. Bowyer,

Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

PhD Thesis. Biometrics Science studying measurements and statistics of biological data Most relevant application: id. recognition 2.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

כמה מהתעשייה? מבנה הקורס השתנה Computer vision.

Fitting and Registration

The 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems October 11-15, 2009 St. Louis, USA.

CSE 185 Introduction to Computer Vision

Computer Vision - Fitting and Alignment

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Efficient Algorithms for Robust Feature Matching Mount, Netanyahu and Le Moigne November 7, 2000 Presented by Doe-Wan Kim.

3D Fingertip and Palm Tracking in Depth Image Sequences

Multimodal Interaction Dr. Mike Spann

Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.

1 TEMPLATE MATCHING  The Goal: Given a set of reference patterns known as TEMPLATES, find to which one an unknown pattern matches best. That is, each.

Fitting and Registration Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/14/12.

ENT 273 Object Recognition and Feature Detection Hema C.R.

Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.

Model Fitting Computer Vision CS 143, Brown James Hays 10/03/11 Slides from Silvio Savarese, Svetlana Lazebnik, and Derek Hoiem.

Scene Completion Using Millions of Photographs James Hays, Alexei A. Efros Carnegie Mellon University ACM SIGGRAPH 2007.

Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.

Computer Vision - Fitting and Alignment (Slides borrowed from various presentations)

Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team

Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.

CSE 185 Introduction to Computer Vision Feature Matching.

Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.

Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.

Tracking Hands with Distance Transforms Dave Bargeron Noah Snavely.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

11/25/03 3D Model Acquisition by Tracking 2D Wireframes Presenter: Jing Han Shiau M. Brown, T. Drummond and R. Cipolla Department of Engineering University.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Processing visual information for Computer Vision

SIFT Scale-Invariant Feature Transform David Lowe

Line Fitting James Hayes.

Nearest-neighbor matching to feature database

Feature description and matching

Real-Time Human Pose Recognition in Parts from Single Depth Image

A special case of calibration

Object recognition Prof. Graeme Bailey

Nearest-neighbor matching to feature database

Photo by Carl Warner.

CSE 185 Introduction to Computer Vision

Recognition and Matching based on local invariant features

Presentation transcript:

Cambridge, Massachusetts Pose Estimation in Heavy Clutter using a Multi-Flash Camera Ming-Yu Liu, Oncel Tuzel, Ashok Veeraraghavan, Rama Chellappa, Amit Agrawal, and Harushisa Okuda

Object Pose Estimation for Robot Assembly Tasks Human Labor to Robot Labor Objects must be carefully placed before robot operates How about this? The goal is to detect and localize a target object in a cluttered bin and to accurately estimate its pose using cameras. The robot can then use this estimate to grasp the object and perform subsequent manipulation. Computer Vision Based Solution Invention of interchangeable parts

Algorithmic Layout System Overview

Multi-Flash Camera LEDs are sequentially switched on and off to create different illumination patterns. We filter out the contribution of ambient light by computing J i = I i – I ambient We normalize the illumination changes by computing ratio Images RI i = J i / J max Detect the bright to dark transition in the ratio images

Depth Edges Edges detection using Canny edge detector Depth Edges Using MFC

Database Generation The database is generated by rendering the CAD model of the object with respect to sampled 3D rotations at the fixed location. We sample k out-of-plane rotations uniformly on the space and generate the depth edge templates. We exclude inplane rotations from the database and solve for the optimal in-plane rotation parameter during matching

Directional Chamfer Matching We define the distance between two sets of edge maps as and solve for the optimal alignment parameters where

Search Optimization The search problem requires optimization over three parameters of planer Euclidean transformation,, for each of the k templates stored in the database Given a 640x480 query image and a database of k = 300 edge templates, the brute-force search requires more than evaluations of the cost function We perform search optimization in two stages: We present a sublinear time algorithm for computing the matching score We reduce the three-dimensional search problem to one dimensional queries

Line Representation We fit line segments to depth edges and each template pose is represented with a collection of m-line segments Compared with a set of points which has cardinality n, its linear representation is more concise It requires only O(m) memory to store an edge map where m << n We use a variant of RANSAC algorithm to compute the linear representation of an edge map

3D Distance Transform The 3D DT can be computed in linear time on the size of the image using dynamic programming Given the DT the matching cost can be evaluated in O(n) operations where n is the number of template edge pixels. Input Image Quantization2D Distance Transform 3D Distance Transform Distance Transform Distance transform is an intermediate image representation where the map labels each pixel of the image with the distance to the nearest zero pixel.

Directional Integral Images Summing the cost for each edge pixel still requires O(n) operations It is possible to compute this summation for all the points on a line in constant time using directional integral images We compute 1D directional integral images in one pass over the 3D distance transform tensor Using the integral representation the matching cost can of the template at a hypostatized location can be computed in O(m) operations where m is the number of lines in a template and m << n Integral Distance Transform

1D Line Search The linear representation provides an efficient method to reduce the size of the search space. We rotate and translate the template such that the major template line segment is aligned with the direction of the major query image line segment. The template is then translated along the query segment. The search time is invariant to the size of the image and is only a function of number of template and query image lines.

Pose Refinement The scene is imaged with MFC from a second location We jointly minimize the reprojection error in two views via continuous optimization (ICP and Gauss-Newton) and refine the pose

Experiments on Synthetic Data Detection Rate Circuit Breaker Mitsubishi Logo Ellipse Toy T-NutKnobWheelAvg. Propsed OCM [1] Chamfer Matching [2] [1] J. Shotten, A. Blake, and R. Cipolla. Multiscale categorical object recognition using contour fragment, PAMI 2008 [2] H. G. Barrow, J. M. Tenenbaum, R. C. Bolles, and H. C. Wolf, "Parametric correspondence and chamfer matching: Two new techniques for image matching," in Proc. 5th Int. Joint Conf. Artificial Intelligence 1977 Detection performance comparison Pose estimation in heavy clutter

Experiments on Real Data ( Matching )

Experiment On Real Data ( Pose Refinement )

Pose Estimation Performance on Real Data Normalized histogram of deviation from pose estimates to their medians

Conclusion 1.Multi-Flash Camera provides accurate separation of depth edges and texture edges and can be utilized for object pose estimation even in heavy clutter. 2.Directional Chamfer Matching cost function provides a robust matching measure for detecting objects in heavy clutter. 3.Line representation, 3D distance transform, and directional integral images enables efficient template matching. 4.Experiment results show that the proposed system is highly accurate. ( 1mm and 2 0 )

Thank You & System Demo