Shape Context and Chamfer Matching in Cluttered Scenes

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.

Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.

1 Hierarchical Part-Based Human Body Pose Estimation * Ramanan Navaratnam * Arasanathan Thayananthan Prof. Phil Torr * Prof. Roberto Cipolla * University.

Likelihood Models for Template Matching Using the PDF Projection Theorem Arasanathan Thayananthan Ramanan Navaratnam Dr. Phil Torr Prof. Roberto Cipolla.

Object Recognition Using Locality-Sensitive Hashing of Shape Contexts Andrea Frome, Jitendra Malik Presented by Ilias Apostolopoulos.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Chamfer Distance for Handshape Detection and Recognition CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.

Presented by Xinyu Chang

Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Cambridge, Massachusetts Pose Estimation in Heavy Clutter using a Multi-Flash Camera Ming-Yu Liu, Oncel Tuzel, Ashok Veeraraghavan, Rama Chellappa, Amit.

Silhouette Lookup for Automatic Pose Tracking N ICK H OWE.

Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.

Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.

Simultaneous Segmentation and 3D Pose Estimation of Humans or Detection + Segmentation = Tracking? Philip H.S. Torr Pawan Kumar, Pushmeet Kohli, Matt Bray.

Computer Vision Group University of California Berkeley Shape Matching and Object Recognition using Shape Contexts Jitendra Malik U.C. Berkeley (joint.

1 Interest Operators Find “interesting” pieces of the image –e.g. corners, salient regions –Focus attention of algorithms –Speed up computation Many possible.

A Study of Approaches for Object Recognition

Human Posture Recognition with Convex Programming Hao Jiang, Ze-Nian Li and Mark S. Drew School of Computing Science Simon Fraser University Burnaby, BC,

Distinctive Image Feature from Scale-Invariant KeyPoints

Chamfer Matching & Hausdorff Distance Presented by Ankur Datta Slides Courtesy Mark Bouts Arasanathan Thayananthan.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Recognizing and Tracking Human Action Josephine Sullivan and Stefan Carlsson.

Scale Invariant Feature Transform (SIFT)

Face Recognition from Face Motion Manifolds using Robust Kernel RAD Ognjen Arandjelović Roberto Cipolla Funded by Toshiba Corp. and Trinity College, Cambridge.

Computer Vision Group University of California Berkeley Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA Greg Mori and Jitendra Malik.

Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.

כמה מהתעשייה? מבנה הקורס השתנה Computer vision.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

2D Shape Matching (and Object Recognition)

1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.

CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.

Local invariant features Cordelia Schmid INRIA, Grenoble.

A 3D Model Alignment and Retrieval System Ding-Yun Chen and Ming Ouhyoung.

Video Based Palmprint Recognition Chhaya Methani and Anoop M. Namboodiri Center for Visual Information Technology International Institute of Information.

Shape Matching Tuesday, Nov 18 Kristen Grauman UT-Austin.

CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team

Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Presented by: Idan Aharoni

776 Computer Vision Jan-Michael Frahm Spring 2012.

Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.

Tracking Hands with Distance Transforms Dave Bargeron Noah Snavely.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Fast Human Detection in Crowded Scenes by Contour Integration and Local Shape Estimation Csaba Beleznai, Horst Bischof Computer Vision and Pattern Recognition,

Deformation Modeling for Robust 3D Face Matching Xioguang Lu and Anil K. Jain Dept. of Computer Science & Engineering Michigan State University.

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Lecture 26 Hand Pose Estimation Using a Database of Hand Images

Image gradients and edges

Paper Presentation: Shape and Matching

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Object recognition Prof. Graeme Bailey

CSE 455 – Guest Lectures 3 lectures Contact Interest points 1

Aim of the project Take your image Submit it to the search engine

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Analyzing CAPTCHAs.

Comparing Images Using Hausdorff Distance

CSE 185 Introduction to Computer Vision

Presented by Xu Miao April 20, 2005

Shape Matching and Object Recognition

Presentation transcript:

Shape Context and Chamfer Matching in Cluttered Scenes Arasanathan Thayananthan Björn Stenger Dr. Phil Torr Prof. Roberto Cipolla

What? Why? How? What: track articulated hand motion through video This work: tracker initialization Why: to drive 3D avatar (HCI) How: Using shape matching Two competing methods: chamfer matching shape context matching

Goal: Hand Tracking [ICCV 03]

How to detect a hand? Comparison of matching methods Shape context vs. Chamfer matching Enhancements for shape context Robustness to clutter

Overview Shape Context matching in clutter Applications Difficulties Proposed enhancements Comparison with original shape context Applications Hand detection EZ-Gimpy recognition

Previous Work Shape Context [Belongie et al., 00] Invariance to translation and scale High performance in Digit recognition : MNIST dataset Silhouettes : MPEG-7 database Common household objects: COIL-20 database Chamfer Matching [Barrow et al., 77] efficient hierarchical matching [Borgefors, 88] pedestrian detection [Gavrila, 00]

Shape Context: Histogram Shape context of a point: log-polar histogram of the relative positions of all other points Similar points on shapes have similar histograms

Shape Context: Matching j Cost Matrix C Image Points Template Points 2 Test Optimal Correspondence Cost Function Bi-partite Graph Matching

Shape-Context: Matching

Scale Invariance in Clutter ? Median of pairwise point distances is used as scale factor Clutter will affect this scale factor 50.5 41.6 50.5 121.9

Scale Invariant in Clutter ? Significant clutter Unreliable scale factor Incorrect correspondences Solution Calculate shape contexts at different scales and match at different scales Computationally expensive

No Figural Continuity No continuity constraint Adjacent points in one shape are matched to distant points in the other

Multiple Edge Orientations Edge pixels are divided into 8 groups based on orientation Shape contexts are calculated separately for each group Total matching score is obtained by adding individual 2 scores

Single vs. Multiple Orientation

Imposing Figural Continuity v(i-2) v(i) v(i-1) ui and ui-1 are neighboring points on the model shape u  is the correspondence between two shape points Corresponding points v(i) and v(i-1) need to be neighboring points on target shape v

Imposing Figural Continuity v(i-2) v(i) v(i-1)

Imposing Figural Continuity Minimize the cost function for  Ordering of the model shape is known Use Viterbi Algorithm

With Figural Continuity Similar Shapes

With Figural Continuity Different Scale

With Figural Continuity Small Rotation

With Figural Continuity Shape Variation

With Figural Continuity Clutter

Chamfer Matching Matching technique cost is integral along contour Distance transform of the Canny edge map

Distance Transform (x,y) d Distance image gives the distance to the nearest edgel at every pixel in the image Calculated only once for each frame

Chamfer Matching Chamfer score is average nearest distance from template points to image points Nearest distances are readily obtained from the distance image Computationally inexpensive

Chamfer Matching Distance image provides a smooth cost function Efficient searching techniques can be used to find correct template

Chamfer Matching

Chamfer Matching

Chamfer Matching

Chamfer Matching

Chamfer Matching

Multiple Edge Orientations Similar to Gavrila, edge pixels are divided into 8 groups based on orientation Distance transforms are calculated separately for each group Total matching score is obtained by adding individual chamfer scores

Applications: Hand Detection Initializing a hand model for tracking Locate the hand in the image Adapt model parameters No skin color information used Hand is open and roughly fronto-parallel

Results: Hand Detection Shape Context with Continuity Constraint Original Shape Context Chamfer Matching

Results: Hand Detection Shape Context with Continuity Constraint Original Shape Context Chamfer Matching

Applications: CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart [Blum et al., 02] Used in e-mail sign up for Yahoo accounts Word recognition with shape variation and added noise Examples:

EZ-Gimpy results Chamfer cost for each letter template Word matching cost: average chamfer cost + variance of distances Top 3 matches (dictionary 561 words) right 25.34 fight 27.88 night 28.42 93.2% correct matches using 2 templates per letter Shape context 92.1% [Mori & Malik, 03]

Discussion The original shape context matching Not invariant in clutter Iterative matching is used in the original shape context paper Correct point correspondence in the initial matching is quite small in substantial clutter Iterative matching will not improve the performance

Discussion Shape Context with Continuity Constraint Includes contour continuity & curvature Robust to substantial amount of clutter Much better correspondences and model alignment just from initial matching No need for iteration More robust to small variations in scale, rotation and shape.

Discussion Chamfer Matching Variant to scale and rotation More sensitive to small shape changes than shape context Need large number of template shapes But Robust to clutter Computationally cheap compared to shape context

Conclusion Use shape context when There is not much clutter There are unknown shape variations from the templates (e.g. two different types of fish) Speed is not the priority

Conclusion Chamfer matching is better when There is substantial clutter All expected shape variations are well-represented by the shape templates Robustness and speed are more important

Forthcoming work [ICCV 2003]

Webpage For more information on initialization and articulated hand tracking http://svr-www.eng.cam.ac.uk/~bdrs2/hand/hand.html