Silhouette-based Object Phenotype Recognition using 3D Shape Priors Yu Chen 1 Tae-Kyun Kim 2 Roberto Cipolla 1 University of Cambridge, Cambridge, UK 1.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Shape Context and Chamfer Matching in Cluttered Scenes
ECG Signal processing (2)
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Ignas Budvytis*, Tae-Kyun Kim*, Roberto Cipolla * - indicates equal contribution Making a Shallow Network Deep: Growing a Tree from Decision Regions of.
RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Automatic determination of skeletal age from hand radiographs of children Image Science Institute Utrecht University C.A.Maas.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
An Introduction of Support Vector Machine
Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.
Face Alignment with Part-Based Modeling
1 Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions Taipeng Tian, Rui Li and Stan Sclaroff Computer Science Dept. Boston University.
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Cambridge, Massachusetts Pose Estimation in Heavy Clutter using a Multi-Flash Camera Ming-Yu Liu, Oncel Tuzel, Ashok Veeraraghavan, Rama Chellappa, Amit.
Su-A Kim 3 rd June 2014 Danhang Tang, Tsz-Ho Yu, Tae-kyun Kim Imperial College London, UK Real-time Articulated Hand Pose Estimation using Semi-supervised.
Lecture Pose Estimation – Gaussian Process Tae-Kyun Kim 1 EE4-62 MLCV.
3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?
Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.
Silhouette Lookup for Automatic Pose Tracking N ICK H OWE.
Robust Object Tracking via Sparsity-based Collaborative Model
Modeling 3D Deformable and Articulated Shapes Yu Chen, Tae-Kyun Kim, Roberto Cipolla Department of Engineering University of Cambridge.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Face Recognition & Biometric Systems, 2005/2006 Face recognition process.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
A Versatile Depalletizer of Boxes Based on Range Imagery Dimitrios Katsoulas*, Lothar Bergen*, Lambis Tassakos** *University of Freiburg **Inos Automation-software.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Incremental Learning of Temporally-Coherent Gaussian Mixture Models Ognjen Arandjelović, Roberto Cipolla Engineering Department, University of Cambridge.
Face Recognition Based on 3D Shape Estimation
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Computer Vision Group University of California Berkeley Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA Greg Mori and Jitendra Malik.
Recognition of object by finding correspondences between features of a model and an image. Alignment repeatedly hypothesize correspondences between minimal.
A Bidirectional Matching Algorithm for Deformable Pattern Detection with Application to Handwritten Word Retrieval by K.W. Cheung, D.Y. Yeung, R.T. Chin.
Inter-modality Face Sketch Recognition Hamed Kiani.
Generic object detection with deformable part-based models
A Practical System for Modelling Body Shapes from Single View Measurements Yu Chen 1, Duncan Robertson 2, Roberto Cipolla 1 Department of Engineering,
Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,
Bag of Video-Words Video Representation
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
Multi-Output Learning for Camera Relocalization Abner Guzmán-Rivera UIUC Pushmeet Kohli Ben Glocker Jamie Shotton Toby Sharp Andrew Fitzgibbon Shahram.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Texture We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Face Recognition: An Introduction
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
Human pose recognition from depth image MS Research Cambridge.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
P RW GEI: Poisson Random Walk based Gait Recognition Intelligent Systems Research Centre School of Computing and Intelligent Systems,
An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)
COMP24111: Machine Learning Ensemble Models Gavin Brown
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Presenter: Jae Sung Park
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Object detection with deformable part-based models
COMP61011 : Machine Learning Ensemble Models
Real-Time Human Pose Recognition in Parts from Single Depth Image
Model-Based Organ Segmentation: Recent Methods
Domingo Mery Department of Computer Science
Domingo Mery Department of Computer Science
Presentation transcript:

Silhouette-based Object Phenotype Recognition using 3D Shape Priors Yu Chen 1 Tae-Kyun Kim 2 Roberto Cipolla 1 University of Cambridge, Cambridge, UK 1 Imperial College, London, UK 2

Problem Description Task: To identify the phenotype class of deformable objects. Given a gallery of canonical-posed silhouettes in different phenotype classes. Can we find out ?

Problem Description Motivation: –Pose recognition is widely investigated; –Phenotype recognition is somehow overlooked; –Applications? Difficulty: –Pose and camera viewpoint variations are more dominant than the phenotype variation.

Problem Description 2D approaches hardly work in this case. Our strategy: make use of the 3D shape prior of deformable objects. Shall we use a purely generative approach? No! Too expensive to perform for a recognition task!

Solution: Two-Stage Model Main Ideas: Discriminative + Generative Two stages: 1. Hypothesising –Discriminative; –Using random forests; 2. Shape Synthesis and Verification –Generative; –Synthesising 3D shapes using shape priors; –Silhouette verification. Recognition by a model selection process.

Use 3 RFs to quickly hypothesize phenotype, pose, and camera parameters. Learned on synthetic silhouettes generated by the shape priors. Parameter Hypothesizing F A : Pose classifier F C : Camera pose classifier F S : Phenotype classifier (canonical pose)

Examples of Tree Classifiers The phenotype classifier The pose classifier

Training RF Classifiers Random Features: –Rectangle pairs with random sizes and locations. –Difference of mean intensity values [Shotton et al. 09] –Feature error compensation for phenotype classifier; Criteria Function: –Similarity-aware diversity index.

Shape Synthesis and Verification Generate 3D shapes V –From candidate parameters given by RFs. –Use GPLVM shape priors [Chen et al.’10]. Compare the projection of V with the query silhouette S q. –Oriented Chamfer matching (OCM). [Stenger et al’03]

Experiments Testing data: –Manually segmented silhouettes; Current Datasets –Human jumping jack (13 instances, 170 images); –Human walking (16 instances, 184 images); –Shark swimming (13 instances, 168 images). Phenotype Categorisation

Comparative Approaches: Learn a single RF phenotype classifier; Histogram of Shape Context (HoSC) –[Agarwal and Triggs, 2006] Inner-Distance Shape Context (IDSC) –[Ling and Jacob, 2007] 2D Oriented Chamfer matching (OCM) –[Stenger et al. 2006] Mixture of Experts for the shape reconstruction –[Sigal et al. 2007]. –Modified into a recognition algorithm

Comparative Approaches: Internal comparisons: –Proposed method with both feature error modelling and similarity-aware criteria function (G+D); –Proposed method w.o feature error modelling (G+D–E); –Proposed method w.o similarity-aware criteria function (G+D–S) Using standard diversity index instead.

Recognition Performance Cross-validation by splitting the dataset instances. 5 phenotype categories for every test. Selecting one instance from each category.

Recognition Performance How the parameters of RFs affect the performance? –Max Tree Depth d max –Tree Number N T

Qualitative Results of SVR Left: Input image/silhouette; Centre: Using RF-hypothesizes; Right: Using the optimisation-based approach.

Qualitative Results of SVR

Take-Home Messages Phenotype recognition is difficult but still possible; Combing discriminative and generative cues can greatly speed up the inference; A divide-and-conquer strategy can help improve the recognition rate.

Future Work Explore the application on more complicated poses and more categories. –E.g. Boxing, gardening, other sports, etc. Data collection; Automate the silhouette extraction. –E.g. Kinect.

The End Questions?

Feature Error Modelling Purpose: –For learning the phenotype classifier Fs; –To reduce systematic errors between synthetic and real silhouettes. Error modelling dataset: –Several pairs of synthetic and real silhouettes with different error modes; –Compute the feature difference for each pair. Error compensation of the training data. –Find nearest neighbour silhouettes in the error modelling dataset. –Compensate the features of synthetic silhouettes with corresponding error vectors. Synthetic Silhouette Real Silhouette Error Vector e

Similarity-aware Criteria Functions Tree learning: maximize the criteria function (impurity) drop of each node Observation: some classes can be more similar to each other while some are more different. –Generalise the Gibbs and Martin’s diversity index; –Take the class similarity into account; –The similarity matrix W is defined by the average 3D mesh difference between phenotype classes. More similar Less similar

Comparative Approaches: Learn a single RF phenotype classifier. Training Data: –Generated by the 3D shape priors; –Phenotype classes are uniformly sampled from the latent space; –Various poses/camera viewpoints for each class. Recognition: –Compare the RF histograms between the query and each gallery image. –χ 2 distance. Phenotype 1 Phenotype 2 Phenotype N Random Forest...

Approximate Single View Reconstruction Use 3D shapes hypotheses V. Contrast with the results by the optimisation-based approach [Chen et al. 10]. Performance –Fairly good accuracy; –More than 50x faster than the optimisation-based approach.