Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio Cuzzolin.

Slides:



Advertisements
Similar presentations
Face Recognition Sumitha Balasuriya.
Advertisements

Machine learning and imprecise probabilities for computer vision
Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.
Gestures Recognition. Image acquisition Image acquisition at BBC R&D studios in London using eight different viewpoints. Sequence frame-by-frame segmentation.
1 Gesture recognition Using HMMs and size functions.
We consider situations in which the object is unknown the only way of doing pose estimation is then building a map between image measurements (features)
Machine Learning for Vision-Based Motion Analysis Learning pullback metrics for linear models Oxford Brookes Vision Group Oxford Brookes University 17/10/2008.
Lectureship A proposal for advancing computer graphics, imaging and multimedia design at RGU Robert Gordon University Aberdeen, 20/6/2008 Fabio Cuzzolin.
FEATURE PERFORMANCE COMPARISON FEATURE PERFORMANCE COMPARISON y SC is a training set of k-dimensional observations with labels S and C b C is a parameter.
Bilinear models and Riemannian metrics for motion classification Fabio Cuzzolin Microsoft Research, Cambridge, UK 11/7/2006.
Pattern Finding and Pattern Discovery in Time Series
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Computer vision: models, learning and inference Chapter 18 Models for style and identity.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
3D Face Modeling Michaël De Smet.
Amir Hosein Omidvarnia Spring 2007 Principles of 3D Face Recognition.
Silhouette Lookup for Automatic Pose Tracking N ICK H OWE.
Face Recognition & Biometric Systems, 2005/2006 Face recognition process.
Profiles for Sequences
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
Recognition of Human Gait From Video Rong Zhang, C. Vogler, and D. Metaxas Computational Biomedicine Imaging and Modeling Center Rutgers University.
Face Recognition Based on 3D Shape Estimation
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury.
Gait Recognition Simon Smith Jamie Hutton Thomas Moore David Newman.
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Human Identification using Silhouette Gait Data Rutgers University Chan-Su Lee.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Computer vision: models, learning and inference
Person-Specific Domain Adaptation with Applications to Heterogeneous Face Recognition (HFR) Presenter: Yao-Hung Tsai Dept. of Electrical Engineering, NTU.
Isolated-Word Speech Recognition Using Hidden Markov Models
0 Pattern Classification, Chapter 3 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda,
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.
Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Alvina Goh Reading group: 07/06/06.
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
Gait Recognition Guy Bar-hen Tal Reis. Introduction Gait – is defined as a “manner of walking”. Gait recognition – –is the term typically used to refer.
Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.
Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.
Face Recognition: An Introduction
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
MUSTAFA OZAN ÖZEN PINAR SAĞLAM LEVENT ÜNVER MEHMET YILMAZ.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
P RW GEI: Poisson Random Walk based Gait Recognition Intelligent Systems Research Centre School of Computing and Intelligent Systems,
Separating Style and Content with Bilinear Models Joshua B. Tenenbaum, William T. Freeman Computer Examples Barun Singh 25 Feb, 2002.
Chapter 8. Learning of Gestures by Imitation in a Humanoid Robot in Imitation and Social Learning in Robots, Calinon and Billard. Course: Robots Learning.
Tracking with dynamics
University of South Florida, Tampa1 Gait Recognition and Inverse Biometrics Sudeep Sarkar (Zongyi Liu, Pranab Mohanty) Computer Science and Engineering.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Learning pullback action manifolds Heriot Watt University, 26/5/2010 Fabio Cuzzolin Oxford Brookes Vision Group.
Gait Recognition Gökhan ŞENGÜL.
Gait Analysis for Human Identification (GAHI)
René Vidal and Xiaodong Fan Center for Imaging Science
René Vidal Time/Place: T-Th 4.30pm-6pm, Hodson 301
Optical Flow Estimation and Segmentation of Moving Dynamic Textures
Dynamical Statistical Shape Priors for Level Set Based Tracking
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Video-based human motion recognition using 3D mocap data
Outline Multilinear Analysis
Separating Style and Content with Bilinear Models Joshua B
Announcements Project 4 out today Project 2 winners help session today
Separating Style and Content with Bilinear Models Joshua B
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio Cuzzolin

Bilinear models for invariant gaitID The identity recognition problem View-invariance in gaitID Bilinear models HMMs and a three-layer model Four experiments on the Mobo database

Identity recognition from gait biometrics increasingly popular cooperative methods: face recognition, retinal analysis surveillance context: non-cooperative users the problem: recognizing the identity of humans from their gait methods: dimensionality reduction, silhouette analysis issues: nuisance factors, viewpoint dependence

A brief review gait signatures: silhouettes [Collins 02, Wang 03], optical flow, velocity moments, shape symmetry, static body parameters baseline algorithm [Sarkar 05] computes similarity scores between a probe sequence and each gallery (training) sequence by pairwise frame correlation methodologies: mostly pattern recognition after dimensionality reduction eigenspaces [Abdelkader 01], PCA/MDA [Tolliver 03, Han 04] stochastic models (HMMs): [Kale 02, Debrunner 00] KL-divergence between Markov models

Bilinear models for invariant gaitID The identity recognition problem View-invariance in gaitID Bilinear models HMMs and a three-layer model Four experiments on the Mobo database

The view-invariance issue nuisance factors many different nuisance factors are involved viewpoint illumination clothes, shoes, carried objects trajectory view-invariance big issue: view-invariance possible approaches: 3D tracking virtual view reconstruction static body parameters

Approches to view-invariant gait ID [Cunado 99]: Evidence gathering technique coupled oscillators, Fourier description, inclination of thigh and leg [Urtasun,Fua 04]: fitting 3D temporal motion models to synchronized video sequences Motion parameters: coefficients of the singular value decomposition of the estimated model angles [Bhanu,Han 02] matching a 3D kinematic model to 2D silhouettes extracting a number of feature angles from the fitted model [Kale 03]: synthetic side-view of the moving person using a single camera [Shakhnarovich 01]: view-normalization from volumetric intersection of the visual hulls [Johnson, Bobick 01]: static body parameters recovered across multiple views

Bilinear models for invariant gaitID The identity recognition problem View-invariance in gaitID Bilinear models HMMs and a three-layer model Four experiments on the Mobo database

Bilinear models style invariance From view-invariance to style invariance motions usually possess several labels: action, identity, viewpoint, emotional state, etc. Bilinear models Bilinear models (Tenenbaum) can be used to separate the influence of two of those factors, called style and content (the label to classify) y SC is a training set of k-dimensional observations with labels S and C b C is a parameter vector representing content, while A S is a style-specific linear map mapping the content space onto the observation space

Bilinear models the content (identity, action) of an observation can be thought of as a vector in an abstract content space of some dimension J bCbC ASAS y SC observations are then derived from content vector linearly, through a map which depends on the style parameter S

Learning an asymmetric bilinear model given an observation sequence y SC …... an asymmetric bilinear model can fitted to the data through the SVD Y=SUV of a stacked observation matrix the symmetric model can be written as Y=AB where least-squares optimal style and content parameters are

Content classification of unknown style consider a training set in which persons (content=ID) are seen walking from different viewpoints (style=viewpoint) when new motions are acquired in which a known person is walking from a different viewpoint (unknown style)… … an iterative EM procedure can be set up to classify the content (identity) E step -> estimation of p(c|s), the prob. of the content given the current estimate s of the style M step -> estimation of the linear map for the unknown style s

Bilinear models for invariant gaitID The identity recognition problem View-invariance in gaitID Bilinear models HMMs and a three-layer model Four experiments on the Mobo database

Hidden Markov models finite-state representation of an observation process state process {X k } is a Markov chain given a sequence os observations (feature matrix) EM algorithm for parameter learning (Moore) A->transition probabilities (motion dynamics) C-> means of state-output distributions (poses)

Motions as stacked HMMs interpretation of the C matrix: columns of C are means of the output distributions associated with the states of the model in gaitID (cyclic motions) the dynamics is the same for all sequences (A neglected) stacked columns of the C matrix a sequence can then be represented as a collection of poses: stacked columns of the C matrix

Three-layer model First layer (feature representation): projection of the contour of the silhouette on a sheaf of lines passing through the center 1 Third layer: bilinear model of HMMs 3 2 In the second layer each sequence is encoded as a Markov model, its C matrix is stacked in an observation vector, and a bilinear model is trained over those vectors

Bilinear models for invariant gaitID The identity recognition problem View-invariance in gaitID Bilinear models HMMs and a three-layer model Four experiments on the Mobo database

Mobo database: 25 people performing 4 different walking actions, from 6 cameras action, id, view each sequence has three labels: action, id, view MOBO database

Four experiments one label is chosen as contentanother one as style we can then set up four experiments in which one label is chosen as content, another one as style, and the remaining is considered as a nuisance factor contentstylenuisance action view-invariant action recognition viewIDaction ID-invariant action recognition IDviewID action-invariant gaitID actionviewID view-invariant gaitID viewaction

Results – ID versus VIEW baseline algorithm Compared performances with baseline algorithm and straight k-NN on sequence HMMs

Results – ID versus action ID vs action experiment performance of the bilinear classifier in the ID vs action experiment as a function of the nuisance (view=1:5), averaged over all the possible choices of the test action the average best-match performance of the bilinear classifier is shown in solid red, (minimum and maximum in magenta). Best-3 matches ratio in dotted red

Feature extraction projection of the contour Type 1: projection of the contour of the silhouette on a sheaf of lines passing through the center Type 2: size functions [Frosini 90] Lees moments Type 3: Lees moments

Results - influence of features ID-invariant action recognition Left: ID-invariant action recognition using the bilinear classifier. The entire dataset is considered, regardless the viewpoint. The correct classification percentage is shown as a function of the test identity in black (for models using Lee's features) and red (contour projections). Related mean levels are drawn as dotted lines. View-invariant action recognition Right: View-invariant action recognition.

Conclusions covariance factors covariance factors of paramount importance in gaitID bilinear-multilinear models bilinear-multilinear models provide a way to separate different factors three-layer model we proposed a three-layer model in which sequence are represented through HMMs expensive and sensitive some approaches to view-invariance are expensive and sensitive experiments on the Mobo database show how much separating factor is effective for motion classification future: multilinear models, testing on more realistic setups (many factors, USF database)