Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes.

Slides:

Advertisements

Similar presentations

Part 2: Unsupervised Learning

Advertisements

Applications of one-class classification

The geometry of of relative plausibilities

01/18 Lab meeting Fabio Cuzzolin

On the credal structure of consistent probabilities Department of Computing School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin.

Machine learning and imprecise probabilities for computer vision

Learning Riemannian metrics for motion classification Fabio Cuzzolin INRIA Rhone-Alpes Computational Imaging Group, Pompeu Fabra University, Barcellona.

Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.

Gestures Recognition. Image acquisition Image acquisition at BBC R&D studios in London using eight different viewpoints. Sequence frame-by-frame segmentation.

IEEE CDC Nassau, Bahamas, December Integration of shape constraints in data association filters Integration of shape constraints in data.

1 Gesture recognition Using HMMs and size functions.

We consider situations in which the object is unknown the only way of doing pose estimation is then building a map between image measurements (features)

Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio Cuzzolin.

Bayesian Belief Propagation

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Simplicial complexes of finite fuzzy sets Fabio Cuzzolin Dipartimento di Elettronica e Informazione – Politecnico di Milano Image and Sound Processing.

Department of Engineering Math, University of Bristol A geometric approach to uncertainty Oxford Brookes Vision Group Oxford Brookes University 12/03/2009.

Machine Learning for Vision-Based Motion Analysis Learning pullback metrics for linear models Oxford Brookes Vision Group Oxford Brookes University 17/10/2008.

Evidential modeling for pose estimation Fabio Cuzzolin, Ruggero Frezza Computer Science Department UCLA.

On the properties of relative plausibilities Computer Science Department UCLA Fabio Cuzzolin SMC05, Hawaii, October

Lectureship A proposal for advancing computer graphics, imaging and multimedia design at RGU Robert Gordon University Aberdeen, 20/6/2008 Fabio Cuzzolin.

FEATURE PERFORMANCE COMPARISON FEATURE PERFORMANCE COMPARISON y SC is a training set of k-dimensional observations with labels S and C b C is a parameter.

Robust spectral 3D-bodypart segmentation along time Fabio Cuzzolin, Diana Mateus, Edmond Boyer, Radu Horaud Perception project meeting 24/4/2007 Submitted.

Bilinear models and Riemannian metrics for motion classification Fabio Cuzzolin Microsoft Research, Cambridge, UK 11/7/2006.

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Pattern Finding and Pattern Discovery in Time Series

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.

3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)

Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction Atul Kanaujia, CBIM, Rutgers Cristian Sminchisescu, TTI-C Dimitris Metaxas,CBIM, Rutgers.

Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,

Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.

Multi-view stereo Many slides adapted from S. Seitz.

Fabio Cuzzolin Now Marie-Curie fellow with the Perception project INRIA Rhone-Alpes Concour chargés de recherche de 1 ère classe, INRIA Rennes, May

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Super-Resolution of Remotely-Sensed Images Using a Learning-Based Approach Isabelle Bégin and Frank P. Ferrie Abstract Super-resolution addresses the problem.

Learning to classify the visual dynamics of a scene Nicoletta Noceti Università degli Studi di Genova Corso di Dottorato.

TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.

Machine learning & category recognition Cordelia Schmid Jakob Verbeek.

Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,

Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Alvina Goh Reading group: 07/06/06.

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.

Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

CS Statistical Machine learning Lecture 24

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Associative Hierarchical CRFs for Object Class Image Segmentation

Tracking with dynamics

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

Learning pullback action manifolds Heriot Watt University, 26/5/2010 Fabio Cuzzolin Oxford Brookes Vision Group.

- photometric aspects of image formation gray level images

Tracking Objects with Dynamics

Computer Vision, Robotics, Machine Learning and Control Lab

Particle Filtering for Geometric Active Contours

Nonparametric Semantic Segmentation

Machine Learning Basics

Dynamical Statistical Shape Priors for Level Set Based Tracking

The Functional Space of an Activity Ashok Veeraraghavan , Rama Chellappa, Amit Roy-Chowdhury Avinash Ravichandran.

Context-Aware Modeling and Recognition of Activities in Video

Recognition and Matching based on local invariant features

Presentation transcript:

Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes

Career path Masters thesis on gesture recognition at the University of Padova Visiting student, ESSRL, Washington University in St. Louis, and at the University of California at Los Angeles (2000) Ph.D. thesis on belief functions and uncertainty theory (2001) Researcher at Politecnico di Milano with the Image and Sound Processing group ( ) Post-doc at the University of California at Los Angeles, UCLA Vision Lab ( ) Marie Curie fellow at INRIA Rhone-Alpes

collaborations with several groups Scientific production and collaborations collaborations with journals: IEEE PAMIIEEE SMC-BCVIU Information FusionInt. J. Approximate Reasoning PC member for VISAPP, FLAIRS, IMMERSCOM, ISAIM currently 4+10 journal papers and 31+8 conference papers SIPTA Setubal CMU Pompeu Fabra EPFL-IDIAP UBoston

My background research Discrete math linear independence on lattices and matroids Uncertainty theory geometric approach algebraic analysis generalized total probability Machine learning Manifold learning for dynamical models Computer vision gesture and action recognition 3D shape analysis and matching Gait ID pose estimation

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

HMMs for gesture recognition transition matrix A -> gesture dynamics state-output matrix C -> collection of hand poses Hand poses were represented by size functions (BMVC'97)

Gesture classification … HMM 1 HMM 2 HMM n EM to learn HMM parameters from an input sequence the new sequence is fed to the learnt gesture models they produce a likelihood the most likely model is chosen (if above a threshold) OR new model is attributed the label of the closest one (using K-L divergence or other distances)

Compositional behavior of HMMs the model of the action of interest is embedded in the overall model clustering Cluttered model for two overlapping motions Reduced model for the fly gesture after clustering

Volumetric action recognition 2D approaches, feature extracted from images viewpoint dependence now available synchronized multi-camera systems Milano, BBC R&D volumetric approach: features are extracted from volumetric reconstructions of the body (ICIP'04)

Locally linear embedding to find topological representation of the moving body 3D feature extraction Linear discriminant analysis (LDA) to estimate motion direction k-means clustering of bodyparts

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

Unsupervised coherent 3D segmentation to recognize actions we need to extract features segmenting moving articulated 3D bodies into parts along sequences, in a consistent way in an unsupervised fashion robustly, with respect to changes of the topology of the moving body as a building block of a wider motion analysis and capture framework ICCV-HM'07, CVPR'08, to submit to IJCV

Clustering after Laplacian embedding generates a lower-dim, widely separated embedded cloud less sensitive to topology changes than other methods less expensive then ISOMAP (refs. Jenkins, Chellappa) local neighbourhoods stable under articulated motion

Algorithm K-wise clustering in the embedding space

Seed propagation along time To ensure time consistency clusters seeds have to be propagated along time Old positions of clusters in 3D are added to new cloud and embedded Result: new seeds

Results Coherent clustering along a sequence Example of model recovery

Results - 2 handling of topology changesmissing data

Laplacian matching of dense meshes or voxelsets as embeddings are pose-invariant (for articulated bodies) they can then be used to match dense shapes by simply aligning their images after embedding ICCV '07 – NTRL, ICCV '07 – 3dRR, CVPR '08, submitted to ECCV'08, to submit to PAMI

Eigenfunction Histogram assignment Algorithm: compute Laplacian embedding of the two shapes find assignment between eigenfunctions of the two shapes this selects a section of the embedding space embeddings are orthogonally aligned there by EM

Results Appls: graph matching, protein analysis, motion capture To propagate bodypart segmentation in time Motion field estimation, action segmentation

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

Bilinear models for gait-ID To recognize the identity of humans from their gait (CVPR '06, book chapter in progress) nuisance factors: emotional state, illumination, appearance, view invariance... (literature: randomized trees) each motion possess several labels: action, identity, viewpoint, emotional state, etc. bilinear models [Tenenbaum] can be used to separate the influence of style and content (to classify)

Content classification of unknown style given a training set in which persons (content=ID) are seen walking from different viewpoints (style=viewpoint) an asymmetric bilinear model can learned from it through SVD when new motions are acquired in which a known person is being seen walking from a different viewpoint (unknown style)… an iterative EM procedure can be set up to classify the content E step -> estimation of p(c|s), the prob. of the content given the current estimate s of the style M step -> estimation of the linear map for unknown style s

Three layer model each sequence is encoded as an HMM its C matrix is stacked in a single observation vector a bilinear model is learnt from those vectors Three-layer model Features: projections of silhouette's contours onto a line through the center

Results on CMU database T Mobo database: 25 people performing 4 different walking actions, from 6 cameras. Three labels: action, id, view Compared performances with baseline algorithm and straight k-NN on sequence HMMs

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

Learning manifolds of dynamical models Classify movements represented as dynamical models for instance, each image sequence can be mapped to an ARMA, or AR linear model, or a HMM Motion classification then reduces to find a suitable distance function in the space of dynamical models e.g.: Kullback-Leibler, Fisher metric [Amari] when some a-priori info is available (training set).... we can learn in a supervised fashion the best metric for the classification problem! To submit to ECCV'08 – MLVMA Workshop

Learning pullback metrics many algorithms take in input dataset and map it to an embedded space, but fail to learn a full metric (LLE, ISOMAP) consider than a family of diffeomorphisms F between the original space M and a metric space N the diffeomorphism F induces on M a pullback metric maximizing inverse volume finds the manifold which better interpolates the data (geodesics pass through crowded regions)

Pullback metrics - detail Diffeomorphism Diffeomorphism on M: Push-forward Push-forward map: Given a metric on M, g:TM TM, the pullback metric pullback metric is case of linear maps: Xing and Jordan'02, Shental'02

Space of AR(2) models given an input sequence, we can identify the parameters of the linear model which better describes it autoregressive models of order 2 AR(2) Fisher metric on AR(2) Compute the geodesics of the pullback metric on M

Results on action and ID rec scalar feature, AR(2) and ARMA models

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

assumption: not enough evidence to determine the actual probability describing the problem second-order distributions (Dirichlet), interval probabilities credal sets Uncertainty measures: Intervals, credal sets Belief functions [Shafer 76]: special case of credal sets a number of formalisms have been proposed to extend or replace classical probability

Multi-valued maps and belief functions suppose you have two different but related problems that we have a probability distribution for the first one... and that the two are linked by a map one to many [Dempster'68, Shafer'76] the probability P on S induces a belief function on T

Belief functions as random sets if m is a mass function s.t. A B belief function b:2 s.t. probabilities are additive: if A B= then p(A B)=p(A)+p(B) probability on a finite set: function p: 2 Θ -> [0,1] with p(A)= x m(x), where m: Θ -> [0,1] is a mass function

it has the shape of a simplex IEEE Tr. SMC-C '08, Ann. Combinatorics '06, FSS '06, IS '06, IJUFKS'06 Geometric approach to uncertainty belief functions can be seen as points of a Cartesian space of dimension 2 n -2 belief space: the space of all the belief functions on a given frame Each subset is a coordinate in this space

how to transform a measure of a certain family into a different uncertainty measure can be done geometrically Approximation problem Probabilities, fuzzy sets, possibilities are special cases of b.f.s IEEE Tr. SMC-B '07, IEEE Tr. Fuzzy Systems '07, AMAI '08, AI '08, IEEE Tr. Fuzzy Systems '08

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

generalization of the total probability theorem Total belief theorem introduces Kalman-like filtering for random sets conditional constraint a-priori constraint

Graph of all solutions admissible solution is found by following a path on the graph links to combinatorics and linear systems to submit to JRSS-B whole graph of candidate solutions

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

Model-free pose estimation pose estimating the pose (internal configuration) of a moving body from the available images t=0t=T if you do not have an a-priori model of the object.. Sun & Torr BMVC'06, Rosales, Urtasun Brand, Grauman ICCV'03, Agarwal

Learning feature-pose maps... learn a map between features and poses directly from the data given pose and feature sequences acquired by motion capture.. q q yy 1 1 T T a multi-modal Gaussian density is set up on the feature space a map from each cluster to the set of poses whose feature values fall inside it (regression functions, EM)

Evidential model and approximate parameter space.... form the evidential model similar to propagation in qualitative Markov trees MTNS'00, ISIPTA'05, to submit to Information Fusion

Information fusion by Dempsters rule several aggregation or elicitation operators proposed original proposal: Dempsters rule b 1 : m({a 1 })=0.7, m({a 1, a 2 })=0.3 a1a1 a2a2 a3a3 a4a4 b 1 b 2 : m({a 1 }) = 0.7*0.1/0.37 = 0.19 m({a 2 }) = 0.3*0.9/0.37 = 0.73 m({a 1, a 2 }) = 0.3*0.1/0.37 = 0.08 b 2 : m( )=0.1, m({a 2, a 3, a 4 })=0.9

Performances comparison of three models: left view only, right view only, both views pose estimation yielded by the overall model estimate associated with the right model ground truth left model

JPDA with shape info robustness: clutter does not meet shape constraints occlusions: occluded targets can be estimated CDC'02, CDC'04 JPDA model: independent targets shape model: rigid links Dempsters fusion

Belief graphical models what happens when the original probability distribution belongs to a certain class? In particular: belief functions induced by graphical models?

Imprecise classifiers application of robust statistics to vision problems imprecise classifiers class estimate is a belief function or a credal set [Zaffalon, Cozman] exploit only available evidence, represent ignorance

Credal networks belief networks or credal networks [Shafer and Shenoy] at each node a BF or a convex set of probs similar to generalized belief propagation... message passing between nodes representing groups of variables algorithms to reduce complexity already exist

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

independence can be defined in different ways in Boolean algebras, semi-modular lattices, and matroids Boolean independence is important in uncertainty theory Boolean independence example: collection of power sets of the partitions of a given finite set a set of sub-algebras {A t } of a Boolean algebra B are independent (IB) if

Relation with matroids? matroid (E, I 2 E ) : I; A I, A A then A I; A 1 I, A 2 I, |A 2 |>|A 1 | then x A 2 s.t. A 1 {x} I graphic matroids: dependent sets are circuits they have significant relationships BUT Boolean independence a form of anti-matroidicity? BCC'01, BCC'07, ISAIM'08, UNCLOG'08, subm.to Discrete Mathematics Matroids paradigm of abstract independence

Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

A multi-layer framework for human motion analysis feedbacks act between different layers (e.g. integrated detection, segmentation, classification and pose estimation) action recognition action segmentation multiple views 3D reconstruction unsupervised body-part segmentation image data fusion model fitting (stick-articulated) motion capture identity recognition surveillanceHMI

Spatio-temporal action segmentation problem: segmenting parts of the video(s) containing interesting motions multidimensional volume global approach: working on multidimensional volumes previous works: object segmentation on the spatio-temporal volume for single frames [Collins, Natarajan] idea: in a multi-camera setup, working on 3D clouds (hulls) + motion fields + time = 7D volume smoothing shape detection proposal: smoothing using tensor voting [Medioni PAMI'05] + shape detection on the obtained manifold

Stereo correspondence based on local image structure local structure problem: finding correspondences between points in different view, using the local structure of the image Markov random fields: Markov random fields: disparity = hidden variable one direction: using local direction of the gradient or structure tensor to help the correspondence [Zucker] large scale structures second option: FRAME -> large scale structures in MRF potential general potential for MRFs, local texture for correspondence?

Other developments 3D markerless motion capture Proposal: data-driven pose estimation based on 3D representations unsupervised body model learning shape classification/ recognition in embedding spaces surveillance in crowded areas: impossible to recover a 3D model information fusion techniques on multiple images handle conflict between different pieces of evidence