Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes.

Similar presentations


Presentation on theme: "Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes."— Presentation transcript:

1 Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes

2 Career path Masters thesis on gesture recognition at the University of Padova Visiting student, ESSRL, Washington University in St. Louis, and at the University of California at Los Angeles (2000) Ph.D. thesis on belief functions and uncertainty theory (2001) Researcher at Politecnico di Milano with the Image and Sound Processing group (2003-2004) Post-doc at the University of California at Los Angeles, UCLA Vision Lab (2004-2006) Marie Curie fellow at INRIA Rhone-Alpes

3 collaborations with several groups Scientific production and collaborations collaborations with journals: IEEE PAMIIEEE SMC-BCVIU Information FusionInt. J. Approximate Reasoning PC member for VISAPP, FLAIRS, IMMERSCOM, ISAIM currently 4+10 journal papers and 31+8 conference papers SIPTA Setubal CMU Pompeu Fabra EPFL-IDIAP UBoston

4 My background research Discrete math linear independence on lattices and matroids Uncertainty theory geometric approach algebraic analysis generalized total probability Machine learning Manifold learning for dynamical models Computer vision gesture and action recognition 3D shape analysis and matching Gait ID pose estimation

5 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

6 HMMs for gesture recognition transition matrix A -> gesture dynamics state-output matrix C -> collection of hand poses Hand poses were represented by size functions (BMVC'97)

7 Gesture classification … HMM 1 HMM 2 HMM n EM to learn HMM parameters from an input sequence the new sequence is fed to the learnt gesture models they produce a likelihood the most likely model is chosen (if above a threshold) OR new model is attributed the label of the closest one (using K-L divergence or other distances)

8 Compositional behavior of HMMs the model of the action of interest is embedded in the overall model clustering Cluttered model for two overlapping motions Reduced model for the fly gesture after clustering

9 Volumetric action recognition 2D approaches, feature extracted from images viewpoint dependence now available synchronized multi-camera systems Milano, BBC R&D volumetric approach: features are extracted from volumetric reconstructions of the body (ICIP'04)

10 Locally linear embedding to find topological representation of the moving body 3D feature extraction Linear discriminant analysis (LDA) to estimate motion direction k-means clustering of bodyparts

11 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

12 Unsupervised coherent 3D segmentation to recognize actions we need to extract features segmenting moving articulated 3D bodies into parts along sequences, in a consistent way in an unsupervised fashion robustly, with respect to changes of the topology of the moving body as a building block of a wider motion analysis and capture framework ICCV-HM'07, CVPR'08, to submit to IJCV

13 Clustering after Laplacian embedding generates a lower-dim, widely separated embedded cloud less sensitive to topology changes than other methods less expensive then ISOMAP (refs. Jenkins, Chellappa) local neighbourhoods stable under articulated motion

14 Algorithm K-wise clustering in the embedding space

15 Seed propagation along time To ensure time consistency clusters seeds have to be propagated along time Old positions of clusters in 3D are added to new cloud and embedded Result: new seeds

16 Results Coherent clustering along a sequence Example of model recovery

17 Results - 2 handling of topology changesmissing data

18 Laplacian matching of dense meshes or voxelsets as embeddings are pose-invariant (for articulated bodies) they can then be used to match dense shapes by simply aligning their images after embedding ICCV '07 – NTRL, ICCV '07 – 3dRR, CVPR '08, submitted to ECCV'08, to submit to PAMI

19 Eigenfunction Histogram assignment Algorithm: compute Laplacian embedding of the two shapes find assignment between eigenfunctions of the two shapes this selects a section of the embedding space embeddings are orthogonally aligned there by EM

20 Results Appls: graph matching, protein analysis, motion capture To propagate bodypart segmentation in time Motion field estimation, action segmentation

21 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

22 Bilinear models for gait-ID To recognize the identity of humans from their gait (CVPR '06, book chapter in progress) nuisance factors: emotional state, illumination, appearance, view invariance... (literature: randomized trees) each motion possess several labels: action, identity, viewpoint, emotional state, etc. bilinear models [Tenenbaum] can be used to separate the influence of style and content (to classify)

23 Content classification of unknown style given a training set in which persons (content=ID) are seen walking from different viewpoints (style=viewpoint) an asymmetric bilinear model can learned from it through SVD when new motions are acquired in which a known person is being seen walking from a different viewpoint (unknown style)… an iterative EM procedure can be set up to classify the content E step -> estimation of p(c|s), the prob. of the content given the current estimate s of the style M step -> estimation of the linear map for unknown style s

24 Three layer model each sequence is encoded as an HMM its C matrix is stacked in a single observation vector a bilinear model is learnt from those vectors Three-layer model Features: projections of silhouette's contours onto a line through the center

25 Results on CMU database T Mobo database: 25 people performing 4 different walking actions, from 6 cameras. Three labels: action, id, view Compared performances with baseline algorithm and straight k-NN on sequence HMMs

26 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

27 Learning manifolds of dynamical models Classify movements represented as dynamical models for instance, each image sequence can be mapped to an ARMA, or AR linear model, or a HMM Motion classification then reduces to find a suitable distance function in the space of dynamical models e.g.: Kullback-Leibler, Fisher metric [Amari] when some a-priori info is available (training set).... we can learn in a supervised fashion the best metric for the classification problem! To submit to ECCV'08 – MLVMA Workshop

28 Learning pullback metrics many algorithms take in input dataset and map it to an embedded space, but fail to learn a full metric (LLE, ISOMAP) consider than a family of diffeomorphisms F between the original space M and a metric space N the diffeomorphism F induces on M a pullback metric maximizing inverse volume finds the manifold which better interpolates the data (geodesics pass through crowded regions)

29 Pullback metrics - detail Diffeomorphism Diffeomorphism on M: Push-forward Push-forward map: Given a metric on M, g:TM TM, the pullback metric pullback metric is case of linear maps: Xing and Jordan'02, Shental'02

30 Space of AR(2) models given an input sequence, we can identify the parameters of the linear model which better describes it autoregressive models of order 2 AR(2) Fisher metric on AR(2) Compute the geodesics of the pullback metric on M

31 Results on action and ID rec scalar feature, AR(2) and ARMA models

32 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

33 assumption: not enough evidence to determine the actual probability describing the problem second-order distributions (Dirichlet), interval probabilities credal sets Uncertainty measures: Intervals, credal sets Belief functions [Shafer 76]: special case of credal sets a number of formalisms have been proposed to extend or replace classical probability

34 Multi-valued maps and belief functions suppose you have two different but related problems...... that we have a probability distribution for the first one... and that the two are linked by a map one to many [Dempster'68, Shafer'76] the probability P on S induces a belief function on T

35 Belief functions as random sets if m is a mass function s.t. A B belief function b:2 s.t. probabilities are additive: if A B= then p(A B)=p(A)+p(B) probability on a finite set: function p: 2 Θ -> [0,1] with p(A)= x m(x), where m: Θ -> [0,1] is a mass function

36 it has the shape of a simplex IEEE Tr. SMC-C '08, Ann. Combinatorics '06, FSS '06, IS '06, IJUFKS'06 Geometric approach to uncertainty belief functions can be seen as points of a Cartesian space of dimension 2 n -2 belief space: the space of all the belief functions on a given frame Each subset is a coordinate in this space

37 how to transform a measure of a certain family into a different uncertainty measure can be done geometrically Approximation problem Probabilities, fuzzy sets, possibilities are special cases of b.f.s IEEE Tr. SMC-B '07, IEEE Tr. Fuzzy Systems '07, AMAI '08, AI '08, IEEE Tr. Fuzzy Systems '08

38 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

39 generalization of the total probability theorem Total belief theorem introduces Kalman-like filtering for random sets conditional constraint a-priori constraint

40 Graph of all solutions admissible solution is found by following a path on the graph links to combinatorics and linear systems to submit to JRSS-B whole graph of candidate solutions

41 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

42 Model-free pose estimation pose estimating the pose (internal configuration) of a moving body from the available images t=0t=T if you do not have an a-priori model of the object.. Sun & Torr BMVC'06, Rosales, Urtasun Brand, Grauman ICCV'03, Agarwal

43 Learning feature-pose maps... learn a map between features and poses directly from the data given pose and feature sequences acquired by motion capture.. q q yy 1 1 T T a multi-modal Gaussian density is set up on the feature space a map from each cluster to the set of poses whose feature values fall inside it (regression functions, EM)

44 Evidential model 18594 161 38.. and approximate parameter space.... form the evidential model similar to propagation in qualitative Markov trees MTNS'00, ISIPTA'05, to submit to Information Fusion

45 Information fusion by Dempsters rule several aggregation or elicitation operators proposed original proposal: Dempsters rule b 1 : m({a 1 })=0.7, m({a 1, a 2 })=0.3 a1a1 a2a2 a3a3 a4a4 b 1 b 2 : m({a 1 }) = 0.7*0.1/0.37 = 0.19 m({a 2 }) = 0.3*0.9/0.37 = 0.73 m({a 1, a 2 }) = 0.3*0.1/0.37 = 0.08 b 2 : m( )=0.1, m({a 2, a 3, a 4 })=0.9

46 Performances comparison of three models: left view only, right view only, both views pose estimation yielded by the overall model estimate associated with the right model ground truth left model

47 JPDA with shape info robustness: clutter does not meet shape constraints occlusions: occluded targets can be estimated CDC'02, CDC'04 JPDA model: independent targets shape model: rigid links Dempsters fusion

48 Belief graphical models what happens when the original probability distribution belongs to a certain class? In particular: belief functions induced by graphical models?

49 Imprecise classifiers application of robust statistics to vision problems imprecise classifiers class estimate is a belief function or a credal set [Zaffalon, Cozman] exploit only available evidence, represent ignorance

50 Credal networks belief networks or credal networks [Shafer and Shenoy] at each node a BF or a convex set of probs similar to generalized belief propagation... message passing between nodes representing groups of variables algorithms to reduce complexity already exist

51 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

52 independence can be defined in different ways in Boolean algebras, semi-modular lattices, and matroids Boolean independence is important in uncertainty theory Boolean independence example: collection of power sets of the partitions of a given finite set a set of sub-algebras {A t } of a Boolean algebra B are independent (IB) if

53 Relation with matroids? matroid (E, I 2 E ) : I; A I, A A then A I; A 1 I, A 2 I, |A 2 |>|A 1 | then x A 2 s.t. A 1 {x} I graphic matroids: dependent sets are circuits they have significant relationships BUT Boolean independence a form of anti-matroidicity? BCC'01, BCC'07, ISAIM'08, UNCLOG'08, subm.to Discrete Mathematics Matroids paradigm of abstract independence

54 Computer Vision Action and gesture recognition Laplacian segmentation and matching of 3D shapes Bilinear models for invariant gaitID Machine learning Manifold learning for dynamical models Discrete math Uncertainty theory Geometric approach to measures Generalized total probability Vision applications and developments Unification of the notion of independence New directions

55 A multi-layer framework for human motion analysis feedbacks act between different layers (e.g. integrated detection, segmentation, classification and pose estimation) action recognition action segmentation multiple views 3D reconstruction unsupervised body-part segmentation image data fusion model fitting (stick-articulated) motion capture identity recognition surveillanceHMI

56 Spatio-temporal action segmentation problem: segmenting parts of the video(s) containing interesting motions multidimensional volume global approach: working on multidimensional volumes previous works: object segmentation on the spatio-temporal volume for single frames [Collins, Natarajan] idea: in a multi-camera setup, working on 3D clouds (hulls) + motion fields + time = 7D volume smoothing shape detection proposal: smoothing using tensor voting [Medioni PAMI'05] + shape detection on the obtained manifold

57 Stereo correspondence based on local image structure local structure problem: finding correspondences between points in different view, using the local structure of the image Markov random fields: Markov random fields: disparity = hidden variable one direction: using local direction of the gradient or structure tensor to help the correspondence [Zucker] large scale structures second option: FRAME -> large scale structures in MRF potential general potential for MRFs, local texture for correspondence?

58 Other developments 3D markerless motion capture Proposal: data-driven pose estimation based on 3D representations unsupervised body model learning shape classification/ recognition in embedding spaces surveillance in crowded areas: impossible to recover a 3D model information fusion techniques on multiple images handle conflict between different pieces of evidence


Download ppt "Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes."

Similar presentations


Ads by Google