Lectureship A proposal for advancing computer graphics, imaging and multimedia design at RGU Robert Gordon University Aberdeen, 20/6/2008 Fabio Cuzzolin.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

The geometry of of relative plausibilities
01/18 Lab meeting Fabio Cuzzolin
Lectureship Early Career Fellowship School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes.
On the credal structure of consistent probabilities Department of Computing School of Technology, Oxford Brookes University 19/6/2008 Fabio Cuzzolin.
Machine learning and imprecise probabilities for computer vision
Learning Riemannian metrics for motion classification Fabio Cuzzolin INRIA Rhone-Alpes Computational Imaging Group, Pompeu Fabra University, Barcellona.
Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.
Gestures Recognition. Image acquisition Image acquisition at BBC R&D studios in London using eight different viewpoints. Sequence frame-by-frame segmentation.
1 Gesture recognition Using HMMs and size functions.
We consider situations in which the object is unknown the only way of doing pose estimation is then building a map between image measurements (features)
Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio Cuzzolin.
Bayesian Belief Propagation
Simplicial complexes of finite fuzzy sets Fabio Cuzzolin Dipartimento di Elettronica e Informazione – Politecnico di Milano Image and Sound Processing.
Department of Engineering Math, University of Bristol A geometric approach to uncertainty Oxford Brookes Vision Group Oxford Brookes University 12/03/2009.
Machine Learning for Vision-Based Motion Analysis Learning pullback metrics for linear models Oxford Brookes Vision Group Oxford Brookes University 17/10/2008.
Evidential modeling for pose estimation Fabio Cuzzolin, Ruggero Frezza Computer Science Department UCLA.
On the properties of relative plausibilities Computer Science Department UCLA Fabio Cuzzolin SMC05, Hawaii, October
FEATURE PERFORMANCE COMPARISON FEATURE PERFORMANCE COMPARISON y SC is a training set of k-dimensional observations with labels S and C b C is a parameter.
Robust spectral 3D-bodypart segmentation along time Fabio Cuzzolin, Diana Mateus, Edmond Boyer, Radu Horaud Perception project meeting 24/4/2007 Submitted.
Bilinear models and Riemannian metrics for motion classification Fabio Cuzzolin Microsoft Research, Cambridge, UK 11/7/2006.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Matthias Wimmer, Bernd Radig, Michael Beetz Chair for Image Understanding Computer Science TU München, Germany A Person and Context.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
By: Ryan Wendel.  It is an ongoing analysis in which videos are analyzed frame by frame  Most of the video recognition is pulled from 3-D graphic engines.
Social Media Mining Chapter 5 1 Chapter 5, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010.
Patch to the Future: Unsupervised Visual Prediction
Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.
Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.
Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.
Computer and Robot Vision I
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
A Bayesian Formulation For 3d Articulated Upper Body Segmentation And Tracking From Dense Disparity Maps Navin Goel Dr Ara V Nefian Dr George Bebis.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Human Action Recognition
Fabio Cuzzolin Now Marie-Curie fellow with the Perception project INRIA Rhone-Alpes Concour chargés de recherche de 1 ère classe, INRIA Rennes, May
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Learning to classify the visual dynamics of a scene Nicoletta Noceti Università degli Studi di Genova Corso di Dottorato.
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Shape-Based Human Detection and Segmentation via Hierarchical Part- Template Matching Zhe Lin, Member, IEEE Larry S. Davis, Fellow, IEEE IEEE TRANSACTIONS.
Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,
Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.
Project title : Automated Detection of Sign Language Patterns Faculty: Sudeep Sarkar, Barbara Loeding, Students: Sunita Nayak, Alan Yang Department of.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
AUTOMATIC TARGET RECOGNITION AND DATA FUSION March 9 th, 2004 Bala Lakshminarayanan.
Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Learning pullback action manifolds Heriot Watt University, 26/5/2010 Fabio Cuzzolin Oxford Brookes Vision Group.
REAL-TIME DETECTOR FOR UNUSUAL BEHAVIOR
Intrinsic Data Geometry from a Training Set
Tracking Objects with Dynamics
Computer Vision, Robotics, Machine Learning and Control Lab
Nonparametric Semantic Segmentation
Machine Learning Basics
Dynamical Statistical Shape Priors for Level Set Based Tracking
Video-based human motion recognition using 3D mocap data
Outline Multilinear Analysis
The Functional Space of an Activity Ashok Veeraraghavan , Rama Chellappa, Amit Roy-Chowdhury Avinash Ravichandran.
ISOMAP TRACKING WITH PARTICLE FILTERING
Presentation transcript:

Lectureship A proposal for advancing computer graphics, imaging and multimedia design at RGU Robert Gordon University Aberdeen, 20/6/2008 Fabio Cuzzolin INRIA Rhone-Alpes

Career path Masters thesis on gesture recognition at the University of Padova Visiting student, ESSRL, Washington University in St. Louis, and at the University of California at Los Angeles (2000) Ph.D. thesis on belief functions and uncertainty theory (2001) Researcher at Politecnico di Milano with the Image and Sound Processing group ( ) Post-doc at the University of California at Los Angeles, UCLA Vision Lab ( ) Marie Curie fellow at INRIA Rhone-Alpes

collaborations with several groups Scientific production and collaborations collaborations with journals: IEEE PAMIIEEE SMC-BCVIU Information FusionInt. J. Approximate Reasoning PC member for VISAPP, FLAIRS, IMMERSCOM, ISAIM currently 4+10 journal papers and 31+8 conference papers SIPTA Setubal CMU Pompeu Fabra EPFL-IDIAP UBoston

My background research Discrete math linear independence on lattices and matroids Uncertainty theory geometric approach algebraic analysis generalized total probability Machine learning Manifold learning for dynamical models Computer vision gesture and action recognition 3D shape analysis and matching Gait ID pose estimation

action recognition action segmentation A multi-layer framework for human motion analysis different tasks, integrated in a series of layes feedbacks act between different layers multiple views 3D reconstruction unsupervised body-part segmentation image data fusion model fitting (stick-articulated) motion capture identity recognition surveillanceHMI

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

HMMs for gesture recognition transition matrix A -> gesture dynamics state-output matrix C -> collection of hand poses Hand poses were represented by size functions (BMVC'97)

Gesture classification … HMM 1 HMM 2 HMM n EM to learn HMM parameters from an input sequence the new sequence is fed to the learnt gesture models they produce a likelihood the most likely model is chosen (if above a threshold) OR new model is attributed the label of the closest one (using K-L divergence or other distances)

Volumetric action recognition 2D approaches: features are extracted from single views -> viewpoint dependence volumetric approach: features are extracted from a volumetric reconstruction of the moving body (ICIP'04)

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

Unsupervised coherent 3D segmentation to recognize actions we need to extract features segmenting moving articulated 3D bodies into parts along sequences, in a consistent way in an unsupervised fashion robustly, with respect to changes of the topology of the moving body as a building block of a wider motion analysis and capture framework ICCV-HM'07, CVPR'08, to submit to IJCV

Clustering after Laplacian embedding generates a lower-dim, widely separated embedded cloud less sensitive to topology changes than other methods less computationally expensive then ISOMAP local neighborhoods -> stable under articulated motion

Algorithm K-wise clustering in the embedding space

Seed propagation along time To ensure time consistency clusters seeds have to be propagated along time Old positions of clusters in 3D are added to new cloud and embedded Result: new seeds

Results Coherent clustering along a sequence Handling of topology changes

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

Laplacian matching of dense meshes or voxelsets as embeddings are pose-invariant (for articulated bodies) they can then be used to match dense shapes by simply aligning their images after embedding ICCV '07 – NTRL, ICCV '07 – 3dRR, CVPR '08, submitted to ECCV'08, to submit to PAMI

Eigenfunction Histogram assignment Algorithm: compute Laplacian embedding of the two shapes find assignment between eigenfunctions of the two shapes this selects a section of the embedding space embeddings are orthogonally aligned there by EM

Results Appls: graph matching, protein analysis, motion capture To propagate bodypart segmentation in time Motion field estimation, action segmentation

Application: spatio-temporal action segmentation problem: segmenting parts of the video(s) containing interesting motions multidimensional volume global approach: working on the entire sequence (multidimensional volume) previous works: object segmentation on the spatio- temporal volume for single frames idea: in a multi-camera setup, working on 3D clouds (hulls) + motion fields + time = 7D volume smoothing shape detection outline of an approach: smoothing using message passing + shape detection on the obtained manifold

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

Bilinear models for gait-ID To recognize the identity of humans from their gait (CVPR '06, book chapter in progress) nuisance factors: emotional state, illumination, appearance, view invariance... (literature: randomized trees) each motion possess several labels: action, identity, viewpoint, emotional state, etc. bilinear models (Tenenbaum) can be used to separate the influence of style and content (the label to classify)

Content classification of unknown style given a training set in which persons (content=ID) are seen walking from different viewpoints (style=viewpoint) an asymmetric bilinear model can learned from it through SVD when new motions are acquired in which a known person is being seen walking from a different viewpoint (unknown style)… an iterative EM procedure can be set up to classify the content E step -> estimation of p(c|s), the prob. of the content given the current estimate s of the style M step -> estimation of the linear map for unknown style s

Three layer model each sequence is encoded as an HMM its C matrix is stacked in a single observation vector a bilinear model is learnt from those vectors Three-layer model Features: projections of silhouette's contours onto a line through the center

Results on CMU database T Mobo database: 25 people performing 4 different walking actions, from 6 cameras. Three labels: action, id, view Compared performances with baseline algorithm and straight k-NN on sequence HMMs

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

Learning manifolds of dynamical models Classify movements represented as dynamical models for instance, each image sequence can be mapped to an ARMA, or AR linear model Motion classification then reduces to find a suitable distance function in the space of dynamical models when some a-priori info is available (training set).... we can learn in a supervised fashion the best metric for the classification problem! To submit to ECCV'08 – MLVMA Workshop

Learning pullback metrics many unsupervised algorithms take in input dataset and map it to an embedded space, but fail to learn a full metric consider than a family of diffeomorphisms F between the original space M and a metric space N the diffeomorphism F induces on M a pullback metric maximizing inverse volume finds the manifold which better interpolates the data (geodesics pass through crowded regions)

Space of AR(2) models given an input sequence, we can identify the parameters of the linear model which better describes it autoregressive models of order 2 AR(2) Fisher metric on AR(2) Compute the geodesics of the pullback metric on M

Results on action and ID rec scalar feature, AR(2) and ARMA models

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

assumption: not enough evidence to determine the actual probability describing the problem second-order distributions (Dirichlet), interval probabilities credal sets Uncertainty measures: Intervals, credal sets Belief functions (Shafer 76): special case of credal sets a number of formalisms have been proposed to extend or replace classical probability

if m is a mass function on 2 Θ s.t. Probability on a finite set: function p: 2 Θ -> [0,1] with p(A)= x m(x), where m: Θ -> [0,1] is a mass function Probabilities are additive: if A B= then p(A B)=p(A)+p(B) Belief functions as random sets A B belief function b: 2 Θ ->[0,1]

Information fusion by Dempsters rule several aggregation or elicitation operators proposed original proposal: Dempsters rule b 1 : m({a 1 })=0.7, m({a 1, a 2 })=0.3 a1a1 a2a2 a3a3 a4a4 b 1 b 2 : m({a 1 }) = 0.7*0.1/0.37 = 0.19 m({a 2 }) = 0.3*0.9/0.37 = 0.73 m({a 1, a 2 }) = 0.3*0.1/0.37 = 0.08 b 2 : m( )=0.1, m({a 2, a 3, a 4 })=0.9

Imprecise classifiers and credal networks imprecise classifiers class estimate is a belief function exploit only available evidence, represent ignorance Belief networks or credal networks at each node a belief function or a convex set of probs robust version of bayesian networks

A multi-layer framework for human motion analysis Action and gesture recognition Laplacian unsupervised segmentation Matching of 3D shapes by embedded orthogonal alignment Bilinear models for invariant gaitID Manifold learning for dynamical models The role of uncertainty measures Information fusion for model-free pose estimation

Model-free pose estimation pose estimating the pose (internal configuration) of a moving body from the available images t=0t=T if you do not have an a- priori model of the object..

Learning feature-pose maps... learn a map between features and poses directly from the data given pose and feature sequences acquired by motion capture.. q q yy 1 1 T T a Gaussian density for each state is set up on the feature space -> approximate feature space maps each cluster to the set of training poses q k with feature y k inside it

Evidential model and approximate parameter space.... form the evidential model MTNS'00, ISIPTA'05, to submit to Information Fusion

Results on human body tracking comparison of three models: left view only, right view only, both views pose estimation yielded by the overall model estimate associated with the right model ground truth left model

Conclusions - Research Hot topic in computer vision and machine learning: human motion analysis Applications: motion capture, surveillance, human machine interaction, biometric identification Different tools from machine learning, robust statistics, differential geometry can be useful Several tasks are involved in a hierarchical fashion Tasks are not isolated, but interact and generate feedbacks to help the solution of the others

Conclusions - Teaching plans machine vision involves notions coming from different branches of pure and applied mathematics: robust statistics, differential geometry, discrete math all of them are considered as useful tools to solve real- world problems students have then the chance to improve their mathematical background and learn at the same time how to develop real products on the ground integrated courses can be designed along this line

Conclusions – Commercial partnerships several opportunities to develop technology transfer activities involving companies biometrics: in particular, behavioral (non-controlled) identification surveillance: multi-camera human motion detection and classification image and video browsing: internet-based content retrieval personal links with companies like Honeywell Labs (surveillance), Riya (image googling), MS Research