Víctor Ponce Miguel Reyes Xavier Baró Mario Gorga Sergio Escalera Two-level GMM Clustering of Human Poses for Automatic Human Behavior Analysis Departament.

Slides:



Advertisements
Similar presentations
A probabilistic model for retrospective news event detection
Advertisements

ADHD indicators modelling based on Dynamic Time Warping from RGB data: A feasibility study Antonio Hernández-Vela, Miguel Reyes, Laura Igual, Josep Moya,
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Combining Detectors for Human Hand Detection Antonio Hernández, Petia Radeva and Sergio Escalera Computer Vision Center, Universitat Autònoma de Barcelona,
By: Ryan Wendel.  It is an ongoing analysis in which videos are analyzed frame by frame  Most of the video recognition is pulled from 3-D graphic engines.
Salvatore giorgi Ece 8110 machine learning 5/12/2014
Real-Time Human Pose Recognition in Parts from Single Depth Images
1 Approximated tracking of multiple non-rigid objects using adaptive quantization and resampling techniques. J. M. Sotoca 1, F.J. Ferri 1, J. Gutierrez.
An Overview of Machine Learning
A Modified EM Algorithm for Hand Gesture Segmentation in RGB-D Data 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) July 6-11, 2014, Beijing,
Patch to the Future: Unsupervised Visual Prediction
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
BoVDW: Bag-of-Visual-and-Depth- Words for Gesture Recognition All rights reserved HuBPA© Human Pose Recovery and Behavior Analysis Antonio Hernández-Vela.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Lecture Pose Estimation – Gaussian Process Tae-Kyun Kim 1 EE4-62 MLCV.
Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.
Probability-based Dynamic Time Warping for Gesture Recognition on RGB-D data All rights reserved HuBPA© Human Pose Recovery and Behavior Analysis Group.
Automatic Identification of Bacterial Types using Statistical Image Modeling Sigal Trattner, Dr. Hayit Greenspan, Prof. Shimon Abboud Department of Biomedical.
Real-Time Activity Monitoring of Inpatients Miguel Reyes, Jordi Vitrià, Petia Radeva and Sergio Escalera Computer Vision Center, Universitat Autònoma de.
Kinect Case Study CSE P 576 Larry Zitnick
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Incremental Learning of Temporally-Coherent Gaussian Mixture Models Ognjen Arandjelović, Roberto Cipolla Engineering Department, University of Cambridge.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Object Detection and Tracking Mike Knowles 11 th January 2005
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Presented by Zeehasham Rasheed
A Probabilistic Framework for Video Representation Arnaldo Mayer, Hayit Greenspan Dept. of Biomedical Engineering Faculty of Engineering Tel-Aviv University,
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Super-Resolution of Remotely-Sensed Images Using a Learning-Based Approach Isabelle Bégin and Frank P. Ferrie Abstract Super-resolution addresses the problem.
A Fast and Robust Fingertips Tracking Algorithm for Vision-Based Multi-touch Interaction Qunqun Xie, Guoyuan Liang, Cheng Tang, and Xinyu Wu th.
Vision-based Navigation and Reinforcement Learning Path Finding for Social Robots Xavier Pérez *, Cecilio Angulo *, Sergio Escalera + and Diego Pardo *
3D Fingertip and Palm Tracking in Depth Image Sequences
Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)
Zhengyou Zhang Microsoft Research Digital Object Identifier: /MMUL Publication Year: 2012, Page(s): Professor: Yih-Ran Sheu Student.
1 Mean shift and feature selection ECE 738 course project Zhaozheng Yin Spring 2005 Note: Figures and ideas are copyrighted by original authors.
Human Gesture Recognition Using Kinect Camera Presented by Carolina Vettorazzo and Diego Santo Orasa Patsadu, Chakarida Nukoolkit and Bunthit Watanapa.
Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Alvina Goh Reading group: 07/06/06.
Miguel Reyes 1,2, Gabriel Dominguez 2, Sergio Escalera 1,2 Computer Vision Center (CVC) 1, University of Barcelona (UB) 2
A General Framework for Tracking Multiple People from a Moving Camera
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
Albert Clapés 1,2, Alex Pardo 2, Miguel Reyes 1,2, Sergio Escalera 1,2, Oriol Pujol 1,2 Computer Vision Center (CVC) 1, University of Barcelona (UB) 2.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Raviteja Vemulapalli University of Maryland, College Park.
Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine Journal of Information & Computational Science, 2015.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September, 2006.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
Adaption Def: To adjust model parameters for new speakers. Adjusting all parameters requires an impractical amount of data. Solution: Create clusters and.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Hand Gesture Recognition Using Haar-Like Features and a Stochastic Context-Free Grammar IEEE 高裕凱 陳思安.
Journal of Visual Communication and Image Representation
RGB-D Images and Applications
Edge Preserving Spatially Varying Mixtures for Image Segmentation Giorgos Sfikas, Christophoros Nikou, Nikolaos Galatsanos (CVPR 2008) Presented by Lihan.
Microsoft Kinect How does a machine infer body position?
3D Single Image Scene Reconstruction For Video Surveillance Systems
Gait Recognition Gökhan ŞENGÜL.
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
How the Kinect Works Computational Photography
Real-Time Human Pose Recognition in Parts from Single Depth Image
Dynamical Statistical Shape Priors for Level Set Based Tracking
Video-based human motion recognition using 3D mocap data
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Presentation transcript:

Víctor Ponce Miguel Reyes Xavier Baró Mario Gorga Sergio Escalera Two-level GMM Clustering of Human Poses for Automatic Human Behavior Analysis Departament de Matemàtica Aplicada i Anàlisi, Universitat de Barcelona, Spain Centre de Visió per Computador, Universitat Autònoma de Barcelona, Spain Detect human poses is a main step in the study of human behavior analysis. Our achievement is to find a non supervised method to determine key poses of a human gesture for a posterior analysis of a human behavior using machine learning and computer vision techniques. We use Kinect system for body feature extraction from joint positions of the articulated human models and then we apply a global and local feature clustering using a two-level Gaussian Mixture Model approach for the expert evaluation by comparing each clustering levels through their visualizations. Keywords: Human Pose, Human Behavior Analysis, Clustering. Abstract In this paper, we designed a data set of human actions and described individual frames using pose skeleton models from depth map information. We proposed a two-level GMM clustering algorithm in order to group similar poses so that posterior Human Behavior analysis techniques can improve generalization. We showed some preliminary qualitative results comparing our approach with the classical one-level GMM clustering strategy, showing a more visual coherent grouping of poses. Conclusion Methodology [1] Open natural interface. November Last viewed :00. [2] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. CVPR, [3] V. Ponce, M. Gorga, X. Baró, S. Escalera. ‘Human Behavior Analysis from Video Data using Bag- of-Gestures. International Joint Conference on Artificial Intelligence’ (IJCAI 2011). ISBN Vol 3, ISBN (electronic proceedings), AAAI Press, pp Refrences Human Pose Clustering 1.First level Clustering: Use three spatial components of descriptor V for each joint i, i Є [1, …, 14] and perform GMM of k 1 clusters, namely GMM i 1 2.Second level Clustering: 1.Define for each pose a new feature vector, of size 14xk 1, where v j i is the probability of applying GMM model GMM i 1 at features from V corresponding to spatial coordinates of j.th joint. 2.Use components of descriptor V and perform GMM of k 2 clusters, namely GMM 2. Grouping the previous pose descriptions in pose clusters with standard Gaussian Mixture Model. Our goal is to group the set of frame pose descriptions in clusters so that posterior learning algorithms can improve generalization in Human Behavior Analysis systems. We use a full covariance GMM of K components parameterized: Then, a likelihood value based on the probability distributions p(·) of the GMM is obtained: Human Pose Representation Figure 1: The 3D articulated human model consisting of 15 distinctive points. Depth maps acquired by public API OpenNI software [1]. These features are able to detect and track people to a maximum distance of six meters from multi-sensor device. Method of [2] used for the detection of human body and its skeletal model. The approach of [2] uses a huge set of human samples to infer pixel labels through Random Forest estimation, and skeletal model is defined as the centroid of mass of the different dense regions using mean shift algorithm. The articulated human model is defined by the set of 15 reference points. This model has the advantage of being highly deformable, and thus, able to fit to complex human poses. Results Figure 2: Sample examples from clusters defined using one-level GMM (left) and two-level GMM (right), respectively. Figure 3: Cluster assignment for some gesture pose sequences. Data: Set of gestures using the Kinect device consisting of seven different categories. It has been considered 10 different actors and different environments, having a total of 130 data sequences with 32 frame gestures. Thus, the data set contains the high variability from uncontrolled environments. The resolution of the video depth sequences is 340x280. Methods and parameters: The people detection system used is provided by the public library OpenNI. This library has a high accuracy in people detection, allowing multiple detection even in cases of partial occlusions. Figure 3 shows consecutive visual descriptions of some data set gestures. At the bottom of the sequences, we show a first row that represents the cluster number assigned by a one-level GMM, and a second row with the assigned cluster using two-level GMM. One can see that in most cases both grouping techniques assigns consecutive poses to same clusters. However, the clusters assigned by the one-level GMM have more visual variability, being inefficient for human behavior generalization purposes.