Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.

Slides:



Advertisements
Similar presentations
Part 2: Unsupervised Learning
Advertisements

An Adaptive Learning Method for Target Tracking across Multiple Cameras Kuan-Wen Chen, Chih-Chuan Lai, Yi-Ping Hung, Chu-Song Chen National Taiwan University.
Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
Detecting Faces in Images: A Survey
Analysis of Contour Motions Ce Liu William T. Freeman Edward H. Adelson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
- Recovering Human Body Configurations: Combining Segmentation and Recognition (CVPR’04) Greg Mori, Xiaofeng Ren, Alexei A. Efros and Jitendra Malik -
Modeling the Shape of People from 3D Range Scans
Reducing Drift in Parametric Motion Tracking
3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
Real Time Motion Capture Using a Single Time-Of-Flight Camera
3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
2D Human Pose Estimation in TV Shows Vittorio Ferrari Manuel Marin Andrew Zisserman Dagstuhl Seminar July 2008.
Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.
1 Vertically Integrated Seismic Analysis Stuart Russell Computer Science Division, UC Berkeley Nimar Arora, Erik Sudderth, Nick Hay.
Formation et Analyse d’Images Session 8
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
1 Face Tracking in Videos Gaurav Aggarwal, Ashok Veeraraghavan, Rama Chellappa.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
CVPR 2006 New York City Granularity and Elasticity Adaptation in Visual Tracking Ming Yang, Ying Wu NEC Laboratories America Cupertino, CA 95014
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, U.C. Berkeley We present a model of edge and region grouping.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
A General Framework for Tracking Multiple People from a Moving Camera
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Visual Tracking Conventional approach Build a model before tracking starts Use contours, color, or appearance to represent an object Optical flow Incorporate.
Person detection, tracking and human body analysis in multi-camera scenarios Montse Pardàs (UPC) ACV, Bilkent University, MTA-SZTAKI, Technion-ML, University.
Michael J. BlackFebruary 2002 Learning the Appearance and Motion of People in Video Hedvig Sidenbladh Michael J. Black Department.
Markov Random Fields Probabilistic Models for Images
Tracking People by Learning Their Appearance Deva Ramanan David A. Forsuth Andrew Zisserman.
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
Learning the Appearance and Motion of People in Video Hedvig Sidenbladh, KTH Michael Black, Brown University.
A Statistical Method for 3D Object Detection Applied to Face and Cars CVPR 2000 Henry Schneiderman and Takeo Kanade Robotics Institute, Carnegie Mellon.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Expectation-Maximization (EM) Case Studies
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Phisical Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 4th Maximum likelihood estimation and EM algorithm Kazuyuki Tanaka Graduate School.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team
Sequential Monte-Carlo Method -Introduction, implementation and application Fan, Xin
Lecture 2: Statistical learning primer for biologists
Markov Random Fields & Conditional Random Fields
Face Detection and Head Tracking Ying Wu Electrical Engineering & Computer Science Northwestern University, Evanston, IL
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Announcements Final is Thursday, March 18, 10:30-12:20 –MGH 287 Sample final out today.
11/25/03 3D Model Acquisition by Tracking 2D Wireframes Presenter: Jing Han Shiau M. Brown, T. Drummond and R. Cipolla Department of Engineering University.
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
Computer vision: models, learning and inference
Tracking Objects with Dynamics
LOCUS: Learning Object Classes with Unsupervised Segmentation
Dynamical Statistical Shape Priors for Level Set Based Tracking
Detecting Artifacts and Textures in Wavelet Coded Images
Multi-scale Visual Tracking by Sequential Belief Propagation
“The Truth About Cats And Dogs”
Image Parsing & DDMCMC. Alan Yuille (Dept. Statistics. UCLA)
Video Compass Jana Kosecka and Wei Zhang George Mason University
Analysis of Contour Motions
Expectation-Maximization & Belief Propagation
Presented by Xu Miao April 20, 2005
Graduate School of Information Sciences, Tohoku University
Presentation transcript:

Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05

Project Goal Learning human motion Need to know the human body configurations Detect human body parts from a single image Where are head, arms, legs, torsos? Estimate human body configurations What are the size, location and orientation? First step (i.e., initialization) for full human body tracking Statistical Inference InputOutput Location/size/ orientation head arms legs

Challenges Large variation in pose Occlusion: some parts are not visible Lighting variation: affects appearance Cluttered background: noisy visual cues High dimensional state variables

Main Idea Analysis by synthesis (i.e., Hypothesize and test) Statistical inference Locate body parts using cues Importance sampling Learn the shapes of human body parts Intelligently guess some possible answers, i.e., assembly of body parts Match each guessed answer with image observation using shape prior and geometry constraints Head sample Torso sample Upper leg sample Which observed assembly looks most likely to be a human? Lower arm sample Image Potential body parts Assembly of body parts Best assembly visual cues & importance sampling local observation & belief propagation

In Plain English Learning shape  Collect prior knowledge of body parts Importance sampling  Intelligent guess of answer Observation  What is seen in image such as appearance, color, and edges Belief  Local evidence Belief propagation  Inference using all relevant local evidence Potential functions  Encode constraints Head sample Torso sample Upper leg sample Which observed assembly looks most likely to be a human? Lower arm sample

Markov Network X i : pose state of each limb Z i : image observation of each limb Ψ ij (X i, X j ): each undirected link represents a potential function Φ i (Z i |X i ): each directed link represents a observation likelihood To infer P(X i |Z) (i.e., P(state variables|image observations )

For each body part, normalize labeled shape and learn a low-dimensional representation, ps i, using probabilistic Principal Component Analysis (PCA) Pose parameters: X i ={ps i, s x, s y, , t x, t y } Normalize the labeled shape (1) Normalized shape, (2) originally labeled shape and (3) reconstructed shape labeled shape (1) (2) (3) Learning Body Shapes

Face Detection for Head Pose AdaBoost-based face detector Detection results are good but not precise 2 class k-means algorithm to cluster skin color pixels The head pose hypothesis Ix h is obtained by re-centering the face rectangle to the centroid of the skin color cluster and then projecting to the head PCA space Gaussian importance function

Image specific skin color segmentation Least square rectangle fitting for lower-arm& upper-leg hypothesis Upper-arm& Lower-leg hypothesis from constrained local search Gaussian mixture importance function Skin color segmentationRectangle fittingUpper-arm& lower-leg search Arm/Leg Importance Functions

Torso Pose Importance Function Probabilistic Hough transform to detect line segments Lines are assembled to quad-shapes and are pruned Canny edge masked likelihood  t (n) are evaluated for each good hypothesis Ix t (n) Gaussian mixture importance function Results from Hough transform Torso hypothesis

Potential Constraint Encode physical constraints of human body parts Link points are defined between two adjacent body parts The potential function is defined by a Gaussian radial basis function Defined link points

Likelihood Model Average normalized steered edge response in R, G, B bands Likelihood is the maximum of the three

Experiment: Likelihood Model Translation of the left-lower-legCurve for the likelihood value

Joint Posterior Distribution The joint posterior distribution of the Markov network is where X={X 1, X 2, …, X 9 } The goal is to infer the marginal posterior P(X i |Z) i.e., P( Configuration of body part i | Image observation)

Belief Propagation Message passing Non-Gaussian distribution makes closed form implementation intractable Belief propagation Monte Carlo evidence from neighboring nodes combine with local evidence from observation

Belief Propagation Monte Carlo

Experimental Results

State of the art Proposed methodUSCBrown Set up Single frame Multi-view and video Algorithm Data Driven Belief Propagation Monte Carlo (DDBPMC), Marko Chain Monte Carlo (MCMC), Belief Propagation (BP) and PAMPAS Characteristics efficient, + well posed problem,+ more robust to lighting change, + can be applied to ASIMO directly, ++ extended to full body tracker easily, + numerous experiments, ++ overall: ++ ad-hoc, - not a well posed problem, - may be sensitive to lighting change, - not applicable to ASIMO directly, - may not be extended to full body tracker, - few results are available, -- overall: - systematic, + work for specific environment, - may be sensitive to lighting change, - require multi cameras, -- may be extended to full body tracker, + few results are available, -- overall: + Speed 2 to 3 minute per frame 5+ minute per frameUnknown, but should be more than 3 minutes

Limitations of Current Work Some skin color regions Face in frontal pose Reasonable contrast (visible edges) Low degree of occlusions

Concluding Remarks A novel algorithm for pose estimation Principled statistical formulation in recovering Human pose in 2-D A working prototype Work towards full human body tracking