Download presentation
Presentation is loading. Please wait.
Published byLorin Stevens Modified over 6 years ago
1
Combining Geometric- and View-Based Approaches for Articulated Pose Estimation
David Demirdjian MIT Computer Science and Artificial Intelligence Laboratory Approach View-based model Fusion Given the estimates P(g) from the geometric model-based tracker and P(v) from the view-based model prediction, the final pose P estimation is chosen as the pose minimizing the 3D fitting error function E2(): P = arg min { E2(P(g)), E2(P(v)) } P {P(g) , P(v)} Key frames The view-based model consists in a collection of key frames. Each key frame maps a view J and its local variation to pose P. Variations of view J are modeled by the apparent motion dx (image flow) of a set of support points x in the view. Pose P is modeled as: A key frame is characterized by: {J , P 0 , x , L} view ref. pose support points motion-pose jacobian We propose here an efficient real-time approach for articulated body tracking that combines: Geometric-based model local fit of a 3D articulated CAD model to stereo data View-based model (global) search of image in a collection of views (+ local prediction) and Experiments Fusion update P(g) stereo view-based model P(v) intensity (I) Detected keyframe Jk JN J2 J1 J3 model-based (ICP) tracker Pose prediction Given a new image I, a prediction of the corresponding pose P is estimated by: Searching the key frame Jk, closest to image I w.r.t. to an image distance d(I1,I2) e.g. L2 (weighted by some foreground weights when applicable) Comparative results (re-projection of the 3D articulated model) on a sequence of more than 1500 images. Jk JN J2 J1 J3 ? ICP (geometric-based model) I We introduce a view-based model that contains views of a person under various articulated poses. The view-based model is built and updated online. Our main contribution consists of modeling, in each frame, the pose changes as a linear transformation of the view change. This linear model allows for: refining the estimate of the pose P 0 corresponding to a key frame predicting the pose in a new image The articulated pose is computed by fusion of the estimation provided by the two techniques. Estimating the flow of support points x between I and Jk Predicting pose P using local model ICP + view-based model Learning/Updating the view-based model The view-based model is learnt online (at the beginning, bootstrapped using the geometric model-based tracker estimates). Key frame selection Goal: model the most of the appearance space using a fixed number N of key frame in the collection. Key frames Jk are selected so that they maximizes the inter-image distance inside the view-based model collection. Key frame update Support points x estimated as image points corresponding to subject (e.g. using foreground detection, optical flow, …) Parameters P 0 and L estimated from a set of n observations (dx(n), P(n)) using a robust estimation techniques. Average error between the estimation of the 3D articulated model and the 3D scene reconstruction vs. number of frames. Peaks in the data corresponding to the ICP algorithm are actually tracking failures. Geometric model-based tracking The geometric-based model tracker estimates the pose P that minimizes a fitting error E2(P) Average percentage of frames correctly tracked in 20 sequences (of about 1000 frames) using: ICP (geometric-based model) only ICP + view-based model ICP + view-based model Given an initial pose P0, this comes to estimate a body transformation D that minimizes F 2(D) = E2(P = D(P0)). The constrained minimization of F2(D) is performed in 2 steps [ICCV’03 Demirdjian et al.]: Unconstrained minimization Quadratic Programming problem resolution Unconstrained minimization: Find a body transformation D (and uncertainty L) that minimizes F2(D). This is done by applying the ICP algorithm independently to each limb (without accounting for articulated constraints bw. limbs). Quadratic Programming Problem: ICP Future Work Finds the rigid transformation that maps shape St (limb model) to shape Sr (3D data) Modeling appearance: improving the view-based model to account for appearance (e.g. texture) variation across people Adding dynamic constraints to improve robustness and reduce tracking “jumpiness” Probabilistic fusion to account for the uncertainty on the pose estimates Sr St x Minimizes (Mahalanobis distance) with D* satisfying:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.