Download presentation
Presentation is loading. Please wait.
1
Learning and Vision Seminar Anand D. Subramaniam Seminar on Vision and Learning University of California, San Diego September 20, 2001 Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Presented by : Anand D. Subramaniam Electrical and Computer Engineering Dept., University of California, San Diego
2
Learning and Vision Seminar Anand D. Subramaniam Outline Gait Recognition The Layering Approach Layer One - Input Image Sequence — Optical Flow Layer Two - Coherence Blob Hypothesis — EM Clustering Layer Three - Simple Dynamical Categories — Kalman Filters Layer Four - Complex Movement Sequences — Hidden Markov Models Model training Simulation results
3
Learning and Vision Seminar Anand D. Subramaniam Gait Recognition Running WalkingSkipping
4
Learning and Vision Seminar Anand D. Subramaniam The Layering Approach Layer 1 Layer 2 Layer 3 Layer 4
5
Learning and Vision Seminar Anand D. Subramaniam Input Image Sequence Layer 1 Feature vector comprises of optical flow, color value and pixel value. Optical Flow equation Affine Motion Model Affine Warp
6
Learning and Vision Seminar Anand D. Subramaniam
7
Expectation Maximization Algorithm EM is an iterative algorithm which computes locally optimal solutions to certain cost functions. EM simplifies a complex cost function into a bunch of easily solvable cost functions by introducing a “missing parameter”. Missing data is the Indicator Function.
8
Learning and Vision Seminar Anand D. Subramaniam Expectation Maximization Algorithm EM iterates between two steps E - Step : — Estimate the conditional mean estimate of the missing parameter given the previous estimate of model parameters and the observations. M - Step : — Re-estimate the model parameters given the soft clustering done by the E - Step. EM is numerically stable with the likelihood non-decreasing with every iteration. EM converges to a local optima. EM has linear convergence.
9
Learning and Vision Seminar Anand D. Subramaniam Density Estimation using EM Gaussian mixture models can be used to model any given probability density function to arbitrary accuracy by using sufficient number of clusters. ( curve fitting using Gaussian kernels) For a given number of clusters, the EM tries to minimize the Kullback-Leibler divergence measure between the arbitrary pdf and the class of Gaussian mixture models with the given number of clusters.
10
Learning and Vision Seminar Anand D. Subramaniam Coherence Blob Hypotheses Layer 2 Likelihood Equation Mixture Model Missing Data Simplified Cost Functions
11
Learning and Vision Seminar Anand D. Subramaniam EM Initialization We need to track the temporal variation of blob parameters in order to initialize the EM for a given frame. Kalman filters Recursive EM using Conjugate priors
12
Learning and Vision Seminar Anand D. Subramaniam
13
All Roads Lead From Gauss 1809 “ … since all our measurements and observations are nothing more than approximations to the truth, the same must be true of all calculations resting upon them, and the highest aim of all computations made concerning concrete phenomenon must be to approximate, as nearly as practicable, to the truth. But this can be accomplished in no other way than by suitable combination of more observations than the number absolutely requisite for the determination of the unknown quantities. This problem can only be properly undertaken when an approximate knowledge of the orbit has been already attained, which is afterwards to be corrected so as to satisfy all the observations in the most accurate manner possible.” - From Theory of the Motion of the Heavenly Bodies Moving about the Sun in Conic Sections, Gauss, 1809
14
Learning and Vision Seminar Anand D. Subramaniam Estimation Basics Problem statement Observation Random variable X (Given) Target Random Variable Y (Unknown) Joint Probability Density f(x,y) (Given) What is the best estimate y opt =g(x) which minimizes the expected mean square error between y opt and y ? Answer : Conditional Mean g(x) = E(Y|X=x) Estimate g(x) can be potentially nonlinear and unavailable in closed form. When X and Y are jointly Gaussian g(x) is linear. What is the best linear estimate y lin =Wx which minimizes the mean square error ?
15
Learning and Vision Seminar Anand D. Subramaniam Wiener Filter 1940 Wiener-Hopf Solution : W = R YX (R xx ) -1 Involves Matrix Inversion Applies only to stationary processes Not amenable for online recursive implementation.
16
Learning and Vision Seminar Anand D. Subramaniam Kalman Filter The estimate can be obtained recursively. Can be applied to non-stationary processes. If measurement noise and process noise are white and Gaussian, then the filter is “optimal”. Minimum variance unbiased estimate In the general case, the Kalman filter is the best linear estimator among all linear estimators. Process Model : Measurement Model : STATE SPACE MODEL
17
Learning and Vision Seminar Anand D. Subramaniam The Water Tank Problem Process Model : Measurement Model :
18
Learning and Vision Seminar Anand D. Subramaniam What does a Kalman filter do ? The Kalman filter propagates the conditional density in time.
19
Learning and Vision Seminar Anand D. Subramaniam How does it do it ? The Kalman filter iterates between two steps Time Update (Predict) — Project current state and covariance forward to the next time step, that is, compute the next a priori estimates. Measurement Update (Correct) —Update the a priori quantities using noisy measurements, that is, compute the a posteriori estimates. Choose K k to minimize error covariance
20
Learning and Vision Seminar Anand D. Subramaniam Applications GPS Satellite orbit computation Active noise control Tracking
21
Learning and Vision Seminar Anand D. Subramaniam The Layering Approach Layer 1 Layer 2 Layer 3 Layer 4
22
Learning and Vision Seminar Anand D. Subramaniam Simple Dynamical Categories Layer 3 A sequence of blobs k (t), k (t+1),…, k (t+d) is grouped into dynamical categories. The group assignment is “soft”. The dynamical categories are represented with a set of M second order linear dynamical systems. Each category is certain phase during a gait cycle. Categories called “movemes” (like “phonemes”). D m (t,k) : Probability that a certain blob k (t) belongs to one of the dynamical categories m. Q(t) = A1 m Q(t-2) + A0 m Q(t-1) + B m w Q(t) is the motion estimate of the specific blob k (t), w is the system noise and C m = B m.(B m ) T is the system covariance. The dynamical systems form states of a Hidden Markov Model.
23
Learning and Vision Seminar Anand D. Subramaniam
24
The Model
25
Learning and Vision Seminar Anand D. Subramaniam Trellis representation
26
Learning and Vision Seminar Anand D. Subramaniam HMM in speech
27
Learning and Vision Seminar Anand D. Subramaniam HMM model parameters State Transition Matrix : A Observation state PDF : B Number of states : N Number of Observation levels : M Initial probability distribution :
28
Learning and Vision Seminar Anand D. Subramaniam Three Basic Problems Given the observation sequence O = O 1 O 2 …O T, and a model, how do we efficiently compute P(O| ), the probability of the observation sequence, given the model ? Given the observation sequence O = O 1 O 2 …O T, and the model, how do we choose a corresponding state sequence Q = q 1 q 2 …q T, which best “explains” the observations ? How do we adjust the model parameters to maximize P(O| ) ? Forward Backward Algorithm Viterbi Algorithm Baum Welch Algorithm
29
Learning and Vision Seminar Anand D. Subramaniam How do they work ? Key ideas Both Forward-Backward algorithm and the Viterbi algorithm solve the associated problem by induction (or recursively). The induction is a consequence of the Markovity of the model. The Baum-Welch is exactly the EM algorithm with a different “missing parameter”. The missing parameter is the state a particular observation belongs to.
30
Learning and Vision Seminar Anand D. Subramaniam The Layering Approach Layer 1 Layer 2 Layer 3 Layer 4
31
Learning and Vision Seminar Anand D. Subramaniam Complex Movement Sequences Layer 4 Each dynamical system becomes a state of a Hidden Markov Model. Different gaits are modeled using different HMM’s. Paper uses 33 sequences of 5 different subjects performing 3 different gait categories. Choose that HMM that has the maximum likelihood given the observation. Number of correct classified gait cycles in the test set varied from 86% to 93%.
32
Learning and Vision Seminar Anand D. Subramaniam References EM Algorithm A.P. Dempster, N.M. Laird and D.B. Rubin, “Maximum Likelihood from incomplete data via the EM Algorithm”, Journal of the Royal Statistical Society, 39(B),1977. Richard A. Redner and Homer F. Walker, “Mixture densities, Maximum likelihood and the EM algorithm”, SIAM Review, vol. 26.,No. 2, April 1984. G.J. McLachlan and T. Krishnan, “EM Algorithm and its extensions”, Wiley and Sons, 1997. Jeff A. Bilmes, “A Gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and Hidden Markov Models”, available on the net.
33
Learning and Vision Seminar Anand D. Subramaniam References Kalman Filter Anderson, B. D. O. and Moore, J. B. (1979). Optimal Filtering. Prentice-Hall, Englewood Cliffs, NJ. H. Sorenson, "Kalman Filtering: Theory and Application," IEEE Press, 1985. Peter Maybeck, Stochastic Models, Estimation, and Control, Volume 1, Academic Press. 1979 Web site : http://www.cs.unc.edu/~welch/kalman/
34
Learning and Vision Seminar Anand D. Subramaniam References Hidden Markov Models Rabiner, “ An introduction to Hidden Markov Models and selected applications in speech recognition”, Proceedings of the IEEE, 1989. Rabiner and Juang, “An introduction to Hidden Markov Models”, IEEE ASSP Magazine, 1986. M.I. Jordan and C.M. Bishop, “An Introduction to Graphical Models and Machine Learning”, ask Serge.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.