An Introduction to Kalman Filtering by Arthur Pece
Generative model for a generic signal
Basic concepts in tracking/filtering State variables x; observation y: both are vectors Discrete time: x(t), y(t), x(t+1), y(t+1) Probability P pdf [density] p(v) of vector variable v : p(v*) = lim P(v* < v < v*+dv) / dv dv->0.
Basic concepts: Gaussian pdf A Gaussian pdf is completely characterized by 2 parameters: its mean vector its covariance matrix
Basic concepts: prior and likelihood Prior pdf of variable v: in tracking, this is usually the probability conditional on the previous estimate: p[ v(t) | v(t-1) ] Likelihood: pdf of the observation, given the state variables: p[ y(t) | x(t) ]
Basic concepts: Bayes’ theorem Posterior pdf is proportional to prior pdf times likelihood: p[ x(t) | x(t-1), y(t) ] = p[ x(t) | x(t-1) ] p[ y(t) | x(t) ] / Z where Z = p[ y(t) ]
Basic concepts: recursive Bayesian estimation Posterior pdf given the set y(1:t) of all observations up to time t: p[ x(t) | y(1:t) ] = p[ y(t) | x(t) ]. p[ x(t) | x(t-1) ]. p[ x(t-1) | y(1:t-1) ] / Z 1
Basic concepts: recursive Bayesian estimation p[ x(t) | y(1:t) ] = p[ y(t) | x(t) ]. p[ x(t) | x(t-1) ]. p[ y(t-1) | x(t-1) ]. p[ x(t-1) | x(t-2) ]. p[ x(t-2) | y(1:t-2) ] / Z 2
Basic concepts: recursive Bayesian estimation p[ x(t) | y(1:t) ] = p[ y(t) | x(t) ]. p[ x(t) | x(t-1) ]. p[ y(t-1) | x(t-1) ]. p[ x(t-1) | x(t-2) ]. p[ y(t-2) | x(t-2) ]. p[ x(t-2) | x(t-3) ]. … / Z*
Kalman model
Kalman model in words Dynamical model: the current state x(t) is a linear (vector) function of the previous state x(t-1) plus additive Gaussian noise Observation model: the observation y(t) is a linear (vector) function of the state x(t) plus additive Gaussian noise
Problems in visual tracking Dynamics is nonlinear, non-Gaussian Pose and shape are nonlinear, non-Gaussian functions of the system state Most important: what is observed is not image coordinates, but pixel grey-level values: a nonlinear function of object shape and pose, with non-additive, non-Gaussian noise
More detailed model
Back to Kalman A Gaussian pdf, propagated through a linear system, remains Gaussian If Gaussian noise is added to a variable with Gaussian pdf, the resulting pdf is still Gaussian (sum of covariances) ---> The predicted state pdf is Gaussian if the previous state pdf was Gaussian ---> The observation pdf is Gaussian if the state pdf is Gaussian
Kalman dynamics
Kalman observation
Kalman posterior pdf The product of 2 Gaussian densities is still Gaussian (sum of inverse covariances) ---> the posterior pdf of the state is Gaussian if prior pdf and likelihood are Gaussian
Kalman filter Operates in two steps: prediction and update Prediction: propagate mean and covariance of the state through the dynamical model Update: combine prediction and innovation (defined below) to obtain the state estimate with maximum posterior pdf
Note on the symbols From now on, the symbol x represents no longer the ”real” state (which we cannot know) but the mean of the posterior Gaussian pdf The symbol A represents the covariance of the posterior Gaussian pdf x and A represent mean and covariance of the prior Gaussian pdf
Kalman prediction Prior mean: previous mean vector times dynamical matrix: x(t) = D x(t-1) Prior covariance matrix: previous covariance matrix pre- AND post- multiplied by dynamical matrix PLUS noise covariance: A(t) = D T A(t-1) D + N
Kalman update In the update step, we must reason backwards, from effect (observation) to cause (state): we must ”invert” the generative process. Hence the update is more complicated than the prediction.
Kalman update (continued) Basic scheme: Predict the observation from the current state estimate Take the difference between predicted and actual observation (innovation) Project the innovation back to update the state
Kalman innovation Observation matrix F The innovation v is given by: v = y - F x Observation-noise covariance R The innovation has covariance W: W = F T A F + R
Kalman update: state mean vector Posterior mean vector: add weighted innovation to predicted mean vector weigh the innovation by the relative covariances of state and innovation: larger covariance of the innovation --> larger uncertainty of the innovation --> smaller weight of the innovation
Kalman gain Predicted state covariance A Innovation covariance W Observation matrix F Kalman gain K = A F T W -1 Posterior state mean: s = s + K v
Kalman update: state covariance matrix Posterior covariance matrix: subtract weighted covariance of the innovation weigh the covariance of the innovation by the Kalman gain: A = A - K T W K Why subtract? Look carefully at the equation: larger innovation covariance --> smaller Kalman gain K --> smaller amount subtracted!
Kalman update: state covariance matrix (continued) Another equivalent formulation requires matrix inversion (sum of inverse covariances) Advanced note: The equations given here are for the usual covariance form of the Kalman filter It is possible to work with inverse covariance matrices all the time (in prediction and update): this is called the information form of the Kalman filter
Summary of Kalman equations Prediction : x(t) = D x(t-1) A(t) = D T A(t-1) D + N Update: innovation: v = y - F x innov. cov: W = F T A F + R Kalman gain: K = A F T W -1 posterior mean: s = s + K v posterior cov: A = A - K T W K
Kalman equations with control input u Prediction : x(t) = D x(t-1) + C u(t-1) A(t) = D T A(t-1) D + N Update: innovation: v = y - F x innov. cov: W = F T A F + R Kalman gain: K = A F T W -1 posterior mean: s = s + K v posterior cov: A = A - K T W K