MCMC: Particle Theory By Marc Sobel. Particle Theory: Can we understand it?

Slides:

Advertisements

Similar presentations

Jose-Luis Blanco, Javier González, Juan-Antonio Fernández-Madrigal University of Málaga (Spain) Dpt. of System Engineering and Automation May Pasadena,

Advertisements

CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.

October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.

Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

(Includes references to Brian Clipp

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Visual Recognition Tutorial

Artificial Learning Approaches for Multi-target Tracking Jesse McCrosky Nikki Hu.

10/28 Temporal Probabilistic Models. Temporal (Sequential) Process A temporal process is the evolution of system state over time Often the system state.

TOWARD DYNAMIC GRASP ACQUISITION: THE G-SLAM PROBLEM Li (Emma) Zhang and Jeff Trinkle Department of Computer Science, Rensselaer Polytechnic Institute.

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.

Particle Filters Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics TexPoint fonts used in EMF. Read.

Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.

Visual Recognition Tutorial

Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.

Nonlinear and Non-Gaussian Estimation with A Focus on Particle Filters Prasanth Jeevan Mary Knox May 12, 2006.

Particle Filters for Mobile Robot Localization 11/24/2006 Aliakbar Gorji Roborics Instructor: Dr. Shiri Amirkabir University of Technology.

Today Introduction to MCMC Particle filters and MCMC

Basics of discriminant analysis

Lecture II-2: Probability Review

Particle Filtering. Sensors and Uncertainty Real world sensors are noisy and suffer from missing data (e.g., occlusions, GPS blackouts) Use sensor models.

Radial Basis Function Networks

Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.

Hypothesis Testing. Distribution of Estimator To see the impact of the sample on estimates, try different samples Plot histogram of answers –Is it “normal”

Particle Filtering in Network Tomography

Markov Localization & Bayes Filtering

Inferences for Regression

BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.

Statistical Decision Theory

Model Inference and Averaging

Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

Recap: Reasoning Over Time  Stationary Markov models  Hidden Markov models X2X2 X1X1 X3X3 X4X4 rainsun X5X5 X2X2 E1E1 X1X1 X3X3 X4X4 E2E2 E3E3.

Module 1: Statistical Issues in Micro simulation Paul Sousa.

The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.

Slide 6-1 Copyright © 2004 Pearson Education, Inc.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.

First topic: clustering and pattern recognition Marc Sobel.

Probability and Measure September 2, Nonparametric Bayesian Fundamental Problem: Estimating Distribution from a collection of Data E. ( X a distribution-valued.

Image Analysis, Random Fields and Dynamic MCMC By Marc Sobel.

MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.

Slide Chapter 2d Describing Quantitative Data – The Normal Distribution Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley.

Sampling and estimation Petter Mostad

Short Introduction to Particle Filtering by Arthur Pece [ follows my Introduction to Kalman filtering ]

OBJECT TRACKING USING PARTICLE FILTERS. Table of Contents Tracking Tracking Tracking as a probabilistic inference problem Tracking as a probabilistic.

Page 0 of 7 Particle filter - IFC Implementation Particle filter – IFC implementation: Accept file (one frame at a time) Initial processing** Compute autocorrelations,

Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

CHAPTER 2: Basic Summary Statistics

The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.

SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 6- 1.

Estimating standard error using bootstrap

Bayesian Semi-Parametric Multiple Shrinkage

Probabilistic Robotics

Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.

Introduction to particle filter

Statistics and Shape Analysis

More about Posterior Distributions

Introduction to particle filter

Particle Filters for Event Detection

Presentation transcript:

MCMC: Particle Theory By Marc Sobel

Particle Theory: Can we understand it?

The Dynamic Model Setup: Heavy Theory

Particle Filters: Lighter Theory  More familiar dynamic model setting:  Even more familiar: (for unknown g,h)

Particle Filter: Goal  The Goal in particle filter theory is to simulate (all or part of) the (unobserved) posterior distribution of the signal process {X t : t=1,…}=X 1:t  (i.e., (X 1:t |Y 1:t ) ) (as well as additional parameters, if necessary). Specifically, we would like to simulate the signal process at time t (i.e., the posterior distribution of X 1:t. The particles referred to above take the form,  These particles are designed to approximate the posterior distribution, (X 1:t |Y 1:t ) through the introduction of appropriate weights (see next).

Particle Filter: Weights  The (normalized) weights W (1),…,W (k) attached to particles are designed so that the posterior probability  at a particle X (i) is given by the weight W (i) (i=1,…,k). Thus, for example,  Since weights are always assumed to be normalized, we need only specify them up to a constant of proportionality. 

Convergence  The goal of convergence is that the estimated posterior distribution of x n ’s given observations Y 1:n  corresponds to what it should be. Typically the posterior distribution is defined via weighting w t [k].  This induces, using ‘K’ as a kernel, the density,

convergence  We’d like the resulting measures to converge,

Particle Filters: Bootstrap  Assume that the (Y,X)’s are independent of each other. Then we can build particle filters sequentially by defining the weights by:  for particles formed by:  But, by normalization the term drops out. So, we can define the weights on the new X t ’s by

Particles PSLAM (continued)  To get the bootstrap particle filter, we assume that the prior distribution of X t can be based on simulating X’s from the prior distribution. This leaves us to define the weights by: for X t ’s selected from the prior. We might then cull the particles, choosing only those with reasonably large weights. for X t ’s selected from the prior. We might then cull the particles, choosing only those with reasonably large weights.

Particle Selection: Elementary  Shephard and Pitt (in 1994) proposed dividing up the particles into homogeneous groups (i.e., based on the values of functions of particles), and resampling from each group separately. This has the advantage that we aren’t working with collections of particles which combine apples and oranges.

A Probabilistic Approach  The following algorithms take a probabilistic approach

Full vs. Online SLAM: Full Slam is an example of a particle filter which is not a bootstrap filter.  Full SLAM calculates the robot state over all time up to time t  Online SLAM calculates the robot state for the current time t

Weights  We’ve already said that weights reflect the posterior probability of a particle. In Sequential MCMC we construct these weights sequentially. In bootstrap particle analysis we simulate using the prior, but sometimes this doesn’t make sense. (see examples later). One mechanism is to use another distribution (i.e., less prone to outliers) to generate particle extensions.

Weights (continued)  Liu (2001) recommends using a target distribution q to reconstruct the current posterior. Use ‘q’ to simulate the X’s. The weights then take on the form, 

Kernel Density Estimate-based Particle Filters  Use a kernel (gaussian or otherwise) as the proposal density q. This puts likelihood weights on the points selected by the prior:

Doucet Particle Filters  Doucet recommends using the kernel K as a proposal density. His weighting becomes:

Effective Sample Size  The effective sample size of a weighted distribution is the effective number of unique particles.  When it gets too small, we devise a threshold; A particle survives if:  A) it’s weight is above the threshold, or  B) it surves with probability (w/thresh).  All rejected weights are started from time t=0.

Culling or Sampling Particles  We can see that for non-bootstrap particle filters, particles tend to multiply beyond all restriction if no culling is used. Culling (or resampling) removes some (unlikely) particles so they don’t multiply too rapidly.  A) residual resampling: At step t-1, extend particles via with reconstruct weights for the new particles, retain  copies of the particle. Draw m 0 =k-m (1) -…-m (k) from the particle stream. Residual resampling has the effect of killing particles with small weights and emphasizing particles with large weights. m 0 =k-m (1) -…-m (k) from the particle stream. Residual resampling has the effect of killing particles with small weights and emphasizing particles with large weights.

Cull Particles (continued) Thus, a particle with weight.01 in a stream of size 50, is killed. Project: Should Project: Should you first ‘extend’ particles and then resample or vice versa? you first ‘extend’ particles and then resample or vice versa?  B) Simple Resampling: Extend particles as above. Sample particles from the full stream according to the new weights.  This has the effect of ‘keeping’ particles with low weights.  Thus, a particle with weight.01 in a stream of size 50, has a.01 chance of being selected (whereas for residual resampling) it has a chance of 0.

Resampling (continued)  C) General Resampling: Define ‘rescaled’ probability weights: (a (1),…,a (k) ) (usually related to the w t weights). Choose particles based on these weights and assign the ‘new weights’ w t (*,j) = (w t (j) /a (j) ) to the corresponding particles. Rescale the new weights.  D) Effective Sample size together with General Sampling: Sample until the effective sample size is below a threshold. Accept particles whose weights are bigger than an appropriate weight threshold c (i.e., weight median). Accept particles whose weights w are below the threshold c with probability (w/c) and otherwise reject them.

 General Resampling: We can generalize residual sampling by choosing only large weights to resample from, and then resample with scaled weights. Suppose some particles have different numbers of continuations than others. In this case we might want to select them with weighted probabilities that take this into account – i.e., make the a’s larger for more continuations and smaller for fewer continuations. If there are e.g., more continuations, once these have been implemented, we reweight by  w*=(w/a).

The Central Problem with Particle Filters  At each time stage ‘t’ we are trying to reconstruct the posterior distribution and we biuld future particles on this estimate. Mistakes at each stage amplify mistakes later on. Posterior distribution is typically done using weights. For this reason, there are many algorithms supporting ways to improve the posterior distribution reconstruction:  A) Use kernel density estimators to reconstruct the posterior.  B) Use Metropolis Hastings to reconstruct the posterior.  C) Divide particles into separate streams, reconstructing the posterior for each one separately.

Pitt – Shephard Particle Theory  Suppose new X’s which are sampled aposteriori are heterogeneous i.e., they can be divided into homogeneous streams. Call the streams s=1,…,k. Define, for each stream s, with ‘Z i =s’ denoting that X i is in the s’th stream:  We then divide sampling into two parts. First we sample a stream using

Pitt Shephard  Then, we sample using weights:  Suppose we have very heterogeneous particles, and there is a way to divide them into homogeneous streams. Then this makes sense.  (e.g., mixture kalman filter type examples).

Example of a Linear Dynamic Model  In what follows we consider the following linear dynamic model:  We compute  The Kalman Filter gives us a solution to the posterior distribution. We can check this against the bootstrap and other filters.

Comparing Bootstrap Particles with the real thing computing Λ t over 500 time periods.

Histogram of the difference between bootstrap particle and real parameters

A More Complicated Example of a Dynamic Model  In what follows we consider the following quadratic dynamic models:  The Kalman Filter again gives us stepwise solutions to the posterior distribution. We can check this against the bootstrap and other filters.

Quadratic Model for κ 1 =.005. Bootstrap versus real parameters.

Absolute difference between the bootstrap particle filter and the real parameter for κ 1 =.005

Switch to Harder Model  For the second quadratic model when kappa=.01, the bootstrap particle filter breaks down entirely before time t=50.  We switch to residual resampling.

Residual Resamling versus real parameters when κ 2 =.01.

Histogram: Note how much larger differences are in this case

Switch to General Sampling  Now we use residual sampling as long as the effective sample size stays above 500.  When it falls below 500, we use only those particles:  A) with weights above.002, or  B) with probability (w/.002) we accept the particles.

General Sampling versus real parameters for the quadratic model: Coefficient=.01.

Histogram of the differences between general sampling and real parameters : Note how small the differences are.

Mixture Kalman Filters: Tracking Models  Define models which have a switching mechanism P(KF i |KF j ) in going from Kalman filter KF j to KF i (i,j=1,…,d). Updates use standard Kalman Filters for a given model KF j and then with probability P(KF i |KF j ) switch to filter KF i.

Kalman Filters  We have the updating formula:  Which can generate particles consisting of Kalman Filters.  We can use Pitt-Shepard to sample these models. In effect, we divide the Kalman Filters into homogeneous groups, and then sample them as separate streams.

MKF’s  Mixture Kalman Filters are useful for tracking. The Kalman Filters represent different objects at a particular location. The probabilities represent the chance that one object versus others are actually visible subject to the usual occlusion. The real posterior distribution for MKF’s is easy to calculate.

Matlab Code: Part I   % Particle Model   % X[t]=.5X[t-1]+3*V[t]   % Y[t]=2*X[t]+2*W[t]   XP=zeros(100,1000);   X(1,1:1000)=5+sqrt(3)*normrnd(0,1,1,1000);   Y(1,1:1000)=2*X(1,1:1000)+sqrt(2)*normrnd(0,1,1,1000);   W=normpdf(Y(1,:),2*X(1,:),ones(1,1000));   WW=W/sum(W');   XBP(1,:)=randsample(X(1,:),1000,true,WW);   mm=((2*Y(1,:)/4)+(5/9))/(1+(1/9));   sg=sqrt(1/(1+(1/9)));   XP(1,:)=normrnd(mm,sg*ones(1,1000));

Matlab Code: Part II   for tim=2:100   for jj=1:1000   X(tim,jj)=.5*X(tim-1,jj)+5+3*randn;   Y(tim,jj)=2*X(tim,jj)+2*randn;   end   W=normpdf(Y(tim,:),2*X(tim,:),ones(1,1000));   WW=W/sum(W');   XBP(tim,:)=randsample(X(tim,:),1000,true,WW);   mm=((2*Y(tim,:)/4)+(5/9)+.5*X(tim,:)/9)/(1+(1/9));   sg=sqrt(1/(1+(1/9)));   XP(tim,:)=normrnd(mm,sg*ones(1,1000));   end