BAYESIAN INFERENCE Sampling techniques

Name: BAYESIAN INFERENCE Sampling techniques
Uploaded: 2017-10-05T11:42:32+00:00
Duration: PTM16S31
Channel: Jason Johnston
Description: BAYESIAN INFERENCE Sampling techniques

BAYESIAN INFERENCE Sampling techniques
Andreas Steingötter

Motivation & Background
Exact inference is intractable, so we have to resort to some form of approximation

variational Bayes deterministic approximation not exact in principle Alternative approximation: Perform inference by numerical sampling, also known as Monte Carlo techniques.

Posterior distribution 𝑝 𝑧 is required (primarily) for the purpose of evaluating expectations 𝐸(𝑓). 𝑓 𝑧 are predictions made by model with parameters 𝑧 𝑝 𝑧 is parameter prior and 𝑓 𝑧 =𝑝(𝑦|𝑧) is likelihood - evaluate the marginal likelihood (evidence) for a model

approximation Classical Monte Carlo approx 𝑧 (𝑙) are random (not necessarily independent) draws from 𝑝 𝑧 , which converges to the right answer in the limit of large numbers of samples, 𝐿.

Problems: How to obtain independent samples from 𝒑 𝒛 ? Expectation may be dominated by regions of small probability -> large sample sizes will be required to achieve sufficient accuracy Monte Carlo ignores values of 𝑧 (𝑙) when forming the estimate if 𝑧 (𝑙) are independent draws from 𝑝 𝑧 , then low numbers suffice to estimate expectation

How to do sampling? Basic Sampling algorithms Markov chain Monte Carlo
Restricted mainly to 1- / 2- dimensional problems Markov chain Monte Carlo Very general and powerful framework

Basic sampling Special cases Model with directed graph
Ancestral sampling: Easy sampling of joint distribution: Logic sampling: Compare sampled value for 𝑧 𝑖 with observed value at node i. If NOT agree, then discard all previous samples and start with first node

Random sampling Computers can generate only pseudorandom numbers
Correlation of successive values Lack of uniformity of distribution Poor dimensional distribution of output sequence Distance between where certain values occur are distributed differently from those in a random sequence distribution

Random sampling from the Uniform Distribution
Assumption: good pseudo-random generator for uniformly distributed data is implemented Alternative: “true” random numbers with randomness coming from atmospheric noise

Random sampling from a standard non-uniform distribution
Goal: Sample from non-uniform distribution 𝑝 𝑦 which is a standard distribution, i.e. given in analytical form Suppose: we have uniformly distributed random numbers from (0,1) Solution: Transform random numbers 𝑧 over (0,1) using a function which is the inverse of the indefinite integral of the desired distribution

Random sampling from a standard non-uniform distribution
Step 1: Calculate cumulative distribution function Step 2: Transform samples 𝑈 𝑧 0,1 by

Rejection sampling Suppose: Approach:
direct sampling from 𝑝 𝒛 is difficult, but 𝑝 𝒛 can be evaluated for any given value of 𝒛 up to some normalization constant 𝑍 𝑍 𝑝 is unknown, 𝑝 𝑧 can be evaluated Approach: Define simple proposal distribution 𝑞(𝑧) such that 𝑘𝑞 𝑧 ≥ 𝑝 (𝑧) for all 𝑧.

Rejection sampling Simple visual example
Constant k should be as small as possible. Fraction of rejected points depends on the ratio of the area under the unnormalized distribution 𝑝 𝑧 to the area under the curve 𝑘𝑞 𝑧 . 𝑘𝑞 𝑧 𝑝 (𝑧)

Rejection sampling Rejection sampler Generate two random numbers
number 𝑧 0 from proposal distribution 𝑞(𝑧) generate a number 𝑢 0 from uniform distribution over [0,k𝑞 𝑧0 ] If 𝑢 0 > 𝑝 ( 𝑧 0 ) reject! Remaining pairs have unifrom distribution under 𝑝 (𝑧)

Adaptive rejection sampling
Suppose: difficult to determine a suitable analytic form for the proposal distribution 𝑞(𝑧) Approach: construct envelope function “on the fly” based on observed values of the distribution 𝑝 𝑧 if 𝑝 𝑧 is log concave (ln𝑝 𝑧 has non-increasing derivatives) use derivatives to construct envelope

Step 1: at initial set of grid points 𝑧 1 ,…, 𝑧 𝑀 evaluate function ln𝑝 𝑧 𝑖 and its gradient and calculate tangents at 𝑝 𝑧 𝑖 , i=1,…,M. Step 2: sample from envelop distribution, if accepted use it to calculate 𝑝(𝑧), otherwise refine grid. Envelope distribution is a piecewise exponential distribution Slope  Offset k

Problem of rejection sampling: Find a proposal distribution 𝑞(𝑧), which is close to required distribution to minimize rejection rate. Therefore restricted mainly to univariate distributions curse of dimensionality However: potential subroutine

Importance sampling Framework for approximating expectations 𝐸 𝑝 (𝑓 𝑧 ) directly with respect to 𝑝 𝒛 Does NOT provide 𝑝 𝒛 Suppose (again): direct sampling from 𝑝 𝒛 is difficult, but 𝑝 𝒛 can be evaluated for any given value of 𝒛 up to some normalization constant 𝑍.

Importance sampling As for rejection sampling, apply proposal distribution 𝑞 𝑧 from which it is easy to draw samples

Importance sampling Expectation formula for un-normalized distributions with importance weights 𝑟 𝑙 Key points: Importance weights correct bias introduced by sampling from proposal distribution Dependence on how well 𝑞 𝑧 approximates 𝑝(𝑧) (similar to rejection sampling) Choose sample points in input space where 𝑓 𝑧 𝑝 𝑧 is large (or at least where 𝑝 𝑧 is large) If 𝑝 𝑧 > 0 in same region, then 𝑞 𝑧 >0 necessary

Importance sampling Attention:
Consider none of the samples falls in the regions where 𝑓 𝑧 𝑝 𝑧 is large. In that case, the apparent variances of 𝑟 𝑙 and 𝑟 𝑙 𝑓(𝑧(𝑙)) may be small even though the estimate of the expectation may be severely wrong. Hence a major drawback of the importance sampling method is the potential to produce results that are arbitrarily in error and with no diagnostic indication. 𝒒 𝒛 should NOT be small where 𝒑 𝒛 may be significant!!!

Markov Chain Monte Carlo (MCMC) sampling
MCMC is a general framework, sampling from large class of distributions, scales well with dimensionality of sample space Goal: Generate samples from distribution 𝑝(𝑧) Idea: Build a machine which uses the current sample to decide which next sample to produce in such a way that the overall distribution of the samples will be 𝑝(𝑧).

Markov Chain Monte Carlo (MCMC) sampling
Approach: Generate a candidate sample 𝑧 ∗ from a proposal distribution 𝑞(𝑧| 𝑧 (𝜏) ) that depends on the current state 𝑧 (𝜏) and is sufficiently simple to draw samples from directly. Current sample 𝑧 (𝜏) is known (i.e. maintain record of the current state) Samples 𝑧 (1) , 𝑧 (2) , 𝑧 (3) ,… form a Markov chain Accept or reject the candidate sample 𝑧 ∗ according to some appropriate criterion

MCMC - Metropolis algorithm
Suppose: 𝑝 𝒛 can be evaluated for any given value of 𝒛 up to some normalization constant 𝑍. Algorithm: Step 1: Choose symmetric proposal distribution 𝑞 𝒛𝐴 𝒛𝐵 =𝑞 𝒛𝐵 𝒛𝐴 Step 2: Candidate sample 𝑧 ∗ is accepted with probability

MCMC - Metropolis algorithm
Algorithm (cont.): Step 2.1: Choose a random number 𝑢 with uniform distribution in (0,1) Step 2.2: Acceptance test for 𝑢< Step 3:

Metropolis algorithm Notes: rejection of a points leads to the previous sample (different from rejection sampling) If 𝑞 𝒛𝐴 𝒛𝐵 > 0 for any values 𝒛𝐴, 𝒛𝐵, then 𝒛 (𝜏) tends to 𝑝 𝒛 for 𝜏 ->  𝑧 (1) , 𝑧 (2) , 𝑧 (2) ... present no independent samples from 𝑝 𝒛 - serial correlation. Instead retain only every Mth sample.

Examples: Metropolis algorithm
Implementation in R: Elliptical distibution 𝑝 ( 𝑧 (𝜏) ) 𝑝 ( 𝑧 ∗ ) 𝑢< Update state 𝒛 𝜏+1 = 𝒛 ∗ Keep old state 𝒛 𝜏+1 = 𝒛 (𝜏)

Implementation in R: Initialization [-2,2], step size = 0.3 n=1500 n=15000

Implementation in R: Initialization [-2,2], step size = 0.5 n=1500 n=15000

Implementation in R: Initialization [-2,2], step size = 1 n=1500 n=15000

Validation of MCMC Properties of Markov chains: Transition probabilities 𝑇𝑚( 𝑧 ′ ,𝑧): z(1) z(2) z(m) z(m+1) homogeneous If 𝑇𝑚 is the same for all m Invariant (stationary)

Validation of MCMC Propoerties of Markov chains: 𝑝 ∗ 𝑧 : homogeneous
If 𝑇𝑚 is the same for all m Invariant (stationary) Sufficient detailed balance 𝑇𝑚 satisfy reversible

Validation of MCMC ergodicity
Goal: invariant Markov chain that converges to desired distribution 𝑝 ∗ 𝑧 An ergodic Markov chain has only one equilibrium distribution invariant 𝑝 ∗ 𝑧 = lim 𝑚→∞ 𝑝 𝑧m !!! for any 𝑝(𝑧0) ergodicity

Properties and validation of MCMC
Approach: Construct appropriate transition probabilities 𝑇( 𝑧 ′ ,𝑧): 𝑇( 𝑧 ′ ,𝑧) from set of base transitions 𝑩k Mixture form Successive application k - Mixing coefficients

Metropolis-Hastings algorithm
Generalization of Metropolis algorithm No symmetric proposal distribution 𝑞(𝑧) required Choice of proposal distribution crititcal If symmetry

Metropolis-Hastings algorithm
Gaussian centered on current state Small variance -> high acceptance, slow walk, dependent samples Large variance -> high rejection rate

Gibbs sampling Special case of Metropolis-Hastings algorithm
the random value is always accepted, Suppose: 𝑝 𝑧1,𝑧2,𝑧3 , Step 1: initial samples 𝑧1, 𝑧2, 𝑧3 Step 2: (repeated) 𝑧11~𝑝 𝑧1 𝑧2,𝑧3 𝑧21~𝑝(𝑧2|𝑧11,𝑧3) 𝑧31~𝑝(𝑧3|𝑧11,𝑧21) repeated by cycling randomly choose variable to be updated

Gibbs sampling 𝑝 𝒛\i is invariant (unchanged)
Univariate conditional distribution 𝑝(𝑧𝑖|𝒛\i) is invariant (by definition) Joint distribution 𝑝 𝒛 is invariant Because (fixed at each step)

Gibbs sampling Sufficient condition for ergodicity:
None of the conditional distributions be anywhere zero, i.e. any point in 𝒛 space can be reached from any other point in a finite number of steps z(2) z(1) z(3)

Gibbs sampling Obtain m independent samples:
Sample MCMC during a «burn-in» period to remove dependence on initial values Then, sample at set time points (e.g. every Mth sample) The Gibbs sequence converges to a stationary (equilibrium) distribution that is independent of the starting values, By construction this stationary distribution is the target distribution we are trying to simulate.

Gibbs sampling Practicability dependent feasibility to draw samples from conditional distributions 𝑝(𝑧𝑖|𝒛\i). Directed graphs will lead to conditional distributions for Gibbs sampling that are log concave. Adaptive rejection sampling methods

BAYESIAN INFERENCE Sampling techniques

Similar presentations

Presentation on theme: "BAYESIAN INFERENCE Sampling techniques"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

BAYESIAN INFERENCE Sampling techniques

Similar presentations

Presentation on theme: "BAYESIAN INFERENCE Sampling techniques"— Presentation transcript:

Similar presentations

About project

Feedback