Download presentation
1
BAYESIAN INFERENCE Sampling techniques
Andreas Steingötter
2
Motivation & Background
Exact inference is intractable, so we have to resort to some form of approximation
3
Motivation & Background
variational Bayes deterministic approximation not exact in principle Alternative approximation: Perform inference by numerical sampling, also known as Monte Carlo techniques.
4
Motivation & Background
Posterior distribution 𝑝 𝑧 is required (primarily) for the purpose of evaluating expectations 𝐸(𝑓). 𝑓 𝑧 are predictions made by model with parameters 𝑧 𝑝 𝑧 is parameter prior and 𝑓 𝑧 =𝑝(𝑦|𝑧) is likelihood - evaluate the marginal likelihood (evidence) for a model
5
Motivation & Background
approximation Classical Monte Carlo approx 𝑧 (𝑙) are random (not necessarily independent) draws from 𝑝 𝑧 , which converges to the right answer in the limit of large numbers of samples, 𝐿.
6
Motivation & Background
Problems: How to obtain independent samples from 𝒑 𝒛 ? Expectation may be dominated by regions of small probability -> large sample sizes will be required to achieve sufficient accuracy Monte Carlo ignores values of 𝑧 (𝑙) when forming the estimate if 𝑧 (𝑙) are independent draws from 𝑝 𝑧 , then low numbers suffice to estimate expectation
7
How to do sampling? Basic Sampling algorithms Markov chain Monte Carlo
Restricted mainly to 1- / 2- dimensional problems Markov chain Monte Carlo Very general and powerful framework
8
Basic sampling Special cases Model with directed graph
Ancestral sampling: Easy sampling of joint distribution: Logic sampling: Compare sampled value for 𝑧 𝑖 with observed value at node i. If NOT agree, then discard all previous samples and start with first node
9
Random sampling Computers can generate only pseudorandom numbers
Correlation of successive values Lack of uniformity of distribution Poor dimensional distribution of output sequence Distance between where certain values occur are distributed differently from those in a random sequence distribution
10
Random sampling from the Uniform Distribution
Assumption: good pseudo-random generator for uniformly distributed data is implemented Alternative: “true” random numbers with randomness coming from atmospheric noise
11
Random sampling from a standard non-uniform distribution
Goal: Sample from non-uniform distribution 𝑝 𝑦 which is a standard distribution, i.e. given in analytical form Suppose: we have uniformly distributed random numbers from (0,1) Solution: Transform random numbers 𝑧 over (0,1) using a function which is the inverse of the indefinite integral of the desired distribution
12
Random sampling from a standard non-uniform distribution
Step 1: Calculate cumulative distribution function Step 2: Transform samples 𝑈 𝑧 0,1 by
13
Rejection sampling Suppose: Approach:
direct sampling from 𝑝 𝒛 is difficult, but 𝑝 𝒛 can be evaluated for any given value of 𝒛 up to some normalization constant 𝑍 𝑍 𝑝 is unknown, 𝑝 𝑧 can be evaluated Approach: Define simple proposal distribution 𝑞(𝑧) such that 𝑘𝑞 𝑧 ≥ 𝑝 (𝑧) for all 𝑧.
14
Rejection sampling Simple visual example
Constant k should be as small as possible. Fraction of rejected points depends on the ratio of the area under the unnormalized distribution 𝑝 𝑧 to the area under the curve 𝑘𝑞 𝑧 . 𝑘𝑞 𝑧 𝑝 (𝑧)
15
Rejection sampling Rejection sampler Generate two random numbers
number 𝑧 0 from proposal distribution 𝑞(𝑧) generate a number 𝑢 0 from uniform distribution over [0,k𝑞 𝑧0 ] If 𝑢 0 > 𝑝 ( 𝑧 0 ) reject! Remaining pairs have unifrom distribution under 𝑝 (𝑧)
16
Adaptive rejection sampling
Suppose: difficult to determine a suitable analytic form for the proposal distribution 𝑞(𝑧) Approach: construct envelope function “on the fly” based on observed values of the distribution 𝑝 𝑧 if 𝑝 𝑧 is log concave (ln𝑝 𝑧 has non-increasing derivatives) use derivatives to construct envelope
17
Adaptive rejection sampling
Step 1: at initial set of grid points 𝑧 1 ,…, 𝑧 𝑀 evaluate function ln𝑝 𝑧 𝑖 and its gradient and calculate tangents at 𝑝 𝑧 𝑖 , i=1,…,M. Step 2: sample from envelop distribution, if accepted use it to calculate 𝑝(𝑧), otherwise refine grid. Envelope distribution is a piecewise exponential distribution Slope Offset k
18
Adaptive rejection sampling
Problem of rejection sampling: Find a proposal distribution 𝑞(𝑧), which is close to required distribution to minimize rejection rate. Therefore restricted mainly to univariate distributions curse of dimensionality However: potential subroutine
19
Importance sampling Framework for approximating expectations 𝐸 𝑝 (𝑓 𝑧 ) directly with respect to 𝑝 𝒛 Does NOT provide 𝑝 𝒛 Suppose (again): direct sampling from 𝑝 𝒛 is difficult, but 𝑝 𝒛 can be evaluated for any given value of 𝒛 up to some normalization constant 𝑍.
20
Importance sampling As for rejection sampling, apply proposal distribution 𝑞 𝑧 from which it is easy to draw samples
21
Importance sampling Expectation formula for un-normalized distributions with importance weights 𝑟 𝑙 Key points: Importance weights correct bias introduced by sampling from proposal distribution Dependence on how well 𝑞 𝑧 approximates 𝑝(𝑧) (similar to rejection sampling) Choose sample points in input space where 𝑓 𝑧 𝑝 𝑧 is large (or at least where 𝑝 𝑧 is large) If 𝑝 𝑧 > 0 in same region, then 𝑞 𝑧 >0 necessary
22
Importance sampling Attention:
Consider none of the samples falls in the regions where 𝑓 𝑧 𝑝 𝑧 is large. In that case, the apparent variances of 𝑟 𝑙 and 𝑟 𝑙 𝑓(𝑧(𝑙)) may be small even though the estimate of the expectation may be severely wrong. Hence a major drawback of the importance sampling method is the potential to produce results that are arbitrarily in error and with no diagnostic indication. 𝒒 𝒛 should NOT be small where 𝒑 𝒛 may be significant!!!
23
Markov Chain Monte Carlo (MCMC) sampling
MCMC is a general framework, sampling from large class of distributions, scales well with dimensionality of sample space Goal: Generate samples from distribution 𝑝(𝑧) Idea: Build a machine which uses the current sample to decide which next sample to produce in such a way that the overall distribution of the samples will be 𝑝(𝑧).
24
Markov Chain Monte Carlo (MCMC) sampling
Approach: Generate a candidate sample 𝑧 ∗ from a proposal distribution 𝑞(𝑧| 𝑧 (𝜏) ) that depends on the current state 𝑧 (𝜏) and is sufficiently simple to draw samples from directly. Current sample 𝑧 (𝜏) is known (i.e. maintain record of the current state) Samples 𝑧 (1) , 𝑧 (2) , 𝑧 (3) ,… form a Markov chain Accept or reject the candidate sample 𝑧 ∗ according to some appropriate criterion
25
MCMC - Metropolis algorithm
Suppose: 𝑝 𝒛 can be evaluated for any given value of 𝒛 up to some normalization constant 𝑍. Algorithm: Step 1: Choose symmetric proposal distribution 𝑞 𝒛𝐴 𝒛𝐵 =𝑞 𝒛𝐵 𝒛𝐴 Step 2: Candidate sample 𝑧 ∗ is accepted with probability
26
MCMC - Metropolis algorithm
Algorithm (cont.): Step 2.1: Choose a random number 𝑢 with uniform distribution in (0,1) Step 2.2: Acceptance test for 𝑢< Step 3:
27
Metropolis algorithm Notes: rejection of a points leads to the previous sample (different from rejection sampling) If 𝑞 𝒛𝐴 𝒛𝐵 > 0 for any values 𝒛𝐴, 𝒛𝐵, then 𝒛 (𝜏) tends to 𝑝 𝒛 for 𝜏 -> 𝑧 (1) , 𝑧 (2) , 𝑧 (2) ... present no independent samples from 𝑝 𝒛 - serial correlation. Instead retain only every Mth sample.
28
Examples: Metropolis algorithm
Implementation in R: Elliptical distibution 𝑝 ( 𝑧 (𝜏) ) 𝑝 ( 𝑧 ∗ ) 𝑢< Update state 𝒛 𝜏+1 = 𝒛 ∗ Keep old state 𝒛 𝜏+1 = 𝒛 (𝜏)
29
Examples: Metropolis algorithm
Implementation in R: Initialization [-2,2], step size = 0.3 n=1500 n=15000
30
Examples: Metropolis algorithm
Implementation in R: Initialization [-2,2], step size = 0.5 n=1500 n=15000
31
Examples: Metropolis algorithm
Implementation in R: Initialization [-2,2], step size = 1 n=1500 n=15000
32
Validation of MCMC Properties of Markov chains: Transition probabilities 𝑇𝑚( 𝑧 ′ ,𝑧): z(1) z(2) z(m) z(m+1) homogeneous If 𝑇𝑚 is the same for all m Invariant (stationary)
33
Validation of MCMC Propoerties of Markov chains: 𝑝 ∗ 𝑧 : homogeneous
If 𝑇𝑚 is the same for all m Invariant (stationary) Sufficient detailed balance 𝑇𝑚 satisfy reversible
34
Validation of MCMC ergodicity
Goal: invariant Markov chain that converges to desired distribution 𝑝 ∗ 𝑧 An ergodic Markov chain has only one equilibrium distribution invariant 𝑝 ∗ 𝑧 = lim 𝑚→∞ 𝑝 𝑧m !!! for any 𝑝(𝑧0) ergodicity
35
Properties and validation of MCMC
Approach: Construct appropriate transition probabilities 𝑇( 𝑧 ′ ,𝑧): 𝑇( 𝑧 ′ ,𝑧) from set of base transitions 𝑩k Mixture form Successive application k - Mixing coefficients
36
Metropolis-Hastings algorithm
Generalization of Metropolis algorithm No symmetric proposal distribution 𝑞(𝑧) required Choice of proposal distribution crititcal If symmetry
37
Metropolis-Hastings algorithm
Gaussian centered on current state Small variance -> high acceptance, slow walk, dependent samples Large variance -> high rejection rate
38
Gibbs sampling Special case of Metropolis-Hastings algorithm
the random value is always accepted, Suppose: 𝑝 𝑧1,𝑧2,𝑧3 , Step 1: initial samples 𝑧1, 𝑧2, 𝑧3 Step 2: (repeated) 𝑧11~𝑝 𝑧1 𝑧2,𝑧3 𝑧21~𝑝(𝑧2|𝑧11,𝑧3) 𝑧31~𝑝(𝑧3|𝑧11,𝑧21) repeated by cycling randomly choose variable to be updated
39
Gibbs sampling 𝑝 𝒛\i is invariant (unchanged)
Univariate conditional distribution 𝑝(𝑧𝑖|𝒛\i) is invariant (by definition) Joint distribution 𝑝 𝒛 is invariant Because (fixed at each step)
40
Gibbs sampling Sufficient condition for ergodicity:
None of the conditional distributions be anywhere zero, i.e. any point in 𝒛 space can be reached from any other point in a finite number of steps z(2) z(1) z(3)
41
Gibbs sampling Obtain m independent samples:
Sample MCMC during a «burn-in» period to remove dependence on initial values Then, sample at set time points (e.g. every Mth sample) The Gibbs sequence converges to a stationary (equilibrium) distribution that is independent of the starting values, By construction this stationary distribution is the target distribution we are trying to simulate.
42
Gibbs sampling Practicability dependent feasibility to draw samples from conditional distributions 𝑝(𝑧𝑖|𝒛\i). Directed graphs will lead to conditional distributions for Gibbs sampling that are log concave. Adaptive rejection sampling methods
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.