Download presentation
Presentation is loading. Please wait.
1
Empirical Bayes approaches to thresholding Bernard Silverman, University of Bristol (joint work with Iain Johnstone, Stanford) IMS meeting 30 July 2002
2
30 July 2002IMSJohnstone/Silverman 2 Finding needles or hay in haystacks Archetypal problem: n noisy observations Sequence i may well be sparse, but not necessarily Assume noise variance 1; no restriction in practice
3
30 July 2002IMSJohnstone/Silverman 3 Examples Wavelet coefficients at each level of an unknown function Coefficients in some more general dictionary Pixels of a nearly, or not so nearly, black object or image
4
30 July 2002IMS Needles or hay? Needles : rare objects in the noise if the observation was 3, would be inclined to think it was straw Hay : common objects non-sparse signal if the observation was 3, would be inclined to think it was a nonzero “object” A good method will adapt to either situation automatically
5
30 July 2002IMSJohnstone/Silverman 5 Thresholding Choose a threshold t If |Y i | is less than t, estimate i = 0 If |Y i | is greater than t, estimate i = Y i Gain strength from sparsity; if sparse a high threshold gives great accuracy Data-dependent choice of threshold is essential to adapt to sparsity: a high threshold can be disadvantageous for a ‘dense’ signal
6
30 July 2002IMSJohnstone/Silverman 6 Aims for a thresholding method Adaptive to sparse and dense signals Stable to small data changes Tractable, with available software Performs well on simulations Performs well on real data Has good theoretical properties Our method does all these!
7
30 July 2002IMSJohnstone/Silverman 7 Bayesian Formulation Prior for each parameter is a mixture of an atom at zero (prob 1-w) and a suitable heavy-tailed density (prob w) Posterior median is a true thresholding rule; denote its threshold by t(w) Small w large threshold, so want small w for sparse signals, large for dense
8
30 July 2002IMSJohnstone/Silverman 8 Other possible thresholding rules hard or soft thresholding with the same threshold t(w) posterior mean: not a strict thresholding rule Posterior probability of non-zero gives probability that pixel/coefficient/feature is ‘really there’ –threshold if this prob is < 0.5 –threshold for some larger prob? Mean and Bayes factor rules generalize to complex and multivariate case
9
30 July 2002IMSJohnstone/Silverman 9 Let g= convolution of and normal density Marginal log likelihood of w is computationally tractable to maximize automatically adaptive; gives large w if a large number of Y i are large and vice versa Empirical Bayes: Data-based choice of w
10
30 July 2002IMSJohnstone/Silverman 10 Example Six signals of varying sparsity Each has 10000 values arranged as an image for display Independent Gaussian noise added Excellent behaviour of the MML automatic thresholding method is borne out by other simulations, including in the wavelet context
11
30 July 2002IMS
12
30 July 2002IMS
13
Root mean square error plotted against threshold
14
30 July 2002IMSJohnstone/Silverman 14 Root mean square error plotted against threshold Much lower RMSE can be obtained for sparse signals with suitable thresholding Best threshold decreases as density increases The MML automatic choice of threshold is excellent
15
30 July 2002IMS Estimates obtained with optimal threshold
16
30 July 2002IMSJohnstone/Silverman 16 Theoretical Properties Characterize sparseness by n -1 | i | p p for some small p > 0 Among all signals with given energy (sum of squares), the sparsest are those with small l p norm For signals with this level of sparsity, best possible estimation MSE is O( p |log | (2-p)/2 )
17
30 July 2002IMSJohnstone/Silverman 17 Automatic adaptivity MML thresholding method achieves this best mean square error rate, without telling it p or , all the way down to p=0 Price to pay is an additional O(n -1 log 3 n) term Result also works if error is measured in q- norm, for 0 < q 2
18
30 July 2002IMSJohnstone/Silverman 18 Adaptivity for standard wavelet transform Assume MML method is applied level by level Assume array of coefficients lies in some Besov class with 0 < p 2; allows for a very wide range of function classes, including very inhomogeneous Allow mean q-norm error Apart from an O(n -1 log 4 n) term, achieve minimax rate regardless of parameters.
19
30 July 2002IMSJohnstone/Silverman 19 Conclusion to this part Empirical Bayes thresholding has great promise as an adaptive method Wavelets are only one of many contexts where this approach can be used Bayesian aspects have not been considered much in practical contexts; if you want 95% posterior probability that a feature is there, you just increase the threshold
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.