Download presentation
1
Presenting: Assaf Tzabari
Bayesian Inference Presenting: Assaf Tzabari
2
Agenda Basic concepts Conjugate priors Generalized Bayes rules
Empirical Bayes Admissibility Asymptotic efficiency
3
Basic concepts - unknown parameter with prior density
x - random vector with density Joint density of x and q : marginal density of x : Posterior density of q :
4
Basic concepts (cont.) Elements of a decision problem:
- the set of all possible decisions - loss function defined for all - decision rule Risk function : Bayes risk function :
5
Basic concepts (cont.) A Bayes rule is a decision rule which minimizes
A Bayes rule can be found by choosing, for each x, an action which minimizes the posterior expected loss: or, equivalently, which minimizes:
6
Basic concepts (cont.) Example: Bayesian estimation under MSE
7
Conjugate priors Definition: A class
of prior distributions is a conjugate family for if for all Example: the class of normal priors is a conjugate family for the class of normal sample densities,
8
Using conjugate priors
Step 1: Find a conjugate prior Choose a class with the same form as the likelihood functions Step 2: Calculate the posterior Gather the factors involving in
9
Using conjugate priors (cont.)
Example: Finding conjugate prior for the Poisson distribution x=(x1,…, xn ) where xi~P(q) are iid, Factors fit to gamma distribution of q
10
Using conjugate priors (cont.)
Example (cont.): Finding conjugate prior for the Poisson distribution The Bayes estimator under MSE is then, 1 2 3 4 5 6 7 8 9 10 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 a=1, b=2 a=2, b=2 a=3, b=2 a=10, b=0.5 q p(q) Gamma The ML estimator is
11
Using conjugate priors (cont.)
More conjugate priors for common statistical distributions: Binomial x~b(p,n) and Beta prior 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.5 2 2.5 3 3.5 a=2,b=2 a=0.5,b=0.5 a=2,b=5 a=5,b=2 Beta p(q) q
12
Using conjugate priors (cont.)
Uniform iid x=(x1,…, xn ) , xi ~U(0,q) and Pareto prior 0.5 1 1.5 2 2.5 3 3.5 4 a=1 a=2 a=3 Pareto p(q) q
13
Conjugate priors (cont.)
Advantages Easy to calculate Intuitive Useful for sequential estimation Can a conjugate prior be a reasonable approximation to the true prior? Not always!
14
Conjugate priors (cont.)
Example: estimating q under MSE based on x~N(q,1) Step 1: subjectively determine a-fractiles a point z(a) is defined a-fractile if Step 2: look for matching prior and find Bayes estimator Only p1 is conjugate prior, but which is a better estimator ?
15
Improper priors Improper prior – a prior with infinite mass
Bayes risk has no meaning The posterior function usually exists Useful in the following cases: Prior information is not available (noninformative priors are usually improper) The parameter space is restricted
16
Generalized Bayes rules
Definition: If p(q) is an improper prior, a generalized Bayes rule, for given x, is an action which minimizes or, if , which minimizes the posterior expected loss. Example: estimating q >0 under MSE based on
17
Generalized Bayes rules (cont.)
-10 -5 5 1 2 3 4 6 Bayes estimator for s=2 Bayes estimator for s=1 Bayes estimator for s=1/2 ML estimator x
18
Generalized Bayes rules (cont.)
Generalized Bayes rules are useful in solving problems which don’t include prior information Example: Location parameter estimation under L(a-q) fx|q is a location density with location parameter q if fx|q =f(x-q) Using p(q) =1 we get,
19
Generalized Bayes rules (cont.)
Example (cont.): Location parameter estimation under L(a-q) The generalized Bayes rule has the form, This is a group of invariant rules, and the best invariant rule is the generalized Bayes rule with the prior p(q) =1
20
Generalized Bayes rules (cont.)
Example (cont.): Location parameter estimation under L(a-q) Under MSE d(x) is the posterior mean, for x=(x1,…,xn) , Pitman’s estimator is derived:
21
Empirical Bayes Development of Bayes rules using auxiliary empirical (past or current) data Methods: Using past data in constructing the prior Using past data in estimating the marginal distribution Dealing simultaneously with several decision problems Xn+1 - sample information with density x1 ,…,xn - past observations with densities
22
Determination of the prior from past data
Assumption: q1 ,…,qn ,qn+1 - parameters from a common prior p(q) - conditional mean and variance of xi - marginal mean and variance of xi Lemma 1: Result 1:
23
Determination of the prior from past data (cont.)
Step 1: Assume a certain functional form for p Conjugate family of priors is convenient Step 2: Estimate mp , sp2 based on x1,…, xn Xn+1 can be included too If mf(q)=q and sf2 is constant then: Step 2a: Estimate mm , sm2 from the data E.g. Step 2b: Use result 1. to calculate mp , sp2
24
Determination of the prior from past data (cont.)
Example: and p(q) is assumed to be normal (conjugate prior). Estimation of mp and sp2 is needed for determining the prior.
25
Estimation of marginal distribution from past data
Assumption: The Bayes rule can be represented in terms of m(x) Advantage: No need to estimate the prior Advantage: No need to estimate the prior Step 1: Estimate m(x) x1,…, xn , xn+1 are a sample from the distribution with density m(x) E.g. in the discrete case, Step 2: Estimate the Bayes rule, using
26
Estimation of marginal distribution from past data (cont.)
Example: The Bayes estimation of qn+1 when ,under MSE.
27
Compound decision problems
Independent x1,…, xn are observed, where qi are from a common prior p(q) Goal: simultaneously make decisions involving q1 ,…,qn The loss is L(q1 ,…,qn,a) Solution: Determine the prior from x1,…, xn using empirical Bayes methods
28
Admissibility of Bayes rules
Bayes rules with finite Bayes risk are typically admissible: If a Bayes rule, dp is unique then it is admissible E.g. Under MSE the Bayes rule is unique Proof: Any rule R-better than dp must be a Bayes rule itself For discrete q , assuming that p is positive, dp is admissible For continuous q , if R(q,d) is continuous in q for every d then dp is admissible
29
Admissibility of Bayes rules (cont.)
Generalized Bayes rules can be inadmissible and the verification of their admissibility can be difficult. Example: generalized Bayes estimator of q based on versus the James-Stein estimator
30
Admissibility of Bayes rules (cont.)
Example (cont.): generalized Bayes estimator of q versus the James-Stein estimator
31
Admissibility of Bayes rules (cont.)
Theorem: If x is continues with p-dimensional exponential density and Q is closed, then any admissible estimator is a generalized Bayes rule fx|q is a p-dimensional exponential density if, E.g. The normal distribution
32
Asymptotic efficiency of Bayes estimators
x1 ,…,xn are iid samples with density f(xi|q) Definitions: Estimator dn(x1,…,xn) of q is defined asymptotically unbiased if, Asymptotically unbiased estimator is defined asymptotically efficient if, v(q) – asymptotic variance I(q) – Fisher information in a single sample
33
Asymptotic efficiency of Bayes estimators (cont.)
Assumptions for the next theorems The posterior is a proper continues and positive density, and The prior can be improper! The likelihood function l(q)=f(x|q) satisfies regularity conditions
34
Asymptotic efficiency of Bayes estimators (cont.)
Theorem: For large values of n, the posterior distribution is approximately – Conclusion: Bayes estimators such as the posterior mean are asymptotic unbiased The effect of the prior declines as n increases
35
Asymptotic efficiency of Bayes estimators (cont.)
Theorem: If dn is the Bayes estimator under MSE then, Conclusion: The Bayes estimator dn under MSE is asymptotically efficient
36
Asymptotic efficiency of Bayes estimators (cont.)
Example: estimator of p based on binomial sample x~b(p,n) under MSE
37
Asymptotic efficiency of Bayes estimators (cont.)
If the prior is concentrated it determines the estimator, “Don’t confuse me with the facts!” a=b=2 a=b=2000 1 10 20 30 40 50 60 70 80 90 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 x x 10 20 30 40 50 60 70 80 90 100 ML estimator dp(x) Bayes estimator
38
Asymptotic efficiency of Bayes estimators (cont.)
For large sample, the Bayes estimator tends to become independent of the prior n=1000 n=10 9 10 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 300 400 500 600 700 800 900 1000 100 200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x x ML estimator dp(x) Bayes estimator when a=b=2
39
Asymptotic efficiency of Bayes estimators (cont.)
More examples of asymptotic efficient Bayes estimators Location distributions: if the likelihood function l(q)=f(x-q) satisfies the regularity conditions, then the Pitman estimator after one observation is asymptotically efficient Exponential distributions: if then it satisfies the regularity conditions, and the asymptotic efficiency depends on the prior
40
Conclusions Bayes rules are designed for problems with prior information, but useful in other cases as well Determining the prior is a crucial step, which affects the admissibility and the computational complexity Bayes estimators, under MSE, performs well on large sample
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.