1 Bayesian Essentials Slides by Peter Rossi and David Madigan
2 Distribution Theory 101 Marginal and Conditional Distributions: X Y 1 1 uniform
3 Simulating from Joint To draw from the joint: i. Draw from marginal on X ii. Condition on this draw, and draw from conditional of Y|X library(triangle) x <- rtriangle(NumDraws,0,1,1) y <- runif(NumDraws,0,x) plot(x,y)
4 Triangular Distribution If U~ unif(0,1), then: sqrt(U) has the standard triangle distribution If U1, U2 ~ unif(0,1), then: Y=max{U1,U2} has the standard triangle distribution
Sampling Importance Resampling 5 f g draw a big sample from g sub-sample from that sample with probability f/g
Metropolis 6 start with current = 0.5 to get the next value: draw a “proposal” from g keep with probability f(proposal)/f(current) else keep current f g
7 The Goal of Inference Make inferences about unknown quantities using available information. Inference -- make probability statements unknowns -- parameters, functions of parameters, states or latent variables, “future” outcomes, outcomes conditional on an action Information – data-based non data-based theories of behavior; subjective views; mechanism parameters are finite or in some range
8 p(θ|D) α p(D| θ) p(θ) Posterior α “Likelihood” × Prior Modern Bayesian computing– simulation methods for generating draws from the posterior distribution p(θ|D). Bayes theorem
9 Summarizing the posterior Output from Bayesian Inference: A possibly high dimensional distribution Summarize this object via simulation: marginal distributions of don’t just compute Contrast with Sampling Theory: point est/standard error summary of irrelevant dist bad summary (normal) Limitations of asymptotics
10 Metropolis Start somewhere with θ current To get the next value, generate a proposal θ proposal Accept with “probability”: else keep currrent
11 Example Believe these measurements (D) come from N(μ,1): Prior for μ? p(μ) = 2μ
12 Example continued p(D|μ)? y 1,…,y 10 switch to R… other priors? unif(0,1), norm(0,1), norm(0,100) generating good candidates?
13 Prediction See D, compute : “Predictive Distribution” future observable
14 Bayes/Classical Estimators Prior washes out – locally uniform!!! Bayes is consistent unless you have dogmatic prior.
15 Bayesian Computations Before simulation methods, Bayesians used posterior expectations of various functions as summary of posterior. If p(θ|D) is in a convenient form (e.g. normal), then I might be able to compute this for some h.
16 Conjugate Families Models with convenient analytic properties almost invariably come from conjugate families. Why do I care now? - conjugate models are used as building blocks - build intuition re functions of Bayesian inference Definition: A prior is conjugate to a likelihood if the posterior is in the same class of distributions as prior. Basically, conjugate priors are like the posterior from some imaginary dataset with a diffuse prior.
17 Beta-Binomial model Need a prior!
18 Beta distribution
19 Posterior
20 Prediction
21 Regression model
22 Bayesian Regression Prior: Inverted Chi-Square: Interpretation as from another dataset. Draw from prior?
23 Posterior
24 Combining quadratic forms
25 Posterior
26 IID Simulations 3) Repeat 1) Draw [ 2 | y, X] 2) Draw [ | 2,y, X] Scheme: [y|X, , 2 ] [ | 2 ] [ 2 ] [ , 2 |y,X] [ 2 | y,X] [ | 2,y,X]
27 IID Simulator, cont.