Bayesian Data Analysis in R

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Lecture 11 (Chapter 9).
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Binomial Distribution & Bayes’ Theorem. Questions What is a probability? What is the probability of obtaining 2 heads in 4 coin tosses? What is the probability.
Model Assessment, Selection and Averaging
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Visual Recognition Tutorial
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Chapter 11 Multiple Regression.
G. Cowan Lectures on Statistical Data Analysis Lecture 14 page 1 Statistical Data Analysis: Lecture 14 1Probability, Bayes’ theorem 2Random variables and.
Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Bayesian Inference Using JASP
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Simple examples of the Bayesian approach For proportions and means.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Outline Sampling Measurement Descriptive Statistics:
Estimating standard error using bootstrap
Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.
Applied statistics Usman Roshan.
Data Analysis Patrice Koehl Department of Biological Sciences
Psych 231: Research Methods in Psychology
26134 Business Statistics Week 5 Tutorial
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Bayesian data analysis
Applied Biostatistics: Lecture 2
CHAPTER 10 Comparing Two Populations or Groups
Data Mining Lecture 11.
Latent Variables, Mixture Models and EM
Binomial Distribution & Bayes’ Theorem
Lecture Slides Elementary Statistics Thirteenth Edition
CHAPTER 29: Multiple Regression*
More about Posterior Distributions
CHAPTER 3 Describing Relationships
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical NLP: Lecture 4
CHAPTER 10 Comparing Two Populations or Groups
PSY 626: Bayesian Statistics for Psychological Science
Psych 231: Research Methods in Psychology
PSY 626: Bayesian Statistics for Psychological Science
Psych 231: Research Methods in Psychology
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Inferential Statistics
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.
CHAPTER 10 Comparing Two Populations or Groups
Sampling Distributions (§ )
CHAPTER 3 Describing Relationships
CHAPTER 10 Comparing Two Populations or Groups
Mathematical Foundations of BME Reza Shadmehr
CHAPTER 3 Describing Relationships
Presentation transcript:

Bayesian Data Analysis in R

Welcome

Overview Bayesian Statistics The BayesFactor package ANOVAs Linear regression models Mixed effects models with brms Specifying custom models with JAGS FACULTY RESEARCH OFFICE | HUMAN SCIENCES

FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Likelihoods and Bayes Factors

Using data to revise beliefs P(H): The prior. My belief in the hypothesis before seeing the data. P(D|H): The likelihood. The probability of observing the data if my hypothesis is true. P(H|D): The posterior. My revised belief in the hypothesis after seeing the data. P(H) X P(D|H) P(H|D) = P(D) P(D): The marginal likelihood. The probability of observing the data, regardless which hypothesis is true. FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Using likelihood ratios Comparing the evidence for two models The likelihood quantifies the evidence the observed data provide for a hypothesis. Compare the likelihoods of two competing hypotheses to determine which one is more likely (given the observed data). Value of likelihood is meaningless by itself. https://humburg.shinyapps.io/likelihood/ Figure based on https://alexanderetz.com/2015/04/15/understanding-bayes-a-look-at-the-likelihood/

Updating priors with likelihoods Prior encodes initial belief in possible parameter values (probability of Heads) Likelihood quantifies the plausibility of different hypotheses in light of the available data. The two combine to produce an updated set of beliefs, the posterior. https://humburg.shinyapps.io/updating_prior/ Figure based on https://alexanderetz.com/2015/04/15/understanding-bayes-a-look-at-the-likelihood/ FACULTY RESEARCH OFFICE | HUMAN SCIENCES

What about the marginal likelihood? Can be difficult to compute. Requires integrating over all possible hypotheses, weighted by their priors. It is a normalizing constant that can often be avoided. FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Bayes Factor The likelihood ratio compares two point hypotheses. What if we have more complex hypotheses? Figure based on https://alexanderetz.com/2015/08/09/understanding-bayes-visualization-of-bf/ FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Bayes Factor The likelihood ratio compares two point hypotheses. What if we have more complex hypotheses? The coin is fair vs it isn’t fair If the coin is biased, how likely are all the possible probabilities for heads? Compare the likelihood that the coin is fair to the average of the likelihoods corresponding to all possible values of P(heads) weighted by their priors. https://humburg.shinyapps.io/bayes_factor/ Figure taken from https://alexanderetz.com/2015/08/09/understanding-bayes-visualization-of-bf/ FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Bayes Factor 𝑃( 𝐻 1 |𝐷) 𝑃( 𝐻 0 |𝐷) = 𝑃( 𝐻 1 ) 𝑃( 𝐻 0 ) × 𝑃( 𝐷|𝐻 1 ) 𝑃( 𝐷|𝐻 0 ) Bayes factors quantify relative weight of evidence. Can be used to decide in favour of one hypothesis over another based on evidence provided by the data. Will converge to either 0 or ∞, depending on which hypothesis is true. Beware small samples! Bayes Factor (H1/H0) Strength of evidence < 1 Evidence favours null 1 - 3 Barely worth mentioning 3 - 10 Substantial 10 - 30 Strong 30 - 100 Very strong > 100 Decisive Does this remind you of anything? FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Beyond Bayes Factors Point estimates and Credible intervals Bayes factors summarise evidence in support of two competing models. The posterior holds information about plausible parameter values. The posterior can tell us what effects are credible after considering the data. Can construct credible intervals that include 95% of the probability mass such that all values within the interval have higher density than those outside (HDI). The mode of the posterior provides a point estimate (MAP). Kruschke, J. K. (2014). Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2nd Edition. Academic Press / Elsevier.

The BayesFactor package Bayesian Hypothesis Testing with R

Before we get started Ensure you have these R packages installed BayesFactor brms rjags HDInterval All datasets are available from https://humburg.github.io/bayesian-data-analysis-in-r/data.html

BayesFactor Overview Provides a convenient interface for commonly used models Bayesian versions of typical frequentist tests Computes Bayes factors Can sample from the posterior to obtain parameter estimates > library(BayesFactor) > bf <- ttestBF() > bf <- anovaBF() > bf <- lmBF() ... > summary(posterior(bf))

Bayesian ANOVA General Concepts Model observation 𝑦 𝑖 as normally distributed around mean 𝜇 𝑖 with standard deviation 𝜎 𝑦 . 𝜇 𝑖 = 𝛽 0 + 𝑗 𝛽 𝑗 𝑥 𝑖𝑗 Hierarchical prior structure. This model mimics frequentist ANOVA in assuming equal variances. Could allow for different variances by modifying prior structure. The distribution of 𝑦 𝑖 does not have to be normal. Copyright © Kruschke, J. K. (2014). Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2nd Edition. Academic Press / Elsevier. FACULTY RESEARCH OFFICE | HUMAN SCIENCES

BayesFactor ANOVA JZS model Standard approach developed by Jeffreys, Zellner, and Siow. Allows for heterogeneous variances. Value of g can be estimated from the data. No analytical solution for Bayes factor. Numeric approximation relatively easy. Example data: https://humburg.github.io/bayesian-data-analysis-in-r/data/anova_data.csv FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Let’s try this FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Linear Regression General Concepts Model observation 𝑦 𝑖 as distributed around mean 𝜇 𝑖 with standard deviation 𝜎 𝑦 . Can use t distribution to increase robustness towards outliers. Use of hyper prior on slope coefficients. FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Linear Regression with BayesFactor JZS Standard approach developed by Jeffreys, Zellner, and Siow. Models variance of slope coefficients as Inverse Gamma. Example data: https://humburg.github.io/bayesian-data-analysis-in-r/data/regression_data.csv FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Let’s try this FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Exercise: Weight and Height Weight (lbs) and height (in) of 14 – 20 year olds. Take a look at the data. Use BayesFactor to model weight as a function of height, age, and gender. Which model seems reasonable? What are the effects of the variables you included in the model? Dataset https://humburg.github.io/bayesian-data-analysis-in-r/data/weight_perception.csv FACULTY RESEARCH OFFICE | HUMAN SCIENCES

The brms package Bayesian mixed effect models

brms Overview Designed to fit mixed models. Supports variety of response distributions. Uses formula syntax similar to lme4. Models require compilation (slow). Can avoid repeated compilation with update(). Sampling required to approximate posterior may be slow. Use multiple cores as appropriate. https://github.com/paul-buerkner/brms https://xkcd.com/303/

Accuracy & Reaction Time An Example Simulated data of reaction time and accuracy. 16 participants provide 100 responses each. Single condition. Can model this as mixed effects logistic regression. Could fit this with lme4. Dataset: https://humburg.github.io/bayesian-data-analysis-in-r/data/mixed_sim1.csv 𝑅𝑒𝑠𝑝~𝑅𝑇+ 1 𝑆𝑏𝑗 Or 𝑅𝑒𝑠𝑝~𝑅𝑇+(𝑅𝑇|𝑆𝑏𝑗) D.J. Davidson, A.E. Martin, Modeling accuracy as a function of response time with the generalized linear mixed effects model, Acta Psychologica, 144:1 (2013). FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Exercise: Accuracy & Reaction Time Simulated data of reaction time and accuracy. 16 participants provide 100 responses each. Two conditions. Can model this as mixed effects logistic regression. Could fit this with lme4. Dataset: https://humburg.github.io/bayesian-data-analysis-in-r/data/mixed_sim2.csv 𝑅𝑒𝑠𝑝~𝑅𝑇+𝐶𝑜𝑛𝑑+ 1 𝑆𝑏𝑗 Or 𝑅𝑒𝑠𝑝~𝑅𝑇+𝐶𝑜𝑛𝑑+ 1 𝑆𝑏𝑗 + 0+𝑅𝑇 𝑆𝑏𝑗 + 0+𝐶𝑜𝑛𝑑 𝑆𝑏𝑗 𝑅𝑒𝑠𝑝~𝑅𝑇∗𝐶𝑜𝑛𝑑+ 1 𝑆𝑏𝑗 + 0+𝑅𝑇+𝐶𝑜𝑛𝑑 𝑆𝑏𝑗 𝑅𝑒𝑠𝑝~𝑅𝑇∗𝐶𝑜𝑛𝑑+ 1+𝑅𝑇∗𝐶𝑜𝑛𝑑 𝑆𝑏𝑗 D.J. Davidson, A.E. Martin, Modeling accuracy as a function of response time with the generalized linear mixed effects model, Acta Psychologica, 144:1 (2013). FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Specifying your own models Describing models with JAGS

What if you want a different model? Don’t have to use pre-defined classes of models. Can use jags or stan to specify models directly. Lots of flexibility. May take some getting used to. Jags uses language very similar to R. Can interact with jags via rjags package.

Example: linear regression with JAGS Use same data as before: https://humburg.github.io/bayesian-data-analysis-in-r/data/regression_data.csv Fit simple model including only treatment effect. FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Exercise: Linear regression with JAGS Continuing from previous example, add the covariate to the model. Start with separate priors for each coefficient. What happens if you use a single prior with priors on its parameters instead? FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Where to go from here? FACULTY RESEARCH OFFICE | HUMAN SCIENCES

Next steps Try to apply this to your own data. Simulations can be instructive. Further reading: More about brms: https://paul-buerkner.github.io/blog/brms-blogposts/ Andrew Gelman’s blog: https://andrewgelman.com/ John Kruschke’s blog: http://doingbayesiandataanalysis.blogspot.com/

May all your posteriors be large.