PSY 626: Bayesian Statistics for Psychological Science 11/18/2018 Bayesian style ANOVA Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2018 Purdue University PSY200 Cognitive Psychology
Smiles and Leniency LaFrance & Hect (1995): vignette describes a person committing some minor infraction. A photo shows the person with different smile types (or neutral) The question is how lenient subjects are in punishing the person in the vignette Each subject contributes one score
Standard analysis Summary statistics, One-way ANOVA
Standard analysis Contrasts to compare neutral versus each other condition
Bayesian variation of ANOVA We start with default priors (uniform, improper) Follow along by downloading “SmilesLeniency.csv” and “SmilesLeniency1.R” from the class web site # load data file SLdata<-read.csv(file="SmilesLeniency.csv",header=TRUE,stringsAsFactors=TRUE) > head(SLdata) SmileType Leniency 1 False 2.5 2 False 5.5 3 False 6.5 4 False 3.5 5 False 3.0 6 False 3.5
Define the model ANOVA is just linear regression library(brms) This makes it a bit awkward to interpret the results First level of SmileType (False) becomes the intercept Every other condition is a slope relative to that intercept library(brms) model1 = brm(Leniency ~ SmileType, data = SLdata, iter = 2000, warmup = 200, chains = 3, thin = 2 )
Output print(summary(model1)) Prints out a table of estimates of Intercept (False SmileType) and deviations for other conditions Family: gaussian Links: mu = identity; sigma = identity Formula: Leniency ~ SmileType Data: SLdata (Number of observations: 136) Samples: 3 chains, each with iter = 2000; warmup = 200; thin = 2; total post-warmup samples = 2700 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 5.37 0.28 4.81 5.95 2470 1.00 SmileTypeFelt -0.46 0.40 -1.24 0.30 2442 1.00 SmileTypeMiserable -0.46 0.40 -1.25 0.28 2464 1.00 SmileTypeNeutral -1.24 0.40 -2.01 -0.41 2576 1.00 Family Specific Parameters: sigma 1.64 0.10 1.46 1.87 2700 1.00 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
Output # Check on model convergence and posteriors plot(model1) Looks good!
Output # Plot model fit to data dev.new() plot(marginal_effects(model1), points = TRUE)
Like Tukey Look at differences between conditions # Compare each condition to every other condition (simlar to Tukey) mcmc = model1 coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(x = levels(SLdata$SmileType)) # A Tukeys contrast matrix library(multcomp) # table(newdata$x) - gets the number of replicates of each level tuk.mat <- contrMat(n = table(newdata$x), type = "Tukey") Xmat <- model.matrix(~x, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat # plot posteriors of differences library(bayesplot) dev.new() mcmc_areas(coefs %*% t(pairwise.mat))
HDPI You might be interested in the Highest Density Posterior Interval (HPDI) for the differences of means # Generate HPDIs for differences comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval") print(comps) term estimate std.error conf.low conf.high 1 Felt - False -0.458715814 0.3984868 -1.1789537 0.35172475 2 Miserable - False -0.462241088 0.3998117 -1.2402750 0.29442586 3 Neutral - False -1.239229116 0.4043405 -2.0430631 -0.47332554 4 Miserable - Felt -0.003525275 0.3985073 -0.7327741 0.78471045 5 Neutral - Felt -0.780513302 0.4038958 -1.5626832 -0.01610910 6 Neutral - Miserable -0.776988028 0.4006923 -1.5729656 -0.03355703
Plot HDPI I find I have to run this directly from the command line rather than through the file dev.new() ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
ANOVA versus Bayesian Note, we get the same pattern of results either way It looks like the Neutral condition is different from each of the other conditions E.g., it is tempting to note that the HDPI of the Neutral-Felt difference does not include 0 significance?! Wait, we can do better Don’t we worry about multiple comparisons? Not really The Bayesian analysis is not about making decisions, it is about estimating parameter values, given the prior and the data You might make Type I errors, but you cannot control the Type I error rate, anyhow
Posteriors You can compute things that are not possible with the standard ANOVA Probability that the mean Leniency rating for a False smile is bigger than the mean Leniency rating for a Miserable smile # Extract posteriors from samples post<-posterior_samples(model1) # Compute probability that mean leniency rating is bigger for a False smile than for a Miserable smile FalseLeniency <- post$b_Intercept MiserableLeniency<- post$b_Intercept + post$b_SmileTypeMiserable difLeniency <- FalseLeniency - MiserableLeniency length(difLeniency[difLeniency >0])/length(difLeniency) [1] 0.8703704 >
Using priors model2 = brm(Leniency ~ SmileType, data = SLdata, iter = 2000, warmup = 200, chains = 3, thin = 2, prior = c(prior(normal(0, 10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0, 5), class = "sigma")) ) Family: gaussian Links: mu = identity; sigma = identity Formula: Leniency ~ SmileType Data: SLdata (Number of observations: 136) Samples: 3 chains, each with iter = 2000; warmup = 200; thin = 2; total post-warmup samples = 2700 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 5.36 0.28 4.83 5.90 2585 1.00 SmileTypeFelt -0.45 0.40 -1.21 0.33 2530 1.00 SmileTypeMiserable -0.45 0.40 -1.24 0.31 2183 1.00 SmileTypeNeutral -1.24 0.40 -2.01 -0.43 2423 1.00 Family Specific Parameters: sigma 1.64 0.10 1.45 1.86 2561 1.00
Using priors model2 = brm(Leniency ~ SmileType, data = SLdata, iter = 2000, warmup = 200, chains = 3, thin = 2, prior = c(prior(normal(0, 10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0, 5), class = "sigma")) ) Class activity: try to “break” the analysis by using crazy priors
Conclusions Bayesian ANOVA No multiple comparisons (because no decisions, yet) Priors