PSY 626: Bayesian Statistics for Psychological Science

PSY 626: Bayesian Statistics for Psychological Science
7/2/2018 Bayes Factors and more Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2016 Purdue University PSY200 Cognitive Psychology

Bayes Factor The ratio of likelihood for the data under the null compared to the alternative Nothing special about the null, it compares any two models Likelihoods are averages across different possible parameter values specified by the model by a prior distribution

What does it mean? Guidelines BF Evidence 1 – 3 Anecdotal
3 – Substantial 10 – Strong 30 – Very strong > Decisive

Proportions We already saw how to do binomial regression with the rethinking library in Stan In the NHST framework, sometimes use a chi-squared test to compare proportions There is a Bayesian equivalent that is implemented in the BayesFactor library

Decision making People seem to base their decisions on more than just the value of the final (expected) outcome state Statement 1 Imagine the country is preparing for the outbreak of an unusual disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimates of the consequences are as follows: If Program A is adopted, 200 people will be saved. If Program B is adopted, there is a one third probability that 600 people will be saved and a two thirds probability that no people will be saved.

Decision making People seem to base their decisions on more than just the value of the final (expected) outcome state Statement 2 Imagine the country is preparing for the outbreak of an unusual disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimates of the consequences are as follows: If Program A is adopted, 400 people will die. If Program B is adopted, there is a one third probability that nobody will die and a two thirds probability that 600 people will die.

Framing matters Statement 1 (n=32): Proportion choose A = 0.719
Big differences, even though expected outcomes are the same!

NHST analysis Create a contingency table: Statement1 Statement2
> DMdata<-read.csv(file="DecisionMaking.csv",header=TRUE,stringsAsFactors=FALSE) > > # Make a contingency table for BayesFactor > DMtable<- table(c("Program A", "Program B"), c("Statement1", "Statement2")) > # Assign counts to table > DMtable[1,1] = sum(DMdata$Problem1[DMdata$StatementType==1]) > DMtable[2,1] = length(DMdata$Problem1[DMdata$StatementType==1]) - sum(DMdata$Problem1[DMdata$StatementType==1]) > DMtable[1,2] = sum(DMdata$Problem1[DMdata$StatementType==2]) > DMtable[2,2] = length(DMdata$Problem1[DMdata$StatementType==2]) - sum(DMdata$Problem1[DMdata$StatementType==2]) > print(DMtable) Statement1 Statement2 Program A Program B

Chi-square test freq<- chisq.test(DMtable, correct=FALSE)
print(freq) Pearson's Chi-squared test data: DMtable X-squared = , df = 1, p-value =

BayesFactor library # load the BayesFactor library
library(BayesFactor) # run Bayesian analysis to test statement type # bf=contingencyTableBF(DMtable, sampleType = "indepMulti", fixedMargin = "cols") print(bf) Loading required package: coda Loading required package: Matrix ************ Welcome to BayesFactor If you have questions, please contact Richard Morey Type BFManual() to open the manual. Bayes factor analysis [1] Non-indep. (a=1) : ±0% Against denominator: Null, independence, a = 1 --- Bayes factor type: BFcontingencyTable, independent multinomial

Posterior Distribution
Iterations = 1:10000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE pi[1,1] pi[2,1] pi[1,2] pi[2,2] pi[*,1] pi[*,2] omega[1,1] omega[2,1] omega[1,2] omega[2,2] 2. Quantiles for each variable: 2.5% % 50% 75% 97.5% pi[1,1] pi[2,1] pi[1,2] pi[2,2] pi[*,1] pi[*,2] omega[1,1] omega[2,1] omega[1,2] omega[2,2] # Sample to estimate posterior distributions chains = posterior(bf, iterations = 10000) print(summary(chains)) plot(chains) Statement1 Statement2 Program A Program B

Rethinking: Null model
Estimate probability of picking Program A cleanNullData<- data.frame(Score=DMdata$Problem1) DMNullModel1 <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a, a ~ dnorm(0, 10) ), data= cleanNullData, iter=1e4, warmup=2000 ) > precis(DMNullModel1) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a > cat("Null probaility of choosing A=", logistic(aValueNull)) Null probaility of choosing A=

Rethinking: Alternative model
Different probabilities for different statements cleanAltData<- data.frame(StatementType=DMdata$StatementType, Score=DMdata$Problem1) DMaltModel1 <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a[StatementType], a[StatementType] ~ dnorm(0, 10) ), data= cleanAltData, iter=1e4, warmup=2000 ) > aValues<-c() > for(p in unique(cleanAltData$StatementType)){ + code<-sprintf("a[%d]", p) + aValues <- c(aValues, coef(DMaltModel1)[code]) + } > cat("Alt probability of respond A: ", logistic(aValues), "\n") Alt probability of respond A: > precis(DMaltModel1, depth=2) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a[1] a[2]

WAIC compare(DMNullModel1, DMaltModel1) WAIC pWAIC dWAIC weight SE dSE
DMaltModel NA DMNullModel Strong support for the alternative model Bayes factor analysis [1] Non-indep. (a=1) : ±0% Against denominator: Null, independence, a = 1 --- Bayes factor type: BFcontingencyTable, independent multinomial

Posteriors > post<-extract.samples(DMaltModel1)
> mean(logistic(post$a[,1])) [1] > sd(logistic(post$a[,1])) [1] > mean(logistic(post$a[,2])) [1] > sd(logistic(post$a[,2])) [1] > dens(logistic(post$a[,1])) > dens(logistic(post$a[,2]))

Using the posterior distribution
Any function of the posterior distribution also has a posterior distribution diff <- logistic(post$a[,1]) - logistic(post$a[,2]) dens(diff) > mean(diff) [1] What is the probability the difference of probabilities is bigger than 0.3? > cat("Probability that probability difference is >0.3=", length(diff[diff>0.3])/length(diff)) Probability that probability difference is >0.3=

Using the posterior distribution
You can do the same thing with the posterior distribution from the BayesFactor library > head(chains) Markov Chain Monte Carlo (MCMC) output: Start = 1 End = 7 Thinning interval = 1 pi[1,1] pi[2,1] pi[1,2] pi[2,2] pi[*,1] pi[*,2] omega[1,1] omega[2,1] omega[1,2] omega[2,2] [1,] [2,] [3,] [4,] [5,] [6,] [7,] > diffBF <- chains[,7] - chains[,9] > dens(diffBF) > cat("Probability that probability difference is >0.3=", length(diffBF[diffBF>0.3])/length(diffBF)) Probability that probability difference is >0.3=

Precision With different priors we get (slightly) different answers
Probability that probability difference is >0.3= Probability that probability difference is >0.3= There is no sense in which one of the answers is “better” Except in as much as one prior/model is better than another Bayesians are often skeptical about excessive decimal places because it implies a level of precision/certainty that is unjustified

BayesFactor vs. rethinking
They give similar answers for the situations we have considered Both use somewhat noninformative priors The default priors in the BayesFactor library are pretty reasonable, and it is easier to use than the rethinking package With more informative priors, the Bayes Factor calculation can produce very different interpretations of a set of data Bayes Factors are also essentially incomputable for some situations The analysis approach advocated by the rethinking package is more general You can build models with the rethinking library that are not possible with the BayesFactor library At least I cannot see how to do it

Serial Position Shown a sequence of 10 letters
Then click on the letters you just saw (have to make guess reports) 15 trials, 109 subjects

Typical data Primacy and recency effects
Where is the “bottom” of the curve?

Data file Structured to list the number correct items recalled for each position across the 15 trials Note, we have lost trial order information here. We could get it if we wanted, from the raw data files

Data frame SPdata<-read.csv(file="SerialPosition.csv",header=TRUE,stringsAsFactors=FALSE) ## Re-organize with one score for each row Participant<-c() Position<-c() Score<-c() # Loop through each participant for( i in c(1:109)){ # loop through each position for(j in c(1:10)){ Participant<-c(Participant, SPdata$Participant[i]) Position<-c(Position, j) Score<-c(Score, SPdata[i, (j+1)]) } SPdata2 <- data.frame(Participant<-Participant, Position<-Position, Score<-Score) colnames(SPdata2) <-c("Participant", "Position", "Score")

Simple model Different estimated probability for each serial position
SPsimple <- map2stan( alist( Score ~ dbinom(15, p), logit(p) <- a[Position], a[Position] ~ dnorm(0, 10) ), data= SPdata2, iter=1e4, warmup=2000 ) > precis(SPsimple, depth=2) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] a[10]

Extracting details aValuesSimple<-c()
for(p in unique(SPdata2$Position)){ code<-sprintf("a[%d]", p) aValuesSimple <- c(aValuesSimple, coef(SPsimple)[code]) } cat("Position probability of correct response: ", logistic(aValuesSimple)) sp.sequence <- c(1:10) plot(sp.sequence, logistic(aValuesSimple)) logitPEst<-link(SPsimple, data=data.frame(Position=sp.sequence)) logitPEst.mean <- apply(logitPEst, 2, mean) logitPEst.HPDI <-apply(logitPEst, 2, HPDI, prob=0.89) lines(sp.sequence, logitPEst.mean) shade(logitPEst.HPDI, sp.sequence)

Serial position curve Where is the bottom of the curve?
Data suggests position 7 Simple model suggests position 7 Lots of uncertainty

Fit a quadratic Define the bottom of the curve as a parameter to be estimated: c Has to be some intermediate value (2—9) SPcurve <- map2stan( alist( Score ~ dbinom(15, p), logit(p) <- a + b*(Position-c)*(Position-c), a ~ dnorm(0, 10), b ~ dnorm(0, 10), c ~ dunif(2, 9) ), data= SPdata2, iter=1e4, warmup=2000 ) > precis(SPcurve) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a b c

Fit a shape Predict performance from the model
# Plot points for data means dataMeans<-c() for(j in c(1:10)){ subData <- subset(SPdata2, SPdata2$Position==j) dataMeans<-c(dataMeans, mean(subData$Score)) } plot(sp.sequence, dataMeans/15) # Add model curve fit and HPDI logitPEst<-link(SPcurve, data=data.frame(Position=sp.sequence)) logitPEst.mean <- apply(logitPEst, 2, mean) logitPEst.HPDI <-apply(logitPEst, 2, HPDI, prob=0.89) lines(sp.sequence, logitPEst.mean) shade(logitPEst.HPDI, sp.sequence)

Posterior > post <- extract.samples(SPcurve)
> cat("Bottom or curve posterior: Mean=", mean(post$c), " sd=", sd(post$c)) Bottom or curve posterior: Mean= sd= > dens(post$c)

Model comparision > compare(SPcurve, SPsimple)
WAIC pWAIC dWAIC weight SE dSE SPcurve NA SPsimple Favors the quadratic model

Conclusions Both the rethinking and BayesFactors libraries are useful
BayesFactor: Uses default priors that are fairly reasonable Non-informative That’s a strength (don’t have to think about it) That’s a weakness (cannot consider some kinds of models) Rethinking: More flexible That’s a strength (can fit specialized models) That’s a weakness (sometimes don’t know how to set priors) For both: you can extract posterior information That’s they key benefit of a Bayesian approach!

PSY 626: Bayesian Statistics for Psychological Science

Similar presentations

Presentation on theme: "PSY 626: Bayesian Statistics for Psychological Science"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PSY 626: Bayesian Statistics for Psychological Science

Similar presentations

Presentation on theme: "PSY 626: Bayesian Statistics for Psychological Science"— Presentation transcript:

Similar presentations

About project

Feedback