PSY 626: Bayesian Statistics for Psychological Science

PSY 626: Bayesian Statistics for Psychological Science
1/16/2019 Binomial regression Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2018 Purdue University PSY200 Cognitive Psychology

Zenner Cards Guess which card appears next:

Data Score indicates whether you predicted correctly (1) or not (0)
File ZennerCards.csv contains the data for 22 participants # load full data file ZCdata<-read.csv(file="ZennerCards.csv",header=TRUE,stringsAsFactors=FALSE)

Binomial model yi is number of observed outcomes (e.g., correct responses) from n draws when the probability of a correct response is pi We know n for any trial is 1 We estimate pi It is convenient to actually estimate the logit of pi

Binomial regression We will model the logit as a linear equation
Among other things, this insures that our pi value is always between 0 and 1 (as all probabilities must be) Makes it easier to identify priors Our model posterior will gives us logit(pi), to get back pi we have to invert (plogis function)

Model set up No independent variables
# Estimate probability of correct model1 = brm(Score ~ 1, data = ZCdata, family="binomial") print(summary(model1)) Family: binomial Links: mu = logit Formula: Score ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept

Odd warning No independent variables
> model1 = brm(Score ~ 1, data = ZCdata, family="binomial") Using the maximum response value as the number of trials. Only 2 levels detected so that family 'bernoulli' might be a more efficient choice. Compiling the C++ model Start sampling SAMPLING FOR MODEL 'binomial brms-model' NOW (CHAIN 1). Gradient evaluation took seconds 1000 transitions using 10 leapfrog steps per transition would take 3.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup)

Odd warning Data can be coded as 0 – 1 or as a frequency
For the latter, brm needs to know how many trials so that it can judge how many successes (proportion) The number of trials might vary across reported frequencies, so this should be provided for each score In our case it is 1 trial for each score model1 = brm(Score|trials(1) ~ 1, data = ZCdata, family="binomial") Only 2 levels detected so that family 'bernoulli' might be a more efficient choice. Compiling the C++ model Start sampling SAMPLING FOR MODEL 'binomial brms-model' NOW (CHAIN 1). Gradient evaluation took seconds 1000 transitions using 10 leapfrog steps per transition would take 3.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept

Model results Convert posterior logit to proportion
# compute means of posteriors post<-posterior_samples(model1) props<-plogis(post$b_Intercept) dev.new() plot(density(props))

Model results Can quickly answer some questions such as
Probability that the success rate is better than pure chance (0.2)? betterThanChance <-length(props[props > 0.2])/length(props) cat("Probability that success rate is better than pure chance = ", betterThanChance, "\n") Probability that success rate is better than pure chance =

Skeptical model Set a prior to be tight around 0.2
Have to set it for the logistic of 0.2 # Skeptical prior lgSkeptical = qlogis(0.2) stanvars <-stanvar(lgSkeptical, name='lgSkeptical') prs <- c(prior(normal(lgSkeptical, 0.01), class = "Intercept") ) model2 = brm(Score|trials(1) ~ 1, data = ZCdata, family="binomial", prior = prs, stanvars=stanvars ) print(summary(model2)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept

Model results Convert posterior logit to proportion
# compute means of posteriors post<-posterior_samples(model1) props<-plogis(post$b_Intercept) dev.new() plot(density(props))

Model results Can quickly answer some questions such as
Probability that the success rate is better than pure chance (0.2)? betterThanChance <-length(props[props > 0.2])/length(props) cat("Probability that success rate is better than pure chance = ", betterThanChance, "\n") Probability that success rate is better than pure chance = > mean(props) [1]

Differences across actual cards
Maybe you suspect that some cards are more easily guessed than other cards model3 = brm(Score|trials(1) ~ 0 + ActualCard, data = ZCdata, family="binomial") Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + ActualCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat ActualCardCircle ActualCardCross ActualCardSquare ActualCardStar ActualCardWavy

Plot of marginal means converts logits to proportions dev.new() plot(marginal_effects(model3) )

We might want to compare predicted performance of a model that considers different guess rates for different cards against a model that treats all cards equally A model that uses different probabilities for different cards is expected to do better than a model that ignores card type. Does this suggest that people have some kind of predictive power that works for some cards and not for other cards? print(model_weights(model1, model3, weights=”waic")) model1 model3

Differences in guessed cards
It might be better to see if participant’s guesses improve model fit model4 = brm(Score|trials(1) ~ 0 + GuessedCard, data = ZCdata, family="binomial") print(summary(model4)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GuessedCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GuessedCardCircle GuessedCardCross GuessedCardSquare GuessedCardStar GuessedCardWavy

Differences across guessed cards
Plot of marginal means converts logits to proportions dev.new() plot(marginal_effects(model4) )

Differences across cards
We might want to compare predicted performance of models that: considers different success rates for different cards treats all cards equally Considers different success rates for different guesses A model that uses different probabilities for different cards is expected to do better than a model that ignores card type or a model that is based on guessed cards. print(model_weights(model1, model3, model4, weights=”waic")) model1 model3 model4

Interaction of cards and guesses
Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GuessedCard * ActualCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GuessedCardCircle GuessedCardCross GuessedCardSquare GuessedCardStar GuessedCardWavy ActualCardCross ActualCardSquare ActualCardStar ActualCardWavy GuessedCardCross:ActualCardCross GuessedCardSquare:ActualCardCross GuessedCardStar:ActualCardCross GuessedCardWavy:ActualCardCross GuessedCardCross:ActualCardSquare GuessedCardSquare:ActualCardSquare GuessedCardStar:ActualCardSquare GuessedCardWavy:ActualCardSquare GuessedCardCross:ActualCardStar GuessedCardSquare:ActualCardStar GuessedCardStar:ActualCardStar GuessedCardWavy:ActualCardStar GuessedCardCross:ActualCardWavy GuessedCardSquare:ActualCardWavy GuessedCardStar:ActualCardWavy GuessedCardWavy:ActualCardWavy Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1). Warning messages: 1: The model has not converged (some Rhats are > 1.1). Do not analyse the results! We recommend running more iterations and/or setting stronger priors. 2: There were 3875 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See Interaction of cards and guesses Maybe subjects are more successful for some combination of cards and guesses? Duh! Type Circle, guess Circle will be 100% correct model5 = brm(Score|trials(1) ~ 0 + GuessedCard*ActualCard, data = ZCdata, family="binomial") print(summary(model5))

Interaction If we ignore the warning about model convergence
And don’t notice that the interaction model is stupid It looks like the interaction model does great! Indeed, it has to do great, because it has all the answers Using both guessed and actual cards does not make sense for a model > print(model_weights(model1, model2, model3, model4, model5, weights="waic")) model model model model model5 e e e e e+00

Effect of trial Success could be related to trial (information available for later trials) Success rate could be higher for later trials model6 = brm(Score|trials(1) ~ 0 + Trial, data = ZCdata, family="binomial") print(summary(model6)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + Trial Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Trial

Effect of trial Plot of marginal means converts logits to proportions
Opposite of information availability! dev.new() plot(marginal_effects(model6) )

Effect of trial Compare to previous models (except for stupid model5)
Trial does not help the model at all The best model is the one that supposes the success rate is almost precisely 0.2 (guess rate) The model that considers variation across actual cards is almost as good Trading off flexibility for fit > print(model_weights(model1, model2, model3, model4, model6, weights="waic")) model model model model model6 e e e e e-20

Effect of subject Could be that subjects have different success rates
Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + ParticipantFactor Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor ParticipantFactor Effect of subject Could be that subjects have different success rates # By default, participant numbers are treated as _numbers_. Need to correct that. ZCdata$ParticipantFactor = factor(ZCdata$Participant) model7 = brm(Score|trials(1) ~ 0 + ParticipantFactor, data = ZCdata, family="binomial") print(summary(model7)) dev.new() plot(marginal_effects(model7) )

Effect of trial Compare to previous models (except for stupid model5)
Subject selective success rate does not help the model at all The best model is the one that supposes the success rate is almost precisely 0.2 (guess rate) The model that considers variation across actual cards is almost as good Trading off flexibility for fit > print(model_weights(model1, model2, model3, model4, model6, model7, weights="waic")) model model model model model model7 e e e e e e-06

Your turn Modify the code to make “random” effect on subjects Should shrink performance to indicate less difference between subjects Modify the code to shrink population scores for each actual card Set a prior to shrink toward a success rate of 0.2 Priors have to be set up in terms of logit scores!

Driving data set Questionnaire about driving in inclement weather
CHO2DRI: How often he or she chooses to drive in inclement weather: 1 = always, 3 = sometimes, 5 = never # load full data file DRdata<-read.csv(file="Driving.csv",header=TRUE,stringsAsFactors=FALSE) # mark whether Choose 2 Drive in inclement weather or not DRdata$Score<-DRdata$CHO2DRI DRdata$Score[DRdata$Score!=5] =0 DRdata$Score[DRdata$Score==5] =1

Hypothesis test Compare males and females Z-test

Binomial regression Model with no effect of sex
Call: glm(formula = Score ~ 1, family = "binomial", data = DRdata) Deviance Residuals: Min Q Median Q Max Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: on 60 degrees of freedom Residual deviance: on 60 degrees of freedom AIC: Number of Fisher Scoring iterations: 4 modelf1null = glm(formula = Score ~ 1, family = "binomial", data = DRdata)

Frequentist binomial regression
Different rates for females and males modelf1 <- glm(Score ~ GENDER, data=DRdata, family="binomial") print(summary(modelf1) ) Call: glm(formula = Score ~ GENDER, family = "binomial", data = DRdata) Deviance Residuals: Min Q Median Q Max Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) * GENDERMale --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: on 60 degrees of freedom Residual deviance: on 59 degrees of freedom AIC: Number of Fisher Scoring iterations: 4

Frequentist binomial regression
Convert to proportions Call: glm(formula = Score ~ GENDER, family = "binomial", data = DRdata) Deviance Residuals: Min Q Median Q Max Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) * GENDERMale --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: on 60 degrees of freedom Residual deviance: on 59 degrees of freedom AIC: Number of Fisher Scoring iterations: 4 > plogis( ) [1] > plogis( ) [1]

Frequentist model comparison
Compare null model to gender model Null model is expected to better predict future data

Bayesian binomial regression
No effect of sex model1null = brm(Score|trials(1) ~ 1, data=DRdata, family="binomial") print(summary(model1null)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept

Different rates for females and males model1 = brm(Score|trials(1) ~ 0 + GENDER, data=DRdata, family="binomial") print(summary(model1)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale GENDERMale

Convert to proportions > plogis(-0.87) [1] > plogis(-1.24) [1] Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale GENDERMale

Posteriors Convert to proportions # compute means of posteriors
> mean(propsF) [1] > mean(propsM) [1] # compute means of posteriors post<-posterior_samples(model1) propsF<-plogis(post$b_GENDERFemale) propsM<-plogis(post$b_GENDERMale) dev.new() xrange = c(min(c(propsF, propsM)), max(c(propsF, propsM))) yrange = c(0, max(c(density(propsF)$y, density(propsM)$y))) plot(density(propsF), xlim= xrange, ylim=yrange, col="green", main="Posterior Female (green) and Male (red)") lines(density(propsM), col="red", lwd=3, lty=2)

Posteriors Can also look at credible intervals dev.new()
plot(marginal_effects(model1) )

Bayesian model comparison
Compare null model to gender model Null model is expected to better predict future data > print(model_weights(model1null, model1, weights="loo")) model1null model1

Your turn The driving data set includes the age of each respondent
Include AGE as a variable in the binomial regression Compare to the other models Frequentist Bayesian

PSY 626: Bayesian Statistics for Psychological Science

Similar presentations

Presentation on theme: "PSY 626: Bayesian Statistics for Psychological Science"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PSY 626: Bayesian Statistics for Psychological Science

Similar presentations

Presentation on theme: "PSY 626: Bayesian Statistics for Psychological Science"— Presentation transcript:

Similar presentations

About project

Feedback