PSY 626: Bayesian Statistics for Psychological Science

PSY 626: Bayesian Statistics for Psychological Science
1/11/2018 More Stan examples Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2016 Purdue University PSY200 Cognitive Psychology

Facial feedback Is your emotional state influenced by your facial muscles? If I ask you to smile, you may report feeling happier But this could just be because you guess I want you to report feeling happier Or because intentional smiling is associated with feeling happier You can ask people to use smiling muscles without them realizing it

Facial feedback Within subjects design (n=21)
Judge “happiness” in a piece of abstract art while Holding a pen in your teeth (smiling) Holding a pen in your lips (frowning/pouting) No pen 11 trials for each condition Different art on each trial

(Corrected) Data The HappinessRating is a number between 0 (no happiness) to 100 (lots of happiness) The facial feedback hypothesis suggests that the mean HappinessRating values should be larger when the pen is held in the teeth and lower when the pen is held in the lips File FacialFeedback.csv contains all the data Originally it misspelled “Participant” The updated file on the class web site fixes this error

Models # Null model: one mean for all conditions FFmodel <- map(
alist( HappinessRating ~ dnorm(mu, sigma), mu <- a, a ~ dnorm(50, 50), sigma ~ dunif(0, 100) ), data= FFdata )

Models # Alternative model 1: different mean for each condition
FFmodel2 <- map( alist( HappinessRating ~ dnorm(mu, sigma), mu <- a[ConditionIndex], a[ConditionIndex] ~ dnorm(50, 50), sigma ~ dunif(0, 100) ), data= FFdata )

Models # Alternative model 2: different means and sds for each condition FFmodel3 <- map( alist( HappinessRating ~ dnorm(mu, sigma), mu <- a[ConditionIndex], a[ConditionIndex] ~ dnorm(50, 50), sigma[ConditionIndex] ~ dunif(0, 100) ), data= FFdata )

Limitations map( ) does several things map( ) has several limitations
Apply the Bayesian calculations to convert priors and data into posteriors Search the posterior distributions to find maximum a posterior estimates of the parameter values (this is more difficult than you might think) map( ) has several limitations It assumes the posterior distributions are Gaussian (this makes the search for MAP values easier) We cannot model within-subject designs (no place for the correlation) We cannot model multi-level designs Fixing these limitations requires a different type of algorithm Stan (Markov Chain Monte Carlo sampling)

Comparing models print(compare(FFmodel, FFmodel2, FFmodel3))
WAIC pWAIC dWAIC weight SE dSE FFmodel NA FFmodel FFmodel Null model (FFmodel) and different means with same sd (FFmodel2) are nearly the same

Multilevel model Set up some dummy variables and a “clean” data set
FFdata$PenInTeeth <- ifelse(FFdata$Condition =="PenInTeeth", 1, 0) FFdata$NoPen <- ifelse(FFdata$Condition =="NoPen", 1, 0) FFdata$PenInLips <- ifelse(FFdata$Condition =="PenInLips", 1, 0) FFdataClean<- data.frame(HappinessRating= FFdata$HappinessRating, Participant= FFdata$Participant, PenInTeeth= FFdata$PenInTeeth, NoPen=FFdata$NoPen, PenInLips=FFdata$PenInLips, ConditionIndex = FFdata$ConditionIndex)

Multilevel model # Multilevel: different means and sds for each condition FFMultilevel <- map2stan( alist( HappinessRating ~ dnorm(mu, sigma), mu <- a1[Participant]*PenInTeeth + a2[Participant]*NoPen + a3[Participant]*PenInLips, c(a1, a2, a3)[Participant] ~ dmvnorm2(c(mean_teeth, mean_none, mean_lips), sigma_part, Rho), mean_teeth ~dnorm(50, 50), mean_none ~dnorm(50, 50), mean_lips ~dnorm(50, 50), sigma_part ~ dunif(0, 100), Rho ~ dlkjcorr(2), sigma <- c1*PenInTeeth + c2*NoPen + c3*PenInLips, c1 ~ dunif(0, 100), c2 ~ dunif(0, 100), c3 ~ dunif(0, 100) ), data= FFdataClean )

SAMPLING FOR MODEL 'HappinessRating ~ dnorm(mu, sigma)' NOW (CHAIN 1).
Chain 1, Iteration: 1 / 2000 [ 0%] (Warmup) Chain 1, Iteration: 200 / 2000 [ 10%] (Warmup) Chain 1, Iteration: 400 / 2000 [ 20%] (Warmup) Chain 1, Iteration: 600 / 2000 [ 30%] (Warmup) Chain 1, Iteration: 800 / 2000 [ 40%] (Warmup) Chain 1, Iteration: 1000 / 2000 [ 50%] (Warmup) Chain 1, Iteration: 1001 / 2000 [ 50%] (Sampling) Chain 1, Iteration: 1200 / 2000 [ 60%] (Sampling) Chain 1, Iteration: 1400 / 2000 [ 70%] (Sampling) Chain 1, Iteration: 1600 / 2000 [ 80%] (Sampling) Chain 1, Iteration: 1800 / 2000 [ 90%] (Sampling) Chain 1, Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: seconds (Warm-up) seconds (Sampling) seconds (Total) WARNING: No variance estimation is performed for num_warmup < 20 Chain 1, Iteration: 1 / 1 [100%] (Sampling) Elapsed Time: 8e-06 seconds (Warm-up) seconds (Sampling) seconds (Total) Computing WAIC Constructing posterior predictions [ 1000 / 1000 ] Lots of warnings DIAGNOSTIC(S) FROM PARSER:Warning (non-fatal): Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable. If so, you need to call increment_log_prob() with the log absolute determinant of the Jacobian of the transform. Left-hand-side of sampling statement: v_a1a2a3 ~ multi_normal(...) In file included from file133563cabe3a.cpp:8: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/src/stan/model/model_header.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/mat.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core.hpp:42: /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core/set_zero_all_adjoints.hpp:14:17: warning: unused function 'set_zero_all_adjoints' [-Wunused-function] static void set_zero_all_adjoints() { ^ In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core.hpp:43: /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/core/set_zero_all_adjoints_nested.hpp:17:17: warning: 'static' function 'set_zero_all_adjoints_nested' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration] static void set_zero_all_adjoints_nested() { In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/rev/mat.hpp:9: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/prim/mat.hpp:54: /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/autocorrelation.hpp:17:14: warning: function 'fft_next_good_size' is not needed and will not be emitted [-Wunneeded-internal-declaration] size_t fft_next_good_size(size_t N) { In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/prim/mat.hpp:235: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/prim/arr.hpp:36: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/StanHeaders/include/stan/math/prim/arr/functor/integrate_ode_rk45.hpp:13: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/numeric/odeint.hpp:61: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/numeric/odeint/util/multi_array_adaption.hpp:29: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/multi_array.hpp:21: In file included from /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/multi_array/base.hpp:28: /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/multi_array/concept_checks.hpp:42:43: warning: unused typedef 'index_range' [-Wunused-local-typedef] typedef typename Array::index_range index_range; /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/multi_array/concept_checks.hpp:43:37: warning: unused typedef 'index' [-Wunused-local-typedef] typedef typename Array::index index; /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/multi_array/concept_checks.hpp:53:43: warning: unused typedef 'index_range' [-Wunused-local-typedef] /Library/Frameworks/R.framework/Versions/3.3/Resources/library/BH/include/boost/multi_array/concept_checks.hpp:54:37: warning: unused typedef 'index' [-Wunused-local-typedef] 7 warnings generated.

Results > precis(FFMultilevel)
> precis(FFMultilevel, depth=2) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat … a1[19] a1[20] a1[21] mean_teeth mean_none mean_lips sigma_part[1] sigma_part[2] sigma_part[3] Rho[1,1] NaN Rho[1,2] Rho[1,3] Rho[2,1] Rho[2,2] Rho[2,3] Rho[3,1] Rho[3,2] Rho[3,3] c c c > precis(FFMultilevel, depth=2) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a3[1] a3[2] a3[3] a3[4] a3[5] a3[6] a3[7] a3[8] a3[9] a3[10] a3[11] a3[12] a3[13] a3[14] a3[15] a3[16] a3[17] a3[18] a3[19] a3[20] a3[21] a2[1] a2[2] a2[3] a2[4] a2[5] a2[6] a2[7] a2[8] a2[9] a2[10] a2[11] a2[12] a2[13] a2[14] a2[15] a2[16] a2[17] a2[18] a2[19] a2[20] a2[21] a1[1] a1[2] a1[3] a1[4] a1[5] a1[6] a1[7] a1[8] a1[9] a1[10] a1[11] a1[12] a1[13] a1[14] a1[15] a1[16] a1[17] a1[18] a1[19] a1[20] a1[21] mean_teeth mean_none mean_lips sigma_part[1] sigma_part[2] sigma_part[3] Rho[1,1] NaN Rho[1,2] Rho[1,3] Rho[2,1] Rho[2,2] Rho[2,3] Rho[3,1] Rho[3,2] Rho[3,3] c c c > FFMultilevel map2stan model fit 1000 samples from 1 chain Formula: HappinessRating ~ dnorm(mu, sigma) mu <- a1[Participant] * PenInTeeth + a2[Participant] * NoPen + a3[Participant] * PenInLips c(a1, a2, a3)[Participant] ~ dmvnorm2(c(mean_teeth, mean_none, mean_lips), sigma_part, Rho) mean_teeth ~ dnorm(50, 50) mean_none ~ dnorm(50, 50) mean_lips ~ dnorm(50, 50) sigma_part ~ dunif(0, 100) Rho ~ dlkjcorr(2) sigma <- c1 * PenInTeeth + c2 * NoPen + c3 * PenInLips c1 ~ dunif(0, 100) c2 ~ dunif(0, 100) c3 ~ dunif(0, 100) Log-likelihood at expected values: Deviance: DIC: Effective number of parameters (pD): 49.56 WAIC (SE): (35.2) pWAIC: 47.75 > precis(FFMultilevel) 75 vector or matrix parameters omitted in display. Use depth=2 to show them. Mean StdDev lower 0.89 upper 0.89 n_eff Rhat mean_teeth mean_none mean_lips c c c

Model comparison > compare(FFmodel, FFmodel2, FFmodel3, FFMultilevel) WAIC pWAIC dWAIC weight SE dSE FFMultilevel NA FFmodel FFmodel FFmodel Warning message: In compare(FFmodel, FFmodel2, FFmodel3, FFMultilevel) : Not all model fits of same class. This is usually a bad idea, because it implies they were fit by different algorithms. Check yourself, before you wreck yourself. Multilevel model has 75 parameters, but they are not really free

Model comparison What does this mean?
If you wanted to predict future data, the multilevel model is your best choice among these models That’s largely because there seem to be differences between participants, and only this model considers them Maybe we should consider other models

Null multilevel model # Stan data for null model
> precis(FFMultilevelNull, depth=2) Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] a[10] a[11] a[12] a[13] a[14] a[15] a[16] a[17] a[18] a[19] a[20] a[21] grandmean sigma_part sigma Null multilevel model # Stan data for null model FFdataClean2<- data.frame(HappinessRating= FFdata$HappinessRating, Participant= FFdata$Participant) # Multilevel: equal means and sds for each condition FFMultilevelNull <- map2stan( alist( HappinessRating ~ dnorm(mu, sigma), mu <- a[Participant], a[Participant] ~ dnorm(grandmean, sigma_part), grandmean ~dnorm(50, 50), sigma_part ~ dunif(0, 100), sigma ~ dunif(0, 100) ), data= FFdataClean2 )

Comparing models > compare(FFmodel, FFmodel2, FFmodel3, FFMultilevel, FFMultilevelNull) WAIC pWAIC dWAIC weight SE dSE FFMultilevel NA FFMultilevelNull FFmodel FFmodel FFmodel A model that has different means for the conditions is expected to predict future data better than a model that has the same mean for each condition Support for the facial feedback hypothesis?

Evidential support What does the Facial feedback hypothesis actually predict? Is your emotional state influenced by your facial muscles? ->Differences in happiness ratings across the conditions > precis(FFMultilevel) 75 vector or matrix parameters omitted in display. Use depth=2 to show them. Mean StdDev lower 0.89 upper 0.89 n_eff Rhat mean_teeth mean_none mean_lips c c c

Evidential support What does the Facial feedback hypothesis actually predict? Use of “smiling” facial muscles should lead to higher happiness ratings than “pouting” facial muscles -> Higher ratings for teeth than for lips conditions > precis(FFMultilevel) 75 vector or matrix parameters omitted in display. Use depth=2 to show them. Mean StdDev lower 0.89 upper 0.89 n_eff Rhat mean_teeth mean_none mean_lips c c c

Evidential support What does the Facial feedback hypothesis actually predict? “Smiling” facial muscles leads to more happiness, “pouting” facial muscles leads to less happiness -> Order of means: teeth > none > lips > precis(FFMultilevel) 75 vector or matrix parameters omitted in display. Use depth=2 to show them. Mean StdDev lower 0.89 upper 0.89 n_eff Rhat mean_teeth mean_none mean_lips c c c

> precis(FFMultilevelOrder, depth=2)
Mean StdDev lower 0.89 upper 0.89 n_eff Rhat a3[1] a3[2] a3[3] a3[4] a3[5] a3[6] a3[7] a3[8] a3[9] a3[10] a3[11] a3[12] a3[13] a3[14] a3[15] a3[16] a3[17] a3[18] a3[19] a3[20] a3[21] a2[1] a2[2] a2[3] a2[4] a2[5] a2[6] a2[7] a2[8] a2[9] a2[10] a2[11] a2[12] a2[13] a2[14] a2[15] a2[16] a2[17] a2[18] a2[19] a2[20] a2[21] a1[1] a1[2] a1[3] a1[4] a1[5] a1[6] a1[7] a1[8] a1[9] a1[10] a1[11] a1[12] a1[13] a1[14] a1[15] a1[16] a1[17] a1[18] a1[19] a1[20] a1[21] deviation_teeth mean_none deviation_lips sigma_part[1] sigma_part[2] sigma_part[3] Rho[1,1] NaN Rho[1,2] Rho[1,3] Rho[2,1] Rho[2,2] Rho[2,3] Rho[3,1] Rho[3,2] Rho[3,3] c c c Ordered means You should build the model that best represents the theoretical claim Consider a model for ordered effects: teeth > none > lips # Multilevel: teeth and lips deviate from none and sds for each condition FFMultilevelOrder <- map2stan( alist( HappinessRating ~ dnorm(mu, sigma), mu <- (a2[Participant]+a1[Participant])*PenInTeeth + a2[Participant]*NoPen + (a2[Participant]-a3[Participant])*PenInLips, c(a1, a2, a3)[Participant] ~ dmvnorm2(c(deviation_teeth, mean_none, deviation_lips), sigma_part, Rho), deviation_teeth ~dunif(0, 10), mean_none ~dnorm(50, 50), deviation_lips ~dunif(0, 10), sigma_part ~ dunif(0, 100), Rho ~ dlkjcorr(2), sigma <- c1*PenInTeeth + c2*NoPen + c3*PenInLips, c1 ~ dunif(0, 100), c2 ~ dunif(0, 100), c3 ~ dunif(0, 100) ), data= FFdataClean ) > precis(FFMultilevelOrder) 75 vector or matrix parameters omitted in display. Use depth=2 to show them. Mean StdDev lower 0.89 upper 0.89 n_eff Rhat deviation_teeth mean_none deviation_lips c c c

Model comparison > compare(FFMultilevel,FFMultilevelNull, FFMultilevelOrder) WAIC pWAIC dWAIC weight SE dSE FFMultilevel NA FFMultilevelOrder FFMultilevelNull The ordered model does not do the best Neither does it seem hopeless! But what if you interpret the facial feedback theory as being agnostic about the “None” condition?

Order 2 We do not ignore the “none” condition
It is correlated with other variables, so it has information to be gleaned We constrain the teeth ratings to be larger than the lips ratings (for the hyper-means) FFMultilevelOrder2 <- map2stan( alist( HappinessRating ~ dnorm(mu, sigma), mu <- (a3[Participant]+a1[Participant])*PenInTeeth + a2[Participant]*NoPen + a3[Participant]*PenInLips, c(a1, a2, a3)[Participant] ~ dmvnorm2(c(deviation_teeth, mean_none, mean_lips), sigma_part, Rho), deviation_teeth ~dunif(0, 10), mean_none ~dnorm(50, 50), mean_lips ~dnorm(50, 50), sigma_part ~ dunif(0, 100), Rho ~ dlkjcorr(2), sigma <- c1*PenInTeeth + c2*NoPen + c3*PenInLips, c1 ~ dunif(0, 100), c2 ~ dunif(0, 100), c3 ~ dunif(0, 100) ), data= FFdataClean )

Compare models > compare(FFMultilevel, FFMultilevelNull, FFMultilevelOrder, FFMultilevelOrder2) WAIC pWAIC dWAIC weight SE dSE FFMultilevel NA FFMultilevelOrder FFMultilevelOrder FFMultilevelNull Note, support for the unordered means model grows weaker when considering another viable (order 2) model Still, this analysis does suggest strong support for either interpretation of the facial feedback hypotheses Neither does it convincingly rule out those interpretations Get more data (or give up)

Model construction The hardest part for these kinds of analyses is describing the model Likelihood functions (dnorm(), dmvnorm2()) Priors (dunif(), dlkjcorr(2) ) You will likely find that there are some situations where you cannot describe the model that you want to consider Move beyond the rethinking package May have to code directly in Stan (there will still be models that you cannot construct!)

Zenner Cards Guess which card appears next:

Data Score indicates whether you predicted correctly (1) or not (0)
File ZennerCards.csv contains the data for 22 participants

Loading data # load full data file
ZCdata<-read.csv(file="ZennerCards.csv",header=TRUE,stringsAsFactors=FALSE) # load the rethinking library library(rethinking) # Make data frames that will work for Stan ZCdataClean<- data.frame(Score= ZCdata$Score, Participant= ZCdata$Participant) ZCdataClean2<- data.frame(Score= ZCdata$Score)

Null model ZCmodel0 <- map2stan( alist( Score ~ dbinom(1, p),
Prior very tight around correct probability of 0.2 (all in logit values) Treat each score as homogeneous (regardless of participant) 0 parameters ZCmodel0 <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a, a ~ dnorm(logit(0.2), 0.001) ), data= ZCdataClean2 )

Equivalent subjects model
Weak prior for probability logit Treat each score as homogeneous (regardless of participant) 1 parameter ZCmodel1 <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a, a ~ dnorm(0, 10) ), data= ZCdataClean2 )

Treat participant differences
Weak prior for probability logit Logit probabilities may differ across participants 22 parameters ZCmodel3 <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a[Participant], a[Participant] ~ dnorm(0, 10) ), data= ZCdataClean )

Multilevel model ZZCMultilevel <- map2stan(
Probabilities for logit probabilities across participants are related to each other Weak prior for probability logit hyper-parameters (mean and standard deviation) Logit probabilities may differ across participants 24 (constrained) parameters ZZCMultilevel <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a[Participant], a[Participant] ~ dnorm(grand, sigma_part), grand ~dnorm(0, 10), sigma_part ~ dunif(0, 100) ), data= ZCdataClean, iter=1e4, warmup=2000 )

Shrinkage Lots of shrinkage

Model comparison > compare(ZCmodel0, ZCmodel1, ZCmodel3, ZCMultilevel) WAIC pWAIC dWAIC weight SE dSE ZCmodel NA ZCmodel ZCMultilevel ZCmodel Favors the null model with 0 parameters, but not overwhelmingly so 1 parameter model (which estimates probability) is viable So is 24 parameter multi-level model

Saving a model On the iMac in my office, Stan often froze when I tried to run more than one model To work around this problem, I would save a model: save(ZCmodel3, file="ZCParticipants.Rpd") The next time I ran the code, I would block out the call to that model and load the saved file instead: if(1==0){ # Different probabilities for different participants ZCmodel3 <- map2stan( alist( Score ~ dbinom(1, p), logit(p) <- a[Participant], a[Participant] ~ dnorm(0, 10) ), data= ZCdataClean ) }else{ load("ZCParticipants.Rpd") }

Conclusions You want your model to reflect theory
Not always easy to do

PSY 626: Bayesian Statistics for Psychological Science

Similar presentations

Presentation on theme: "PSY 626: Bayesian Statistics for Psychological Science"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PSY 626: Bayesian Statistics for Psychological Science

Similar presentations

Presentation on theme: "PSY 626: Bayesian Statistics for Psychological Science"— Presentation transcript:

Similar presentations

About project

Feedback