Workshop in R and GLMs: #4 Diane Srivastava University of British Columbia
Exercise 1.Fit the binomial glm survival = size*treat 2. Fit the bionomial glm parasitism = size*treat 3. Predict what size has 50% parasitism in treatment “0”
Predicting size for p=0.5, treat=0 Output from logistic regression with logit link: predicted log e (p/1-p) = a+bx So when p=0.5, solve log(1)=a+bx
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) <2e-16 *** size <2e-16 *** treat size:treat What is equation for treat 0? treat 1?
Rlecture.csv 3.12
Model simplification 1.Parsimonious/ Logical sequence (e.g. highest order interactions first) 2. Stepwise sequence 3. Bayesian comparison of candidate models (not covered)
Plant size Logit parasitism Plant size Logit parasitism ANCOVA: Difference between categories…. Constant, doesn’t depend on size Depends on size size*treat ns size*treat sig
Deletion tests How to change your model quickly: model2<-update(model1,~.-size:treat) How to do a deletion test: anova(reduced model, full model, test="Chi") 1.Test for interaction in logit parasitism ANCOVA If not sig, remove and continue. If sig, STOP! 2. Test covariate If not sig, remove and continue. If sig, put back and continue 3. Test main effect
Code for “parasitism” analysis > ds<-read.table(file.choose(), sep=",", header=TRUE); ds > attach(ds) > par<-cbind(parasitism, 100-parasitism); par > m1<-glm(par~size*treat, data=ds, family=binomial) > summary(m1) > m2<-update(m1, ~.-size:treat) > summary(m2) > anova(m2,m1, test="Chi") > m3<-update(m2, ~.-size) > anova(m3,m2, test="Chi") > m3<-update(m2, ~.-treat) > anova(m3,m2, test="Chi")
Context (often) matters! What is the p-value for treat in: size+treat? treat? Stepwise regression: step(model)
Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11
X variableparameterSSF 1,13 p Height < of player
X variableparameterSSp Weight < of player F 1,13
Why do you think weight is + correlated with jump height?
An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high…
lighter heavier X variableparameterSSF p Height < Weight <0.0001
Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + -
Species.txt Rothamsted Park Grass experiment started in 1856
Exercise (species.txt) diane<-read.table(file.choose(), header=T); diane; attach(diane) Univariate trends: plot(Species~Biomass) plot(Species~pH) Combined trends: plot(Species~Biomass, type="n"); points(Species[pH=="high"]~Biomass[pH=="high"]); points(Species[pH=="mid"]~Biomass[pH=="mid"], pch=16); points(Species[pH=="low"]~Biomass[pH=="low"], pch=0)
Exercise (species.txt) 1. With a normal distribution, fit pH*Biomass check model dignostics test interaction for significance 2. With a poisson distribution, fit pH *Biomass check model dignostics test interaction for significance
Moral of the story: Make sure you KNOW what you are modelling!
Exercise (species.txt) 1. Fit glm: Species~pH, family=gaussian 2. Test if low and mid pH have the same effect this is a planned comparison
Further reading Statistics: An Introduction using R (M.J. Crawley, Wiley publishers) Extending the linear model with R (JJ Faraway, Chapman & Hall/CRC)
Code for “Species” analysis > m1<-glm(Species~pH*Biomass, family=gaussian, data=diane) > summary(m1) > m2<-update(m1, ~.-pH:Biomass) > anova(m2,m1, test="Chi") > par(mfrow=c(2,2)); plot(m1) > m3<-glm(Species~pH*Biomass, family=poisson, data=diane) > m4<-update(m3, ~.-pH:Biomass) > anova(m4,m3, test="Chi") > par(mfrow=c(2,2)); plot(m3) >PH<-(pH!="high")+0 > m5<-glm(Species~pH, family=gaussian, data=diane) > m6<-update(m5, ~.-pH+PH) > anova(m6,m5, test="Chi")