Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.

Slides:



Advertisements
Similar presentations
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Advertisements

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith.
Moving away from Linear-Gaussian assumptions Cons: Some things become much harder. No baked-in test of global fit Non-recursive models Error correlations.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Weekend Workshop I PROC MIXED. Random or Fixed ?RANDOMFIXEDLevels: Selected at random from infinite population Finite number of possibilities Another.
EPI 809/Spring Probability Distribution of Random Error.
Logistic Regression Example: Horseshoe Crab Data
PROC GLIMMIX: AN OVERVIEW
Overview of Logistics Regression and its SAS implementation
The Power of Proc Nlmixed. Introduction Proc Nlmixed fits nonlinear mixed-effects models (NLMMs) – models in which the fixed and random effects have a.
Generalized Linear Mixed Model English Premier League Soccer – 2003/2004 Season.
Comparison of Repeated Measures and Covariance Analysis for Pretest-Posttest Data -By Chunmei Zhou.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Multiple regression analysis
1 Experimental design and analyses of experimental data Lesson 6 Logistic regression Generalized Linear Models (GENMOD)
HIERARCHICAL LINEAR MODELS USED WITH NESTED DESIGNS IN EDUCATION, PSYCHOLOGY USES RANDOM FACTORS EXPECTED MEAN SQUARE THEORY COMBINES INFORMATION ACROSS.

OLS versus MLE Example YX Here is the data:
EPI809/Spring Testing Individual Coefficients.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
GEE and Generalized Linear Mixed Models
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Inference for regression - Simple linear regression
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Fixed vs. Random Effects Fixed effect –we are interested in the effects of the treatments (or blocks) per se –if the experiment were repeated, the levels.
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 4: Regression Models and Multivariate Analyses.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Lecture 4 SIMPLE LINEAR REGRESSION.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Multilevel Linear Modeling aka HLM. The Design We have data at two different levels In this case, 7,185 students (Level 1) Nested within 160 Schools (Level.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
BUSI 6480 Lecture 8 Repeated Measures.
Topic 23: Diagnostics and Remedies. Outline Diagnostics –residual checks ANOVA remedial measures.
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
Simple Linear Regression. Data available : (X,Y) Goal : To predict the response Y. (i.e. to obtain the fitted response function f(X)) Least Squares Fitting.
1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1.
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
Discrete Choice Modeling William Greene Stern School of Business New York University.
SAS® Global Forum 2014 March Washington, DC Got Randomness?
Today: March 7 Data Transformations Rank Tests for Non-Normal data Solutions for Assignment 4.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Discrete Choice Modeling William Greene Stern School of Business New York University.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Experimental Statistics - week 9
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
Jump to first page Bayesian Approach FOR MIXED MODEL Bioep740 Final Paper Presentation By Qiang Ling.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Microeconometric Modeling
Generalized Linear Models
Discrete Choice Modeling
Generalized Linear Models
Microeconometric Modeling
Logistic Regression with “Grouped” Data
Presentation transcript:

Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with random effects) Flu Data: (Binomial with random effects)

D.A.D. Mixed (not generalized) Models: Fixed Effects and Random Effects

SAS Global Forum 2010 D.A.D. “Generalized”  non normal distribution Binary for probabilities: Y=0 or 1 Mean p Pr{Y=j}= p j (1-p) (1-j) Link: L=ln(p/(1-p)) = “Logit” Range (over all L): 0<p<1 Poisson for counts: Y in {0,1,2,3,4, ….} Mean count Pr{Y=j} = exp(- )( j )/(j!) Link: L = log( ) Range (over all L): >0

SAS Global Forum 2010 D.A.D. Generalized (not mixed) linear models. Use link L = g(E{Y}), e.g. ln(p/(1-p)) = ln(E{Y}/(1-E{Y}) Assume L is linear model in the inputs with fixed effects. Estimate model for L, e.g. L=g(E{Y})=  o   X Use maximum likelihood Example: L = *dose Dose = 10, L=0.8, p=exp(0.8)/(1+exp(0.8))= inverse link = 0.86

SAS Global Forum 2010 D.A.D. Challenger was mission 24 From 23 previous launches we have: 6 O-rings per mission Y=0 no damage, Y=1 erosion or blowby p = Pr {Y=1} = f{mission, launch temperature) Features: Random mission effects Logistic link for p proc glimmix data=O_ring; class mission; model fail = temp/dist=binomial s; random mission; run;  Generalized  Mixed

DemoO_rings.sas

SAS Global Forum 2010 D.A.D. Estimated G matrix is not positive definite. Covariance Parameter Estimates Cov Standard Parm Estimate Error mission 2.25E-18. Solutions for Fixed Effects Effect Estimate Error DF t Value Pr > |t| Intercept temp

Just logistic regression – no mission variance component

SAS Global Forum 2010 D.A.D. Flu Data CDC Active Flu Virus Weekly Data % positive data FLU; input fluseasn year t week pos specimens; pct_pos=100*pos/specimens; logit=log(pct_pos/100/(1+(pct_pos/100))); label pos = "# positive specimens"; label pct_pos="% positive specimens"; label t = "Week into flu season (first = week 40)"; label week = "Actual week of year"; label fluseasn = "Year flu season started"; logit pct. pos.

DemoGet_Flu.sas

“Sinusoids” S(j) = sin(2  jt/52) C(j)=cos(2  jt/52) (1) GLM all effects fixed (harmonic main effects insignificant) PROC GLM DATA=FLU; class fluseasn; model logit = s1 c1 fluseasn*s1 fluseasn*c1 fluseasn*s2 fluseasn*c2 fluseasn*s3 fluseasn*c3 fluseasn*s4 fluseasn*c4; output out=out1 p=p; data out1; set out1; P_hat = exp(p)/(1+exp(p)); label P_hat = "Pr{pos. sample} (est.)"; run; Logit scale

DemoFlu_GLM.sas

(2) MIXED analysis on logits Random harmonics. Normality assumed PROC MIXED DATA=FLU method=ml; ** reduced model; class fluseasn; model logit = s1 c1 /outp=outp outpm=outpm ddfm=kr; random intercept/subject=fluseasn; random s1 c1/subject=fluseasn type=toep(1); random s2 c2/subject=fluseasn type=toep(1); random s3 c3/subject=fluseasn type=toep(1); random s4 c4/subject=fluseasn type=toep(1); run; Logit scale Probability scale

DemoFlu_MIXED.sas

(3) GLIMMIX analysis Random harmonics. Binomial assumed (overdispersed – lab effects?) PROC GLIMMIX DATA=FLU; title2 "GLIMMIX Analysis"; class fluseasn; model pos/specimens = s1 c1 ; * s2 c2 s3 c3 s4 c4; random intercept/subject=fluseasn; random s1 c1/subject=fluseasn type=toep(1); random s2 c2/subject=fluseasn; ** Toep(1) - no converge; random s3 c3/subject=fluseasn type=toep(1); random s4 c4/subject=fluseasn type=toep(1); random _residual_; output out=out2 pred(ilink blup)=pblup pred(ilink noblup) overallpearson = p_resid; run; Mean – no BLUPs

DemoFlu_GLIMMIX.sas

SAS Global Forum 2010 D.A.D. Flu data Binomial random _residual_ does not affect the fit (just standard errors) Could try Beta distribution instead: PROC GLIMMIX DATA=FLU; title2 "GLIMMIX Analysis"; class fluseasn; model f = s1 c1 /dist=beta link=logit s; random intercept/subject=fluseasn; random s1 c1/subject=fluseasn type=toep(1); random s2 c2/subject=fluseasn type=toep(1); random s3 c3/subject=fluseasn type=toep(1); random s4 c4/subject=fluseasn type=toep(1); output out=out3 pred(ilink blup)=pblup pred(ilink noblup)=overall pearson=p_residbeta; run;

SAS Global Forum 2010 D.A.D. Horseshoe Crab study (reference: SAS GLIMMIX course notes): Female nests have “satellite” males Count data – Poisson? Generalized Linear Features (predictors): Carapace Width, Weight, Color, Spine condition Random Effect: Site Mixed Model

DemoGet_Horseshoe.sas

Histogram # Boxplot 15.5+* 1 0.* * 1 |.* 1 |.** 3 | 9.5+** 3 |.*** 6 |.** 4 | 6.5+******* 13 |.******** ********** 19 | | 3.5+********** 19 | |.***** 9 *--+--*.******** 16 | | 0.5+******************************* * may represent up to 2 counts proc glimmix data=crab; class site; model satellites = weight width / dist=poi solution ddfm=kr; random int / subject=site; output out=overdisp pearson=pearson; run; proc means data=overdisp n mean var; var pearson; run; proc univariate data=crab normal plot; var satellites; run; N Mean Variance Fit Statistics Gener. Chi-Square / DF 2.77 Cov Parm Subject Estimate Intercept site Effect Estimate Pr > |t| Intercept weight width Zero Inflated ?

DemoCrabs_OVERDISP.sas

SAS Global Forum 2010 D.A.D. Zero Inflated Poisson (ZIP)

SAS Global Forum 2010 D.A.D. Zero Inflated Poisson (ZIP) proc nlmixed data=crab; parms b0=0 bwidth=0 bweight=0 c0=-2 c1=0 s2u1=1 s2u2=1; x=c0+c1*width+u1; p0 = exp(x)/(1+exp(x)); eta= b0+bwidth*width +bweight*weight +u2; lambda=exp(eta); if satellites=0 then loglike = log(p0 +(1-p0)*exp(-lambda)); else loglike = log(1-p0)+satellites*log(lambda)-lambda-lgamma(satellites+1); expected=(1-p0)*lambda; id p0 expected lambda; model satellites~general(loglike); Random U1 U2~N([0,0],[s2u1,0,s2u2]) subject=site; predict p0+(1-p0)*exp(-lambda) out=out1; run;

SAS Global Forum 2010 D.A.D. Zero Inflated Poisson (ZIP) Parameter Estimates Parameter Estimate t Pr>|t| Lower Upper b bwidth bweight c c s2u s2u

DemoCrabs_ZIP.sas

SAS Global Forum 2010 D.A.D. From fixed part of model, compute Pr{count=j} and plot (3D) versus Weight, Carapace width

SAS Global Forum 2010 D.A.D. Another possibility: Negative binomial Number of failures until k th success ( p=Prob{success} )

SAS Global Forum 2010 D.A.D. Crab beer Crab beer 3 trials before first success Negative Binomial Satellites

SAS Global Forum 2010 D.A.D. Negative binomial: In SAS, k is our 1/k proc glimmix data=crab; class site; model satellites = weight width / dist=nb solution ddfm=kr; random int / subject=site; run; Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF 1.03 Covariance Parameter Estimates Cov Parm Subject Estimate Std. Error Intercept site Scale Standard Effect Estimate Error DF t Value Pr > |t| Intercept weight width Num Den Effect DF DF F Value Pr > F weight width

DemoCrabs_NEGBIN.sas

Population average model vs. Individual Specific Model 8 typists Y=Error counts (Poisson distributed) ln( i )= ln(mean of Poisson) =  +U i for typist i. conditionally (individual specific) Distributions for Y, U~N(0,1) and  =1 =e  =e 1 = = mean for “typical” typist

SAS Global Forum 2010 D.A.D. Population average model Expectation ||||| | | of individual distributions averaged across population of all typists. Run same simulation for 8000 typists, compute mean of conditional population means, exp(  +U). The MEANS Procedure Variable N Mean Std Dev Std Error ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ lambda ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Z=( )/ = !! Population mean is not e  Conditional means,  +U, are lognormal. Log(Y)~N(1,1)  E{Y}=exp(  +0.5   ) = e 1.5 =

DemoTypists.sas

SAS Global Forum 2010 D.A.D. Main points: 1.Generalized linear models with random effects are subject specific models. 2.Subject specific models have fixed effects that represent an individual with random effects 0 (individual at the random effect distributional means). 3.Subject specific models when averaged over the subjects do not give the model fixed effects. 4.Models with only fixed effects do give the fixed effect part of the model when averaged over subjects and are thus called population average models.