Instructor: K.C. Carriere

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Lecture 11 (Chapter 9).
Three or more categorical variables
Brief introduction on Logistic Regression
Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
Inference for Regression
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Simple Logistic Regression
Models with Discrete Dependent Variables
1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose.
Chapter 16 Chi Squared Tests.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
EPI 809/Spring Multiple Logistic Regression.
Chapter 11 Multiple Regression.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Maximum likelihood (ML)
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Generalized Linear Models
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Presentation 12 Chi-Square test.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
AS 737 Categorical Data Analysis For Multivariate
Lecture 9: Marginal Logistic Regression Model and GEE (Chapter 8)
Chapter 13: Inference in Regression
Simple Linear Regression
Correlation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
Chapter 14 Introduction to Multiple Regression
CHAPTER 14 MULTIPLE REGRESSION
HSRP 734: Advanced Statistical Methods June 19, 2008.
Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.
Methods of Presenting and Interpreting Information Class 9.
Estimating standard error using bootstrap
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
Chapter 12 Tests with Qualitative Data
Generalized Linear Models
Chapter 11 Goodness-of-Fit and Contingency Tables
Chapter 10 Analyzing the Association Between Categorical Variables
LEARNING OUTCOMES After studying this chapter, you should be able to
Logistic Regression.
Inference for Relationships
Analyzing the Association Between Categorical Variables
Presentation transcript:

Instructor: K.C. Carriere Analysis of ordinal repeated categorical response data by using marginal model (Maximum likelihood approach) by Abdul Salam Instructor: K.C. Carriere Stat 562

Contents: Introduction Background of data Objective of the study Basic theory Marginal model Model fitting using ML SAS Codes Results Conclusion

Introduction Definition Categorical data Repeated categorical data Advantages and Disadvantages of repeated Measurements Designs

Definition Categorical data Categorical data fits into a small number of discrete categories (as opposed to continuous). Categorical data is either non-ordered (nominal) such as gender or city, or ordered (ordinal) such as high, medium, or low temperatures.

Definition (cont-) Repeated categorical data The term “repeated measurements” refers broadly to data in which the response of each experimental unit or subject is observed on multiple occasions or under multiple conditions. When the response is categorical then it is called repeated categorical data.

Definition (cont-) Application of Repeated categorical data Repeated categorical response data occur commonly in health-related application, especially in longitudinal studies. For example, a physician might evaluate patients at weekly intervals regarding whether a new drug treatment is successful. In some cases explanatory variable also vary over time.

Advantages of Repeated Measurements Designs Individual patterns of change. Provide more efficient estimates of relevant parameters than cross-sectional designs with the same number and pattern of measurement. Between subjects sources of variability can be excluded from the experimental error.

Disadvantages of Repeated Measurements Designs Analysis of repeated data is complicated by the dependence among the repeated observations made on the same experimental unit. Often investigator cannot control the circumstances for obtaining measurements, so that the data may be unbalanced or partially incomplete.

Background of Insomnia data A randomized, double blind clinical trail has been performed for comparing an active hypnotic drug with a placebo in patients who have insomnia problems. The outcome variable which is patient’s response to the question, How quickly did you fall asleep after going to bed?” measured using categories (<20 minutes, 20-30 minutes, 30-60 minutes, and >60 minutes). Patients were asked this question before and following a two-week treatment period.

Background of Insomnia data Patients were randomly assigned to one of the two treatments active and placebo. The two treatments, active and placebo, form a binary explanatory variable. Patients receiving the two treatments were independent samples.

Table#1: Time to falling Asleep, by Treatment and Occasion.(n=239). Follow Up Treatment Initial <20 min 20 – 30 min 30 – 60 min > 60 min Active <20 7 4 1 20 – 30 11 5 2 30 – 60 13 23 3 60 9 17 8 Placebo 14 6 18 > 60 22

Objectives To study the effect of time on the response. To study the effect of treatment on the response. Is the time to fall asleep is quicker for active treatment than placebo? Is there any interaction between treatment and time? How does the treatment affect the time to fall asleep over time?

Pharmaceutical Company Interest Company hope that patients with a Active treatment have a significantly higher rate of improvement than patients with placebo.

Generalized linear model to the analysis of Repeated Measurements Designs Marginal Models; Random Effect Models; Transition models.

Basic Theory

GLMs for ordinal response. Extensions of generalized linear model methodology for the analysis of repeated measurements accommodate discrete or continuous, time-independent or dependent covariates. GLMs have three components: A random component, which identify the response variable Y and its probability distribution; a systematic component specify explanatory variables used in a linear predictor function; a link function specifies the functional relationship between the systematic component and the E(Y).. Generalized linear model was first introduced by Nelder and Wedderburn (1972). It is an extension of classical linear models for independent normally distributed random variables with constant variance and can also handled model for rates and proportions, binary, ordinal and multinomial variables and counts response variables.

Random Component. j = 0,1,…….c Thus Since the response is ordinal, so it is often advantageous to construct logits that account for categorical ordering and are less affected by the number of choice of categories of the response, which is known as cumulative response probabilities, from which the cumulative logits are defined. For ordinal response with c + 1 ordered categories labeled as 0,1, 2,…….,C for each individuals or experimental unit. The cumulative response probabilities are The random component of GLMs consists of response variable Y with independent observation (y1, y2, …………yN) from distribution in the natural exponential family. Several important distributions are special cases, including Poisson and Binomial. The response variable Y is not independent because of existence of within group dependence, due to repeated measurement on same individual, the corresponding within group covariance tends to be non-zero. j = 0,1,…….c Thus

Systematic component. The systematic component of the generalized linear model specifies the explanatory variables. The linear combination of these explanatory variables is called the linear predictor denoted by The vector β characterizes how the cross-sectional response distribution depends on the explanatory variables.

Link Function. The link function explain the relation ship between random and systematic component, that how relates to the explanatory variables in the linear predictor. For ordinal response having c+1 categories, one might use the cumulative logit. Logitj = logit [P(Y ≤ j)], j=1,…………..c

Link Function. where GLM is simplified to proportional odds model, then βj may simplify to β indicating the same effect for each logit. The proportional odds model is Each cumulative logit uses all c + 1 responses categories. A model for is similar to the ordinary logit model for a binary response, where categories 0 to j-1 form the first category and the j to c form the second category. For the ordinal response, if the cumulative link function is used, for j =1,……….c,

Link Function. For individuals with covariate vector x* and x, the odds ratio for the response below category j is The odds ratio does not depend on response category j. The regression coefficient can be calculated by taking log, which indicate the difference in logit (log odds) of response variable per unit change in the x. Each cumulative logit uses all c + 1 responses categories. A model for is similar to the ordinary logit model for a binary response, where categories 0 to j-1 form the first category and the j to c form the second category. For the ordinal response, if the cumulative link function is used,

Maximum Likelihood Method (ML). The standard approach to maximum likelihood (ML) fitting of marginal models involves solving the score equations using the Newton-Raphson method, Fisher scoring, or some other iterative reweighted least squares algorithm. ML fitting of marginal logit models is awkward. For T observations on an I-category response, at each setting of predictors the likelihood refers to IT ­ multinomial joint probabilities, but the model applies to T sets of marginal multinomial parameters, and assume that marginal multinomial variates are independent.

ML: Model Speciofication. Let consider T categorical responses, where the tth variable has It categories. The responses are ordinal observed for P covariate patterns, defined by a set of explanatory variables. Let r = denote the number of response profiles for each covariate pattern. The vector of counts for covariate pattern p is denoted by Yp. The Yp are assumed to be independent multinomial random vectors,

ML: Model Speciofication. Where is a vector of positive probabilities and 1rT is a r-dimensional vector of 1’s. Since the model applies to T sets of marginal multinomial parameters, the marginal models can be written as a generalized linear model with the link function, (Lang and Agresti 1994). Here denote the complete set of multinomial joint probabilities for all setting of predictors. The matrix A applied to forms the T marginal probabilities and their complements at each setting of predictors. The elements of A are nonnegative. is a vector of p0 parameters. The matrix C applied to log marginal probabilities forms the T marginal logits for each setting; each row of C has 1 in the position multiplied by the log numerator probability for a given marginal logit, -1 in the position multiplied by the log denominator probability, and 0 elsewhere.

ML Fitting of marginal Models: Lang and Agresti (1994) considered the likelihood as a function of rather then. The likelihood function for a marginal logit model is the product of the multinomial mass functions from the various predictors setting. One approach for ML fitting views the model as a set of constraints and uses methods for maximizing a function subject to constraints Let U denote a full column rank matrix such that column of U is orthogonal to by the column of X, then, and the model has the equivalent constraint form .

ML Fitting of marginal Models: Let be a vector having elements and the lagrange multipliers . The Lagrangian likelihood equations have form where The method of maximizing the likelihood incorporates these model constraints as well as identifiability constraints, which constraint the response probabilities at each predictor setting to sum to 0. The method introduces Lagrange multipliers corresponding to these constraints and solves the Lagrangian likelihood equations using a Newton-Raphson algorithm (Aitchison and Silvey 1958; Harber 1985). is a vector with terms involving the contents in marginal logits that the model specifies constraints as well as log-likelihood derivative. The Newton-Raphson iterative scheme is

ML Fitting of marginal Models: After obtaining the fitted values on convergence of the algorithm, they calculate model parameter estimates using A drawback of this algorithm is that the derivative matrix is typically very large (with the numbers of rows and columns exceeding the number of cells in the contingency tables) and does not have a simple form, making inversion difficult. Lang and Agresti (1994) uses an asymptomatic approximation to a reparameterized derivative matrix that has a much simpler form, requiring inverting only a diagonal matrix and a symmetric positive definite matrix. This maximum likelihood fitting method makes no assumption about the model that describes the joint distribution. Thus, when the marginal model holds, the ML estimate are consistent regardless of the dependence structure for that distribution.

Inference Hypothesis testing for parameters: After obtaining model parameter estimates and estimated covariance matrix, one can apply standard methods of inference, for instance Wald chi-squared test for marginal homogeneity. Goodness of Fit test: To assess model goodness of fit, one can compare observed and fitted cell counts using the likelihood-ratio statistics G2 or the Pearson Chi-square statistics. For nonsparse tables, assuming that the model holds, these statistics have approximate chi-squared distributions with degree of freedom equal to the number of constraints implied by

Limitations of ML: The number of multinomial probabilities increases dramatically as the number of predictors increases. ML approaches are not practical when T is large or there are many predictors, especially when some are continuous. It does not make any assumption about the model that describes the joint distribution .

Results: Time to Falling Asleep Treatment Occasion <20 min Active Initial 0.101 0.168 0.336 0.395 Follow up 0.412 0.160 0.092 Placebo 0.117 0.167 0.292 0.425 0.258 0.242 0.208 Table#2: Sample Marginal Proportions for Insomnia Data.

Figure# 1: Sample Marginal Proportions Insomnia data.

Marginal Proportion sample proportion of time to falling asleep in <20 minutes for subject who received Active treatment at initial occasion is = (7+4+1+0) / (7+4+1+0+11+…………+13+8) = 12/119=0.1008 Similarly the sample proportion of time to falling asleep in >60 minutes for subject received placebo at follow up is = (1+0+2+22) / (7+4+2+1+………..+14+22) = 25/120=0.20833 And so on.

What did you get from Marginal Proportion table? From initial to follow up occasion, time to falling asleep seems to shift downward for both treatments. The degree of shift seems greater for the active treatment than placebo, indicating possible interaction. Or we could say that effect of treatment on the response is different at different occasion.

Fitted Marginal Model logit [P(Y ≤ j)] = Let ‘x’ represent the treatment, with x=1 for an Active treatment and x=0 for the placebo. Let t denote the occasion measurement , with t=0 for initial and t=1 for follow up. Let (Yt) represent the outcome variable which is patient’s response at time t to the question, “How quickly did you fall asleep after going to bed?” with j=0 for <20 minutes, j=1 for 20-30 minutes, j=2 for 30-60 minutes, and j=3 for >60 minutes). The marginal model with cumulative link can be written for our data set as where ’s are the intercept and , , and are the coefficient of the occasion, treatment, and interaction term respectively. logit [P(Y ≤ j)] =

SAS code data isomnia; input treatment $ initial $ follow $ count @@; If count=0 then count=1E-8; datalines; active <20 <20 7 active <20 20-30 4 active <20 30-60 1 active <20 >60 0 active 20-30 <20 11 active 20-30 20-30 5 active 20-30 30-60 2 active 20-30 >60 2 active 30-60 <20 13 active 30-60 20-30 23 active 30-60 30-60 3 active 30-60 >60 1 active >60 <20 9 active >60 20-30 17 active >60 30-60 13 active >60 >60 8 placbo <20 <20 7 placbo <20 20-30 4 placbo <20 30-60 2 placbo <20 >60 1 placbo 20-30 <20 14 placbo 20-30 20-30 5 placbo 20-30 30-60 1 placbo 20-30 >60 0 placbo 30-60 <20 6 placbo 30-60 20-30 9 placbo 30-60 30-60 18 placbo 30-60 >60 2 placbo >60 <20 4 placbo >60 20-30 11 placbo >60 30-60 14 placbo >60 >60 22 ;

proc catmod order=data data=isomnia; weight count; population Treatment; response clogit; model initial*follow=(1 0 0 1 1 1, α 1+ β1+ β2 +β3 active + follow, j=1 0 1 0 1 1 1, α 2+ β1+ β2 +β3 active + follow, j=2 0 0 1 1 1 1, α 3+ β1+ β2 +β3 active + follow, j=3 1 0 0 1 0 0, α 1+ β1 active+ initial, j=1 0 1 0 1 0 0, α 2+ β1 active+ initial , j=2 0 0 1 1 0 0, α 3+ β1 active + initial, j=3 1 0 0 0 1 0, α 1 + β2 placebo+ follow, j=1 0 1 0 0 1 0, α 2 + β2 placebo+ follow, j=2 0 0 1 0 1 0, α 3 + β2 placebo+ follow, j=3 1 0 0 0 0 0, α 1 placebo+ initial, j=1 0 1 0 0 0 0, α 2 placebo+ initial, j=2 0 0 1 0 0 0) α 3 placebo+ initial, j=3 (1 2 3 ='Cutpoint', 4='Treatment', 5='TIme effect', 6='Time*Treatment effect') / freq; quit; SAS code

Fitted Marginal Model After fitting the marginal model using maximum likelihood method to the above marginal distribution gave the following results Logit [P (Y≤ J)] = -1.16+ 0.10 +1.37+1.074 (Occasion) + 0.046 (Treatment) + 0.662 (Occasion * Treatment)

Hypothesis testing for estimators: For Occasion β1= 1.074 S.E (β1)= 0.162 p-value=<0.0001 For Treatment β2= 0.046 S.E (β2)= 0.236 p-value= 0.84 For interaction (Occasion * time) β3= 0.662 S.E (β3)= 0.244 p-value= 0.00665 P-value for the occasion indicates the effect of the time on the response is significantly different at each treatment and the effect of treatment is not significantly different at initial occasion. P-value for interaction indicates the effect of the treatment on the response is significantly different over time.

Model Goodness of fit test The Likelihood ratio test (G2) has been used for Goodness of fit test. ML model fitting, comparing the observed to fitted cell counts in modeling the 12 marginal logits using these six parameters with df=6 gives G2 = 8.0 and p-value 0.238, indicating that the model fit the given data set well

Interpretation of Parameters Effect of Treatment: (Active vs Placebo) 1. At initial observation: The estimated odds that the time to falling asleep for the active treatment is below any fixed equal Exp {0.046}=1.04 times the estimated odds for the placebo treatment. 2. At Follow up observation: The estimated odds that the time to falling asleep for the active treatment is below any fixed equal Exp{0.046+0.662} = 2.03 times the estimated odds for the placebo treatment.

Interpretation of Parameters (cont.) For the Active treatment the slope is β3= 0.662 (SE=0.244) higher than for the placebo, giving strong evidence of faster improvement. In other words, initially the two treatments had similar effect, but at the follow up those patients with the active treatment tended to fall asleep more quickly.

Conclusion Using the maximum likelihood methods for the marginal distribution for the above given Insomnia data set, we have sufficient evidence to conclude that treatment and time have substantial effects on the response (time to fall asleep).

Thank You For Your Attention