Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instructor: K.C. Carriere

Similar presentations


Presentation on theme: "Instructor: K.C. Carriere"— Presentation transcript:

1 Instructor: K.C. Carriere
Analysis of ordinal repeated categorical response data by using marginal model (Maximum likelihood approach) by Abdul Salam Instructor: K.C. Carriere Stat 562

2 Contents: Introduction Background of data Objective of the study
Basic theory Marginal model Model fitting using ML SAS Codes Results Conclusion

3 Introduction Definition Categorical data Repeated categorical data
Advantages and Disadvantages of repeated Measurements Designs

4 Definition Categorical data
Categorical data fits into a small number of discrete categories (as opposed to continuous). Categorical data is either non-ordered (nominal) such as gender or city, or ordered (ordinal) such as high, medium, or low temperatures.

5 Definition (cont-) Repeated categorical data
The term “repeated measurements” refers broadly to data in which the response of each experimental unit or subject is observed on multiple occasions or under multiple conditions. When the response is categorical then it is called repeated categorical data.

6 Definition (cont-) Application of Repeated categorical data
Repeated categorical response data occur commonly in health-related application, especially in longitudinal studies. For example, a physician might evaluate patients at weekly intervals regarding whether a new drug treatment is successful. In some cases explanatory variable also vary over time.

7 Advantages of Repeated Measurements Designs
Individual patterns of change. Provide more efficient estimates of relevant parameters than cross-sectional designs with the same number and pattern of measurement. Between subjects sources of variability can be excluded from the experimental error.

8 Disadvantages of Repeated Measurements Designs
Analysis of repeated data is complicated by the dependence among the repeated observations made on the same experimental unit. Often investigator cannot control the circumstances for obtaining measurements, so that the data may be unbalanced or partially incomplete.

9 Background of Insomnia data
A randomized, double blind clinical trail has been performed for comparing an active hypnotic drug with a placebo in patients who have insomnia problems. The outcome variable which is patient’s response to the question, How quickly did you fall asleep after going to bed?” measured using categories (<20 minutes, minutes, minutes, and >60 minutes). Patients were asked this question before and following a two-week treatment period.

10 Background of Insomnia data
Patients were randomly assigned to one of the two treatments active and placebo. The two treatments, active and placebo, form a binary explanatory variable. Patients receiving the two treatments were independent samples.

11 Table#1: Time to falling Asleep, by Treatment and Occasion.(n=239).
Follow Up Treatment Initial <20 min 20 – 30 min 30 – 60 min > 60 min Active <20 7 4 1 20 – 30 11 5 2 30 – 60 13 23 3 60 9 17 8 Placebo 14 6 18 > 60 22

12 Objectives To study the effect of time on the response.
To study the effect of treatment on the response. Is the time to fall asleep is quicker for active treatment than placebo? Is there any interaction between treatment and time? How does the treatment affect the time to fall asleep over time?

13 Pharmaceutical Company Interest
Company hope that patients with a Active treatment have a significantly higher rate of improvement than patients with placebo.

14 Generalized linear model to the analysis of Repeated Measurements Designs
Marginal Models; Random Effect Models; Transition models.

15 Basic Theory

16 GLMs for ordinal response.
Extensions of generalized linear model methodology for the analysis of repeated measurements accommodate discrete or continuous, time-independent or dependent covariates. GLMs have three components: A random component, which identify the response variable Y and its probability distribution; a systematic component specify explanatory variables used in a linear predictor function; a link function specifies the functional relationship between the systematic component and the E(Y).. Generalized linear model was first introduced by Nelder and Wedderburn (1972). It is an extension of classical linear models for independent normally distributed random variables with constant variance and can also handled model for rates and proportions, binary, ordinal and multinomial variables and counts response variables.

17 Random Component. j = 0,1,…….c Thus
Since the response is ordinal, so it is often advantageous to construct logits that account for categorical ordering and are less affected by the number of choice of categories of the response, which is known as cumulative response probabilities, from which the cumulative logits are defined. For ordinal response with c + 1 ordered categories labeled as 0,1, 2,…….,C for each individuals or experimental unit. The cumulative response probabilities are The random component of GLMs consists of response variable Y with independent observation (y1, y2, …………yN) from distribution in the natural exponential family. Several important distributions are special cases, including Poisson and Binomial. The response variable Y is not independent because of existence of within group dependence, due to repeated measurement on same individual, the corresponding within group covariance tends to be non-zero. j = 0,1,…….c Thus

18 Systematic component. The systematic component of the generalized linear model specifies the explanatory variables. The linear combination of these explanatory variables is called the linear predictor denoted by The vector β characterizes how the cross-sectional response distribution depends on the explanatory variables.

19 Link Function. The link function explain the relation ship between random and systematic component, that how relates to the explanatory variables in the linear predictor. For ordinal response having c+1 categories, one might use the cumulative logit. Logitj = logit [P(Y ≤ j)], j=1,…………..c

20 Link Function. where GLM is simplified to proportional odds model, then βj may simplify to β indicating the same effect for each logit. The proportional odds model is Each cumulative logit uses all c + 1 responses categories. A model for is similar to the ordinary logit model for a binary response, where categories 0 to j-1 form the first category and the j to c form the second category. For the ordinal response, if the cumulative link function is used, for j =1,……….c,

21 Link Function. For individuals with covariate vector x* and x, the odds ratio for the response below category j is The odds ratio does not depend on response category j. The regression coefficient can be calculated by taking log, which indicate the difference in logit (log odds) of response variable per unit change in the x. Each cumulative logit uses all c + 1 responses categories. A model for is similar to the ordinary logit model for a binary response, where categories 0 to j-1 form the first category and the j to c form the second category. For the ordinal response, if the cumulative link function is used,

22 Maximum Likelihood Method (ML).
The standard approach to maximum likelihood (ML) fitting of marginal models involves solving the score equations using the Newton-Raphson method, Fisher scoring, or some other iterative reweighted least squares algorithm. ML fitting of marginal logit models is awkward. For T observations on an I-category response, at each setting of predictors the likelihood refers to IT ­ multinomial joint probabilities, but the model applies to T sets of marginal multinomial parameters, and assume that marginal multinomial variates are independent.

23 ML: Model Speciofication.
Let consider T categorical responses, where the tth variable has It categories. The responses are ordinal observed for P covariate patterns, defined by a set of explanatory variables. Let r = denote the number of response profiles for each covariate pattern. The vector of counts for covariate pattern p is denoted by Yp. The Yp are assumed to be independent multinomial random vectors,

24 ML: Model Speciofication.
Where is a vector of positive probabilities and 1rT is a r-dimensional vector of 1’s. Since the model applies to T sets of marginal multinomial parameters, the marginal models can be written as a generalized linear model with the link function, (Lang and Agresti 1994). Here denote the complete set of multinomial joint probabilities for all setting of predictors. The matrix A applied to forms the T marginal probabilities and their complements at each setting of predictors. The elements of A are nonnegative. is a vector of p0 parameters. The matrix C applied to log marginal probabilities forms the T marginal logits for each setting; each row of C has 1 in the position multiplied by the log numerator probability for a given marginal logit, -1 in the position multiplied by the log denominator probability, and 0 elsewhere.

25 ML Fitting of marginal Models:
Lang and Agresti (1994) considered the likelihood as a function of rather then. The likelihood function for a marginal logit model is the product of the multinomial mass functions from the various predictors setting. One approach for ML fitting views the model as a set of constraints and uses methods for maximizing a function subject to constraints Let U denote a full column rank matrix such that column of U is orthogonal to by the column of X, then, and the model has the equivalent constraint form

26 ML Fitting of marginal Models:
Let be a vector having elements and the lagrange multipliers The Lagrangian likelihood equations have form where The method of maximizing the likelihood incorporates these model constraints as well as identifiability constraints, which constraint the response probabilities at each predictor setting to sum to 0. The method introduces Lagrange multipliers corresponding to these constraints and solves the Lagrangian likelihood equations using a Newton-Raphson algorithm (Aitchison and Silvey 1958; Harber 1985). is a vector with terms involving the contents in marginal logits that the model specifies constraints as well as log-likelihood derivative. The Newton-Raphson iterative scheme is

27 ML Fitting of marginal Models:
After obtaining the fitted values on convergence of the algorithm, they calculate model parameter estimates using A drawback of this algorithm is that the derivative matrix is typically very large (with the numbers of rows and columns exceeding the number of cells in the contingency tables) and does not have a simple form, making inversion difficult. Lang and Agresti (1994) uses an asymptomatic approximation to a reparameterized derivative matrix that has a much simpler form, requiring inverting only a diagonal matrix and a symmetric positive definite matrix. This maximum likelihood fitting method makes no assumption about the model that describes the joint distribution Thus, when the marginal model holds, the ML estimate are consistent regardless of the dependence structure for that distribution.

28 Inference Hypothesis testing for parameters:
After obtaining model parameter estimates and estimated covariance matrix, one can apply standard methods of inference, for instance Wald chi-squared test for marginal homogeneity. Goodness of Fit test: To assess model goodness of fit, one can compare observed and fitted cell counts using the likelihood-ratio statistics G2 or the Pearson Chi-square statistics. For nonsparse tables, assuming that the model holds, these statistics have approximate chi-squared distributions with degree of freedom equal to the number of constraints implied by

29 Limitations of ML: The number of multinomial probabilities increases dramatically as the number of predictors increases. ML approaches are not practical when T is large or there are many predictors, especially when some are continuous. It does not make any assumption about the model that describes the joint distribution .

30 Results: Time to Falling Asleep Treatment Occasion <20 min
Active Initial 0.101 0.168 0.336 0.395 Follow up 0.412 0.160 0.092 Placebo 0.117 0.167 0.292 0.425 0.258 0.242 0.208 Table#2: Sample Marginal Proportions for Insomnia Data.

31 Figure# 1: Sample Marginal Proportions Insomnia data.

32 Marginal Proportion sample proportion of time to falling asleep in <20 minutes for subject who received Active treatment at initial occasion is = ( ) / ( …………+13+8) = 12/119=0.1008 Similarly the sample proportion of time to falling asleep in >60 minutes for subject received placebo at follow up is = ( ) / ( ……… ) = 25/120= And so on.

33 What did you get from Marginal Proportion table?
From initial to follow up occasion, time to falling asleep seems to shift downward for both treatments. The degree of shift seems greater for the active treatment than placebo, indicating possible interaction. Or we could say that effect of treatment on the response is different at different occasion.

34 Fitted Marginal Model logit [P(Y ≤ j)] =
Let ‘x’ represent the treatment, with x=1 for an Active treatment and x=0 for the placebo. Let t denote the occasion measurement , with t=0 for initial and t=1 for follow up. Let (Yt) represent the outcome variable which is patient’s response at time t to the question, “How quickly did you fall asleep after going to bed?” with j=0 for <20 minutes, j=1 for minutes, j=2 for 30-60 minutes, and j=3 for >60 minutes). The marginal model with cumulative link can be written for our data set as where ’s are the intercept and , , and are the coefficient of the occasion, treatment, and interaction term respectively. logit [P(Y ≤ j)] =

35 SAS code data isomnia; input treatment $ initial $ follow $ count @@;
If count=0 then count=1E-8; datalines; active < < active < active < active < > active < active active active > active < active active active > active > < active > active > active > > placbo < < placbo < placbo < placbo < > placbo < placbo placbo placbo > placbo < placbo placbo placbo > placbo > < placbo > placbo > placbo > > ;

36 proc catmod order=data data=isomnia;
weight count; population Treatment; response clogit; model initial*follow=( , α 1+ β1+ β2 +β3 active + follow, j=1 , α 2+ β1+ β2 +β3 active + follow, j=2 , α 3+ β1+ β2 +β active + follow, j=3 , α 1+ β1 active+ initial, j=1 , α 2+ β1 active+ initial , j=2 , α 3+ β1 active + initial, j=3 , α β2 placebo+ follow, j=1 , α β2 placebo+ follow, j=2 , α β2 placebo+ follow, j=3 , α placebo+ initial, j=1 , α 2 placebo+ initial, j=2 ) α 3 placebo+ initial, j=3 (1 2 3 ='Cutpoint', 4='Treatment', 5='TIme effect', 6='Time*Treatment effect') / freq; quit; SAS code

37 Fitted Marginal Model After fitting the marginal model using maximum likelihood method to the above marginal distribution gave the following results Logit [P (Y≤ J)] = (Occasion) + 0.046 (Treatment) + 0.662 (Occasion * Treatment)

38 Hypothesis testing for estimators:
For Occasion β1= S.E (β1)= p-value=<0.0001 For Treatment β2= S.E (β2)= p-value= 0.84 For interaction (Occasion * time) β3= S.E (β3)= p-value= P-value for the occasion indicates the effect of the time on the response is significantly different at each treatment and the effect of treatment is not significantly different at initial occasion. P-value for interaction indicates the effect of the treatment on the response is significantly different over time.

39 Model Goodness of fit test
The Likelihood ratio test (G2) has been used for Goodness of fit test. ML model fitting, comparing the observed to fitted cell counts in modeling the 12 marginal logits using these six parameters with df=6 gives G2 = 8.0 and p-value 0.238, indicating that the model fit the given data set well

40 Interpretation of Parameters
Effect of Treatment: (Active vs Placebo) 1. At initial observation: The estimated odds that the time to falling asleep for the active treatment is below any fixed equal Exp {0.046}=1.04 times the estimated odds for the placebo treatment. 2. At Follow up observation: The estimated odds that the time to falling asleep for the active treatment is below any fixed equal Exp{ } = 2.03 times the estimated odds for the placebo treatment.

41 Interpretation of Parameters (cont.)
For the Active treatment the slope is β3= (SE=0.244) higher than for the placebo, giving strong evidence of faster improvement. In other words, initially the two treatments had similar effect, but at the follow up those patients with the active treatment tended to fall asleep more quickly.

42 Conclusion Using the maximum likelihood methods for the marginal distribution for the above given Insomnia data set, we have sufficient evidence to conclude that treatment and time have substantial effects on the response (time to fall asleep).

43 Thank You For Your Attention


Download ppt "Instructor: K.C. Carriere"

Similar presentations


Ads by Google