Presentation is loading. Please wait.

Presentation is loading. Please wait.

REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.

Similar presentations


Presentation on theme: "REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI."— Presentation transcript:

1 REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI

2 Objective  Focus on the parametric regression models for survival data  Build an understanding related to the commonly used parametric regression methods  Link the survival time of an individual to covariates using a specified probability distribution within the regression settings

3 What is Prognosis? It is the prediction of the future of an individual patient with respect to duration, course, and outcome of a disease. Prognosis plays an important role in medical practice but it is often difficult to sort out which characteristics of a patient (also called explanatory variables) are most closely related to it. Therefore, a statistical analysis is needed to prepare a compact summary of the data that can reveal their relationship.

4 PRELIMINARY EXAMINATION OF DATA The Categories of Dependent Variables: The data used in prognostic studies or clinical trials can have the response variable as dichotomous, polychotomous, or continuous. Dichotomous dependent variables : response or nonresponse, life or death, and presence or absence of a given disease. Polychotomous dependent variables : different grades of symptoms (e.g., no evidence of disease, minor symptom, major symptom) and scores of psychiatric reactions (e.g., feeling well, tolerable, depressed, or very depressed). Continuous dependent variables : length of survival from start of treatment or length of remission, both measured on a numerical scale by a continuous range of values.

5 The Categories of Independent Variables: A prognostic variable (or independent variable) may be either numerical or nonnumeric. Numerical prognostic variables: - discrete, such as the number of previous strokes - continuous, such as age Continuous variables can be made discrete by grouping patients into subcategories (e.g., four age subgroups: 20, 20—39, 40—59, and 60). Nonnumeric prognostic variables : - unordered (e.g., race or diagnosis) - ordered (e.g., severity of disease may be primary, local, or metastatic).

6 Steps in Data Examination: Before conducting a statistical computation, the data needs to be examined carefully. We usually take the following steps as our preliminary examination:  Obtain correlation coefficients between variables to detect significantly correlated variables. The highly correlated variable that has a prognostic value shown in other studies shouldn’t be deleted.  For qualitative prognostic variable, the dummy variables are used. For example, having cell types A, B and O, let the dummy variable x1 takes the value of 1 for cell type A and 0 otherwise, and x2 takes the value of 1 for cell type B and 0 otherwise. For two categories (e.g., sex), only one dummy variable is needed: x is 1 for a male, 0 for a female.  Transformation ( such as logarithmic) can be applied to the prognostic variables to obtain the better description of the data.

7  Reduction of prognostic factors that have little or no effect on the dependent variable from the multivariate analysis.  Dealing with missing data. - depends what proportion of data is missing - may drop the missing data observations if they are relatively smaller in proportion - for nominal or categorical independent variable, treat individuals in a group with missing information as another group. - for quantitatively measured variables (e.g., age), the mean of the values available can be used for a missing value. This principle can also be applied to nominal data.

8 GENERAL STRUCTURE OF PARAMETRIC REGRESSION MODELS

9 Commonly Used Parametric Models: The most commonly used parametric models are:  Exponential  Weibull  Lognormal  Log-logistic  Gamma  Gompertz The first two are included in our discussion. The distributions generally involve 2 parameters : (λ) scale parameter & (γ) shape parameter. Shape is assumed constant across individuals. Maximum Likelihood Estimation is used to obtain the estimates for parameters. Newton – Raphson Iterative procedure is also applied when there is no closed solution to MLE.

10 Likelihood Inference of Regression Models

11 Hypothesis Testing

12 Exponential Model  The exponential distribution is a useful form of the survival distribution when the hazard function (probability of failure) is constant and does not depend on time, the graph is approximately a straight line with slope=1.  In biomedical field, a constant hazard function is usually unrealistic, the situation will not be the case.

13

14 Practical Approach

15

16

17 Weibull Model  The hazard function changes with time, the graph is approximately a straight line, but the slope is not 1.  The hazard function always increase when the parameter γ >1  The hazard function always decrease when γ <1  It is the exponential regression model when γ = 1

18

19

20 Exponential hazard function is constant whereas Weibull hazard function is monotonically decreasing.

21 THANK YOU


Download ppt "REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI."

Similar presentations


Ads by Google