Presentation is loading. Please wait.

Presentation is loading. Please wait.

01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.

Similar presentations


Presentation on theme: "01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive."— Presentation transcript:

1 01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa

2 Objectives Epidemiology often tries to determine risk associated with ‘exposures’ or like events Various types of exposures –nominal –ordinal –count –continuous Session looks at how to use these different types of exposures in Cox models And, how to interpret the results 01/20152

3 Nominal variables (1) Categories –Male/female –Occupation –City of residence 01/20153

4 Nominal variables (2) Can not be rank ordered –If you ‘impose’ a ranking, then you are defining a new variable. –No intrinsic numerical value Even if they are commonly coded as 1/2, they are NOT NUMERIC Use of 1/2/3 coding dates back to early days of computers Can assign letters or words to categories (e.g. Male/Female) –Usually analyzed using dummy variables Indicator variables is a better term. 01/20154

5 Nominal variables (3) SAS must be aware that these variables are categories, not numbers –CLASS statement Indicator variables –‘n’ response levels  ‘n-1’ indicator variables –Why? The MLE equations are indeterminate if you include ‘n’ indicator variables. 01/20155

6 Consider a two-level nominal variable –e.g. M/F Some possible IV codings: All (and many more) are possible. #1 is the most commonly used Nominal variables (4) 01/20156 Level#1#2#3#4 1 (M) 2 (F) Level#1#2#3#4 1 (M)1 2 (F)0 Level#1#2#3#4 1 (M)11 2 (F)0 Level#1#2#3#4 1 (M)112 2 (F)01 Level#1#2#3#4 1 (M)11210 2 (F)01-18

7 Suppose we have 3 levels (blue/green/red) Some possible Indicator Variable codings: Nominal variables (5) 01/20157 Reference Effect Orthogonal Polynomial Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue) 2 (green) 3 (red) Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue)10 2 (green)01 3 (red)00 Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue)1010 2 (green)0101 3 (red)00 Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue)1010-1.2250.707 2 (green)01010-1.414 3 (red)00 1.2250.707

8 Ordinal variables Still categorical (name) type data. But, there is an implicit ordering with the levels. –Disease severity mild, moderate, severe –Rating scale Agree/don’t care/disagree Interval data which has been categorized is really ‘ordinal’, not interval. Commonly treat as nominal variables and ignore rank order Can use ordinal variables for trend or dose response analyses –Key issue: determining the spacing/numerical values to use 01/20158

9 Interval variables Has a meaningful ‘0’ Can assume any value (perhaps within a range) –Infinite precision is possible. Examples –Blood pressure –Cholesterol –Income Usually treated as a continuous number Can categorize and then use ordinal methods –Can reduce measurement errors –Useful for EDA to look for non-linear dose-response. 01/20159

10 Count variables The number of times an event has happened –# pregnancies –# years of education Very common type of data in medical research Often regarded as continuous but isn’t really Can affect choice of model –Poisson distribution 01/201510

11 Parameter interpretation (1) Cox models Hazard Ratios not the hazard –Can not provide a direct estimates of S(t) or h(t) –Needs additional assumptions or methods –Next week Cox model: Contains no explicit intercept term –That is the h 0 (t) 01/201511

12 Parameter interpretation (2) To explore Betas, we will use this form of the Cox model: Start with 2-level nominal variable Consider 3 different coding approaches 01/201512

13 Parameter interpretation (3) We will have only one parameter. 01/201513

14 x i = 1 if exposed = 0 otherwise 01/201514 H 0 : β 1 =0  Wald Score Likelihood ratio 95% CI  or

15 x i = 1 if exposed = -1 otherwise 01/201515 Changing the coding has changed the interpretation of the Beta value ‘Effect’ coding compares each level to the mean effect ‘sort of’ since it is dependent on the number of subjects in each level.

16 x i = 2 if exposed = 1 otherwise 01/201516 Changing the coding doesn’t always change the interpretation of the Beta value Need to be aware of the coding used in order to correctly interpret the model

17 X i is continuous (interval) Let’s consider the effect of changing the exposure by ‘1’ unit 01/201517 So,β 1 relates to the HR for a 1 unit change in x 1. What about an ‘s’ unit change?

18 X i is continuous (interval) Let’s consider the effect of changing the exposure by ‘s’ unit 01/201518 SAS has an option to do this automatically. Important to consider the level for a ‘meaningful’ change

19 01/201519 Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue)1010-1.2250.707 2 (green)01010-1.414 3 (red)00 1.2250.707

20 ‘reference’ coding; level 3=ref 01/201520

21 ‘effect’ coding 01/201521

22 Effect coding –β i does NOT let you compare two levels directly –Use ‘Reference Coding’ to do that 01/201522

23 ‘orthogonal polynomial’ coding 01/201523

24 01/201524 Doesn’t look that useful or interesting This coding can be used to test for linear and quadratic effects in the dose-response relationship Test β 1 to look for a linear effect Test β 2 to look for a quadratic effect Can be extended to higher orders if there are more levels.

25 Hypothesis testing (1) Does adding new variable(s) to a model lead to a better model? –Test the statistical significance of the additional variables. Let’s start with 1 variable The likelihood is a function of the variable –L = f(β 1 ) To test β 1 =0, compare the likelihood of the model with the MLE estimate to the ‘β 1 =0’ model –f(0) vs. f(β MLE ) 01/201525

26 Hypothesis testing (2) As is common in statistics, this gives the best results if you: –Take logarithms –Multiple by -2 Likelihood ratio test of H 0 : β 1 =0 01/201526

27 Hypothesis testing (3) Can extend to test the null hypothesis that ‘k’ parameters are simultaneously ‘0’. Likelihood ratio test of H 0 : β 1 =β 2 =….. β k =0 Can be used to compare any two models –BUT, Models must be Nested or hierarchical 01/201527

28 Hypothesis testing (4) Nested –All of the variables in one model must be contained in the other model #1: x 1, x 2, x 3, x 4 #2: x 1, x 2 –AND: Analysis samples must be identical Watch for issues with missing data in extra variables 01/201528

29 Hypothesis testing (5) You can also do many of these tests using the Wald and Score approximations –Hopefully, you get the same answer Usually do (at least, in the same ballpark) 01/201529

30 Confidence Intervals (1) Always based on the Wald Approximation First, determine 95% CI’s for the Betas Second, convert to 95% CI’s for the HR’s 01/201530

31 01/201531


Download ppt "01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive."

Similar presentations


Ads by Google