01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

II. Potential Errors In Epidemiologic Studies Random Error Dr. Sherine Shawky.
Exploring the Shape of the Dose-Response Function.
Interpreting regression for non-statisticians Colin Fischbacher.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Logistic Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Departments of Medicine and Biostatistics
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Multiple Regression – Basic Relationships
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Sample Size Determination Ziad Taib March 7, 2014.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
AS 737 Categorical Data Analysis For Multivariate
Nonlinear Regression Functions
Statistical Inference: Which Statistical Test To Use? Pınar Ay, MD, MPH Marmara University School of Medicine Department of Public Health
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Assessing Survival: Cox Proportional Hazards Model
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.
03/20141 EPI 5344: Survival Analysis in Epidemiology Log-rank vs. Mantel-Hanzel testing Dr. N. Birkett, Department of Epidemiology & Community Medicine,
01/20151 EPI 5344: Survival Analysis in Epidemiology Maximum Likelihood Estimation: An Introduction March 10, 2015 Dr. N. Birkett, School of Epidemiology,
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Introduction to Logistic Regression Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
01/20151 EPI 5344: Survival Analysis in Epidemiology Epi Methods: why does ID involve person-time? March 10, 2015 Dr. N. Birkett, School of Epidemiology,
01/20141 EPI 5344: Survival Analysis in Epidemiology Epi Methods: why does ID involve person-time? March 13, 2014 Dr. N. Birkett, Department of Epidemiology.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Chapter 16 Data Analysis: Testing for Associations.
Section 3.3: The Story of Statistical Inference Section 4.1: Testing Where a Proportion Is.
Issues concerning the interpretation of statistical significance tests.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
In Stat-I, we described data by three different ways. Qualitative vs Quantitative Discrete vs Continuous Measurement Scales Describing Data Types.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
We’ll now look at the relationship between a survival variable Y and an explanatory variable X; e.g., Y could be remission time in a leukemia study and.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
01/20151 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
01/20151 EPI 5344: Survival Analysis in Epidemiology Confounding and Effect Modification March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Quick Review from Session #1 March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health &
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
Instrument design Essential concept behind the design Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public.
Review: Stages in Research Process Formulate Problem Determine Research Design Determine Data Collection Method Design Data Collection Forms Design Sample.
01/20141 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models April 1, 2014 Dr. N. Birkett, Department of Epidemiology & Community.
Chapter 13 Understanding research results: statistical inference.
02/20161 EPI 5344: Survival Analysis in Epidemiology Hazard March 8, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
EPI 5344: Survival Analysis in Epidemiology Week 6 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa 03/2016.
03/20161 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 29, 2016 Dr. N. Birkett, School of Epidemiology, Public Health.
BINARY LOGISTIC REGRESSION
Multiple logistic regression
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Presentation transcript:

01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa

Objectives Epidemiology often tries to determine risk associated with ‘exposures’ or like events Various types of exposures –nominal –ordinal –count –continuous Session looks at how to use these different types of exposures in Cox models And, how to interpret the results 01/20152

Nominal variables (1) Categories –Male/female –Occupation –City of residence 01/20153

Nominal variables (2) Can not be rank ordered –If you ‘impose’ a ranking, then you are defining a new variable. –No intrinsic numerical value Even if they are commonly coded as 1/2, they are NOT NUMERIC Use of 1/2/3 coding dates back to early days of computers Can assign letters or words to categories (e.g. Male/Female) –Usually analyzed using dummy variables Indicator variables is a better term. 01/20154

Nominal variables (3) SAS must be aware that these variables are categories, not numbers –CLASS statement Indicator variables –‘n’ response levels  ‘n-1’ indicator variables –Why? The MLE equations are indeterminate if you include ‘n’ indicator variables. 01/20155

Consider a two-level nominal variable –e.g. M/F Some possible IV codings: All (and many more) are possible. #1 is the most commonly used Nominal variables (4) 01/20156 Level#1#2#3#4 1 (M) 2 (F) Level#1#2#3#4 1 (M)1 2 (F)0 Level#1#2#3#4 1 (M)11 2 (F)0 Level#1#2#3#4 1 (M)112 2 (F)01 Level#1#2#3#4 1 (M) (F)01-18

Suppose we have 3 levels (blue/green/red) Some possible Indicator Variable codings: Nominal variables (5) 01/20157 Reference Effect Orthogonal Polynomial Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue) 2 (green) 3 (red) Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue)10 2 (green)01 3 (red)00 Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue) (green) (red)00 Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue) (green) (red)

Ordinal variables Still categorical (name) type data. But, there is an implicit ordering with the levels. –Disease severity mild, moderate, severe –Rating scale Agree/don’t care/disagree Interval data which has been categorized is really ‘ordinal’, not interval. Commonly treat as nominal variables and ignore rank order Can use ordinal variables for trend or dose response analyses –Key issue: determining the spacing/numerical values to use 01/20158

Interval variables Has a meaningful ‘0’ Can assume any value (perhaps within a range) –Infinite precision is possible. Examples –Blood pressure –Cholesterol –Income Usually treated as a continuous number Can categorize and then use ordinal methods –Can reduce measurement errors –Useful for EDA to look for non-linear dose-response. 01/20159

Count variables The number of times an event has happened –# pregnancies –# years of education Very common type of data in medical research Often regarded as continuous but isn’t really Can affect choice of model –Poisson distribution 01/201510

Parameter interpretation (1) Cox models Hazard Ratios not the hazard –Can not provide a direct estimates of S(t) or h(t) –Needs additional assumptions or methods –Next week Cox model: Contains no explicit intercept term –That is the h 0 (t) 01/201511

Parameter interpretation (2) To explore Betas, we will use this form of the Cox model: Start with 2-level nominal variable Consider 3 different coding approaches 01/201512

Parameter interpretation (3) We will have only one parameter. 01/201513

x i = 1 if exposed = 0 otherwise 01/ H 0 : β 1 =0  Wald Score Likelihood ratio 95% CI  or

x i = 1 if exposed = -1 otherwise 01/ Changing the coding has changed the interpretation of the Beta value ‘Effect’ coding compares each level to the mean effect ‘sort of’ since it is dependent on the number of subjects in each level.

x i = 2 if exposed = 1 otherwise 01/ Changing the coding doesn’t always change the interpretation of the Beta value Need to be aware of the coding used in order to correctly interpret the model

X i is continuous (interval) Let’s consider the effect of changing the exposure by ‘1’ unit 01/ So,β 1 relates to the HR for a 1 unit change in x 1. What about an ‘s’ unit change?

X i is continuous (interval) Let’s consider the effect of changing the exposure by ‘s’ unit 01/ SAS has an option to do this automatically. Important to consider the level for a ‘meaningful’ change

01/ Level#1 X 1 X 2 #2 X 1 X 2 #3 X 1 X 2 1 (blue) (green) (red)

‘reference’ coding; level 3=ref 01/201520

‘effect’ coding 01/201521

Effect coding –β i does NOT let you compare two levels directly –Use ‘Reference Coding’ to do that 01/201522

‘orthogonal polynomial’ coding 01/201523

01/ Doesn’t look that useful or interesting This coding can be used to test for linear and quadratic effects in the dose-response relationship Test β 1 to look for a linear effect Test β 2 to look for a quadratic effect Can be extended to higher orders if there are more levels.

Hypothesis testing (1) Does adding new variable(s) to a model lead to a better model? –Test the statistical significance of the additional variables. Let’s start with 1 variable The likelihood is a function of the variable –L = f(β 1 ) To test β 1 =0, compare the likelihood of the model with the MLE estimate to the ‘β 1 =0’ model –f(0) vs. f(β MLE ) 01/201525

Hypothesis testing (2) As is common in statistics, this gives the best results if you: –Take logarithms –Multiple by -2 Likelihood ratio test of H 0 : β 1 =0 01/201526

Hypothesis testing (3) Can extend to test the null hypothesis that ‘k’ parameters are simultaneously ‘0’. Likelihood ratio test of H 0 : β 1 =β 2 =….. β k =0 Can be used to compare any two models –BUT, Models must be Nested or hierarchical 01/201527

Hypothesis testing (4) Nested –All of the variables in one model must be contained in the other model #1: x 1, x 2, x 3, x 4 #2: x 1, x 2 –AND: Analysis samples must be identical Watch for issues with missing data in extra variables 01/201528

Hypothesis testing (5) You can also do many of these tests using the Wald and Score approximations –Hopefully, you get the same answer Usually do (at least, in the same ballpark) 01/201529

Confidence Intervals (1) Always based on the Wald Approximation First, determine 95% CI’s for the Betas Second, convert to 95% CI’s for the HR’s 01/201530

01/201531