Flexible modeling of dose-risk relationships with fractional polynomials Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical.

Slides:



Advertisements
Similar presentations
Part II: Coping with continuous predictors
Advertisements

Exploring the Shape of the Dose-Response Function.
Polynomial Regression and Transformations STA 671 Summer 2008.
Interpreting regression for non-statisticians Colin Fischbacher.
Departments of Medicine and Biostatistics
Interactions With Continuous Variables – Extensions of the Multivariable Fractional Polynomial Approach Willi Sauerbrei Institut of Medical Biometry and.
Introduction Cure models within the framework of flexible parametric survival models T.M-L. Andersson1, S. Eloranta1, P.W. Dickman1, P.C. Lambert1,2 1.
Model assessment and cross-validation - overview
Chance, bias and confounding
Detecting an interaction between treatment and a continuous covariate: a comparison between two approaches Willi Sauerbrei Institut of Medical Biometry.
Modelling continuous variables with a spike at zero – on issues of a fractional polynomial based procedure Willi Sauerbrei Institut of Medical Biometry.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2009.
x – independent variable (input)
Correcting for measurement error in nutritional epidemiology Ruth Keogh MRC Biostatistics Unit MRC Centre for Nutritional Epidemiology in Cancer Prevention.
Making fractional polynomial models more robust Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg, Germany.
Writing a Research Protocol Michael Aronica MD Program Director Internal Medicine-Pediatrics.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Issues In Multivariable Model Building With Continuous Covariates, With Emphasis On Fractional Polynomials Willi Sauerbrei Institut of Medical Biometry.
EPI 809/Spring Multiple Logistic Regression.
Journal Club Alcohol and Health: Current Evidence July–August 2004.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
BIOST 536 Lecture 2 1 Lecture 2 - Modeling Need to find a model that relates the outcome to the covariates in a meaningful way  Simplification of the.
Multivariable model building with continuous data Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg, Germany.
A Longitudinal Study of Maternal Smoking During Pregnancy and Child Height Author 1 Author 2 Author 3.
Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.
COHORT AND CASE-CONTROL DESIGNS Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa SUMMER COURSE: INTRODUCTION TO EPIDEMIOLOGY.
Dr K N Prasad MD., DNB Community Medicine
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival Analysis: From Square One to Square Two
Cohort Study.
DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.
Building multivariable survival models with time-varying effects: an approach using fractional polynomials Willi Sauerbrei Institut of Medical Biometry.
Modelling continuous exposures - fractional polynomials Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg,
HSTAT1101: 27. oktober 2004 Odd Aalen
Multiple Choice Questions for discussion
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
1 Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2014.
Simple Linear Regression
Improved Use of Continuous Data- Statistical Modeling instead of Categorization Willi Sauerbrei Institut of Medical Biometry and Informatics University.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Research Study Design and Analysis for Cardiologists Nathan D. Wong, PhD, FACC.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
Assessing Survival: Cox Proportional Hazards Model
Alex Dregan and Martin Gulliford King’s College London 09 March 2012 Illicit drug use and cognitive functioning in mid-adult years.
Week 6: Model selection Overview Questions from last week Model selection in multivariable analysis -bivariate significance -interaction and confounding.
Multivariable regression modelling – a pragmatic approach based on fractional polynomials for continuous variables Willi Sauerbrei Institut of Medical.
Use of FP and Other Flexible Methods to Assess Changes in the Impact of an exposure over time Willi Sauerbrei Institut of Medical Biometry and Informatics.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Sampling, sample size estimation, and randomisation
Case-control study Chihaya Koriyama August 17 (Lecture 1)
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Lecture 7: Multiple Linear Regression Interpretation with different types of predictors BMTRY 701 Biostatistical Methods II.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Meta-analysis of observational studies Nicole Vogelzangs Department of Psychiatry & EMGO + institute.
Canadian Bioinformatics Workshops
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Bootstrap and Model Validation
Multivariable regression models with continuous covariates with a practical emphasis on fractional polynomials and applications in clinical epidemiology.
Dept. Computer Science & Engineering, Shanghai Jiao Tong University
Coffee drinking and leukocyte telomere length: A meta-analysis
Lecture 12 Model Building
Regression and Clinical prediction models
Presentation transcript:

Flexible modeling of dose-risk relationships with fractional polynomials Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg, Germany Patrick Royston MRC Clinical Trials Unit, London, UK

Modelling in (pharmaco-)epidemiology Cohort study, case-control study, … Several predictors, mix of continuous and categorical variables The focus is on one risk factor – the rest are potential confounders Wish to estimate the association of the risk factor with the outcome (adjusting for confounders) If the risk factor is continuous, the ‘dose’-risk function is of interest The issues are very similar in different types of regression models (linear regression model, logistic regn, GLM, survival models...) 1

Example – AMI and NSAID use (Hammad et al, PaDS 17:315, April 2008) An analysis using length of follow-up as a continuous variable could be informative! 2

Continuous risk variables – the problem “ Quantifying epidemiologic risk factors using non-parametric regression: model selection remains the greatest challenge” Rosenberg PS et al, Statistics in Medicine 2003; 22: Discussion of issues in modelling a single risk variable, mainly using cubic splines Trivial nowadays to fit almost any model To choose a good model is much harder 3

Alcohol consumption as risk factor for oral cancer Odds relative to non-drinkers 4

Continuous risk factors – which functional form? Traditional approaches a) Linear function - may be an inadequate description of reality - misspecification of functional form may lead to wrong conclusions b) ’Best’ standard transformation (log, square root, etc) c) Step function (categorical data) - Loss of information - How many cutpoints? - Which cutpoints? - Bias introduced by outcome-dependent choice 5

Stat in Med 2006, 25: (65 citations so far at July 2008) 6

Dichotomisation – the `optimal’ cutpoint method ‘Optimal’ cutpoint method is quite often used in clinical research Searches for cutpoint on a continuous variable to minimise the P-value comparing 2 groups But … Multiple testing means P-value is not honest E.g. P <0.002 is really P < 0.05 after adjusting ‘Optimal’ cutpoint is clinically meaningless Unstable – not reproducible between studies 7

Example – S-phase fraction in node-positive breast cancer `Optimal’: P = Corrected: P =

Continuous risk factors – some newer approaches ‘Non-parametric’ models Local smoothers (e.g. running line, lowess, etc) Linear, quadratic or cubic regression splines Cubic smoothing splines Parametric models Polynomials (quadratic, cubic, etc) Non-linear curves Fractional polynomials 9

Fractional polynomial (FP) models Continuous risk variable, X Fractional polynomial of degree m for X with powers p 1, p 2 …, p m is given by FP m (X) =  1 X p1 + … +  m X pm Powers p 1,…, p m are taken from a special set {2,  1,  0.5, 0, 0.5, 1, 2, 3} (0 means log) Usually m = 1 or m = 2 is sufficient for a good fit Repeated powers (p 1 = p 2 )  1 X p1 +  2 X p1 log X 8 FP1 models, 36 FP2 models Systematically search for best fit among these models 10

Examples of FP2 curves - varying powers 11

Selecting FP functions with real data Prefer the simplest (linear) model – if it fits well Use a more complex (non-linear) FP1 or FP2 model only if indicated by the data Apply a carefully designed function selection procedure to Control the type 1 error rate Reduce over-fitting The function selection procedure: Starts with the most complex model (FP2) Applies a sequence of tests to reduce complexity if not supported by data 12

Example – Whitehall 1 Prospective cohort study of 18,403 male British Civil Servants initially aged Complete 10-year follow up (n = 17,260) Identified causes of death: all-cause, stroke, cancer, coronary heart disease Aimed to examine socio-economic features as risk factors We consider all-cause mortality (1,670 deaths) and systolic blood pressure – logistic regression 13

χ 2 -difference dfp-value Any effect? Best FP2 versus null < Linear function suitable? Best FP2 versus linear < FP1 sufficient? Best FP2 vs. best FP < Function selection procedure for systolic blood pressure 14

Whitehall 1 – Mortality and systolic blood pressure 15

Whitehall 1 example – remarks Categorical models with 2 or 5 categories seriously ‘shrink’ the range of risk estimates Linear model looks badly biased for low blood pressures – shape of function is wrong FP2 model fits well and appears plausible Results qualitatively similar if adjusted for age and other factors 16

Multivariable models Can extend the FP method to multivariable modelling when have several continuous risk factors or confounders This is known as MFP (multivariable fractional polynomials) Royston & Sauerbrei (2008) explore MFP in detail Our book is on the Wiley conference stand! If desired, can select variables using a stepwise method (backward elimination) 17

Example: MFP model, Whitehall 1 see Royston P & Sauerbrei W, Meth Inf Med 44: (2005) 18

Advantages of MFP Avoids cut-points for continuous variables Systematic selection of variables and FP functions Informative about shape of risk relationship for any variable in the model not just the one of main interest 19

Concluding remarks Pharmaco-epidemiology appears to have plenty of continuous risk variables and plenty of continuous confounders (M)FP analysis may be very helpful in building parsimonious yet informative models with continuous risk variables We will be more than happy to discuss applications of the methodology with individuals 20