Download presentation
Presentation is loading. Please wait.
Published byJeremy Holt Modified over 8 years ago
1
Prognostic modelling: General modelling strategy issues Ewout Steyerberg Professor of Medical Decision Making Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands Dijon, Nov 12, 2009
2
Erasmus MC – University Medical Center Rotterdam
3
Dijon
4
Workshop « Prognostic Modelling » Important: prognostic modelling holds the promise to improve medical practice Timely: prognostic modeling is a vivid area of research many papers every month on prognostic factors and prognostic models recent books
6
http://www.clinicalpredictionmodels.org http://www.springer.com/978-0-387-77243-1
8
Prognosis and prediction models Physicians do not master the art of prognostication (Baatenburg de Jong) Solution? Data bases New prognostic factors Quantitative analyses Individual predictions instead of average predictions
9
Individualized decision making Individualization requires predictions Diagnosis: Probability of disease Therapy: Probability of outcomes (prognosis) Prognostic factors (natural cause) Predictive factors (response to therapy) Predictions from multivariable models Pragmatic: combination of prognostic factors Knowledge: incremental (‘independent’) value of a prognostic factor
10
Prognostic models Combine multiple patient or disease related characteristics to predict an outcome (‘prognostic factors’ / ‘predictors’) Useful for medical practice Inform patients, relatives, realistic expectations Decision making: physicians + patients Useful for research Risk adjustment in observational data (comparison of series, hospitals) Design and analysis of RCTs (selection, adjustment of treatment effects)
11
Prognosis in oncology
12
Applications in oncology: examples TNM classification Prognostic classifications IPI lymphoma Individual predictions Nomograms / score charts Prostate cancer Electronic: spreadsheet / internet Lynch syndrome
13
Example of an advanced prediction model Mismatch repair (MMR) mutations cause Lynch syndrome CRC; endometrial cancer; various other cancers Diagnostic work-up Simple: Amsterdam / Bethesda criteria Advanced: prediction model, e.g. PREMM
14
Prediction of MLH1 and MSH2 Mutations in Lynch syndrome PREMM 1,2 Model: Equation Log Odds (Pr/(1-Pr)) = -3.87 + 1.33V 1 + 2.78V 2 + 1.44V 3 + 0.59V 4 + 0.41V 5 + 0.951V 6 + 1.27V 7 + 0.964V 8 + 2.48V 9 + 0.404/V 10 – 0.358V 11 /10 – 0.293V12/10 V 1 = CRC in the proband; V 2 = 2 or more CRC in the proband; V 3 = endometrial cancer in the proband; V 4 =other HNPCC cancer in the proband; V 5 = adenoma in the proband; V 6 = 1 for presence of CRC in 1 st degree relative + 0.5 for presence of CRC in 2 nd degree relative; V 7 = ≥ 2 1 st degree relatives with CRC; V 8 = 1 for presence of endometrial cancer in 1 st degree relative + 0.5 for presence of endometrial cancer in 2 nd degree relative; V 9 = ≥ 2 1 st degree relatives with endometrial cancer; V 10 = presence of 1 st or 2 nd degree relative with other HNPCC cancer; V 11 = sum ages of CRC/adenoma; V 12= sum ages of endometrial cancer Balmana et al, JAMA 2006;296:1469-1478
18
PREMM p(mutation): 61% CRC dx 45 CRC dx 31 Endo dx 47 CRC dx 46 Urinary tract dx 58 Endo dx 40
19
Conclusions I Workshop « Prognostic Modelling » is timely and important
20
Modelling strategy What is the aim: knowledge on predictors, or provide predictions? Predictors (Prognostic factors) Traditional (demographics, patient, disease related) Modern (‘omics’, biomarkers, imaging) Challenges: Testing: independent effect, adjusted for confounders Estimation: correct functional form Predictions Pragmatic combination of predictors Recent book: General considerations and 7 modeling steps Many challenges Biostatistical Epidemiological / Decision-analytic
21
Prognostic modelling checklist
22
Prognostic modeling checklist: general considerations
23
Prognostic modeling checklist: 7 steps
24
Prognostic modeling checklist: validity
25
Example: prediction of myocardial infarction outcome
26
Aim: predictors or predictions? Title vs text; additional publication focuses at clinicians, using the 5 strongest predictors only
27
Predictors
28
General considerations in GUSTO-I model
29
1. Data inspection, specifically: missing values Among the array of clinical characteristics considered potential predictor variables in the modeling analyses were occasional patients with missing values. Although a full set of analyses was performed in patients with complete data for all the important predictor variables (92% of the study patients), the subset of patients with one or more missing predictor variables had a higher mortality rate than the other patients, and excluding those patients could lead to biased estimates of risk. To circumvent this, a method for simultaneous imputation and transformation of predictor variables based on the concepts of maximum generalized variance and canonical variables was used to estimate missing predictor variables and allow analysis of all patients. 33 34 The iterative imputation technique conceptually involved estimating a given predictor variable on the basis of multiple regression on (possibly) transformed values of all the other predictor variables. End-point data were not explicitly used in the imputation process. The computations for these analyses were performed with S-PLUS statistical software (version 3.2 for UNIX 32 ), using a modification of an existing algorithm. 33 34 The imputation software is available electronically in the public domain. 33 33 34 32 33 34 33
30
2. Coding of predictors continuous predictors linear and restricted cubic spline functions truncation of values (for example for systolic blood pressure) categorical variables Detailed categorization for location of infarction: anterior (39%), inferior (58%), or other (3%) Ordinality ignored for Killip class (I – IV) class III and class IV each contained only 1% of the patients
32
3. Model specification Main effects: “.. which variables were most strongly related to short- term mortality”: hypothesis testing rather than prediction question specific technique not explicitly stated, but likely p<0.05 Interactions: many tested, one included: Age*Killip Linearity of predictors: transformations chosen at univariate analysis were also used in multivariable analysis
33
4. Model estimation Standard ML No shrinkage / penalization No external information
34
5. Model performance Discrimination Area under the receiver operating characteristic curve (AUC, equivalent to the c statistic) Calibration: observed vs predicted Graphically, including deciles (similar to Hosmer-Lemeshow goodness of fit test) Specific subgroups of patients
35
Calibration
37
6. Model validation “First, 10-fold cross validation was performed: the model was fitted on a randomly selected subset of 90% of the study patients, and the resulting fit was tested on the remaining 10%. This process was repeated 10 times to estimate the extent to which the predictive accuracy of the model (based on the entire sample) was overoptimistic. Second, for each of 100 bootstrap samples (samples of the same size as the original population but with patients drawn randomly, with replacement, from the full study population), the model was refitted and then tested on the original sample, again to estimate the degree to which the predictive accuracy of the model would be expected to deteriorate when applied to an independent sample of patients.”
38
7. Model presentation Predictor effects: Relative importance: Chi-square statistics Relative effects: Odds ratios graphically Predictions Formula
40
Risk Model for 30-Day Mortality Probability of death within 30 days=1/[1+exp (-L)], where L=3.812+0.07624 age-0.03976 minimum (SBP, 120)+2.0796 [Killip class II]+3.6232 [Killip class III]+4.0392 [Killip class IV]-0.02113 heart rate+0.03936 (heart rate-50)+-0.5355 [inferior MI]-0.2598 [other MI location]+0.4115 [previous MI]-0.03972 height+0.0001835 (height-154.9)+3-0.0008975 (height-165.1)+3+0.001587 (height-172.0)+3-0.001068 (height- 177.3)+3+0.0001943 (height-185.4)+3+0.09299 time to treatment-0.2190 [current smoker]-0.2129 [former smoker]+0.2497 [diabetes]-0.007379 weight+0.3524 [previous CABG]+0.2142 [treatment with SK and intravenous heparin]+0.1968 [treatment with SK and subcutaneous heparin]+0.1399 [treatment with combination TPA and SK plus intravenous heparin]+0.1645 [hx of hypertension]+0.3412 [hx of cerebrovascular disease]- 0.02124 age · [Killip class II]-0.03494 age · [Killip class III]-0.03216 age · [Killip class IV]. Explanatory notes. 1. Brackets are interpreted as [c]=1 if the patient falls into category c, [c]=0 otherwise. 2. (x)+=x if x>0, (x)+=0 otherwise. 3. For systolic blood pressure (SBP), values >120 mm Hg are truncated at 120. 4. For time to treatment, values <2 hours are truncated at 2. 5. The measurement units for age are years; for blood pressure, millimeters of mercury; for heart rate, beats per minute; for height, centimeters; for time to treatment, hours; and for weight, kilograms. 6. "Other" MI location refers to posterior, lateral, or apical but not anterior or inferior. 7. CABG indicates coronary artery bypass grafting; SK, streptokinase; and hx, history.
42
Conclusion II GUSTO-I makes for an interesting case-study on General modeling considerations Illustration of 7 modeling steps Internal vs external validity (early 1990s 2009?) Debate possible on some choices 1. Missing values: multiple imputation, including the outcome 2. Coding: fractional polynomials? Lump categories? 3. Selection: stepwise works because of large N 4. Estimation: standard ML works because of large N; penalization? 5. Performance: nothing on usefulness 6. Validation: CV and bootstrap, not necessary because of large N? 7. Presentation: predictor effects: nice! Predictions: score chart / nomogram
43
Challenges in developing a valid prognostic model Theoretical: biostatistical research New analysis techniques, e.g. Neural networks / Support vector machines / … Fractional polynomials / splines for continuous predictors Performance measures Simulations: what makes sense as a strategy? Applications: epidemiological and decision-analytic research Subject matter knowledge Clinical experts Literature: review / meta-analysis Balance research questions vs effective sample size Incremental value new markers Transportability and external validity Clinical impact of using a model
44
Which performance measure when? 1.Discrimination: if poor, usefulness unlikely, but >= 0 2.Calibration: if poor in new setting: Prediction model may harm rather than improve decision-making
45
Phases of model development (Reilly Ann Intern Med 2006;144(3):201-9)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.