D:/rg/folien/ms/ms-USA-040511.ppt F 1 Assessment of prediction error of risk prediction models Thomas Gerds and Martin Schumacher Institute of Medical.

Slides:



Advertisements
Similar presentations
Surviving Survival Analysis
Advertisements

Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Survival Analysis In many medical studies, the primary endpoint is time until an event occurs (e.g. death, remission) Data are typically subject to censoring.
Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
Departments of Medicine and Biostatistics
Detecting an interaction between treatment and a continuous covariate: a comparison between two approaches Willi Sauerbrei Institut of Medical Biometry.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Cervical Cancer Case Study Eshetu Atenafu, Sandra Gardner, So-hee Kang, Anjela Tzontcheva University of Toronto Department of Public Health Sciences (Biostatistics)
the Cox proportional hazards model (Cox Regression Model)
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Using time-dependent covariates in the Cox model THIS MATERIAL IS NOT REQUIRED FOR YOUR METHODS II EXAM With some examples taken from Fisher and Lin (1999)
How do we know whether a marker or model is any good? A discussion of some simple decision analytic methods Carrie Bennette on behalf of Andrew Vickers.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Sample Size Determination
EVIDENCE BASED MEDICINE
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Analysis of Complex Survey Data
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival Analysis: From Square One to Square Two
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Introduction MarkerPolymorphisms in CYP2C19 (*2 and *17) assessed by Taqman allelic discrimination ObjectivesTo evaluate the predictive capacity of CYP2C19.
Estimating cancer survival and clinical outcome based on genetic tumor progression scores Jörg Rahnenführer 1,*, Niko Beerenwinkel 1,, Wolfgang A. Schulz.
Building multivariable survival models with time-varying effects: an approach using fractional polynomials Willi Sauerbrei Institut of Medical Biometry.
Simple Linear Regression
Criteria for Assessment of Performance of Cancer Risk Prediction Models: Overview Ruth Pfeiffer Cancer Risk Prediction Workshop, May 21, 2004 Division.
Improved Use of Continuous Data- Statistical Modeling instead of Categorization Willi Sauerbrei Institut of Medical Biometry and Informatics University.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
How do we know whether a marker or model is any good? A discussion of some simple decision analytic methods Carrie Bennette (on behalf of Andrew Vickers)
Correlation and Regression SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram.
INTRODUCTION TO SURVIVAL ANALYSIS
Guanylyl Cyclase C (GCC) Lymph Nodes (LN) Classification as a Prognostic Marker in Patients with Stage II Colon Cancer: A Pooled Analysis Daniel J. Sargent,
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Linear correlation and linear regression + summary of tests
Breast cancer in elderly patients (70 years and older): The University of Tennessee Medical Center at Knoxville 10 year experience Curzon M, Curzon C,
BUILDING THE REGRESSION MODEL Data preparation Variable reduction Model Selection Model validation Procedures for variable reduction 1 Building the Regression.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Run length and the Predictability of Stock Price Reversals Juan Yao Graham Partington Max Stevenson Finance Discipline, University of Sydney.
1 Using dynamic path analysis to estimate direct and indirect effects of treatment and other fixed covariates in the presence of an internal time-dependent.
Some survival basics Developments from the Kaplan-Meier method October
Hazlina Hamdan 31 March Modelling survival prediction in medical data By Hazlina Hamdan Dr. Jon Garibaldi.
Results Abstract Analysis of Prognostic Web-based Models for Stage II and III Colon Cancer: A Population-based Validation of Numeracy and Adjuvant! Online.
6. Ordered Choice Models. Ordered Choices Ordered Discrete Outcomes E.g.: Taste test, credit rating, course grade, preference scale Underlying random.
A B C Supplementary Figure S1. Time-dependent assessment of grade, GGI and PAM50 in untreated patients Landmark analyses of the Kaplan-Meier estimates.
Response-Guided Neoadjuvant Chemotherapy for Breast Cancer Gunter von Minckwitz, Jens Uwe Blohmer, Serban Dan Costa, Carsten Denkert, Holger Eidtmann Journal.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
J Clin Oncol 30: R2 윤경한 / Prof. 김시영 Huan Jin, Dongsheng Tu, Naiqing Zhao, Lois E. Shepherd, and Paul E. Goss.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences
Bootstrap and Model Validation
Mamounas EP et al. Proc SABCS 2012;Abstract S1-10.
HOW TO BIAS A KAPLAN-MEIER CURVE
Harvard T.H. Chan School of Public Health
April 18 Intro to survival analysis Le 11.1 – 11.2
Survival Analysis: From Square One to Square Two Yin Bun Cheung, Ph.D. Paul Yip, Ph.D. Readings.
Statistics 103 Monday, July 10, 2017.
Shidan Wang, BS, Lin Yang, MD, Bo Ci, BS, Matthew Maclean, BS, David E
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Nadia Howlader, PhD National Cancer Institute
Progression-free (a) and overall (b) survival by age subgroup, Kaplan-Meier plots. Progression-free (a) and overall (b) survival by age subgroup, Kaplan-Meier.
Presentation transcript:

D:/rg/folien/ms/ms-USA ppt F 1 Assessment of prediction error of risk prediction models Thomas Gerds and Martin Schumacher Institute of Medical Biometry and Medical Informatics University Hospital Freiburg, Germany

D:/rg/folien/ms/ms-USA ppt F 2 –Situation –Measures of prediction error –Application to prediction of breast cancer survival –General conclusion –Considerations for breast cancer risk prediction Outline

D:/rg/folien/ms/ms-USA ppt F 3 Situation (1) –Goal:Assessment of predictions  (t|X i ) based on a comparison with actually observed outcomes T i in a sample of n individuals (i = 1,…,n) predicted probability that an individual will be event- free up to t units of time based on covariate information X available at t = 0 –Prediction:  (t|X) T denotes time to event of interest –Outcome:

D:/rg/folien/ms/ms-USA ppt F 4 Situation (2) can be defined for a fixed time t or for a time range should have the properties of a survival probability function is ideally externally derived but otherwise, can be anything: produced by statistical model building, by machine learning techniques or may constitute expert guesses –Prediction:  (t|X)

D:/rg/folien/ms/ms-USA ppt F 5 Measures of prediction error (1) –General loss function approach E (L (T, X,  )) –Common choices:

D:/rg/folien/ms/ms-USA ppt F 6 Measures of prediction error (2) –Expected quadratic or Brier score "Mean Squared Error of Prediction (MSEP)"

D:/rg/folien/ms/ms-USA ppt F 7 Measures of prediction error (2) –Expected quadratic or Brier score "Mean Squared Error of Prediction (MSEP)" –Decomposition S(t|X) denotes the "true" probability that an individual with covariate X will be event-free up to t

D:/rg/folien/ms/ms-USA ppt F 8 Measures of prediction error (3) –Empirical quadratic or Brier Score "Residual Sum of Squares (RSS)" –MSEP and RSS are time-dependent in survival problems –Graphical tool: plotting RSS over time

D:/rg/folien/ms/ms-USA ppt F 9

D:/rg/folien/ms/ms-USA ppt F 10 Measures of prediction error (3) –Incorporation of censored observations "Weighted Residual Sum of Squares (WRSS)" –Empirical quadratic or Brier Score "Residual Sum of Squares (RSS)"

D:/rg/folien/ms/ms-USA ppt F 11 Application to prediction of breast cancer survival GBSG-2-study (German Breast Cancer Study Group) 686 patients with complete information on prognostic factors Two thirds are randomized, otherwise standardized treatment Median follow-up 5 years, 299 events for event-free survival Prognostic factors considered: age, tumor size, tumor grade, number of positive lymph nodes, progesterone receptor, estrogen receptor Predictions for individual patients are derived in terms of conditional event-free probabilities given the covariate combination by means of the Nottingham Prognostic Index and a Cox regression model with all six prognostic factors

D:/rg/folien/ms/ms-USA ppt F 12

D:/rg/folien/ms/ms-USA ppt F 13 Which benchmark value? –"Naive" prediction  (t|X) = 0.5 for all t and X gives a Brier score value of 0.25 –Common prediction  (t) for all individuals ignoring the available covariate information ("pooled Kaplan Meier")

D:/rg/folien/ms/ms-USA ppt F 14

D:/rg/folien/ms/ms-USA ppt F 15 Which benchmark value? –"Naive" prediction  (t|X) = 0.5 for all t and X gives a Brier score of 0.25 –Common prediction  (t) for all individuals ignoring the available covariate information ("pooled Kaplan Meier")' –Calculation of R 2 -measures for checking various aspects of prediction models

D:/rg/folien/ms/ms-USA ppt F 16

D:/rg/folien/ms/ms-USA ppt F 17

D:/rg/folien/ms/ms-USA ppt F 18 –is the mean squared error of prediction (MSEP) when predictions are made in terms of event(-free) probabilities –allows the assessment of any kind of predictions based on individual covariate values –can be estimated even in the presence of right censoring by a weighted residual sum of squares in a nonparametric way –is a valuable tool to detect overfitting –allows the calculation of R 2 -measures –can be adapted to the situation of competing risks and dynamic updating of predictions General conclusion The quadratic or Brier score

D:/rg/folien/ms/ms-USA ppt F 19 Considerations for breast cancer risk prediction –Intention:Assessment of predictions for t = 5y based on aggregated data published by Costantino et al. JNCI 1999; constant prediction ignoring all covariate information is used as benchmark value predicted probability that a woman will develop breast cancer up to time t based on covariate information including age available at t = 0 (entry into program or study; time when prediction is performed) –Prediction:  (t|X) T denotes time from entry into program to development of breast cancer –Outcome: development of breast cancer before t otherwise

D:/rg/folien/ms/ms-USA ppt F 20 Costantino et al., Journal of the National Cancer Institute, Vol. 91, No. 18, September 15, 1999

D:/rg/folien/ms/ms-USA ppt F 21 Estimated prediction error based on aggregated data Brier ScoreLogarithmic score Age group, ymodel 1const. pred.model 1const. pred.  –  All ages

D:/rg/folien/ms/ms-USA ppt F 22 Estimated relative risk (RR) for predicted risk quintiles (model 1, all ages) Predicted 5-year, % risk < – – – > No. of women Observed breast cancer RR

D:/rg/folien/ms/ms-USA ppt F 23 "Diagnostic" properties of predicted risk quintiles (model 1, all ages) SensitivityPos. pred. value SpecificityNeg. pred. value Cutpoint Pred. 5-year risk, %

D:/rg/folien/ms/ms-USA ppt F 24