Kevin Kennedy, MS Saint Luke’s Hospital, Kansas City, MO

Slides:



Advertisements
Similar presentations
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
Advertisements

Critically Evaluating the Evidence: diagnosis, prognosis, and screening Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management.
Departments of Medicine and Biostatistics
HSRP 734: Advanced Statistical Methods July 24, 2008.
Comparison of the New Mayo Clinic Risk Scores and Clinical SYNTAX Score in Predicting Adverse Cardiovascular Outcomes following Percutaneous Coronary Intervention.
Extension Article by Dr Tim Kenny
Chapter 9.2 ROC Curves How does this relate to logistic regression?
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
How do we know whether a marker or model is any good? A discussion of some simple decision analytic methods Carrie Bennette on behalf of Andrew Vickers.
Chapter 9.2 ROC Curves How does this relate to logistic regression?
Lucila Ohno-Machado An introduction to calibration and discrimination methods HST951 Medical Decision Support Harvard Medical School Massachusetts Institute.
Today Concepts underlying inferential statistics
Sample Size Determination
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Effect Sizes, Power Analysis and Statistical Decisions Effect sizes -- what and why?? review of statistical decisions and statistical decision errors statistical.
Chapter 14 Inferential Data Analysis
Decision Tree Models in Data Mining
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director.
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley.
Multiple Choice Questions for discussion
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Statistics for clinical research An introductory course.
Criteria for Assessment of Performance of Cancer Risk Prediction Models: Overview Ruth Pfeiffer Cancer Risk Prediction Workshop, May 21, 2004 Division.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
User Study Evaluation Human-Computer Interaction.
How do we know whether a marker or model is any good? A discussion of some simple decision analytic methods Carrie Bennette (on behalf of Andrew Vickers)
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 2: Diagnostic Classification.
EBCP. Random vs Systemic error Random error: errors in measurement that lead to measured values being inconsistent when repeated measures are taken. Ie:
Long-Term Prognostic Value for Patients with Chronic Heart Failure of Estimated Glomerular Filtration Rate Calculated with the New CKD-EPI Equations Containing.
Limited Dependent Variables Ciaran S. Phibbs May 30, 2012.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
A SAS Macro to Calculate the C-statistic Bill O’Brien BCBSMA SAS Users Group March 10, 2015.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Henry Domenico Vanderbilt University Medical Center.
1 THE ROLE OF COVARIATES IN CLINICAL TRIALS ANALYSES Ralph B. D’Agostino, Sr., PhD Boston University FDA ODAC March 13, 2006.
Osteoarthritis Initiative Analytic Strategies for the OAI Data December 6, 2007 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
Some Alternative Approaches Two Samples. Outline Scales of measurement may narrow down our options, but the choice of final analysis is up to the researcher.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
01/20151 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
1 SSC 2006: Case Study #2: Obstructive Sleep Apnea Rachel Chu, Shuyu Fan, Kimberly Fernandes, and Jesse Raffa Department of Statistics, University of British.
01/20141 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models April 1, 2014 Dr. N. Birkett, Department of Epidemiology & Community.
Doing Analyses on Binary Outcome. From November 14 th Dr Sainani talked about how the math works for binomial data.
Additional Regression techniques Scott Harris October 2009.
Net Reclassification Risk: a graph to clarify the potential prognostic utility of new markers Ewout Steyerberg Professor of Medical Decision Making Dept.
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
03/20161 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 29, 2016 Dr. N. Birkett, School of Epidemiology, Public Health.
Assessing the additional value of diagnostic markers: a comparison of traditional and novel measures Ewout W. Steyerberg Professor of Medical Decision.
Matching methods for estimating causal effects Danilo Fusco Rome, October 15, 2012.
Bootstrap and Model Validation
Biostatistics Lecture /5 & 6/6/2017.
Measuring prognosis Patients want to know likely outcome
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
Applied Biostatistics: Lecture 2
Chapter 8: Inference for Proportions
Evidence Based Diagnosis
Presentation transcript:

A SAS Macro to Compute Added Predictive Ability of New Markers in Logistic Regression Kevin Kennedy, MS Saint Luke’s Hospital, Kansas City, MO Kansas City Area SAS User Group Meeting September 3, 2009

Acknowledgment A special thanks to Michael Pencina (PhD, Biostatistics Dept, Boston University, Harvard Clinical Research Institute) for his valuable input, ideas, and additional formulas to strengthen output.

Motivation Predicting dichotomous outcomes are important Will a patient develop a disease? Will an applicant default on a loan? Will KSU win a game in the NCAA tournament? Improving an already usable models should be a continued goal

Motivation Many published models exist Risk of Bleeding after PCI Predicts patient risk of bleeding after a PCI (Percutaneous Coronary Intervention) based on 10 patient characteristics: Age, gender, shock, prior intervention, kidney function, etc.. Example: a 78 year old female with Prior Shock and Congestive Heart Failure has a ~8% chance of bleeding after procedure (national avg=2-3%) Why important? We can treat those at high risk appropriately Mehta et al. Circ Intervention, June 2009

However….. These models aren’t etched in stone New markers (variables) should be investigated to improve model performance Important Question: How do we determine if we should add these new markers to the model?

Project Goal Compare: How Much Does the new variable add to model performance? Output of Interest: Predicted Probability of Event (one for each model) computed with the logistic regression model Talk about a probability for each model

Outline Traditional Comparisions New Measures Receiver Operating Characteristic Curve New Measures IDI, NRI, Vickers Decision Curve SAS macro to obtain output

Traditional Approach-AUCs Common to plot the Receiver Operating Characteristic Curve (ROC) and report the area underneath (AUC) or c-statistic Measures model discrimination Equivalent to the probability that the predicted risk is higher for an event than a non-even[4]

AUC/ROC Background ROC/AUC plot depicts trade-off between benefit (true positives) and costs (false positives) by defining a “cut off” to determine positive and negative individuals Cut Off Events Non-Events True Neg True Pos False Neg False Pos Predicted Probabilities True Positive Rate (Sensitivity)=.9 False Positive Rate (1-Specificity)=.4

ROC curve True Positive Rate (sensitivity) 0% 100% False Positive Rate (1-specificity) 0% 100%

AUC Computation All possible pairs are made between events and nonevents A dataset with 100 events and 1000 non-events would have 100*1000=100000 pairs If the predicted probability for the event is higher for the subject actually experiencing the event give a ‘1’ (concordant) otherwise ‘0’ (discordant) C-statistic is the average of the 1’s and 0’s (.5 for ties) Now use methods by DeLong to compare AUCs of Model1 and Model2

Advantages Used all the time Recommended guidelines exist for Excellent/Good/Poor discrimination Default output in proc logistic, with new extensions in version 9.2 (roc and roccontrast statements) along with other computing packages

Disadvantages Rank based Doesn’t imply a useful model A comparison of .51 to .50 is treated the same as .7 to .1 Doesn’t imply a useful model Example: all events have probability=.51 and non-event have probability=.5---Perfect discrimination (c=1) but not useful Extremely hard to find markers that result in high AUC Pepe [5] claims an odds ratio of 3 doesn’t yield good discrimination

Alternatives Pencina and D’Agostino in 2008 (Statistics in Medicine) suggest 2 additional statistics: IDI (Integrated Discrimination Improvement) NRI (Net Reclassification Improvement) Vickers [2,3] developed graphical techniques of comparing models

IDI Measure of improved sensitivity without sacrificing specificity Formula measures how much increase in ‘p’ for events, and how much decrease for the non-event

IDI Absolute_IDI=(.15-.12)+(.25-.2)=.08 160% Relative improvement Mean ‘p’ increased by 5% for events Mean ‘p’ decreased by 3% for non-events

NRI Measures how well the new model reclassifies event and non-event Dependant on how we decide to classify observations Example: 0-10% Low, 10-20% Moderate, >20% High Questions: do the patients experiencing an event go up in risk? Mod to High between model1 and 2 Do Patients not experiencing an event go down in risk? Moderate to Low?

NRI Formula:

NRI Computation Example 3 groups defined <10% (low) 10-20% (moderate) >20% (high) Each individual will have a group for model 1 and 2 2 cross-tabulation tables (events and non-events)

NRI Computation Example Crosstab1: Events (100 Events) Model 2 Low Mod High 10 8 2 3 30 5 20 Events moving up 10 Events moving down 70 Events not moving Net of 10/100 (10%) of events getting reclassified correctly Model 1

NRI Computation Example Crosstab2: Non-Events (200 Non-Events) Model 2 15 Non-events moving up 35 Non-events moving down 150 Non-events not moving Net of 20/200 (10%) of non-events getting reclassified correctly Low Mod High 50 5 20 40 10 60 Model 1

NRI Caveats Dependant on Groups Alternative ways to define groups Would we reach similar conclusions with groups: <15, 15-30, >30????? Alternative ways to define groups Any up/down movement. A change in p from .15 to .151 would be an ‘up’ movement A threshold: ex. a change of 3% would constitute an up/down. i.e. .33 to .37 would be an ‘up’ but .33 to .34 would be no movement Good News: Macro handles all these cases and you can request all at once

Vickers Decision Curve A graphical comparison of model 1 and 2 based off of ‘net’ benefit (first attributed to Peirce-1884) Useful if a threshold is important. Example: If a persons predicted probability of an outcome is greater than 10% we treat with strategy A Here we’d want to compare the models at this threshold

Vickers Decision Curve Example: N=1000, dichotomous event, 10% as threshold True Positive count Outcome Yes No ≥10% 80 200 <10% 20 700 False Positive count Predicted Probability 5.78 net true positive results per 100 compared with treating all as negative

Vickers Decision Curve No difference between 1 2 and treat all 0.06 0.05 0.04 Model 2 seems to outperform 1 0.03 Net Benefit 0.02 0.01 0.00 -0.01 10 20 30 40 50 Threshold Probability in % PLOT Treat All Model1 Model2

SAS Macro to Obtain Output %added_pred(data= ,id= , y= , model1cov= ,model2cov= , nripoints=ALL, nriallcut=%str(), vickersplot=FALSE, vickerpoints=%str()); Model1cov=initial model Model2cov=new model Nripoints=nri levels (eg: <10, 10-20, >20) insert nripoints=.1 .2 Nriallcut=if you want to test amount of increase or decrease instead of levels, i.e. if you want to know if a person increase/decreases by 5% insert nriallcut=.05 Vickerpoints=thresholds to test (eg, 10%)

AUC section of Macro AUC Comparisions are easy with SAS 9.2 PROC LOGISTIC DATA=&DATA; MODEL &y=&model1cov &model2cov; ROC ‘FIRST’ &model1cov; ROC ‘SECOND’ &model2cov; ROCCONTRAST REFERENCE=‘FIRST’; ODS OUTPUT ROCASSOCIATION=ROCASS ROCCONTRASTESTIMATE=ROCDIFF; RUN; If working from an earlier version the %roc macro will be called from the SAS website[6]

AUC Section of Macro Sample output %added_pred(data=data, id=id, y=event, model1cov=x1 x2 x3 x4, model2cov=x1 x2 x3 x4 x5, …….); Model1 AUC Model2 AUC Difference in AUC Std Error for Difference P-value for difference 95% CI for Difference 0.77907 0.79375 0.0147 .0042 0.0005 (0.00646,0.0229)

IDI Section of Macro Use proc logistic output dataset for model1 and model2 PROC LOGISTIC DATA=&DATA; MODEL &y=&model1cov; OUTPUT OUT=OLD PRED=P_OLD; RUN; MODEL &y=&model2cov; OUTPUT OUT=NEW PRED=P_NEW; proc sql noprint; create table probs as select *, (p_new-p_old) as pdiff from old(keep=&id &y &model1cov &model2cov p_old ) as a join new(keep=&id &y &model1cov &model2cov p_new ) as b on a.&id=b.&id order by &y; quit; Now use proc means or sql to obtain: p_event_new, p_event_old, p_nonevent_new, p_nonevent_old

IDI Section of Macro Sample Output IDI IDI Std Err Z-value P-value 95% CI Probability change for events Probability change for non-events Relative IDI .0207 .0064 6.3 <.0001 (0.0143, 0.0272) .0186 -.002125 .167

NRI Section of Macro In 3 parts: All groups (any up/down movement) User defined (eg. <10,10-20,>20) Threshold (eg. a change of 3%) Coding is more involved here containing Counts for number of groups Do-loops for various # of thresholds and user groups

NRI Section of Macro User Defined Groups(eg NRI Section of Macro User Defined Groups(eg. <10, 10-20,>20 nripoints=.1 .2) %if &nripoints^=ALL %then %do; %let numgroups=%eval(%words(&nripoints)+1); /*figure out how many ordinal groups*/ data nriprobs; set probs; /*define first ordinal group for pre and post*/ if 0<=p_old<=%scan(&nripoints,1,' ') then group_pre=1; if 0<=p_new<=%scan(&nripoints,1,' ') then group_post=1; %let i=1; %do %until(&i>%eval(&numgroups-1)); if %scan(&nripoints,&i,' ')<p_old then do;group_pre=&i+1;end; if %scan(&nripoints,&i,' ')<p_new then do;group_post=&i+1;end; %let i=%eval(&i+1); %end; if &y=0 then do; if group_post>group_pre then up_nonevent=1; if group_post<group_pre then down_nonevent=1; end; if &y=1 then do; if group_post>group_pre then up_event=1; if group_post<group_pre then down_event=1 if up_nonevent=. then up_nonevent=0; if down_nonevent=. then down_nonevent=0; if up_event=. then up_event=0; if down_event=. then down_event=0; run;

NRI Section of Macro Sample Output %added_pred(data=,…..,nripoints=.1 .2, nriallcut=.03,…..); Group NRI STD ERR Z-value P-VALUE 95% CI % of events correctly reclassified % of non-event correctly reclassified ALL .454 .09 9.7 <.0001 (0.3649,0.5447) (10%) 56% CUT_.03 .101 .08 2.46 .014 (0.0205,0.1817) 4% 6% USER .127 .05 4.95 (0.0769,0.177) 5% 8%

Vickers’ Section of Macro Default is no analysis Can test multiple thresholds Uses bootstrapping techniques to create 95% CI’s If testing thresholds run time will increase due to Bootstrapping

Vickers Section of Macro Code makes 101 datasets: vicker&i testing at every integer threshold (0-100) to make graph %if &vickersplot=TRUE %then %do; %do i=0 %to 100; data vicker&i; set probs(keep=&id p_new p_old &y); if p_new>&i/100 then predmodel=1; else predmodel=0; if p_old>&i/100 then predmodel_old=1; else predmodel_old=0; run; Now use proc freq to obtain frequency counts (see example) Set all the datasets and delete individual parts: data vicker; set %do i=0 %to 100; vicker&i %end;; proc datasets library=work nolist; delete %do i=0 %to 100; vicker&i %end;; quit;

Vickers Section of Macro %added_pred(data=,…..,vickersplot=TRUE,…..)

Vickers Section-Threshold Analysis Can test multiple thresholds, i.e. 15% and 20% by use of bootstrapping methods If you know a priori that if an individuals risk is greater then some threshold (t) you’d treat differently, so you’d want to compare models at this point

Vickers Section-Threshold Analysis Sample Output %added_pred(data=, ….., vickersplot=TRUE, vickerpoints=.15); Cut-point Mean(CI) for diff in net benefit (new-old) Mean(CI) for net benefit of old model Mean (CI) for diff in net benefit (old-treatall) Mean(CI) for net benefit of new model Mean (CI) for diff in Net benefit (new-treatall) 15 0.0013 (-0.0026,0.005) 0.0295 (0.0229,0.362) 0.0852 (0.0783,0.0922) 0.0308 (0.0244,0.0374) 0.0866 (0.0799,0.093)

Conclusions Don’t rely only on AUC and statistical significance to access added marker predictive ability, but a combination of methods Future: extend to time-to-event analysis

Q’s or Comments If you want to use the macro or obtain literature contact me at: Email: kfk3388@gmail.com or kfkennedy@saint-lukes.org

References 1) Pencina MJ, D'Agostino RB Sr, D’Agostino RB Jr. Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat Med 2008; 27:157-72. 2) Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Medical Decision Making. 2006 Nov-Dec;26(6):565-74 3) Vickers AJ, Cronin AM, Elkin EB, Gonen M. Extensions to decision curve analysis, A novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Medical Informatics and Decision Making. 2008 Nov 26;8(1):53. 4) Cook NR. Use and Misuse of the receiver operating characteristics curve in risk prediction. Circulation 2007; 115:928-935. 5) Pepe MS, Janes H, Longton G, Leisenring W, Newcomb P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am J Epidemiol. 2004;159:882-890. 6) http://support.sas.com/kb/25/017.html

%words macro %macro words(list); %local count; %let count=0; %do %while(%qscan(&list,&count+1,%str( )) ne %str()); %let count=%eval(&count+1); %end; &count %mend words;