Sarah N. Mattson, Ph.D. Patrick Goh, B.S. Melody Sadler, Ph.D.

Slides:



Advertisements
Similar presentations
Test Development.
Advertisements

Lecture 3 Validity of screening and diagnostic tests
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Receiver Operating Characteristic (ROC) Curves
Sensitivity, Specificity and ROC Curve Analysis.
Chapter 4 Validity.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
Statistics for Health Care
Chapter 7 Correlational Research Gay, Mills, and Airasian
TAYLOR HOWARD The Employment Interview: A Review of Current Studies and Directions for Future Research.
Statistics in Screening/Diagnosis
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Statistics for Health Care Biostatistics. Phases of a Full Clinical Trial Phase I – the trial takes place after the development of a therapy and is designed.
Reliability & Validity
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
INTRODUCTION Upper respiratory tract infections, including acute pharyngitis, are common in general practice. Although the most common cause of pharyngitis.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Evaluating Classification Performance
Cognitive Testing, Statistics and Dementia Ralph J. Kiernan Ph.D. 14 th May 2013.
T Relationships do matter: Understanding how nurse-physician relationships can impact patient care outcomes Sandra L. Siedlecki PhD RN CNS.
Diagnostic studies Adrian Boyle.
Generalized Logit Model
DSM-5 Changes Increase ADHD Symptom Endorsement Among College Students
Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints.
A Multisite Neurobehavioral Assessment of FASD
Logic of Hypothesis Testing
Lorna Myers, Ph.D. Director of Clinical Neuropsychology
Step 1: Specify a null hypothesis
DSM-5 Update CIFASD Julie A. Kable, Ph.D. Assistant Professor
Associations of Maternal Antidepressant Use During the First Trimester of Pregnancy With Preterm Birth, Small for Gestational Age, Autism Spectrum Disorder,
Service-related research: Therapy outcomes audit
Colleen M. Adnams CIFASD Winter meeting DC 2 February 2011
Hypothesis Testing.
Behavioral Sciences and Education
A Multisite Neurobehavioral Assessment of FASD
Angela Zachman, Lisa Manderino & John Gunstad1
Infant Assessment in FASD: Ukraine Exposure Sample
Bowden, Shores, & Mathias (2006): Failure to Replicate or Just Failure to Notice. Does Effort Still Account for More Variance in Neuropsychological Test.
Understanding Results
Pilot Study for a Novel Measure Designed to Detect ADHD Simulators
Attention-Deficit/ Hyperactivity Disorder
A Multisite Neurobehavioral Assessment of FASD
FASD in San Diego, Moscow, and Helsinki Sarah Mattson, PI
Suboptimal Performance: When Do Methods & Mood Matter?
Reliability & Validity
FAMILY MEDICINE AND LABORATORY TESTS Elham
Perceived versus Actual Knowledge of Autism Spectrum Disorder
Background/Objective
Acute Assessment of Mild Traumatic Brain Injury with the King-Devick Test in an Emergency Department Sample Objectives Results The MTBI and trauma control.
Support Vector Machines (SVM)
What is Screening? Basic Public Health Concepts Sheila West, Ph.D.
Chapter Eight: Quantitative Methods
What is Screening? Basic Public Health Concepts Sheila West, Ph.D.
The receiver operating characteristic (ROC) curve
Introduction to Summary Statistics
Rationale and Hypotheses
Statistical Analysis Error Bars
Using statistics to evaluate your test Gerard Seinhorst
Figure 1. Table for calculating the accuracy of a diagnostic test.
ERRORS, CONFOUNDING, and INTERACTION
Anil Vachani, MD, Harvey I. Pass, MD, William N. Rom, MD, David E
Inferential Statistics
Theoretical issues Meaningful differences between individuals
Wallis, JD Helen Wills Neuroscience Institute UC, Berkeley
Behavior Rating Inventory of Executive Function (BRIEF2): Analyzing and Interpreting Ratings from Multiple Raters Melissa A. Messer1, MHS, Jennifer A.
Clinical Scales and Indexes
Gary Morse, Ph.D. Mary York, LMSW Nathan Dell, AM, LMSW
Presentation transcript:

Validation of ND-PAE Diagnostic Criteria and Suggestions for Improvement Sarah N. Mattson, Ph.D. Patrick Goh, B.S. Melody Sadler, Ph.D. Edward Riley, Ph.D. Supported by NIAAA grant U01 AA014834 and a supplement to U01 AA014830

Aims of Current Study Aim 1: Examine validity of current ND-PAE criteria in comparison to typical controls (AE v. CON) Test 1.0 SD vs. 1.5 SD cut-offs for impairment Aim 2: Examine validity of current ND-PAE criteria in comparison to an expanded control group (AE v. expCON) Compare children with prenatal alcohol exposure to a heterogeneous “expanded” control group that includes typically- developing, and children with ADHD, and children with lower IQ scores Aim 3: Using findings of Aims 1 & 2, propose potential areas of improvement in ND-PAE criteria

Subject Groups AE Group: Prenatal alcohol exposure (PAE) with or without FAS CON Group: non-exposed typically developing (IQ>88, no ADHD) controls expCON Group: non-exposed including: CON Group (IQ>88, no ADHD) ADHD Contrast Group (with ADHD) Low IQ Contrast Group (IQ scores 54-88)

Subjects Variable AE CON expCON N 151 143 268 Age [M (SD)] 12.6 (2.5) 12.5 (2.6) 12.1 (2.6) IQ [M (SD)] 83.6 (16.7) 109.4 (11.6) 99.5 (17.9) Sex (Female) 64 (42.4%) 64 (44.8%) 101 (37.7%) Race (White) 80 (53.0%) 99 (69.2%) 173 (64.6%) FAS [n (%)] 41 (27.2%) -- Ethnicity (Hispanic/Latino) 20 (13.2%) 33 (23.1%) 60 (22.4%) ADHD [n (%)] 97 (64.2%) 0 (0.0%) 95 (35.4%) IQ < 85 [n (%)] 78 (51.7%) 0 (0%) Site San Diego 58 (38.4%) 60 (42%) 109 (40.7%) Atlanta 31 (20.5%) 19 (13.3%) 53 (19.8%) Los Angeles 28 (18.5%) 18 (12.6%) 21 (7.8%) Northern Plains 22 (14.6%) 20 (14.0%) 38 (14.2%) New Mexico 12 (7.9%) 26 (18.2%) 47 (17.5%) Add: N/% ADHD Is CON the squeaky clean con or con + low IQ? CON rate of IQ<85 = 29 (20.3%)

Method Data were selected from over 1200 variables to best represent the three domains of ND- PAE criteria Neurocognitive (NI) Self-Regulation (SR) Adaptive Functioning (AF) Symptoms/variables within each domain were characterized in binary format Endorsed (present) Not endorsed (absent) Two cut-offs were examined: 1.0 SD and 1.5 SD Cut-off scores used for the symptom binary indicators

ND-PAE Neurocog Impairment Self-Regulation Adaptive Function Global Intellectual Impairment (WISC) Executive Functioning Impairment (D-KEFS; BRIEF) Neurocog Impairment Impairment in Learning (CANTAB) Impairment in Memory (CANTAB; WISC) Impairment in Visual Spatial Reasoning (WISC) Self-Regulation Impairment in Behavioral Regulation (CBCL; BRIEF; VABS) ND-PAE Attention Deficit (CBCL) Impairment in Impulse Control (CBCL; DBD) Communication Deficit (VABS) Adaptive Function Social Impairment (VABS; CBCL) Daily Living Impairment (VABS; CBCL) Motor Impairment (VABS)

Rates of Impairment (AE Only) 1.0 SD 1.5 SD Neurocognitive Impairment 89% 79% Self-Regulation 92% 85% Adaptive Function 68% 47% ND-PAE 62% 42% Rev. Adaptive Function 75% Rev. ND-PAE To confirm Julie’s findings.

Rates of Impairment 1.0 SD 1.5 SD AE CON expCON NI 89% 43% 67% 79% 27% 51% SR 92% 25% 52% 85% 9% 39% AF 68% 8% 28% 47% 1% 13% ND-PAE 62% 4% 24% 42% 0% 11% rAF 36% 58% 75% 16% 38% rND-PAE 10% 37% 20% With pat’s edits – squeaky clean controls.

Aim 1 Examine validity of current ND-PAE criteria in comparison to typical controls (AE v. CON) Compare 1 v. 1.5 standard deviation cut-offs Sensitivity, Specificity, PPV, NPV Receiver Operating Characteristic Curves (ROC) 1 v 1.5 SD – rationale? (Appendix 1) ROC curves will be used to increase the validity of the ND-PAE diagnosis using the proposed contrast groups. ROC is a graphical plot of the sensitivity and specificity for a binary classification of the ND-PAE diagnosis relative to chance. Assessed each domain and ND-PAE diagnosis ability overall

Measures of Accuracy True Negative True Negative+False Positive Specificity= True Positive True Positive+False Negative Sensitivity= True Positive True Positive+False Positive PPV= True Negative True Negative+False Negative NPV= True Condition Condition Positive Condition Negative Test Outcome Positive True Positive False Positive PPV Test Outcome Negative False Negative True Negative NPV Sensitivity Specificity Test Outcome Specificity Probability that non-exposed subjects are not identified as alcohol-exposed (no=no) Sensitivity Probability that subjects with alcohol-exposure are identified as alcohol-exposed (yes = yes) Negative Predictive Value (NPV) Probability that subjects given a negative diagnosis are truly non-exposed Positive Predictive Value (PPV) Probability that subjects given a positive diagnosis are truly alcohol-exposed

Accuracy of Diagnosis: AE vs. CON  false (-) 1.5 SD too conservative, negatively impacted sensitivity across all comparisons All following presented findings were also examined at 1.5 SD, and they followed the same trend but sensitivity was lower across all analyses Edited from SM’s excel

Receiver Operator Characteristic (ROC) Curves Illustrates ability of a binary measure to differentiate groups Area Under Curve (AUC): [0-1] .9-1  Excellent .8-.9  Good .7-.8  Fair <.7  Poor Compare with “line of no discrimination” (chance) (1-specificity) (sensitivity) Receiver operator characteristic curves give a visual representation of the ability of a binary measure (Yes/No) to differentiate groups. There are several measures of accuracy that ROC curves provide, but an important one is the “Area Under the Curve,” (laser point the area) which gives a numerical value to represent the discriminatory power of a measure. This value ranges between 0 and 1, with higher numbers indicating better accuracy. These curves can additionally be statistically compared with the “line of no discrimination,” which identifies groups by chance. So ideally you want a curve’s “apex” to be as close to (0,1) as possible to maximize accuracy compared to chance (maximize difference between your curve and chance).

Accuracy of ND-PAE (AE v. CON @ 1 SD) ND-PAE ‘AUC’ = .790 (fair) All curves performed significantly better than chance. (p < .001) NI=.730 SR = .834 AF = .799 ND_PAE = .790 (fair – could be better) NOTE OVERLAP BETWEEN ND-PAE AND AF, AS JULIE INDICATED **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

Accuracy of ND-PAE (AE v. CON @ 1 SD) ND-PAE ‘AUC’ = .790 (fair) Notice: AF curve seems very similar to ND-PAE curve, and has a lower sensitivity than other curves, we will come back to this later. **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

Accuracy of ND-PAE (AE v. CON @ 1.5 SD) ND-PAE ‘AUC’ = .712 (fair) All curves performed significantly better than chance. (p < .001) NI=.761 SR = .882 AF = .728 ND_PAE = .712 (fair – could be better) NOTE: NI & SR ARE BETTER THAN ND-PAE/AF **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

ND-PAE: 1 SD v. 1.5 SD AE v. CON 1 SD ND-PAE ‘AUC’ = .790 Comparison of 1 vs. 1.5 ND-PAE Original Criteria, both are “fair”, although 1 SD is somewhat better at discriminating. LIKELY NOT SIGNIFICANTLY DIFFERENT WHEN COMPARING AE TO TYPICAL CONTROLS. **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

Aim 2 Examine validity of current ND-PAE criteria in comparison to an expanded control group (AE v. expCON) 1 v. 1.5 standard deviation cut-offs Sensitivity, Specificity, PPV, NPV Receiver Operating Characteristic Curves (ROC) Now we want to do the same thing examining 1.5 SD cutoffs. 1 v 1.5 SD – rationale? (Appendix 1) ROC curves will be used to increase the validity of the ND-PAE diagnosis using the proposed contrast groups. ROC is a graphical plot of the sensitivity and specificity for a binary classification of the ND-PAE diagnosis relative to chance. Assessed each domain and ND-PAE diagnosis ability overall

ND-PAE (1 SD v. 1.5 SD) AE v. expCON comparison  false (-) 1.5 SD too conservative, negatively impacted sensitivity across all comparisons All following presented findings were also examined at 1.5 SD, and they followed the same trend but sensitivity was lower across all analyses Edited with SM’s excel.

Accuracy of ND-PAE (AE v. expCON @ 1 SD) ND-PAE ‘AUC’ = .694 (poor) All curves performed significantly better than chance. (p < .001), but self-regulation is the best discriminator here. BUT Areas under the curve (.7-.8 = fair, .8-.9 = good, .9-1 = excellent discriminatory ability, by this logic, you want the apex to be left and high) NI=.613 SR = .701 AF = .696 ND_PAE = .694 (poor – could be better) **REMEMBER TO NOTE THAT X-AXIS is 1-SPECIFICITY!!! **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

Accuracy of ND-PAE (AE v. expCON @ 1.5 SD) ND-PAE ‘AUC’ = .658 (poor) Roc curves plots true positive rate against false positive rate at various threshold settings. Illustrates ability of a binary classifier system Compares the space under the curve to the space provided by a “line-of-no-discrimination,” which would provide sensitivity and specificity based on chance (.5). Therefore, the ideal curve would have an “apex” at (0,1) All curves performed significantly better than chance. (p < .001) BUT Areas under the curve (.7-.8 = fair, .8-.9 = good, .9-1 = excellent discriminatory ability, by this logic, you want the apex to be left and high) NI=.681 SR = .827 AF = .783 ND_PAE = .782 (poor – could be better) ???? **REMEMBER TO NOTE THAT X-AXIS is 1-SPECIFICITY!!! **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

ND-PAE – 1 SD v. 1.5 SD AE v. ExpCON 1 SD ND-PAE ‘AUC’ = .694 1.5 SD ND-PAE ‘AUC’ = .658 Comparison of 1 vs. 1.5 ND-PAE Original Criteria Both 1.0 and 1.5 are “poor” **AF Criteria: 2 of 4 symptoms, with 1 being either communication OR social impairments

Aim 1 & 2 Summary Domain AE vs. CON AE vs. expCON 1.0 SD 1.5 SD Sensitivity .62 .47 Specificity .96 .99 .76 .87 Positive Predictive Value .94 .97 .60 .68 Negative Predictive Value .71 .64 .78 .75 AUC (ND-PAE) .790 .712 .694 .658

Aim 1 & 2 Conclusions Examined validity of current ND-PAE criteria via comparisons with typical controls and an expanded control group Both SD cutoffs yielded moderate accuracy rates ROC curves indicated some benefit at 1.0 SD, although less so when compared to expCON There is definitely room for improvement ND-PAE dx appears to be driven by the Adaptive Function domain, which is more conservative than the other domains Requires more criteria, has lower sensitivity Based on the findings of aim 1, we decided to try relaxing AF criteria in an effort to improve sensitivity of AF criteria In addition, we hoped that the higher sensitivity would “move” the ROC curve of AF “closer” to the other domains, allowing them to have a larger influence on ND-PAE diagnosis

Aim 3 Using findings of Aim 1 & 2, propose potential areas of improvement in ND-PAE criteria AE v. expCON Receiver Operating Characteristic Curves (ROC) Sensitivity, Specificity, PPV, NPV Potential of relaxing AF criteria? 2/4 AF criteria (original) vs. 1 AF criteria (revised) Based on

Dx-Domain Correlations Domain (1.0 SD) AF Original AF Revised Neurocognitive Imp .44 .68 Self Regulation .38 .58 Adaptive Function .89 .70 Domain (1.5 SD) AF Original AF Revised Neurocognitive Imp .44 .67 Self Regulation .35 .53 Adaptive Function .91 .75 The correlation between ND-PAE and AF is very high, so the ND-PAE diagnosis is being driven by the lower sensitivity of AF, while the higher sensitivities of NI and SR are being overpowered. More restrictive criteria leads to lower sensitivity, so relaxing AF criteria was a logical step Allowing criteria to be more lax raises sensitivity, as it allows more children to qualify for a diagnosis (at a cost of lower specificity), so perhaps by relaxing AF criteria we would not only raise ND-PAE sensitivity, but allow other domains to influence ND-PAE criteria at a higher level AE v. expCON

Adaptive Function Revised (1 SD)  false (-)  false (+) 1.5 SD too conservative, negatively impacted sensitivity across all comparisons All following presented findings were also examined at 1.5 SD, and they followed the same trend but sensitivity was lower across all analyses Checked against SM’s excel AE v. expCON

Adaptive Function Revised (1 SD) 1 of 4 symptoms (less restrictive) ND-PAE ‘AUC’ = .831 (good) Relaxed AF criteria line has higher sensitivity, fits better with other domains Doesn’t fall exactly in line with ND-PAE diagnosis, lower correlation, allows other domains to have influence on diagnosis AUC is higher; ND-PAE is a better measure overall of PAE Compared to line of no discrimination, p < .001 Areas under the curve (.7-.8 = fair, .8-.9 = good, .9-1 = excellent discriminatory ability, by this logic, you want the apex to be left and high) NI=.681 SR = .827 AF = .736 ND_PAE = .831 (good) AE v. expCON

Adaptive Function Revised (1.5 SD) 1.5 SD too conservative, negatively impacted sensitivity across all comparisons All following presented findings were also examined at 1.5 SD, and they followed the same trend but sensitivity was lower across all analyses Corrected using SM’s excel. AE v. expCON

Adaptive Function Revised (1.5 SD) 1 of 4 symptoms (less restrictive) ND-PAE ‘AUC’ = .711 (fair) Relaxed AF criteria line has higher sensitivity, fits better with other domains Doesn’t fall exactly in line with ND-PAE diagnosis, lower correlation, allows other domains to have influence on diagnosis AUC is higher; ND-PAE is a better measure overall of PAE Compared to line of no discrimination, p < .001 Areas under the curve (.7-.8 = fair, .8-.9 = good, .9-1 = excellent discriminatory ability, by this logic, you want the apex to be left and high) NI=.640 SR = .731 AF = .682 ND_PAE = .711 (good) AE v. expCON

Original vs. Revised ND-PAE Criteria (1 SD) Original ND-PAE ‘AUC’ = .782 (fair) Revised ND-PAE ‘AUC’ = .831 (good) Comparison of Original vs. Revised ND-PAE Criteria AE v. expCON

Original vs. Revised ND-PAE Criteria (1.5 SD) Original ND-PAE ‘AUC’ = .658 (poor) Revised ND-PAE ‘AUC’ = .711 (fair) Comparison of Original vs. Revised ND-PAE Criteria‹3 Numbers changed. AE v. expCON

Summary of Proposed AF Revision Domain (1.0 SD) AF Original AF Revised Sensitivity .62 .79 Specificity .76 .63 Positive Predictive Value .60 .55 Negative Predictive Value .78 .84 AUC (ND-PAE) .782 .831 Domain (1.5 SD) AF Original AF Revised Sensitivity .47 .62 Specificity .87 .80 Positive Predictive Value .68 .64 Negative Predictive Value .75 .79 AUC (ND-PAE) .658 .711 1.5 SD followed the same trend but with less overall accuracy With revised criteria, our rate of false positives goes up from 40% to 45% but false negatives decrease from 22% to 16% AE v. expCON

Aim 3 Conclusions Use findings of Aim 1 & 2 to improve ND-PAE accuracy Revised AF criteria are less correlated with ND-PAE diagnosis, allowing for influence of other domains Relaxing AF criteria improves sensitivity, lowers specificity, but ROC curves indicate strength of 1.0 SD (good) over 1.5 SD (fair)

General Conclusions Results indicate that the ND-PAE criteria are useful in identifying affected children Results are limited to children with known exposure history at this point Criteria had good internal validity, especially after modifying AF criteria Using a 1.0 SD cutoff and the revised AF criteria yielded the strongest model (AUC = .831), although specificity was higher when using the 1.5 SD cutoff Additional modifications to the criteria (e.g., changing general cognitive cutoff to 1.5 SD?) should be tested

Future Directions Assess the validity of ND-PAE criteria using other comparisons and other samples with known exposure histories and other clinical conditions Consider revisions in other areas of ND-PAE to potentially increase accuracy Integration with other assessment domains may be even more successful Goal: Test the model in samples with unknown exposure histories

Rates of Impairment (AE Only) 1.0 SD 1.5 SD 2.0 SD Neurocognitive Impairment 89% 79% 72% Self-Regulation 92% 85% 80% Adaptive Function 68% 47% 16% ND-PAE 62% 42% 15% Rev. Adaptive Function 75% Rev. ND-PAE 48% To confirm Julie’s findings.