1 The Receiver Operating Characteristic (ROC) Curve EPP 245 Statistical Analysis of Laboratory Data.

Slides:



Advertisements
Similar presentations
Econometrics I Professor William Greene Stern School of Business
Advertisements

Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004
Logistic Regression.
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
Lecture 16: Logistic Regression: Goodness of Fit Information Criteria ROC analysis BMTRY 701 Biostatistical Methods II.
SC968: Panel Data Methods for Sociologists Random coefficients models.
Lecture 17: Regression for Case-control Studies BMTRY 701 Biostatistical Methods II.
1 Logistic Regression EPP 245 Statistical Analysis of Laboratory Data.
In previous lecture, we highlighted 3 shortcomings of the LPM. The most serious one is the unboundedness problem, i.e., the LPM may make the nonsense predictions.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.
Ordered probit models.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Event History Models Sociology 229: Advanced Regression Class 5
1 Analysis of Variance (ANOVA) EPP 245 Statistical Analysis of Laboratory Data.
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
1 Multivariate Analysis and Discrimination EPP 245 Statistical Analysis of Laboratory Data.
Logistic Regression – Basic Relationships
C. Logit model, logistic regression, and log-linear model A comparison.
Statistics - Descriptive statistics 2013/09/23. Data and statistics Statistics is the art of collecting, analyzing, presenting, and interpreting data.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
SPH 247 Statistical Analysis of Laboratory Data May 19, 2015SPH 247 Statistical Analysis of Laboratory Data1.
Chapter 1 The Nature of Probability and Statistics.
The Nature of Probability and Statistics
Initial Data Analysis DISTINCTIONS. Some Distinctions Population vs. Sample Descriptive vs. Inferential stats Variables Types of data  Quantitative versus.
Logit model, logistic regression, and log-linear model A comparison.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
Introduction to Statistics Mr. Joseph Najuch Introduction to statistical concepts including descriptive statistics, basic probability rules, conditional.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez.
Supplemental Table A. Baseline proteinuria predicting renal outcome in multivariable Cox-Hazard model PredictorsHR95% CIp value Baseline UPE, g/day
Non-parametric Tests e.g., Chi-Square. When to use various statistics n Parametric n Interval or ratio data n Name parametric tests we covered Tuesday.
The dangers of an immediate use of model based methods The chronic bronchitis study: bronc: 0= no 1=yes poll: pollution level cig: cigarettes smokes per.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Logistic Regression Analysis Gerrit Rooks
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
SPH 247 Statistical Analysis of Laboratory Data. Binary Classification Suppose we have two groups for which each case is a member of one or the other,
Conditional Logistic Regression Epidemiology/Biostats VHM812/802 Winter 2016, Atlantic Veterinary College, PEI Raju Gautam.
Exact Logistic Regression
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
1 Ordinal Models. 2 Estimating gender-specific LLCA with repeated ordinal data Examining the effect of time invariant covariates on class membership The.
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Birthweight (gms) BPDNProp Total BPD (Bronchopulmonary Dysplasia) by birth weight Proportion.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Descriptive Statistics using R. Summary Commands An essential starting point with any set of data is to get an overview of what you are dealing with You.
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
CHAPTER 7 Linear Correlation & Regression Methods
LEVELS of DATA.
Lecture 18 Matched Case Control Studies
Measuring Success in Prediction
Chapter 2 Describing Data: Graphs and Tables
Introduction to Logistic Regression
Lab 2 Data Manipulation and Descriptive Stats in R
The Nature of Probability and Statistics
The Receiver Operating Characteristic (ROC) Curve
Problems with infinite solutions in logistic regression
The Nature of Probability and Statistics
CMGPD-LN Methodological Lecture Day 4
Displaying Data – Charts & Graphs
Inference for Regression Slope
Data, Tables and Graphs Presentation.
EPP 245 Statistical Analysis of Laboratory Data
Logistic Regression.
Presentation transcript:

1 The Receiver Operating Characteristic (ROC) Curve EPP 245 Statistical Analysis of Laboratory Data

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 2 Binary Classification Suppose we have two groups for which each case is a member of one or the other, and that we know the correct classification (“truth”). Suppose we have a prediction method that produces a single numerical value, and that small values of that number suggest membership in group 1 and large values suggest membership in group 2

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 3 If we pick a cutpoint t, we can assign any case with a predicted value ≤ t to group 1 and the others to group 2. For that value of t, we can compute the number correctly assigned to group 2 and the number incorrectly assigned to group 2 (true positives and false positives). For t small enough, all will be assigned to group 2 and for t large enough all will be assigned to group 1. The ROC curve is a plot of true positives vs. false positives

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 4 Juul's IGF data Description: The 'juul' data frame has 1339 rows and 6 columns. It contains a reference sample of the distribution of insulin-like growth factor (IGF-I), one observation per subject in various ages with the bulk of the data collected in connection with school physical examinations. Variables: age a numeric vector (years). menarche a numeric vector. Has menarche occurred (code 1: no, 2: yes)? sex a numeric vector (1: boy, 2: girl). igf1 a numeric vector. Insulin-like growth factor ($mu$g/l). tanner a numeric vector. Codes 1-5: Stages of puberty a.m. Tanner. testvol a numeric vector. Testicular volume (ml).

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 5 Predicting Menarche Subset Juul data to only females between 8 and 20 years old Predict menarch from age as a quantitative variable and Tanner score as a qualitative variable using dummy variables Menarch re-coded to be 0/1

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 6. logistic men1 age tan2 tan3 tan4 tan5 Logistic regression Number of obs = 519 LR chi2(5) = Prob > chi2 = Log likelihood = Pseudo R2 = men1 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] age | tan2 | tan3 | tan4 | tan5 | predict pmen (option p assumed; Pr(men1)). predict pmen1, xb

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 7. histogram pmen. graph export pmenhist.wmf. histogram pmen if men1==0, title("Pre-Menarch"). graph export pmenhist0.wmf. histogram pmen if men1==1, title("Post-Menarch"). graph export pmenhist1.wmf. histogram pmen1. graph export pmen1hist.wmf. hist pmen1 if men1==0, title("Pre-Menarche"). graph export pmen1hist0.wmf. hist pmen1 if men1==1, title("Post-Menarche"). graph export pmen1hist1.wmf. lroc Logistic model for men1 number of observations = 519 area under ROC curve = graph export pmenroc.wmf

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 8

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 9

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 10

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 11

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 12

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 13

November 30, 2006EPP 245 Statistical Analysis of Laboratory Data 14