The dangers of an immediate use of model based methods The chronic bronchitis study: bronc: 0= no 1=yes poll: pollution level cig: cigarettes smokes per.

Slides:



Advertisements
Similar presentations
Dummy Variables and Interactions. Dummy Variables What is the the relationship between the % of non-Swiss residents (IV) and discretionary social spending.
Advertisements

Brief introduction on Logistic Regression
EC220 - Introduction to econometrics (chapter 10)
Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
Lecture 16: Logistic Regression: Goodness of Fit Information Criteria ROC analysis BMTRY 701 Biostatistical Methods II.
SC968: Panel Data Methods for Sociologists Random coefficients models.
Introduction to Logistic Regression In Stata Maria T. Kaylen, Ph.D. Indiana Statistical Consulting Center WIM Spring 2014 April 11, 2014, 3:00-4:30pm.
Matched designs Need Matched analysis. Incorrect unmatched analysis. cc cc exp,exact Proportion | Exposed Unexposed | Total Exposed
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 Nonlinear Regression Functions (SW Chapter 8). 2 The TestScore – STR relation looks linear (maybe)…
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function F(Z) giving the probability is the cumulative standardized.
Multinomial Logit Sociology 8811 Lecture 11 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Lecture 17: Regression for Case-control Studies BMTRY 701 Biostatistical Methods II.
University of North Carolina at Chapel Hill
1 Logistic Regression EPP 245 Statistical Analysis of Laboratory Data.
In previous lecture, we highlighted 3 shortcomings of the LPM. The most serious one is the unboundedness problem, i.e., the LPM may make the nonsense predictions.
Modelling risk ratios and risk differences …this is *new* methodology…
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Interpreting Bi-variate OLS Regression
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
1 Multivariate Analysis and Discrimination EPP 245 Statistical Analysis of Laboratory Data.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: binary choice logit models Original citation: Dougherty, C. (2012) EC220.
Logistic Regression Sociology 229: Advanced Regression
Methods Workshop (3/10/07) Topic: Event Count Models.
1 The Receiver Operating Characteristic (ROC) Curve EPP 245 Statistical Analysis of Laboratory Data.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Multinomial Logit Sociology 8811 Lecture 10
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen University, THAILAND.
Count Models 1 Sociology 8811 Lecture 12
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.
Lecture 18 Ordinal and Polytomous Logistic Regression BMTRY 701 Biostatistical Methods II.
Logistic Regression. Linear Regression Purchases vs. Income.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
Analysis of Experimental Data IV Christoph Engel.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Conditional Logistic Regression Epidemiology/Biostats VHM812/802 Winter 2016, Atlantic Veterinary College, PEI Raju Gautam.
Exact Logistic Regression
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
1 Ordinal Models. 2 Estimating gender-specific LLCA with repeated ordinal data Examining the effect of time invariant covariates on class membership The.
Birthweight (gms) BPDNProp Total BPD (Bronchopulmonary Dysplasia) by birth weight Proportion.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen University, THAILAND.
Logistic Regression Jeff Witmer 30 March Categorical Response Variables Examples: Whether or not a person smokes Success of a medical treatment.
Discussion: Week 4 Phillip Keung.
Lecture 18 Matched Case Control Studies
Introduction to Logistic Regression
Problems with infinite solutions in logistic regression
Logistic Regression 4 Sociology 8811 Lecture 9
CMGPD-LN Methodological Lecture Day 4
Count Models 2 Sociology 8811 Lecture 13
Eva Ørnbøl + Morten Frydenberg
Common Statistical Analyses Theory behind them
Presentation transcript:

The dangers of an immediate use of model based methods The chronic bronchitis study: bronc: 0= no 1=yes poll: pollution level cig: cigarettes smokes per day

The original analysis:. logistic bronc cig poll Logistic regression Number of obs = 212 LR chi2(2) = Prob > chi2 = Log likelihood = Pseudo R2 = bronc | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] cig | poll | Looks good? They focussed on the p-values but… What of the units of poll and cig? Interpretation? Other models considered? Goodness of fit… better look at the residuals

Any sign of trouble?. predict res,rstandard. list cig poll if res>=2 | res<= | cig poll | | | 42. | | 53. | | 55. | | 72. | | 98. | | | | 124. | | 129. | | 146. | | 163. | | 169. | | | | 175. | | 192. | |

…make the model more complicated?. gen cig2=cig*cig. logit bronc cig cig2 poll Logit estimates Number of obs = 212 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = bronc | Coef. Std. Err. z P>|z| [95% Conf. Interval] cig | cig2 | poll | _cons | Where is the maximum log of odds of bronc?. display /(2* )

…ack! Try cubic? Interactions? ….or better look at this data. table fcig fpol bronc | Bronchitis and Pollution Group Smoking | no Group | zero | | | | | >8 | | Bronchitis and Pollution Group Smoking | yes Group | zero | | 1-3 | 3-5 | | >8 |

Now what? Completely rethink this mess… one try?. xi: logistic bronc i.fcig poll i.fcig _Ifcig_1-6 (naturally coded; _Ifcig_1 omitted) note: _Ifcig_2 != 0 predicts failure perfectly _Ifcig_2 dropped and 43 obs not used note: _Ifcig_3 != 0 predicts failure perfectly _Ifcig_3 dropped and 35 obs not used Logistic regression Number of obs = 134 LR chi2(4) = Prob > chi2 = Log likelihood = Pseudo R2 = bronc | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] _Ifcig_4 | _Ifcig_5 | _Ifcig_6 | poll |

…but look at the *note*. lstat Logistic model for bronc True Classified | D ~D | Total | | 40 - | | Total | | 134 Classified + if predicted Pr(D) >=.5 True D defined as bronc != Sensitivity Pr( +| D) 47.83% Specificity Pr( -|~D) 79.55% Positive predictive value Pr( D| +) 55.00% Negative predictive value Pr(~D| -) 74.47% False + rate for true ~D Pr( +|~D) 20.45% False - rate for true D Pr( -| D) 52.17% False + rate for classified + Pr(~D| +) 45.00% False - rate for classified - Pr( D| -) 25.53% Correctly classified 68.66%

…next step? ….maybe a classical analysis is in order… Are there other variables measured? reassess the entire project?