EHS Lecture 14: Linear and logistic regression, task-based assessment

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
Logit & Probit Regression
HSRP 734: Advanced Statistical Methods July 24, 2008.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Ordinal Logistic Regression
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Generalized Linear Models
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Lecture 8: Generalized Linear Models for Longitudinal Data.
1 Rob Woodruff Battelle Memorial Institute, Health & Analytics Cynthia Ferre Centers for Disease Control and Prevention Conditional.
Assessing Survival: Cox Proportional Hazards Model
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
HSRP 734: Advanced Statistical Methods June 19, 2008.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Logistic Regression Database Marketing Instructor: N. Kumar.
Linear correlation and linear regression + summary of tests
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
Logistic Regression Analysis Gerrit Rooks
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Simple Statistical Designs One Dependent Variable.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
EHS 655 Lecture 3: Types of data, basic Stata commands
BINARY LOGISTIC REGRESSION
Logistic Regression When and why do we use logistic regression?
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
REGRESSION G&W p
Statistical Data Analysis - Lecture /04/03
CHAPTER 7 Linear Correlation & Regression Methods
Notes on Logistic Regression
THE LOGIT AND PROBIT MODELS
Regression Techniques
EHS 655 Lecture 15: Exposure variability and modeling
What we’ll cover today Transformations Inferential statistics
Generalized Linear Models
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Introduction to logistic regression a.k.a. Varbrul
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Stats Club Marnie Brennan
Basic Statistical Terms
Modelling data and curve fitting
Scatter Plots of Data with Various Correlation Coefficients
Ass. Prof. Dr. Mogeeb Mosleh
Logistic Regression.
STA 291 Summer 2008 Lecture 23 Dustin Lueker.
Introduction to Logistic Regression
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Lexico-grammar: From simple counts to complex models
STA 291 Spring 2008 Lecture 23 Dustin Lueker.
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

EHS 655 Lecture 14: Linear and logistic regression, task-based assessment

What we’ll cover today Robust linear regression with R2 Prediction with linear regression Logistic regression modeling

Linear regression Robust regression with R2 Download rregfit http://stats.idre.ucla.edu/stata/faq/how-can-i-get-an-r2-with-robust-regression-rreg/ Download rregfit Stata: rreg depvar indepvar rregfit

Regression - predict Runs a regression model and saves the predicted values for each individual i.e. can use regression to predict TWA based on job title, gender, perceived noise, etc. Linear regression Stata: rreg depvar indepvar1, cluster(idvar) predict fittedvar Logistic regression Stata: logistic depvar indepvar1, cluster(idvar) predict fittedvar

EXPOSURE MODELING Logistic regression Task-based estimation Will use this in our analyses Task-based estimation May use this in our analysis

Binary logistic regression Consider binary categorical (not continuous) outcome Want to know probability of outcome Described by odds ratio Odds of event occurring, [p / (1-p)], where p is probability Note that Stata will give you log odds or odds ratio (OR) Assumptions similar to linear regression e.g., independent observations, equal variance Example: odds of TWA exposure >85 dBA

Binary logistic regression Assumes outcome categorical and binary (i.e., 0 or 1) 0 = negative outcome, 1 = positive outcome Predictor variables Cannot be nominal variables unless “i.” function used Continuous variables okay Ordinal variables okay “vce(robust)” accounts for mild assumption violations Stata: logistic depvar indepvar, vce(robust)

Binary logistic regression We model log odds Where Xi is a continuous variable Xi=0, log odds = β0 Xi=x, log odds = β0 + β1x Xi=x+1, log odds = β0 + β1x + β1

Binary logistic regression Interpreting odds Where Xi=0, odds = eβ0 Xi=x, odds = eβ0 + eβ1x Xi=x+1, odds = eβ0 + eβ1x + eβ1 For groups varying by 1 unit, odds = eβ1

Interpreting logistic regression results Odds ratio when predictor = 0 Exponentiate intercept if using log odds: eβ0 (Otherwise have Stata output odds ratio) Odds ratio between predictors differing by one unit Exponentiate slope from regression: eβ1 Compare nested models with log likelihood score E.g., model with variables 1 and 2 vs variable 1 alone; cannot compare variables 1 vs 2

Binary logistic regression Interpreting model output http://stats.idre.ucla.edu/stata/seminars/stata-logistic/ p-value for chi-square (aka likelihood ratio) Test that all estimated coefficients equal zero Test of whole model Pseudo R2 (aka likelihood ratio index) (log likelihood full model/log likelihood intercept-only model) Log pseudolikelihood Smaller value = better model fit

Binary logistic regression Logistic regression with repeated measures Stata: logistic depvar indepvar, cluster(idvar) or

Multiple logistic regression Like simple logistic regression But with 2 or more predictors Stata: logistic depvar indepvar1 indepvar2, vce(robust)

Repeated measures logistic regression Use if you have Binary outcome Predictor measured repeatedly for each subject And want to Run logistic regression that accounts for multiple (i.e., correlated) measures from single subjects Stata: logistic depvar indepvar, cluster(idvar)

Multinominal logistic regression Use if dependent variable is more than 2 categories, but not continuous E.g., low, medium, high exposure E.g., no, low, medium, severe health impact Not expecting this level of sophistication in your analyses…but http://stats.idre.ucla.edu/stata/dae/multinomiallogistic-regression/ Example: create 3-level ordinal exposure variable from TWA Stata: mlogit depvar indepvar, or vce(robust)

Task-based exposure estimation Develop exposure model based not on personal characteristics or grouping strategy, but on tasks Assume task directly or indirectly determines exposure Create estimates of personal exposure Example: industry where tasks vary widely within groups Where T = total duration, t = duration of task n, f = frequency of task n, C = intensity of task n

Illustration of task-based approach

Task-based assessment Can work better than other approaches Neitzel et al, 2011

Task-based assessment for noise: example of implementation Ignacio and Bullock, AIHA, 2006

Task-based assessment Complicated First, evaluate task-specific exposure levels Then apply exposure levels to individual task durations and frequencies Combination yields personal exposure estimate We don’t always have information about all tasks person performs; may instead focus on Longest task Most frequent task Task with highest exposure

(Pseudo) task-based estimation What about our dataset? To do task-based estimation right, we’d need info on all tasks performed, levels, and durations We just have primary task performed Stata: rreg depvar indepvar1 indepvar2 predict fitted