Computing for Research I Spring 2013 Primary Instructor: Elizabeth Garrett-Mayer Regression Using Stata February 19.

Slides:



Advertisements
Similar presentations
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx Let X = decrease (–) in cholesterol.
Advertisements

Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses.
Exploring the Shape of the Dose-Response Function.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Brief introduction on Logistic Regression
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Departments of Medicine and Biostatistics
HSRP 734: Advanced Statistical Methods July 24, 2008.
Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28.
University of North Carolina at Chapel Hill
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Jul-15H.S.1 Linear Regression Hein Stigum Presentation, data and programs at:
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Generalized Linear Models
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
EFFECT SIZE Parameter used to compare results of different studies on the same scale in which a common effect of interest (response variable) has been.
Logistic Regression. Outline Review of simple and multiple regressionReview of simple and multiple regression Simple Logistic RegressionSimple Logistic.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Simple Linear Regression
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Introduction to STATA for Clinical Researchers Jay Bhattacharya August 2007.
Slide 1 The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics.
Introduction to Logistic Regression Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Linear correlation and linear regression + summary of tests
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
BIOST 536 Lecture 11 1 Lecture 11 – Additional topics in Logistic Regression C-statistic (“concordance statistic”)  Same as Area under the curve (AUC)
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
Session 13: Correlation (Zar, Chapter 19). (1)Regression vs. correlation Regression: R 2 is the proportion that the model explains of the variability.
Logistic Regression. Linear Regression Purchases vs. Income.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Logistic Regression Analysis Gerrit Rooks
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
POPLHLTH 304 Regression (modelling) in Epidemiology Simon Thornley (Slides adapted from Assoc. Prof. Roger Marshall)
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Additional Regression techniques Scott Harris October 2009.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
BINARY LOGISTIC REGRESSION
Chapter 7. Classification and Prediction
EHS Lecture 14: Linear and logistic regression, task-based assessment
Logistic Regression APKC – STATS AFAC (2016).
CHAPTER 7 Linear Correlation & Regression Methods
Notes on Logistic Regression
Chapter 13 Nonlinear and Multiple Regression
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Using Stata’s Margins Command
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Advanced quantitative methods for social scientists (2017–2018) LC & PVK Session 6 Event History Analysis / survival (and other tools for social and individual.
University of North Carolina at Chapel Hill
Stata 9, Summing up.
Standard Statistical analysis Linear-, logistic- and Cox-regression
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Introduction to Logistic Regression
Presentation transcript:

Computing for Research I Spring 2013 Primary Instructor: Elizabeth Garrett-Mayer Regression Using Stata February 19

First, a few odds and ends Dealing with non-stringy strings: – gen xn = real(x) encode and decode – String variable to numeric variable encode varname, gen(newvar) – Numeric variable to string variable decode varname, gen(newvar)

Stata for regression Focus on linear regression Good news: syntax is (almost) identical for other types of regression! More on that later Personal experience: – I use stata for most regression problems – why? tons of options easy to handle complex correlation structures simple to deal with interactions and other polynomials nice way to deal with linear combinations

Linear regression example How long do animals sleep? Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by Allison, T. and Cicchetti, D. (1976), Science, November 12, vol. 194, pp Includes brain and body weight, life span, gestation time, time sleeping, predation and danger indices

Variables in the dataset body weight in kg brain weight in g slow wave ("nondreaming") sleep (hrs/day) paradoxical ("dreaming") sleep (hrs/day) total sleep (hrs/day) (sum of slow wave and paradoxical sleep) maximum life span (years) gestation time (days) predation index (1-5): 1 = minimum (least likely to be preyed upon) 5 = maximum (most likely to be preyed upon) sleep exposure index (1-5): 1 = least exposed (e.g. animal sleeps in a well-protected den) 5 = most exposed overall danger index (1-5): (based on the above two indices and other information) 1 = least danger (from other animals) 5 = most danger (from other animals)

Basic steps Explore your data – outcome variable – potential covariates – collinearity! Regression syntax – regress y x1 x2 x3 …. – that’s about it! – not many options

Interactions “interaction expansion” prefix of “xi:” before a command Treats a variable in ‘ varlist ’ with i. before it as categorical (or “factor”) variable Example in breast cancer dataset regress logsize graden vs. xi: regress logsize i.graden

New twist You don’t have to include xi:! (for making dummy variables) What is the difference? – xi prefix: new ‘dummy’ variables are created in your variable list. variables begin with ‘_I’ then variable name, ending with numeral indicating category – no xi prefix: new variables are not created, just included temporarily in command referring to them in post estimation commands uses syntax i.varname where i is substituted for category of interest

Example xi: regress logsize i.graden ern test _Igraden_2=_Igraden_3=_Igraden_4=0 regress logsize i.graden ern test 2.graden=3.graden=4.graden=0

But that is not an interaction(?) It facilitates interactions with categorical variables xi: regress logsize i.black*nodeyn – fits a regression with the following main effect of black main effect of node interaction between black and node – be careful with continuous variables!

Linear Combinations

What is the expected difference in log tumor size comparing…. – two white women, one with node positive vs. one with node negative disease? – two black women, one with node positive vs. pne with node negative disease? – a black woman with node negative disease vs. a white woman with node positive disease? (see do file for syntax)

Other types of regression logit y x1 x2 x3…. or logistic y x1 x2 x3… – logit: log odds ratios (coefficients) – logistic: odds ratios (exponentiated coefficients) poisson y x1 x2 x3, offset(n) Cox regression – first declare outcome: stset ttd, fail(death) – then fit cox regression: stcox x1 x2 xtlogit or xtregress – random effects logistic and linear regression

Other nifty post-regression options AUC curves after logistic – estat classification reports various summary statistics, including the classification table – estat gof Pearson or Hosmer-Lemeshow goodness-of-fit test – lroc graphs the ROC curve and calculates the area under the curve – lsens graphs sensitivity and specificity versus probability cutoff

Other nifty post-regression options Post Cox regression options – estat concordance : Calculate Harrell's C – estat phtest : Test Cox proportional-hazards assumption – stphplot : Graphically assess the Cox proportional-hazards assumption – stcoxkm : Graphically assess the Cox proportional-hazards assumption