Analysis of Complex Survey Data Day 3: Regression.

Slides:



Advertisements
Similar presentations
SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
Advertisements

Exploring the Shape of the Dose-Response Function.
1 1 Chapter 5: Multiple Regression 5.1 Fitting a Multiple Regression Model 5.2 Fitting a Multiple Regression Model with Interactions 5.3 Generating and.
Brief introduction on Logistic Regression
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Logistic Regression Psy 524 Ainsworth.
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Simple Logistic Regression
EPID Introduction to Analysis and Interpretation of HIV/STD Data Confounding Manya Magnus, Ph.D. Summer 2001 adapted from M. O’Brien and P. Kissinger.
Limited Dependent Variables
Departments of Medicine and Biostatistics
Introduction to Categorical Data Analysis
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.

Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Chapter 14 Inferential Data Analysis
Generalized Linear Models
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Ordinal Logistic Regression “Good, better, best; never let it rest till your good is better and your better is best” (Anonymous)
BASIC STATISTICS WE MOST OFTEN USE Student Affairs Assessment Council Portland State University June 2012.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Leedy and Ormrod Ch. 11 Gray Ch. 14
Week 9: QUANTITATIVE RESEARCH (3)
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Simple Linear Regression
Simple Covariation Focus is still on ‘Understanding the Variability” With Group Difference approaches, issue has been: Can group membership (based on ‘levels.
Assessing Survival: Cox Proportional Hazards Model
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Biostat Didactic Seminar Series Correlation and Regression Part 2 Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
1 Chapter 2: Logistic Regression and Correspondence Analysis 2.1 Fitting Ordinal Logistic Regression Models 2.2 Fitting Nominal Logistic Regression Models.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Handout Twelve: Design & Analysis of Covariance
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Nonparametric Statistics
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Introduction to Biostatistics Lecture 1. Biostatistics Definition: – The application of statistics to biological sciences Is the science which deals with.
BINARY LOGISTIC REGRESSION
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
Advanced Quantitative Techniques
THE LOGIT AND PROBIT MODELS
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
THE LOGIT AND PROBIT MODELS
Applied Statistical Analysis
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Classification of Variables
Do whatever is needed to finish…
Introduction to Logistic Regression
Presentation transcript:

Analysis of Complex Survey Data Day 3: Regression

Today’s schedule Part I: Basic review of common regressions and when to use them PART II: Introduction to – PROC REGRESS – PROC RLOGIST – PROC LOGLINK – PROC MULTILOG

Regression Typically in epidemiologic research, our outcomes fall into four major types: – Continuous Normally distributed Skewed – Counts – Binary – Ordinal – Nominal

Continuous outcome, normally distributed Linear regression

Continuous outcome, right skewed Poisson regression

Counts Poisson regression

Binary outcome Logistic regression

Ordinal Polytomous regression, cumulative logit link function Likert scales Ordered categorical scales (age, income) The cumulative logit link function assumes that the effect of going from 1 to 2 is the same as the effect of going from 2 to 3

Nominal Polytomous regression, general logit link function Race Diagnosis (depression versus anxiety versus substance use disorder) The general logit link function gives a different estimate for the effect of going from 1 to 2 and the effect of going from 2 to 3

Categorizing your exposure Check assumptions regarding the functional form of the relationship between the exposure and the outcome – E.g., relationship between age and alcohol use disorders. We would not want to enter age as a continuous variable because we do not think age is linearly related to risk of alcohol use disorders If you decide to categorize a continuous variable, decision on cutpoints can best be made if there is literature precedent – Relying on data driven cutpoints will make your work incomparable with other work in the literature If there is no precedent: – Use quartiles or – Break up the exposure into small categories, and examine the relationship with the outcome in a regression model with no predictors (on the log scale if using logistic regression).

Choosing covariates Most important: DO NOT SKIP THE GOUNDWORK! – Check associations with exposure and outcome – Check associations among covariates – Categorize the covariates appropriately When should something be evaluated as a moderator, and when should it be a confounder/covariate? – Most of the time, it is clear: do you think that the relationship between exposure and outcome will be the same across levels of the third variable, or do you think it will be different? – If you do not have an a priori hypothesis and are just trying to build a solid statistical model, try as a moderator first. If significant, leave in as a moderator. – Because interaction terms are sometimes difficult to interpret on their own, think about just creating subset statistical models.

LAB 3: Regression in SUDAAN