HSRP 734: Advanced Statistical Methods June 19, 2008.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Two-sample tests. Binary or categorical outcomes (proportions) Outcome Variable Are the observations correlated?Alternative to the chi- square test if.
The Multiple Regression Model.
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Logistic Regression Psy 524 Ainsworth.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Overview of Logistics Regression and its SAS implementation
Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.

Instructor: K.C. Carriere
Clustered or Multilevel Data
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Generalized Linear Models
Logistic regression for binary response variables.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Review of Lecture Two Linear Regression Normal Equation
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
GEE and Generalized Linear Mixed Models
Introduction to Multilevel Modeling Using SPSS
Multilevel Modeling: Other Topics
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Lecture 9: Marginal Logistic Regression Model and GEE (Chapter 8)
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
Overview of Meta-Analytic Data Analysis
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Simple Linear Regression
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Different Distributions David Purdie. Topics Application of GEE to: Binary outcomes: – logistic regression Events over time (rate): –Poisson regression.
Linear correlation and linear regression + summary of tests
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Osteoarthritis Initiative Analytic Strategies for the OAI Data December 6, 2007 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Multiple Logistic Regression STAT E-150 Statistical Methods.
1 STA 617 – Chp12 Generalized Linear Mixed Models Modeling Heterogeneity among Multicenter Clinical Trials  compare two groups on a response for.
Logistic regression (when you have a binary response variable)
Chapter 13 Understanding research results: statistical inference.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
EHS Lecture 14: Linear and logistic regression, task-based assessment
Logistic Regression APKC – STATS AFAC (2016).
REGRESSION G&W p
CHAPTER 7 Linear Correlation & Regression Methods
M.Sc. in Economics Econometrics Module I
LOGISTIC REGRESSION 1.
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
SA3202 Statistical Methods for Social Sciences
Nonparametric Statistics
Association, correlation and regression in biomedical research
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Introductory Statistics
Presentation transcript:

HSRP 734: Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression Outcomes with more than 2 categories –Categories have order –Unordered Conditional logistic regression –Analysis of matched data

Extensions of Logistic Regression Exact methods for small samples –Fisher’s exact –Exact logistic regression Correlated/Clustered data –GEE method –Mixed models

Extensions of Logistic Regression Outcomes with more than 2 categories (polytomous or polychotomous) Cumulative logit model – Proportional odds model for ordinal outcomes (ordered categories) Generalized logit model for nominal outcomes or non-proportional odds models (unordered categories)

Extensions of Logistic Regression Cumulative logit model –Fits a logistic regression model with g-1 intercepts for a g category outcome and one model coefficient for each predictor –Models cumulative probability of being in a “lower” category

Ordinal Logistic Regression Odds ratios take on interpretation “% increase/decrease in the odds of being in a lower/higher category” Subject to the “Proportional Odds” assumption

Extensions of Logistic Regression Generalized logit model –Fits a logistic regression model with g-1 intercepts and g-1 model coefficients for a g category outcome –Model captures the multinomial probability of being in a particular category using generalized logits

Nominal Logistic Regression Odds ratios have regular interpretation, just have to be careful with which comparisons are being made (reference category) Does not assume “Proportional Odds”

SAS

Conditional logistic regression Can use for matched data (e.g., case- control studies) Provides unbiased estimates of odds ratios and CI’s

SAS

Extensions to Logistic Regression Exact Logistic Regression Small Sample Size Adequate sample size but rare event (sparse data)

Fisher’s exact test Exact test for RxC table where Chi- square test assumptions are doubtful Why not always use Fisher’s exact test and Exact logistic regression?

SAS

Extensions of Logistic Regression Longitudinal data / repeated measures data / Clustered data with binary outcomes Multilevel models (nested data structures)  GEE (Generalized Estimating Equations)  GLMM (Generalized Linear Mixed Models)

Two methods for handling clustered outcomes Mixed models –Likelihood based –Use random effects to model clustered observations –continuous outcome (but now extended for categorical) Generalized Estimating Equation (GEE) –Non-likelihood based –Can handle large number of clusters –categorical outcome

GEE GEE can be used in –Longitudinal studies repeated measures of the same individual form a cluster –Community studies subjects clustered by neighborhood –Familial studies subjects clustered by family –Epidemiological studies Different forms of clusters – e.g., pedigree

GEE In general GEE has 3 sets of parameters to estimate: –Regression parameter (population-averaged effects) –Correlation parameter (cluster parameter) –Scale factor (not uncommon to assume =1)

Comparing SLR and GEE SLRGEE No dispersion allowed for variance Var (y)= mu(1-mu) Dispersion allowed for variance Var (y)= mu(1-mu)*scale_factor No need to specify correlation matrix Need to specify correlation structure Has odds ratio interpretation of exp(coefficient)

GEE In its simplest form, GEE can be considered an extension of logistic regression for clustered data Clustered data are common –Time: Longitudinal analysis with repeated measurements on individual (e.g., BL, 1m, 2m, 6m follow-up) –Individual: Cross-sectional analysis with multiple outcomes (e.g., left eye, right eye) –Background: Subjects clustered because of common geographical or social background (e.g., clinic)

Correlation structure –Often called the working correlation structure in GEE –Specifies how the observations within a cluster are related –Often assumes correlation structure uniform throughout clusters

Unstructured –All correlation coefficients free to take any value –E.g.,

Exchangeable –Any responses within the same cluster has the same correlation –Simple (1 parameter to estimate)

Autogressive AR(1) Correlation between responses depends on the interval of time between responses –Farther apart responses => weaker correlation –Only 1 parameter to estimate!

Correlation matrix Selection of a “working correlation structure” is at the discretion of the researcher! How does the correlation structure affects the results?

Properties of GEE estimators How about estimate of correlation if “working” correlation matrix is not correctly specified? Model-based estimate => not consistent Empirical (robust) estimate => still consistent

Properties of GEE estimators Even if correlation structure misspecified, estimate for logistic regression is still consistent –if correlation misspecified, estimate not as efficient (SE is larger) –This property contributes to the popularity of GEE GEE works well with larger #’s of clusters

SAS

Review