Relative risk estimation with clustered/longitudinal data: solving convergence issues in fitting the log binomial generalized estimating equations (GEE)

Slides:

Advertisements

Similar presentations

Multiple Regression.

Advertisements

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.

Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS: A COMPARISON OF METHODS FOR INDEPENDENT OBSERVATIONS Lisa N Yelland, Amy B Salter, Philip Ryan.

Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.

Aspects of Conditional Simulation and estimation of hydraulic conductivity in coastal aquifers" Luit Jan Slooten.

Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

Analysis of Complex Survey Data

1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.

Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.

CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.

Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.

Danila Filipponi Simonetta Cozzi ISTAT, Italy Outlier Identification Procedures for Contingency Tables in Longitudinal Data Roma,8-11 July 2008.

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.

1 INSTRUMENTAL VARIABLE ESTIMATION OF SIMULTANEOUS EQUATIONS In the previous sequence it was asserted that the reduced form equations have two important.

Linear Systems Numerical Methods. 2 Jacobi Iterative Method Choose an initial guess (i.e. all zeros) and Iterate until the equality is satisfied. No guarantee.

Nonparametric Statistics

Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.

Methods of Presenting and Interpreting Information Class 9.

Seeking HIV-testing Only: Missed Opportunity for HIV Prevention?

Estimating standard error using bootstrap

Bootstrap and Model Validation

Nonparametric Statistics

Chapter 4: Basic Estimation Techniques

Sample size calculation

BINARY LOGISTIC REGRESSION

Chapter 4 Basic Estimation Techniques

The hypergeometric and negative binomial distributions

Root Finding Methods Fish 559; Lecture 15 a.

MAT 150 Algebra 1-7 Modeling Linear Functions

Latent Class Regression Computing examples

Probability Theory and Parameter Estimation I

Basic Estimation Techniques

Inference and Tests of Hypotheses

The University of Alabama, Tuscaloosa, AL

Gauss-Siedel Method.

Classification of unlabeled data:

Analyzing Redistribution Matrix with Wavelet

CJT 765: Structural Equation Modeling

Asst Prof Dr. Ahmed Sameer Al-Nuaimi - MBChB, MSc epi, PhD

Virtual University of Pakistan

Using Weights in the Analysis of Survey Data

Basic Estimation Techniques

Multiple logistic regression

SA3202 Statistical Methods for Social Sciences

Nonparametric Statistics

Evaluation of measuring tools: reliability

Spatial Prediction of Coho Salmon Counts on Stream Networks

Modelling data and curve fitting

Scatter Plots of Data with Various Correlation Coefficients

Elementary Statistics

What is Regression Analysis?

Chapter 8: Weighting adjustment

Logistic Regression.

Introduction to Logistic Regression

BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.

Combined predictor Selection for Multiple Clinical Outcomes Using PHREG Grisell Diaz-Ramirez.

Measuring Errors Major: All Engineering Majors

Using Weights in the Analysis of Survey Data

Product moment correlation

If the question asks: “Find the probability if...”

A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants Andrew.

Multiple Regression – Split Sample Validation

Applied Statistics Using SPSS

Applied Statistics Using SPSS

The European Statistical Training Programme (ESTP)

Inferential testing.

Machine Learning: Lecture 5

Presentation transcript:

Relative risk estimation with clustered/longitudinal data: solving convergence issues in fitting the log binomial generalized estimating equations (GEE) Presenter: Zhu Chao Co-authors: David W Hosmer Jim Stankovich Karen Wills Leigh Blizzard

Introduction Log binomial generalized estimating equations (GEE) 2 Log binomial generalized estimating equations (GEE) It is possible to estimate risk and risk ratios in clustered/longitudinal data by fitting a log binomial GEE. However, the estimating equations may fail to converge if the iterations commence from inappropriate starting value, the fitted mean value of one or more observations is equal to unity. Starting values are inappropriate if the initial fitted mean value of one or more observations is greater than unity. This problem can be rectified by providing improved initial values. Solving the second problem requires a specialised approach.

Method Exact method The exact method (Petersen and Deddens, 2010; Zhu et al, 2019) solves convergence difficulties in estimation of the log binomial model for independent data by re- parameterizing the covariates to eliminate the boundary vectors (those covariate vectors with fitted probabilities equal to unity). Because of similarities in functional form and estimation, we postulated that the exact method could be used to solve convergence difficulties encountered in fitting the log binomial GEE for clustered/longitudinal data. Issues in applying the exact method to the log binomial GEE Identifying the boundary vector(s), Determining which optimisation criterion should be used to confirm the solutions.

Example The data are for a subset of 1000 subjects with a burn injury sampled from the National Burn Repository dataset by Hosmer, Lemeshow and Sturdivant (2013). The subjects were treated in 40 different burn facilities (FACILITY). Our goal was to estimate the probability of death (DEATH) taking account of the correlation within each of the 40 burn facilities. The covariates are: age (AGE), total burn surface area (TBSA), inhalation injury involved (INH_INJ, 0 = no, 1 = yes), race (RACE: 0 = non-white, 1 = white), flame involved in burn injury (FLAME: 0 = no, 1 = yes) and gender (GENDER: 0 = male, 1 = female). A log binomial GEE applied to these data failed to converge. There are four boundary vectors representing four subjects from different burn facilities with fitted mean values of unity when evaluated at the convergent solution. Table 1: Four boundary vectors involved in the model. DEATH AGE GENDER TBSA INH_INJ FLAME RACEC FACILITY 1 81.7 34 47 60.7 97 4 86.1 83 19 54.6 86 13

Coefficient errors (%)† Example Re-parameterizing the data to eliminate these boundary vectors in accordance with the exact method produced a convergent solution. Table 2: Convergent log binomial GEE solution by the exact method (with coefficient estimates from the Poisson GEE Shown for comparison). Log binomial GEE (95% CI) Poisson GEE Coefficient errors (%)† AGE 0.0263 (0.0244, 0.0283) 0.0393 (0.0330, 0.0456) 49.43 TBSA 0.0159 (0.0148, 0.0171) 0.0284 (0.0254, 0.0314) 78.62 INH_INJ 0.4461 (0.4131, 0.4791) 0.6207 (0.3704, 0.8710) 39.01 RACE -0.3359 (-0.3608, -0.3111) -0.2183 (-0.4844, 0.0478) 35.01 FLAME 0.7316 (0.2504, 1.2127) 0.5076 (0.0260, 0.9893) 30.62 GENDER -0.1144 (-0.1229, -0.1060) -0.1838 (-0.3567, -0.0108) 60.66 Constant -3.8715 (-4.2755, -3.4674) -4.7488 (-5.2865, -4.2112) 22.66 † Coefficient errors of the Poisson GEE calculated as the absolute percentage difference relative to the log binomial GEE estimates of the coefficient. Also shown are the Poisson GEE estimates of the coefficients, which differed by 23 – 79%.

Discussion and Conclusion Inadmissible starting values are one source of failure of standard fitting algorithms in fitting the log binomial GEE. Boundary solutions are responsible for the remaining failures of standard fitting algorithm. The exact method can be used to overcome convergence difficulties caused by boundary vectors in the log binomial GEE. The coefficients and standard errors of the commonly-used Poisson GEE method are biased and can be badly biased. We recommend estimation of the log binomial GEE using the exact method.