Relative risk estimation with clustered/longitudinal data: solving convergence issues in fitting the log binomial generalized estimating equations (GEE)

Slides:



Advertisements
Similar presentations
Multiple Regression.
Advertisements

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS: A COMPARISON OF METHODS FOR INDEPENDENT OBSERVATIONS Lisa N Yelland, Amy B Salter, Philip Ryan.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
Aspects of Conditional Simulation and estimation of hydraulic conductivity in coastal aquifers" Luit Jan Slooten.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Analysis of Complex Survey Data
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
Danila Filipponi Simonetta Cozzi ISTAT, Italy Outlier Identification Procedures for Contingency Tables in Longitudinal Data Roma,8-11 July 2008.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
1 INSTRUMENTAL VARIABLE ESTIMATION OF SIMULTANEOUS EQUATIONS In the previous sequence it was asserted that the reduced form equations have two important.
Linear Systems Numerical Methods. 2 Jacobi Iterative Method Choose an initial guess (i.e. all zeros) and Iterate until the equality is satisfied. No guarantee.
Nonparametric Statistics
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Methods of Presenting and Interpreting Information Class 9.
Seeking HIV-testing Only: Missed Opportunity for HIV Prevention?
Estimating standard error using bootstrap
Bootstrap and Model Validation
Nonparametric Statistics
Chapter 4: Basic Estimation Techniques
Sample size calculation
BINARY LOGISTIC REGRESSION
Chapter 4 Basic Estimation Techniques
The hypergeometric and negative binomial distributions
Root Finding Methods Fish 559; Lecture 15 a.
MAT 150 Algebra 1-7 Modeling Linear Functions
Latent Class Regression Computing examples
Probability Theory and Parameter Estimation I
Basic Estimation Techniques
Inference and Tests of Hypotheses
The University of Alabama, Tuscaloosa, AL
Gauss-Siedel Method.
Classification of unlabeled data:
Analyzing Redistribution Matrix with Wavelet
CJT 765: Structural Equation Modeling
Asst Prof Dr. Ahmed Sameer Al-Nuaimi - MBChB, MSc epi, PhD
Virtual University of Pakistan
Using Weights in the Analysis of Survey Data
Basic Estimation Techniques
Multiple logistic regression
SA3202 Statistical Methods for Social Sciences
Nonparametric Statistics
Evaluation of measuring tools: reliability
Spatial Prediction of Coho Salmon Counts on Stream Networks
Modelling data and curve fitting
Scatter Plots of Data with Various Correlation Coefficients
Elementary Statistics
What is Regression Analysis?
Chapter 8: Weighting adjustment
Logistic Regression.
Introduction to Logistic Regression
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Combined predictor Selection for Multiple Clinical Outcomes Using PHREG Grisell Diaz-Ramirez.
Measuring Errors Major: All Engineering Majors
Using Weights in the Analysis of Survey Data
Product moment correlation
If the question asks: “Find the probability if...”
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Multiple Regression – Split Sample Validation
Applied Statistics Using SPSS
Applied Statistics Using SPSS
The European Statistical Training Programme (ESTP)
Inferential testing.
Machine Learning: Lecture 5
Presentation transcript:

Relative risk estimation with clustered/longitudinal data: solving convergence issues in fitting the log binomial generalized estimating equations (GEE) Presenter: Zhu Chao Co-authors: David W Hosmer Jim Stankovich Karen Wills Leigh Blizzard

Introduction Log binomial generalized estimating equations (GEE) 2 Log binomial generalized estimating equations (GEE) It is possible to estimate risk and risk ratios in clustered/longitudinal data by fitting a log binomial GEE. However, the estimating equations may fail to converge if the iterations commence from inappropriate starting value, the fitted mean value of one or more observations is equal to unity. Starting values are inappropriate if the initial fitted mean value of one or more observations is greater than unity. This problem can be rectified by providing improved initial values. Solving the second problem requires a specialised approach.

Method Exact method The exact method (Petersen and Deddens, 2010; Zhu et al, 2019) solves convergence difficulties in estimation of the log binomial model for independent data by re- parameterizing the covariates to eliminate the boundary vectors (those covariate vectors with fitted probabilities equal to unity). Because of similarities in functional form and estimation, we postulated that the exact method could be used to solve convergence difficulties encountered in fitting the log binomial GEE for clustered/longitudinal data. Issues in applying the exact method to the log binomial GEE Identifying the boundary vector(s), Determining which optimisation criterion should be used to confirm the solutions.

Example The data are for a subset of 1000 subjects with a burn injury sampled from the National Burn Repository dataset by Hosmer, Lemeshow and Sturdivant (2013). The subjects were treated in 40 different burn facilities (FACILITY). Our goal was to estimate the probability of death (DEATH) taking account of the correlation within each of the 40 burn facilities. The covariates are: age (AGE), total burn surface area (TBSA), inhalation injury involved (INH_INJ, 0 = no, 1 = yes), race (RACE: 0 = non-white, 1 = white), flame involved in burn injury (FLAME: 0 = no, 1 = yes) and gender (GENDER: 0 = male, 1 = female). A log binomial GEE applied to these data failed to converge. There are four boundary vectors representing four subjects from different burn facilities with fitted mean values of unity when evaluated at the convergent solution. Table 1: Four boundary vectors involved in the model. DEATH AGE GENDER TBSA INH_INJ FLAME RACEC FACILITY 1 81.7 34 47 60.7 97 4 86.1 83 19 54.6 86 13

Coefficient errors (%)† Example Re-parameterizing the data to eliminate these boundary vectors in accordance with the exact method produced a convergent solution. Table 2: Convergent log binomial GEE solution by the exact method (with coefficient estimates from the Poisson GEE Shown for comparison). Log binomial GEE (95% CI) Poisson GEE Coefficient errors (%)† AGE 0.0263 (0.0244, 0.0283) 0.0393 (0.0330, 0.0456) 49.43 TBSA 0.0159 (0.0148, 0.0171) 0.0284 (0.0254, 0.0314) 78.62 INH_INJ 0.4461 (0.4131, 0.4791) 0.6207 (0.3704, 0.8710) 39.01 RACE -0.3359 (-0.3608, -0.3111) -0.2183 (-0.4844, 0.0478) 35.01 FLAME 0.7316 (0.2504, 1.2127) 0.5076 (0.0260, 0.9893) 30.62 GENDER -0.1144 (-0.1229, -0.1060) -0.1838 (-0.3567, -0.0108) 60.66 Constant -3.8715 (-4.2755, -3.4674) -4.7488 (-5.2865, -4.2112) 22.66 † Coefficient errors of the Poisson GEE calculated as the absolute percentage difference relative to the log binomial GEE estimates of the coefficient. Also shown are the Poisson GEE estimates of the coefficients, which differed by 23 – 79%.

Discussion and Conclusion Inadmissible starting values are one source of failure of standard fitting algorithms in fitting the log binomial GEE. Boundary solutions are responsible for the remaining failures of standard fitting algorithm. The exact method can be used to overcome convergence difficulties caused by boundary vectors in the log binomial GEE. The coefficients and standard errors of the commonly-used Poisson GEE method are biased and can be badly biased. We recommend estimation of the log binomial GEE using the exact method.