Use of Estimating Equations and Quadratic Inference Functions in Complex Surveys Leigh Ann Harrod and Virginia Lesser Department of Statistics Oregon State.

Slides:



Advertisements
Similar presentations
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Advertisements

Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Missing Data Analysis. Complete Data: n=100 Sample means of X and Y Sample variances and covariances of X Y
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
The General Linear Model Or, What the Hell’s Going on During Estimation?
1 Multiple Frame Surveys Tracy Xu Kim Williamson Department of Statistical Science Southern Methodist University.
Model- vs. design-based sampling and variance estimation on continuous domains Cynthia Cooper OSU Statistics September 11, 2004 R
Visual Recognition Tutorial
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.

Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML) and likelihood ratio (LR) test
Statistical Inference and Regression Analysis: GB Professor William Greene Stern School of Business IOMS Department Department of Economics.
Different chi-squares Ulf H. Olsson Professor of Statistics.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
How to deal with missing data: INTRODUCTION
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Modeling clustered survival data The different approaches.
Maximum likelihood (ML)
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R
Multi-scale Analysis: Options for Modeling Presence/Absence of Bird Species Kathryn M. Georgitis 1, Alix I. Gitelman 1, and Nick Danz 2 1 Statistics Department,
GEE and Generalized Linear Mixed Models
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Lecture 9: Marginal Logistic Regression Model and GEE (Chapter 8)
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Comparison of Variance Estimators for Two-dimensional, Spatially-structured Sample Designs. Don L. Stevens, Jr. Susan F. Hornsby* Department of Statistics.
Lecture 8: Generalized Linear Models for Longitudinal Data.
1 G Lect 3b G Lecture 3b Why are means and variances so useful? Recap of random variables and expectations with examples Further consideration.
1 G Lect 8b G Lecture 8b Correlation: quantifying linear association between random variables Example: Okazaki’s inferences from a survey.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Modeling Correlated/Clustered Multinomial Data Justin Newcomer Department of Mathematics and Statistics University of Maryland, Baltimore County Probability.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Danila Filipponi Simonetta Cozzi ISTAT, Italy Outlier Identification Procedures for Contingency Tables in Longitudinal Data Roma,8-11 July 2008.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Simulation of spatially correlated discrete random variables Dan Dalthorp and Lisa Madsen Department of Statistics Oregon State University
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
New Measures of Data Utility Mi-Ja Woo National Institute of Statistical Sciences.
Estimation in Marginal Models (GEE and Robust Estimation)
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Maximum Likelihood. Much estimation theory is presented in a rather ad hoc fashion. Minimising squared errors seems a good idea but why not minimise the.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
Chapter 7. Classification and Prediction
STATISTICS POINT ESTIMATION
Ch3: Model Building through Regression
CH 5: Multivariate Methods
12 Inferential Analysis.
Spatial Prediction of Coho Salmon Counts on Stream Networks
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
EC 331 The Theory of and applications of Maximum Likelihood Method
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
OVERVIEW OF LINEAR MODELS
Multivariate Methods Berlin Chen, 2005 References:
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

Use of Estimating Equations and Quadratic Inference Functions in Complex Surveys Leigh Ann Harrod and Virginia Lesser Department of Statistics Oregon State University

The research described in this presentation has been funded by the U.S. Environmental Protection Agency through the STAR Cooperative Agreement CR National Research Program on Design-Based/Model- Assisted Survey Methodology for Aquatic Resources at Oregon State University. It has not been subjected to the Agency's review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred.

BACKGROUND Generalized estimating equations (GEE) and quadratic inference functions (QIF) are used in longitudinal studies The techniques are useful when observations are clustered or correlated Can be generalized to any survey data collected over time or space

Work by Liang and Zeger (1986) Question of interest: –Pattern of change in time –Dependence of response on covariates Approach: –Working GLM for the marginal distribution of the response –Advantages of estimating equations –Give consistent estimates of regression parameters and their variances –Increase efficiency –Methods reduce to maximum likelihood when responses are multivariate normal

Work by Liang and Zeger (cont’d) Let R(α) be a nxn correlation matrix –The “working correlation matrix” Let α be an sx1 vector that fully characterizes R(α) Define is the true correlation matrix

GEE Define GEE as: Similar to quasi-likelihood approach –Substitute estimators for α and φ –Solve for Consistency of depends on –Correct specification of the mean, not of R(α) –MCAR data

Zeger, Liang, & Albert (1988) Two approaches: –Subject-specific (SS) model –Population-averaged (PA) model When there is no heterogeneity between subjects, SS model = PA model Applications –Site-specific trend over time –Population-averaged trend of many sites over time

Rao, Yung, Hidiroglou (2002) GEE used with poststratification to obtain GREG estimator Use calibration weights = (design weight) x (PS Adj factor) Simple cases (mean, LS) –Closed form solution available –Taylor linearization variance estimator Complex cases (logistic) –Newton-Raphson to obtain estimate –Jackknife variance estimator

Qu, Lindsay, & Li (2000) Drawbacks of the GEE approach: –When R misspecified, Moment estimator of α doesn’t give optimal Moment estimator of α doesn’t exist in some cases Goal: introduce strategy for estimating the working correlation to correct problems Quadratic inference functions (QIF) –Form:

Developing QIF Plays inferential role similar to negative of log-likelihood Optimal linear combination of elements of the score vector reduces to QL equation Combine parameter estimates optimally when dimension of parameters differ for different missing data patterns Model the inverse of correlation matrix as a linear combination of known matrices

QIF QL equation is a linear combination of the “extended score” g is efficient if weights are inverse of variance QIF analogous to Rao’s score test statistic –May be used to test MCAR (ignorable missing) data assumption for several missing data patterns

Applications Extend use of GEE in survey methodology to include QIF Use QIF to –Estimate trend in time For one site Over all sites –Account for spatial correlation at a point in time –Account for revisit sites within a year (e.g. ODFW habitat surveys)

Research directions Weight QIF by within-cluster variance for cluster samples Account for variable probability sampling Conduct tests of trend using asymptotic distribution

Acknowledgements Annie Qu, Oregon State University

References Liang, K. and S.L. Zeger (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73: Rao, J.N.K., W. Yung, and M.A. Hidiroglou (2002). Estimating equations for the analysis of survey data using poststratification infromation. Sankhya 64(A): Qu, Annie, B.G. Lindsey, and B. Li (2000). Improving generalised estimating equations using inference functions. Biometrika, 87: Qu, Annie and Peter X.-K. Song (2002). Testing ignorable missingness in estimating equation approaches for longitudinal data. Biometrika, 89: Zeger, S.L. and K. Liang (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 42: Zeger, S.L., K. Liang, and P.S. Albert (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics, 44:

Work by Liang and Zeger (cont’d) Marginal density of response is:

GEEs Define GEEs as: Similar to quasi-likelihood approach Substitute estimators for α and φ Solve for