Linearization Variance Estimators for Survey Data: Some Recent Work

Linearization Variance Estimators for Survey Data: Some Recent Work
A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June 18-21, 2007 Montréal, Québec, Canada June 20, 2007

Situation looking for a method of variance estimation that is simple
is widely applicable has good properties provides unique choice for estimators of nonlinear finite population parameters SM, 2004 defined explicitly or implicitly SM, 2004 using calibration weights SM, 2004 under missing data JSM, and JMS, 2002 using repeated survey FCSM, 2003 of model parameters Symposium, 2005 of dual frames JSM, 2007

Demnati –Rao Approach General formulation Finite population parameters
Model parameters Estimator for both parameters Variance estimators associated with and are different

Demnati –Rao Approach ( Survey Methodology, 2004 )
Write the estimator of a finite population parameter as with if element k is not in sample s; if element k is in sample s;

A linearization sampling variance estimator is given by with : variance estimator of the H-T estimator of the total is a (N×1) vector of arbitrary number

Example – Ratio estimator of For SRS and

Example – Ratio estimator of is a better choice over customary Royall and Cumberland (1981) Särndal et al. (1989) Valliant (1993) Binder (1996) Skinner (2004)

Demnati –Rao Approach Also in Survey Methodology, 2004:
Calibration Estimators: the GREG Estimator the “Optimal” Regression Estimator the Generalized Raking Estimator Two-Phase Sampling New Extensions: Wilcoxon Rank-Sum Test Cox Proportional Hazards Model

Model parameters (Symposium, 2005)
Finite-population assumed to be generated from a superpopulation model Inference on model parameter Total variance of : : model expectation and variance : design expectation and variance i) if f ≈ 0 then ii) if f ≈ 1 then where f is the sampling fraction. For multistage sampling, the psu sampling fraction plays the role of f. In case i),

Example: Ratio estimator when y is assumed to be random
for Define We have where Ad is a 2×N matrix of random variables with kth column: We get where Ab is a 2×N matrix of arbitrary real numbers with kth column: where is an estimator of the total variance of

Estimator of the total variance of
and when A variance estimator of is given by with where Note that is an estimator of model covariance when and when

Hence = model variance sampling variance where and Under SRS, where

Under ratio model, Note: remains valid under misspecification of Hence, Note: g-weight appears automatically in and the finite population correction 1-n/N is absent in

Simulation 1: Unconditional performance
We generated R=2,000 finite populations , each of size N=393 from the ratio model where are independent observations generated from a N(0,1) are the “number of beds” for the Hospitals population studied in Valliant, Dorfman, and Royall (2000, p ) One simple random sample of specified size n is drawn from each generated population Parameter of interest:

Ratio estimator: We calculated: Simulated and its components and

Figure 1: Averages of variance estimates for selected sample sizes compared to simulated MSE of the ratio estimator.

Simulation 2: Conditional performance
We generate R=20,000 finite populations , each of size N=393 from the ratio model using the number of beds as One simple random sample of size n=100 is drawn from each generated population Parameter of interest: We arranged the 20,000 samples in ascending order of values and then grouped them into 20 groups each of size 1,000

Figure 2: Conditional relative bias of the expansion and ratio estimators of

Figure 3: Conditional relative bias of variance estimators

Figure 4: Conditional coverage rates of normal theory confidence intervals based on , and for nominal level of 95%

g-weighted estimating functions: model parameter
Generalized Linear Model is the solution of weighted estimating equation: is solution Special case: (GREG) Linear Regression Model Logistic Regression Model

Simulation 3: Estimating equations
We generated R=10,000 finite populations , each of size N=393 from the model Using the number of beds as leads to an average of about 60% for z One simple random sample of size n=30 is drawn from each generated population Parameter of interest: Population units are grouped into two classes with 271 units k having x<350 in class 1 and 122 units k with x>=350 in class 2 Post-stratification: X=(271,122)T

Simulation 3: Estimating equations
Table 1: Monte Carlo Variances Parameter No Calibration Post-stratification 0.0133 0.0139 0.0161 0.0167 Table 2: DR variance estimator Parameter No Calibration Post-stratification 0.0122 0.0123 0.0148 0.0150 Table 3: DR naïve variance estimator Parameter No Calibration Post-stratification 0.0120 0.0145

Multiple Weight Adjustments
Weight Adjustments for Units (or complete) nonresponse Calibration Due to lack of time, not presented in the talk, but it is included in the proceeding paper

Concluding Remarks We provided a method of variance estimation for estimators: of nonlinear model parameters using survey data defined explicitly or implicitly using multiple weight adjustments under missing data The method is simple is widely applicable has good properties provides unique choice Thank you Very Much

Linearization Variance Estimators for Survey Data: Some Recent Work

Similar presentations

Presentation on theme: "Linearization Variance Estimators for Survey Data: Some Recent Work"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Linearization Variance Estimators for Survey Data: Some Recent Work

Similar presentations

Presentation on theme: "Linearization Variance Estimators for Survey Data: Some Recent Work"— Presentation transcript:

Similar presentations

About project

Feedback