Optimal Sampling Strategies for Multidomain, Multivariate Case with different amount of auxiliary information Piero Demetrio Falorsi, Paolo Righi Italian National Statistical Institute Seminar UNECE, 12 June 2012
Outline Aim of the talk Statement of the problem (The unified approach for) sampling design (Mgreg) Estimator Experimental results Conclusions
Aim of the talk An overall strategy
Statement of the problem
Statement of the problem: Challenging informative context Multiple sources of auxiliary information
Statement of the problem: Design
Statement of the problem: Estimation Standard solution for estimation (calibration estimators) may allow for calibrating at domain level only for the register variables and does not calibrate on the domain existing totals deriving from auxiliary data sources Main drawback: Too small sample size for some domains Risk that the estimation of variables that could derive from administrative Data Source are significantly different from known totals Biased estimation for small domains Effect of non response or measurement error
Sampling Design: Multiple sources of auxiliary information
Estimation: Multiple sources of auxiliary information
Estimation:The Working model
Estimation:The Mgreg Estimator
Estimation: Properties
Estimation: Properties - auxiliary=interest
Empirical Results: Population of simulation Italian enterprises from 1 to 99 employees- Computer and related economic activities (2-digits NACE Rev.1) ITACOSM June 2011, Pisa, Italy - 12 Populatio n size Number of cross-classified strata Cumulative (%) distribution More than The domains of interest (44): (1) geographical region with 20 marginal domains (DOM1); (2) economic activity group by Size class (24 domains)
Empirical Results: Simulation: allocation comparison between the one- way and multi-way design Prediction models: M1M1 M2M Value addedLabour cost % Model
Sampling distributions over the partition with different auxiliary information Empirical Results: multiple sources of auxiliary information: example – efficiency of the proposed strategy
Conclusions
The last result (The unified approach) of a research that has lasted almost 6 years Survey Methodology (2008) Statistics in Transition (2006) 2 books published by Franco Angeli illustrating the main findings of a research of strategic interest financed by the Ministry of University and Research Presentations NTTS (2011), Neuchatel (2011) Invited talk to the next scientific conference of the Italian Society of Statistics Accepted talk for the ICES
References Bethel J. (1989) Sample Allocation in Multivariate Surveys, Survey Methodology, 15, Chromy J. (1987). Design Optimization with Multiple Objectives, Proceedings of the Survey Research Methods Sec-tion. American Statistical Association, Deville J.-C., Tillé Y. (2004) Efficient Balanced Sampling: the Cube Method, Biometrika, 91, Deville J.-C., Tillé Y. (2005) Variance approximation under balanced sampling, Journal of Statistical Planning and Inference, 128, Falorsi P. D., Righi P. (2008) A Balanced Sampling Approach for Multi-way Stratification Designs for Small Area Estimation, Survey Methodology, 34, Falorsi P. D., Orsini D., Righi P., (2006) Balanced and Coordinated Sampling Designs for Small Domain Estimation, Statistics in Transition, 7, Isaki C.T., Fuller W.A. (1982) Survey design under a regression superpopulation model, Journal of the American Statistical Association, 77, 89-96