Optimal Allocation in the Multi-way Stratification Design for Business Surveys (*) Paolo Righi, Piero Demetrio Falorsi 

Slides:



Advertisements
Similar presentations
Multiple Indicator Cluster Surveys Survey Design Workshop
Advertisements

1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII.
Using Business Taxation Data as Auxiliary Variables and as Substitution Variables in the Australian Bureau of Statistics Frank Yu, Robert Clark and Gabriele.
Challenges in small area estimation of poverty indicators
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
GENEralised software for Sampling Estimates and Errors in Surveys (GENESEES V. 3.0) Piero Demetrio Falorsi - Salvatore Filiberti Istat Structural Business.
1.2.4 Statistical Methods in Poverty Estimation 1 MEASUREMENT AND POVERTY MAPPING UPA Package 1, Module 2.
Riku Salonen Regression composite estimation for the Finnish LFS from a practical perspective.
Chapter 5 Stratified Random Sampling n Advantages of stratified random sampling n How to select stratified random sample n Estimating population mean and.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
On the use of auxiliary variables in agricultural surveys design
Optimal Sampling Strategies for Multidomain, Multivariate Case with different amount of auxiliary information Piero Demetrio Falorsi, Paolo Righi 
Weighting sample surveys with Bascula Harm Jan Boonstra Statistics Netherlands.
Analysis of Complex Survey Data Day 5, Special topics: Developing weights and imputing data.
Dr. Chris L. S. Coryn Spring 2012
Ranked Set Sampling: Improving Estimates from a Stratified Simple Random Sample Christopher Sroka, Elizabeth Stasny, and Douglas Wolfe Department of Statistics.
IEEM 3201 One and Two-Sample Estimation Problems.
STAT262: Lecture 5 (Ratio estimation)
Comparison of Regularization Penalties Pt.2 NCSU Statistical Learning Group Will Burton Oct
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
1 Marketing Research Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides.
Formalizing the Concepts: STRATIFICATION. These objectives are often contradictory in practice Sampling weights need to be used to analyze the data Sampling.
STRATIFIED SAMPLING DEFINITION Strata: groups of members that share common characteristics Stratified sampling: the population is divided into subpopulations.
Increasing Survey Statistics Precision Using Split Questionnaire Design: An Application of Small Area Estimation 1.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
Joint UNECE/Eurostat Meeting on Population and Housing Censuses (13-15 May 2008) Sample results expected accuracy in the Italian Population and Housing.
Review of normal distribution. Exercise Solution.
Arun Srivastava. Small Areas What is a small area? Sub - population Domain The Domain need not necessarily be geographical. Examples Geographical Subpopulations.
Stratification and Adjustment
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
CORE Rome Meeting – 3/4 October WP3: A Process Scenario for Testing the CORE Environment Diego Zardetto (Istat CORE team)
Joint UNECE/Eurostat Meeting on Population and Housing Censuses (28-30 October 2009) Accuracy evaluation of Nuts level 2 hypercubes with the adoption of.
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
9 th Workshop on Labour Force Survey Methodology – Rome, May 2014 The Italian LFS sampling design: recent and future developments 9 th Workshop on.
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Sampling “Sampling is the process of choosing sample which is a group of people, items and objects. That are taken from population for measurement and.
The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.
1 Enhancing Small Area Estimation Methods Applications to Istat’s Survey Data Ranalli M.G. ~ Università di Perugia D’Alo’ M., Di Consiglio L., Falorsi.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
for statistics based on multiple sources
Copyright 2010, The World Bank Group. All Rights Reserved. Part 2 Sample Design Produced in Collaboration between World Bank Institute and the Development.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
A Theoretical Framework for Adaptive Collection Designs Jean-François Beaumont, Statistics Canada David Haziza, Université de Montréal International Total.
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Improving of Household Sample Surveys Data Quality on Base of Statistical Matching Approaches Ganna Tereshchenko Institute for Demography and Social Research,
Discrete Choice Modeling William Greene Stern School of Business New York University.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
WERST – Methodology Group
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
Repeated anonymised samples of administrative records: an application to social security data in Brazil Rigan A. C. Gonzalez (DATAPREV-Brazil) Pedro L.
1 Optimal Number of Replicates for Variance Estimation Mansour Fahimi, Darryl Creel, Peter Siegel, Matt Westlake, Ruby Johnson, and Jim Chromy Third International.
Joint UNECE-Eurostat worksession on confidentiality, 2011, Tarragona Sampling as a way to reduce risk and create a Public Use File maintaining weighted.
Sampling Designs Outline
1. 2 DRAWING SIMPLE RANDOM SAMPLING 1.Use random # table 2.Assign each element a # 3.Use random # table to select elements in a sample.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
Survey sampling Outline (1 hr) Survey sampling (sources of variation) Sampling design features Replication Randomization Control of variation Some designs.
Small area estimation combining information from several sources Jae-Kwang Kim, Iowa State University Seo-Young Kim, Statistical Research Institute July.
Sampling procedures for assessing accuracy of record linkage Paul A. Smith, S3RI, University of Southampton Shelley Gammon, Sarah Cummins, Christos Chatzoglou,
Regression composite estimation for the Finnish LFS from a practical perspective Riku Salonen.
Sampling and estimation
Small area estimation with calibration methods
Presentation transcript:

Optimal Allocation in the Multi-way Stratification Design for Business Surveys (*) Paolo Righi, Piero Demetrio Falorsi   Italian National Statistical Institute (*) Research of National Interest n.2007RHFBB3 (PRIN) “Efficient use of auxiliary information at the design and at the estimation stage of complex surveys: methodological aspects and applications for producing official statistics””

Outline Statement of the problem Multi-way Sampling Design Multi-way optimal allocation algorithm Monte Carlo simulation

Statement of the problem  Large scale surveys in Official Statistics usually produce estimates for a set of parameters by a huge number of highly detailed estimation domains  These domains generally define not nested partitions of the target population  When the domain indicator variables are available at framework level, we may plan a sample covering each domain  Fixing the sample sizes: Help to control the sampling errors of the main estimates; When direct estimators are not reliable (small area problem), having the units in the domains allows to:  bound the bias of small area indirect estimators; use models with specific small area effects.

Statement of the problem  Standard solution for fixing the sample sizes stratifies the sample with strata given by cross-classification of variables defining the different partitions (cross-classified or one- way stratified design)  Main drawback: Too detailed stratification:  Risk of sample size explosion;  Inefficient sample allocation (2 units per stratum constraint);  Risk of statistical burden (e.g. repeated business surveys).

Statement of the problem Domain of Interest Parameter of interest and estimator: Multivariate (r=1,…,R) and multidomain (d =1, …, D) context

Statement of the problem

Multi-way Sampling Design  Main problem of MWD: define a procedure for random selection  We propose to use the Cube method (Deville and Tillé, 2004): Select random sample of multi-way stratified design; For a large population and a lot of domains.

Multi-way Sampling Design ITACOSM June 2011, Pisa, Italy - 6

Optimal allocation algorithm ITACOSM June 2011, Pisa, Italy - 6

Optimal allocation algorithm

Monte Carlo simulation  Objectives of simulation: Test the convergence of the optimization algorithm (optimization step) Comparison between the expect AV and the Monte Carlo empirical AV Comparison with standard cross-classified stratified design ITACOSM June 2011, Pisa, Italy - 12

Monte Carlo simulation  Data: Subpopulation of the Istat Italian Graduates’ Career Survey (3,427 units); Driving allocation variables:  employed status (yes/no) ;  actively seeking work (yes/no). We generate the values of the two variables by means a logistic additive model (Prediction model); Explicative variables: degree mark, sex, age class and aggregation of subject area degree The parameters are estimated by the data from the previous survey ITACOSM June 2011, Pisa, Italy - 13

Monte Carlo simulation  Survey target estimates: Two partitions define the most disaggregate domains:  First partition: university by subject area degree (9 classes);  Second partition: degree by sex; Domains in real survey:448+94; Strata 2,981 (university, degree, sex); In the simulation: domains 20+15;strata 91.  Errors thresholds fixed in terms of CV(%) ITACOSM June 2011, Pisa, Italy - 14

Monte Carlo simulation  Results: Assuming as known values  Iterations (outer process): 6;  Optimal sample size 171 (after calibration 182). Assuming predicted values:  Iterations (outer process): 3;  Optimal sample size 699 (after calibration 707). ITACOSM June 2011, Pisa, Italy - 15

Monte Carlo simulation  Analysis of the allocation with the predicted values: The sample allocation procedure uses an approximation of the AV  The simulation confirms the input AV is an upward approximation of the real AV ITACOSM June 2011, Pisa, Italy - 16

Monte Carlo simulation  Comparison with the standard approach: The implicit model (one-way stratification model) is similar to the model used in our approach; The allocation differences depend on the unit minimum number constraint (2) in each stratum; The sample size is 751 units (+7.4%); Taking into account the domains with small population strata (<10 units in average per stratum) standard approach produces +14.4% sample size. ITACOSM June 2011, Pisa, Italy - 17

References  Bethel J. (1989) Sample Allocation in Multivariate Surveys, Survey Methodology, 15,  Chromy J. (1987). Design Optimization with Multiple Objectives, Proceedings of the Survey Research Methods Sec-tion. American Statistical Association,  Deville J.-C., Tillé Y. (2004) Efficient Balanced Sampling: the Cube Method, Biometrika, 91,  Deville J.-C., Tillé Y. (2005) Variance approximation under balanced sampling, Journal of Statistical Planning and Inference, 128,  Falorsi P. D., Righi P. (2008) A Balanced Sampling Approach for Multi-way Stratification Designs for Small Area Estimation, Survey Methodology, 34,  Falorsi P. D., Orsini D., Righi P., (2006) Balanced and Coordinated Sampling Designs for Small Domain Estimation, Statistics in Transition, 7,  Isaki C.T., Fuller W.A. (1982) Survey design under a regression superpopulation model, Journal of the American Statistical Association, 77, ITACOSM June 2011, Pisa, Italy - 18