Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.

Slides:



Advertisements
Similar presentations
1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII.
Advertisements

Introduction Simple Random Sampling Stratified Random Sampling
Properties of Least Squares Regression Coefficients
3- 1 Chapter 3 Introduction to Numerical Methods Second-order polynomial equation: analytical solution (closed-form solution): For many types of problems,
Chapter 7 Statistical Data Treatment and Evaluation
The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
The adjustment of the observations
Chapter 7 Sampling Distributions
Point estimation, interval estimation
Who and How And How to Mess It up
Small area Estimation of Italian poverty and social exclusion indicators Stefano Falorsi Michele D’Alò Loredana Di Consiglio Fabrizio Solari Matteo Mazziotta.
Evaluating Hypotheses
Fundamentals of Sampling Method
Statistical Background
A new sampling method: stratified sampling
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Maximum likelihood (ML)
CHAPTER 7, the logic of sampling
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Arun Srivastava. Small Areas What is a small area? Sub - population Domain The Domain need not necessarily be geographical. Examples Geographical Subpopulations.
Sample Design.
Determining Sample Size
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Introduction Since 1995, the Municipality of Firenze designed a quarterly labour force (LF) survey, parallel to that of ISTAT, to cope with the unavailability,
ESSnet on Small Area Estimation
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)
Weighting and estimation methods: description in the Memobust handbook Loredana di Consiglio, Fabrizio Solari 2013 European Establishment Statistics Workshop.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
A Theoretical Framework for Adaptive Collection Designs Jean-François Beaumont, Statistics Canada David Haziza, Université de Montréal International Total.
© Copyright McGraw-Hill 2000
Eurostat Statistical matching when samples are drawn according to complex survey designs Training Course «Statistical Matching» Rome, 6-8 November 2013.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.
Multivariate selective editing via mixture models: first applications to Italian structural business surveys Orietta Luzi, Guarnera U., Silvestri F., Buglielli.
Eurostat Accuracy of Results of Statistical Matching Training Course «Statistical Matching» Rome, 6-8 November 2013 Marcello D’Orazio Dept. National Accounts.
HASE: A Hybrid Approach to Selectivity Estimation for Conjunctive Queries Xiaohui Yu University of Toronto Joint work with Nick Koudas.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
CHAPTER 7, THE LOGIC OF SAMPLING. Chapter Outline  A Brief History of Sampling  Nonprobability Sampling  The Theory and Logic of Probability Sampling.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Guillaume Osier Institut National de la Statistique et des Etudes Economiques (STATEC) Social Statistics Division Construction.
Chapter 7: The Distribution of Sample Means
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Fundamentals of Data Analysis Lecture 10 Correlation and regression.

Ch3: Model Building through Regression
12 Inferential Analysis.
Statistical Methods For Engineers
The European Statistical Training Programme (ESTP)
Chapter 8: Weighting adjustment
12 Inferential Analysis.
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Sampling and estimation
The European Statistical Training Programme (ESTP)
Propagation of Error Berlin Chen
Propagation of Error Berlin Chen
Chapter 13: Item nonresponse
Presentation transcript:

Eurostat Weighting and Estimation

Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT

Outline Weighting and estimation in the Handbook –Weighting, use of auxiliary variables and calibration estimators –Small area estimation –Preliminary estimation Choice of estimation method

Weighting Principle of weighting: each sample unit represents a number of population units. Basic weights: the design weights Non-linear Estimation: Plug-in Principle (or substitution) Horvitz-Thompson estimator

Weighting The principle of weighting is also applied to account for unit non- response. Design weights can be adjusted also to consider non-response in order to reduce the possible bias of resulting estimates. For example, the sample can be partitioned into sub-groups of units where the response rates are assumed to be constant, and where it can be assumed that non-respondents behave similarly to respondents. Non-response depends on auxiliary variables defining a partition of the population, but conditionally on these variables it is independent of the target variable.

Use of Auxiliary information When auxiliary variables are available: reduce bias, reduce variance (however sometimes, external bounds) Ratio estimator, auxiliary information : the total of one numerical variable If applied to the X variable, one gets a perfect estimate

Use of Auxiliary information Poststratification: total of a vector of indicator of post- strata The estimator is

Use of Auxiliary information Raking Ratio –Auxiliary Information: known totals of different auxiliary variables (not-cross-classified) The Raking-Ratio method consists in performing post-stratification with all variables and iterate

Use of Auxiliary information GREG GREG is «assisted» by a linear relationship between X and Y.

Calibration The estimate of total Y is obtained by means of a procedure which –Corrects bias due to non response –takes into account the knowledge of auxiliary variables, requiring that the estimates of these ones are equal to their own known totals

Calibration The weights w k are calculated as follows: d k is the initial weight, equal to the inverse of the inclusion probability p k g k is the final correction factor, which allows equality of sampling estimates to their known totals; it is calculated by means of the following equations

Calibration Final weight are chosen to satisfy constraints on auxiliary variables subject to where G is an appropriate distance function Subject to bounds for w/d

Calibration Distance function G: Linear –Raking ratio: (w/d) Log (w/d) – w/d +1 –Truncated linear

Calibration Calibration estimator equals GREG when choosing the linear (Euclidean) distance function

Calibration All calibration estimators are asymptotically equal to GREG They are approximately unbiased and consistent Their sampling variance converges to GREG variance

Calibration Software –CLAN (Statistics Sweden) –BASCULA (The Netherlands) –GES (StatCan) ReGenesees (ISTAT)- R package –A second R package, called ReGenesees.GUI, implements the presentation layer of the system: less experienced R users will take advantage from the user-friendly graphical interface. downloadable from the Joinup

Weighting, use of auxiliary variable and calibration Planned modules in HB –Main theme module –Calibration estimators –Already available:  GREG portal.eu/content/generalised-regression- estimatorhttp:// portal.eu/content/generalised-regression- estimator

Small area estimation Most national surveys are planned to produce accurate estimation at national level. Analyses at finer partition may not have the desired precision due to small sample size or even zero sample. A small area is a domain where the sample size is not sufficient to satisfy prefixed level of precision.

Small area estimation Indirect estimators – make use of what has been observed on the other domains (or time) –Traditional estimators:  Synthetic estimators  Composite estimators –Model based estimators  Area level models  Unit level models With this class of estimators extra-information is gained in the estimation process by making use of observations outside the domain of interest by means of implicit (synthetic estimators) or explicit (model based estimators) use of models.

Small area estimation Use information at local level with common beta –Modified direct

Small area estimation Synthetic estimators: simple case it is assumed that small areas have same mean of larger domains (at least in classes), Synthetic estimators can be based on different models (relationships between variable of interest and auxiliary v.); linear model; linear mixed model at unit level; linear mixed model at area level.

Small area estimation Model based estimators Based on area level model : Based on unit level model:

Small area estimation References in the HB: series-data series-data

Small area estimation Guidelines can be found at: portal.eu/sites/default/files//WP6-Report.pdf portal.eu/sites/default/files//WP6-Report.pdf Quality assessment: portal.eu/content/final-report-quality-assessment-sae-wp3 portal.eu/content/final-report-quality-assessment-sae-wp3 In practice: – software-tools-sae-sae-wp4http:// software-tools-sae-sae-wp4 –R codes from ESSnet SAE project: portal.eu/content/r-codes-documentations-sae-wp4http:// portal.eu/content/r-codes-documentations-sae-wp4

Preliminary estimation The treatment of unit non-response may be applied. In this case, the late response is treated as non- response but in order to avoid biased estimates, the self-selection of quick respondents mechanism should not be considered as random.

Preliminary estimation Rao et al. (1989) proposed composite estimators that may represent an improvement of the standard estimator. The basic composite estimator is obtained as weighted average of the preliminary estimate at time t and the final estimate at time t-1 adjusted for the difference between preliminary estimates at time t and t-1. chosen on the basis of variances and covariances

Preliminary estimation In order to reduce the revision error of the preliminary estimates model based estimators can be considered, Rao, Srinath and Quenneville (1989) adopt a time series approach to preliminary estimation. Let be respectively the preliminary estimate at time t, the final estimates and the measurement errors in preliminary estimates at time t

Preliminary estimation Furthermore, suppose: The estimator results: Or when auxiliary variables Or taking into account of seasonality

Preliminary estimation Design based – Model based – Sub-sampling –

Choice of estimation methods Quality indicators: –Accuracy: degree of closeness of estimates to the true values.  Bias  Precision –Timeliness : is the length of time between the event or phenomenon they describe and their availability. – Revision errors –Coherence and comparability: Coherence with other statistics Ref. ESS Handbook for Quality Reports Methodologies and Working papers, 2009

Choice of estimation methods Close relationship with sampling design – (e.g. weights) – Choice of sampling strategy Non probabilistic sample design? E.g. cut-off sampling model based estimators –Model simply assumes that the units cut off behave similarly to those in the sampled portion. –Model assumptions should be analysed as far as possible.

Thank you for your attention