Variance Estimation in EU-SILC Survey

Slides:



Advertisements
Similar presentations
Calculation of Sampling Errors MICS3 Data Analysis and Report Writing Workshop.
Advertisements

Multiple Indicator Cluster Surveys Survey Design Workshop
Multiple Indicator Cluster Surveys Survey Design Workshop
1 Session 10 Sampling Weights: an appreciation. 2 To provide you with an overview of the role of sampling weights in estimating population parameters.
SADC Course in Statistics Sampling weights: an appreciation (Sessions 19)
Introduction Simple Random Sampling Stratified Random Sampling
Faculty of Allied Medical Science Biostatistics MLST-201
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Economics 105: Statistics Review #1 due next Tuesday in class Go over GH 8 No GH’s due until next Thur! GH 9 and 10 due next Thur. Do go to lab this week.
Evaluating Methods of Standard Error Estimation for Use with the Current Population Survey’s Public Use Data The Hawaii Coverage For All Technical Workshop.
Chapter 10: Sampling and Sampling Distributions
Multiple Indicator Cluster Surveys Survey Design Workshop
11 ACS Public Use Microdata Samples of 2005 and 2006 – How to Use the Replicate Weights B. Dale Garrett and Michael Starsinic U.S. Census Bureau AAPOR.
Complex Surveys Sunday, April 16, 2017.
Why sample? Diversity in populations Practicality and cost.
A new sampling method: stratified sampling
Section 5.1. Observational Study vs. Experiment  In an observational study, we observe individuals and measure variables of interest but do not attempt.
Chapter 14 Inequality in Earnings. Copyright © 2003 by Pearson Education, Inc.14-2 Figure 14.1 Earnings Distribution with Perfect Equality.
Formalizing the Concepts: Simple Random Sampling.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
PowerPoint Presentation Package to Accompany:
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Complexities of Complex Survey Design Analysis. Why worry about this? Many government studies use these designs – CDC National Health Interview Survey.
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
The Improvement of HBS in the Republic of Moldova European Conference on Quality in Official Statistics, Rome, Italy Lilian Galer,
Sampling Chapter 5.
1 Sampling for EHES Principles and Guidelines Johan Heldal & Susie Cooper Statistics Norway.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
Random Group Variance Adjustments When Hot Deck Imputation Is Used to Compensate for Nonresponse Richard A. Moore Company Statistics Division US Census.
ECONOMICS What Does It Mean To Me? Part VII: Issues and Policies in Microeconomics.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Performance of Resampling Variance Estimation Techniques with Imputed Survey data.
Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
BUS216 Spring  Simple Random Sample  Systematic Random Sampling  Stratified Random Sampling  Cluster Sampling.
Poverty Estimation in Small Areas Agne Bikauskaite European Conference on Quality in Official Statistics (Q2014) Vienna, 3-5 June 2014.
A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting Ellen Scheib, Peter H. Siegel, and James R. Chromy.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
Notes 1.3 (Part 1) An Overview of Statistics. What you will learn 1. How to design a statistical study 2. How to collect data by taking a census, using.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Targeting of Public Spending Menno Pradhan Senior Poverty Economist The World Bank office, Jakarta.
1 Sampling Theory Dr. T. T. Kachwala. 2 Population Characteristics There are two ways in which reliable data or information for Population Characteristics.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Quality Analysis in a Survey on Transportation of Goods by Road Juris Breidaks University of Latvia Central Statistical Bureau of Latvia.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Arun Srivastava. Variance Estimation in Complex Surveys Linearization (Taylor’s series) Random Group Methods Balanced Repeated Replication (BRR) Re-sampling.
Replication methods for analysis of complex survey data in Stata Nicholas Winter Cornell University
Chapter Eleven Sample Size Determination Chapter Eleven.
Sample Design of the National Health Interview Survey (NHIS) Linda Tompkins Data Users Conference July 12, 2006 Centers for Disease Control and Prevention.
1. 2 DRAWING SIMPLE RANDOM SAMPLING 1.Use random # table 2.Assign each element a # 3.Use random # table to select elements in a sample.
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Addis.
Adjustments to the survey design: Sampling
Sampling Why use sampling? Terms and definitions
Sampling: Stratified vs Cluster
Introduction to Survey Data Analysis
An Online CNN Poll.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
SASU manual: sampling issues
Marie Reijo, Population and Social Statistics
Presentation transcript:

Variance Estimation in EU-SILC Survey Mārtiņš Liberts Central Statistical Bureau of Latvia

Task To estimate sampling error for Gini coefficient estimated from social sample surveys (EU-SILC) Estimation of sampling errors for totals and ratios was also analised

Gini coefficient The Gini coefficient is a measure of inequality of a distribution, defined as the ratio of area between the Lorenz curve of the distribution and the curve of the uniform distribution, to the area under the uniform distribution It is often used to measure income inequality It is a number between 0 and 1, where 0 corresponds to perfect equality (e.g. everyone has the same income) and 1 corresponds to perfect inequality (e.g. one person has all the income, and everyone else has zero income)

Graphical representation of the Gini coefficient

Estimates of Gini Index Income inequality by Gini index Estimated from HBS

World map of the Gini coefficient

EU-SILC Target population – population of Latvia living in private households Main variables of interest – income at household and individual level Yearly survey – organised once in a year Started in 2005 in Latvia

Sampling Design Two-stage sampling design The first stage – stratified systematic pps sampling of census counting areas The second stage – simple random sampling of households (addresses) All individuals belonging to selected household are surveyed

First stage sampling The list of census counting areas was created for the last population census in 2000 (4279 areas) Census counting area is relatively small geographical area The size of area is defined by number of households in area Areas are stratified in four strata by urbanisation degree (Riga, 6 other cities, towns and rural areas)

Estimation of sampling errors Re-sampling methods are used Dependent random group (DRG) method Jackknife method Methods are used at the PSU level Both methods use the same resampling mechanisms – by dividing the whole sample in non-overlapping sub-samples

DRG and Jackknife From each sub-sample the parameter  is estimated The estimate of variance The parameter  is estimated by deleting each sub-sample The estimate of variance (in case of stratified sampling)

Resampling Resampling can be done in two ways: Using the same sampling scheme Using randomisation

Randomisation PSUs can be grouped in sub-samples in random order In this case variance estimate will differ each time the variance estimator is applied

Linearization In case of complex statistics the variance estimator becomes also more complex compared to variance estimator of total The linearization technique can be applied for complex statistics to get approximate variance estimate The goal of linearization is to find zi for each unit in the sample so that variance of estimate of parameter  could be approximated by

Linearization In case of differentiable functions (for example Ratio of two totals) the expansion of estimator in Taylor series can be applied to linearize the estimator In case of non-differentiable functions the expanded theory (Deville, 1999) can be used

Linearization of Gini Index Gini Index can be linearized by

Program The program is written in SPSS mainly using the macro commands It is possible to estimate sampling error for arbitrary two stage sampling design using DRG un Jackknife methods at PSU level Sampling error is estimated for totals (SUM), ratios of two totals (RATIO) and Gini index (GINI)

Features of Program The nonresponse correction at user defined groups Poststratification of weights Linearization of RATIO and GINI Selection of number of subgroups PSU ordering in random or user-defined order PSU grouping Estimation of parameters for sampling units or sublevel units

Procedures in program !linrat – linearization of RATIO !lingini – linearization of GINI !estim – estimator !weight – weighting !e_tion – estimation !proc – basic procedure !proc_u – main procedure

Parameters File – sample data file Strata – stratification variable Psu – PSU variables Diz_sv – variable for design weights Meth – method of estimation of sampling errors E_tor – estimators Lin – linearization Div – numbers of subgroups Other parameters

Example !proc_u dir="C:\DRG\SILC\files" file="C:\DRG\SILC\SILC2005_data_ver02.sav" p_file="C:\DRG\SILC\dem_info.sav" strata=prl / psu=atk iecirk / hh_id=db030 / per_sk=per_sk diz_sv=diz_sv resp=resp resp_gr=atk iecirk / p_gr=prl / p_var=per_sk p_tot=iedz_sk meth=drg jack / rorder=0 / repeat=1 psu_gr=sel_nr / order=sel_nr / div=2 4 8 12 16 / e_tor=sum ratio gini / lin=0 / level=P / eqscale per_sk / var=hh07n hs13n / fast=1.

Variables SILC Total housing cost Lowest monthly income to make ends meet

Estimates of parameters

Results SILC (Gini)

Results SILC

Random ordering The stability (variance) of variance estimates was analysed Both methods works similarly and higher number of subgroups gives more stable results

Linearization

Conclusions The DRG and JACK methods gives similar results The variance estimates are dependent on ordering of PSUs Higher number of subgroups gives more stable results Linearization can be used to simplify variance estimation