Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia.

Slides:



Advertisements
Similar presentations
Calculation of Sampling Errors MICS3 Regional Workshop on Data Archiving and Dissemination Alexandria, Egypt 3-7 March, 2007.
Advertisements

Innovation data collection: Advice from the Oslo Manual South East Asian Regional Workshop on Science, Technology and Innovation Statistics.
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA Rudi Seljak Statistical Office of the Republic of Slovenia.
The introduction of new classifications of economic activities and products in Ukraine Workshop on International Classification Chisinau March 2013.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
GENEralised software for Sampling Estimates and Errors in Surveys (GENESEES V. 3.0) Piero Demetrio Falorsi - Salvatore Filiberti Istat Structural Business.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Livelihoods analysis using SPSS. Why do we analyze livelihoods?  Food security analysis aims at informing geographical and socio-economic targeting 
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
CZECH STATISTICAL OFFICE | Na padesatem 81, Prague 10 | Jitka Prokop, Czech Statistical Office SMS-QUALITY The project and application.
Modernisation of Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS Workshop on Modernisation of Statistical Production Geneva, 15–17.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7.2 Estimating a Population Proportion Objective Find the confidence.
Page 1 Vienna, 03. June 2014 Mario Gavrić Croatian Bureau of Statistics Senior Adviser in Classification, Sampling, Statistical Methods and Analyses Department.
by Ha Do Statistical Standard Methodology and ITC Department
Data Preparation for Analytics Using SAS Gerhard Svolba, Ph.D. Reviewed by Madera Ebby, Ph.D.
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
CORE Rome Meeting – 3/4 October WP3: A Process Scenario for Testing the CORE Environment Diego Zardetto (Istat CORE team)
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
The Project – Spreadsheet Design. The following is the high mark band for the spreadsheet design: The candidate has Analysed a given data set and designed.
1 Quality Assurance In moving information from statistical programs into the hands of users we have to guard against the introduction of error. Quality.
Rudi Seljak, Metka Zaletel Statistical Office of the Republic of Slovenia TAX DATA AS A MEANS FOR THE ESSENTIAL REDUCTION OF THE SHORT-TERM SURVEYS RESPONSE.
M ETADATA OF NATIONAL STATISTICAL OFFICES B ELARUS, R USSIA AND K AZAKHSTAN Miroslava Brchanova, Moscow, October, 2014.
Sections 6-1 and 6-2 Overview Estimating a Population Proportion.
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Numeric Processing Chapter 6, Exploring the Digital Domain.
Register-Based Census 2011 in Slovenia – Some Quality Aspects Danilo Dolenc Statistical Office of the Republic of Slovenia UNECE-Eurostat Expert Group.
Statistics Sweden Results from operations in 2006: 146 publications 356 press releases commissions 3,7 million visitors at
Assessing Quality for Integration Based Data M. Denk, W. Grossmann Institute for Scientific Computing.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic 1 Subsystem QUALITY in Statistical Information System Czech.
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
PROCESSING, ANALYSIS & INTERPRETATION OF DATA
Chapter 13 Multiple Regression
Handbook on Precision Requirements and Variance Estimation for ESS Household Surveys Denisa Florescu, Eurostat European Conference on Quality in Official.
European Conference on Quality in Official Statistics 8-11 July 2008 Mr. Hing-Wang Fung Census and Statistics Department Hong Kong, China (
Statistical data confidentiality and micro data in Albania
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,
Metadata Driven Statistical Data Warehouse System at the Hungarian Central Statistical Office Imre Pap Senior IT Advisor Hungarian Central Statistical.
Ph. Brion Insee The contribution of different ways of dealing with non-responses in french business surveys.
RECENT DEVELOPMENT OF SORS METADATA REPOSITORIES FOR FASTER AND MORE TRANSPARENT PRODUCTION PROCESS Work Session on Statistical Metadata 9-11 February.
S T A T I S T I K A U S T R I A Quality Assessment of register-based Statistics A Quality Framework Manuela LENK Directorate.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
1 Dissemination of Data in the Demographic Yearbook System: Current approaches and Future Direction Expert Group Meeting to Review the United Nations Demographic.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Elaborating on the Business Architecture of SN Robbert Renssen Statistics Netherlands Standard Process Steps.
PRODUCER-USER RELATIONS AT SORS: FIRST RELEASE CASE Metka Zaletel Statistical Office of the Republic of Slovenia.
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Using SAS Stored Processes and the SAS Portal for Delivering Statistics to Drug Discovery Volker Harm PhUSE/PSI One-day Event 2009, Marlow.
Survey Training Pack Session 18 – Checking Data Analysis.
ESTP course, SBS module 13 March 2013 Structural Business Statistics Data reporting to Eurostat, transmission format and tools.
The Role of service Granularity in Successful CSPA Realization Zvone Klun, Tomaž Špeh Geneve, 22 June 2016.
Rudi Seljak, Aleš Krajnc
YTY − an integrated production system for business statistics
Structural Business Statistics Data reporting to Eurostat, transmission format and tools ESTP course, SBS module 13 March 2013.
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
6.1 Quality improvement Regional Course on
Data validation in Statistical Office of the Republic of Serbia
Statistical Information Technology
Parallel Session: BR maintenance Quality in maintenance of a BR:
Structural Business Statistics
Technical Coordination Group, Zagreb, Croatia, 26 January 2018
Presentation transcript:

Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia

Content of the presentation Introduction to the problem Application for sampling error estimation – basic principles Short description of the application Discussion

Introduction to the problem In the case of sampling surveys, standard error is still the most indicative “accuracy” indicator. It is obligation of the producer of official statistics to provide at least some information about the level of the accuracy together with the disseminated statistics Two main challenges: –How to correctly and timely estimate the standard error for the whole amount of the disseminated results – How to present these errors to the wide range of different users in clear and understandable way.

Standard error estimation at SORS (Not so far) past: –calculation of the sampling error was quite »survey dependent« → each survey had its own system –the direct estimations only for the key statistics and for the key domains → models for for the other statistics and (sub) domains –results with lower degree of precision were marked and the coefficient of variation was the “exclusive” criteria used Significant revision of the system few years ago: –The general rules were set up for the sampling error estimation –The new rules were set up for the dissemination and presentation –A special (sas) application was built in which all the above mentioned rules were incorporated

Application – general principles The application enables calculation of standard error for seven types of statistics. The application is usable for most of the statistics, produced at SORS, with few exceptions: –EU-SILC (Laeken) indicators (separate sas macro) –Indices (separate sas macro) The application enables aggregation, standard error calculation and also denotation with the special signs, if needed.

Application – general principles cont’d The application “merges” the processes of aggregation, sampling error estimation and tabulation into one fully automated process. It is designed as a metadata driven (MDD) system → parameters for the concrete survey provided outside the core computer code The application uses the following softwares: –The core part of the application (processing) is built in SAS environment, using PROC SURVEYMEANS “facilities” –The metadata are (for now) stored in Access database –Outputs are provided in the form of the excel tables

Application – technical description Hypothetical example Stratified one-stage sample Survey on internet usage in enterprises. Input variables: –Emp…Number of employees –Turn…Turnover –Wpage…Does the enterprise has its webpage (yes/no) –Nace2…Nace 2-digit group –Nace3…Nace 3-digit group –SizeC…Size class Output statistics –STAT01…Proportion of enterprises with its webpage –STAT02… Total turnover in enterprises with its webpage –STAT03… Turnover per employee in enterprises with its webpage Dissemination needed by the following domains –Nace 2-digit group –Nace 2-digit group * Size class Strata: –Nace 3-digit group * Size class

Metadata tables - Description of the statistics TableStat_codeStat_descTypeDummyVariableVariable_enVariable_den Table1STAT01Proportion of enterprises with its webpage02Dummy01 Table1STAT02Total turnover in enterprises with its webpage03 Var02 Table1STAT03 Turnover per employee in enterprises with its webpage05 Var02Var03 Type of statistics: 02 - Proportion 03 - Total 05 - Ratio Name of the Dummy variable needed for the calculation of the proportion (0,1 values) Name of the variable required for the calculation of the total Name of the variable in the enumerator, required for the calculation of the ratio Name of the variable in the denominator, required for the calculation of the ratio

Metadata tables – derived variables TableVar_nameConditionValue Table1Dummy01If Wpage='yes'1 Table1Dummy01If Wpage='no'0 Table1Var02If Wpage='yes'Turn Table1Var02If Wpage='no'0 Table1Var03If Wpage='yes'Emp Table1Var03If Wpage='no'0 Name of the derived variable needed Condition which determines for which units certain rule will be applied Value of the derived variable

Metadata tables – domains TableDomain_codeDom_var1Dom_var2…Dom_var10 Table1Dom1Nace2 Table1Dom2Nace2SizeC List of the variables which define the dimensions of the domain.

Metadata tables – sample design information TableStrataPSU Table1Nace3 Table2SizeC TableNace3SizeC_rate_ Table Table Table … Information on sample design (strata, PSU) Information on sample rate by strata cells

Metadata tables – other information Type of criteria used for the denotation of the statistics with lower precision Limits for the denotations of the statistics with lower precision Formats of the results of the final tables (decimals, percentages,…) Form and content of the output tables

Output – “raw results” Each row of the table gives the information on one aggregate. Dom1Dom_val1Dom2Dom_val2 … Stat_codeValueNo. of unitsSECVStat_diss Nace226.2 Stat Nace226.3 Stat _M Nace233.3SizeC3Stat N Specification of domains Identification of statistics Information on estimated statistics Value to be disseminated

Output – formatted tables Proportion of enterprises with its webpage Total turnover in enterprises with its webpage Turnover per employee in enterprises with its webpage Nace 2 -digit groups M N 45.5 M N578 …

Conclusions The application represents an important contribution to the process of the modernization of the statistical processes. It can be managed only by the subject matter personnel → significant rationalization of the survey execution. Planned improvements : –Development of the user interfaces for metadata management –Transfer of metadata database into ORACLE environment –Supplementation of the application functionalities with the possibility to estimate the sampling error for indices