GENEralised software for Sampling Estimates and Errors in Surveys (GENESEES V. 3.0) Piero Demetrio Falorsi - Salvatore Filiberti Istat Structural Business.

Slides:



Advertisements
Similar presentations
Innovation Surveys: Advice from the Oslo Manual South Asian Regional Workshop on Science, Technology and Innovation Statistics Kathmandu,
Advertisements

Outline Major programmes for collecting industrial statistics Decennial Census of Industries Annual survey of industries Survey of construction industries.
Some considerations on developing a DWH for SBS estimates Orietta Luzi – Mauro Masselli Istat - Italy march 2013.
An Overview of China ’ s Industrial Statistical Methodology Xiaohui WANG Department of Industrial and Transport Statistics National Bureau of Statistics,
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Optimal Sampling Strategies for Multidomain, Multivariate Case with different amount of auxiliary information Piero Demetrio Falorsi, Paolo Righi 
WORKSHOP ON INDUSTRIAL STATITICS, 8 – 10 JULY 2013 COUNTRY PRESENTATION MALDIVES.
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
United Nations Statistics Division Recoding the business register to ISIC Rev.4.
Joint UNECE/Eurostat Meeting on Population and Housing Censuses (13-15 May 2008) Sample results expected accuracy in the Italian Population and Housing.
Dissemination and statistical use of Business Register data Conference of European statisticians Group of experts of Business Registers Luxembourg, 6-7.
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on.
TURKISH STATISTICAL INSTITUTE STI Indicators of Turkey Devrim YAĞAN ECO - UIS Regional Workshop on STI Indicators Tehran 8-10 December 2013.
Korean SME Characteristics & Proposed Developments for Data Linking Presenter : Sunghee Han.
ILO-Paris21 seminar on Capacity Building for labour statistics, Geneva, 3 Dec 2003 Capacity building for labour statistics : the EU system as a final target,
The converging pattern between Business statistics and Administrative data. Towards an “industrialized” statistical production process The Italian LCS2012.
Carmela Pascucci – Istat - Italy Meeting of the Working Party on International Trade in Goods and Trade in Services Statistics (WPTGS) Linking business.
Electronic reporting in Poland 27th Voorburg Group Meeting Warsaw, Poland October 1st to October 5th, 2012 Central Statistical Office of Poland.
Antonio Bernardi - Fulvia Cerroni - Viviana De Giorgi (Istat) An application to the Tax Authority Source (Sector Studies) Session: Administrative data.
Quality in the Swedish Business Database The Quality Survey 2004 Round Table Beijing 2004 Swedish presentation, session 5, 18 th Round Table, Beijing –
Integrating administrative and survey data in the new Italian system for SBS: quality issues O. Luzi, F. Oropallo, A. Puggioni, M. Di Zio, R. Sanzo Nurnberg,
Optimal Allocation in the Multi-way Stratification Design for Business Surveys (*) Paolo Righi, Piero Demetrio Falorsi 
The Adoption of METIS GSBPM in Statistics Denmark.
IMPUTING MISSING ADMINISTRATIVE DATA FOR SHORT-TERM ENTERPRISE STATISTICS Pieter Vlag – Statistics Netherlands Joint work with DESTATIS, Statistics Estonia,
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
Q20101 National accounts revisions: Italian manufacturing productivity analysis Alessandro Faramondi Istat – National Statistical Institute.
May 2012 ESSnet DWH - Workshop III BUSINESS REGISTER IN STATISTICS LITHUANIA Jurga Rukšėnaitė Chief specialist.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Georgia: business register data and gender-disaggregated indicators Tengiz Tsekvava Technical Meeting on Measuring Entrepreneurship from Gender Perspective.
The Monthly Survey of Mining and Manufacturing
The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.
Short – term statistics and seasonal adjustment in Azerbaijan Yusif Yusifov, Head of division Industry, transport and communication statistics State Statistical.
Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan.
Cristina Casciano, Viviana De Giorgi, Filippo Oropallo Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices.
Using administrative registers in sample surveys European Conference on Quality in Official Statistics 3-–6 May 2010 Kaja Sõstra Statistics Estonia.
Small-area estimation in Official Statistics: ICT survey in Enterprises of the Basque Country Jorge Aramendi, Jose Miguel Escalada, Elena Goni & Anjeles.
Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia.
Impact of updating weights on tracking performance and volatility: Industry survey G. Bruno, L. Crosilla, P. Margani, A. Righi EU Workshop on Recent Developments.
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
The challenge of a mixed-mode design survey and new IT tools application: the case of the Italian Structure Earning Surveys Fabiana Rocci Stefania Cardinleschi.
Experience and response in developing countries: the twinning project with the Tunisian National Statistical Institute Monica Consalvi ISTAT, Division.
An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics.
Multivariate selective editing via mixture models: first applications to Italian structural business surveys Orietta Luzi, Guarnera U., Silvestri F., Buglielli.
Evolution of Census Statistics on Enterprises in Italy : from the Traditional Census to a Register of Local Units Monica Consalvi, Luigi Costanzo,
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
Joint Eurostat Unece Worksession on Statistical Data Confidentiality 2011, Tarragona Initial analyses on comparable dissemination from the Essnet project.
Joint UNECE-Eurostat worksession on confidentiality, 2011, Tarragona Sampling as a way to reduce risk and create a Public Use File maintaining weighted.
Regional Seminar on Developing a Program for the Implementation of the 2008 SNA and Supporting Statistics Cenker Burak METİN September 2013 Ankara.
The use of administrative data for the production of official economic statistics in Brazil - current situation and challenges for the future Shanghai,
Meeting of Task Force on Small and Medium Sized Enterprise Data (SMED ) 13 th April 2015, 10:00-17:00 Inclusion of all economic sectors in SBS Giampiero.
Armenia Twinning 2011 Component F – Information Society, 2 – 6 May DEVELOPMENT OF INFORMATION SOCIETY STATISTICS IN LITHUANIA SURVEY ON.
Luxembourg, October Workshop on Backasting SHORT-TERM BUSINESS STATISTICSwww.stat.gov.lt BACKCASTING OF TIME SERIES: PROBLEMS AND RESULTS.
4-6 September 2013, Vilnius Quality in Statistics: Administrative Data and Official Statistics USING ADMINISTRATIVE DATA SOURCES IN OFFICIAL.
Compilation and Dissemination of Distributive Trade Statistics
On building statistical indicators at Labour Market Area level
“The infrastructure for the SBS-Frame production in ISTAT”
Implementation of a more efficient way of collecting data SBS: use of administrative data Statistics Belgium June 2009.
Dublin, april 2012 Role of Business Register in coordinated sampling
Classification systems within business registers – Session 3 ITALY - ISTAT New economic classification and new instruments for Business Register classification:
Quality Aspects and Approaches in Business Statistics
ADMINISTRATIVE DATA IN ANNUAL BUSINESS STATISTICS OF LATVIA
Business Register Quality Improvement
VAT data in Business Register and Business Statistics
Italian situation in the following areas:
Structural Business Statistics
A New Business Statistics in Finland - Quarterly Investments
Regional Seminar on Developing a Program for the Implementation of the 2008 SNA and Supporting Statistics Gülçin ERDOĞAN September 2013 Ankara.
Istat - Structural Business Statistics
Sampling and estimation
Parallel Session: BR maintenance Quality in maintenance of a BR:
Presentation transcript:

GENEralised software for Sampling Estimates and Errors in Surveys (GENESEES V. 3.0) Piero Demetrio Falorsi - Salvatore Filiberti Istat Structural Business Statistics

GENEralised software for Sampling Estimates and Errors in Surveys 2 Summary of the presentation § Objectives and evolution of the software § Software installation pre-requisites § Data needed for Genesees § Input data sets (characteristics, controls) § Output: tables, file formats, data sets § Structural Business Statistics (SBS) surveys using Genesees § Population of interest - Business Register § Domains of interest § SME Sampling strategy (current) § Variables of interest § Case study

GENEralised software for Sampling Estimates and Errors in Surveys 3 Objectives and evolution of the software (1/2) § Need to estimate variables of interest for social and economic statistics § Guarantee coherence among estimates in time and space § Improve quality of data produced (for example, in accordance to SBS Council Regulation) § Methodology (Deville and Särndal, 1992) § Implemented by Falorsi P.D. – Falorsi S..

GENEralised software for Sampling Estimates and Errors in Surveys 4 Objectives and evolution of the software (2/2) § Genesees prototype for social statistics § Genesees prototype for enterprises statistics (1992 as first reference year) § Several contributions to the development of the software have thereafter been provided by other Istat researchers § Delivery of the new releases is made regularly § Genesees is currently used for estimation in almost all Istat surveys

GENEralised software for Sampling Estimates and Errors in Surveys 5 Software installation pre-requisites § SAS for Windows § SAS Language, Macro, IML, Stat, Graph § HD ≥ 4 Mb; RAM ≥ 64 Mb How to download Genesees: § § then select: “Metodi e Software per le indagini statistiche” § download and then unzip the file “Genesees3.zip” on the directory c:\Genesees § to for the starting password § will inform you about the new releases of the software

GENEralised software for Sampling Estimates and Errors in Surveys 6 Data needed for Genesees § Frame (example: Business Register) → to get the known totals of auxiliary variables as a reference structure § Survey respondent units → to compute the initial sampling weight correction factor and then to assign the final sampling weight to each unit

GENEralised software for Sampling Estimates and Errors in Surveys 7 Input data sets (characteristics) § Input SAS data sets: l (“Noti”; “Inp”) § “Noti”: (var. name≤8 char.) l Planned population = domain of interest: (alfanum. var.; var. ≤15 char.) l Totals of auxiliary variables: (num. var.; at least 1 var.) § “Inp”: (var. name≤8 char.) l Id. Code (num. var.) l Planned population (as in “Noti”) l Auxiliary variables: (num. var.) (have to be inputted in the same order as in “Noti”) l Coef = initial weight (adjusted for unit non response); (num. var.) l Ck = “distance weight”: (num. var.); not necessary

GENEralised software for Sampling Estimates and Errors in Surveys 8 Input data sets (controls) § “Noti”: l Planned popul. =. → Procedure stops → data set “Noti-miss” l Totals of aux. var. =. → 0 § “Inp”: l Id. Code =. → Procedure stops → data set “Missing” l Id. Code = double → data set “Codici-doppi” l Auxiliary variables =. → 0 l Coef =. → 1 (no controls) l Ck =. → 1

GENEralised software for Sampling Estimates and Errors in Surveys 9 Output tables § Output tables (summary descriptive statistics related to the calibration estimators process) : l Table 1: Statistics on estimates and final weights for planned popul.; l Table 2: Statistics on initial weights correction factors; l Table 3: Statistics on estimates and initial weights; l Table 4: Prefixed parameters for the estimation iterative procedure; l Table 5: Known totals, direct and final estimates, and differences; l Tabulate 1: Controls on the domains: known totals, direct estimates, ratios between known totals and direct estimates, sample totals; l Tabulate 2: Sample size (respondents) and population estimate with direct weights; l Tabulate 3: Controls on domains without sample units.

GENEralised software for Sampling Estimates and Errors in Surveys 10 Output file formats § Output file formats l “genesees.log” (SAS log) l “stampa1.txt” – “stampa6.txt” (Tables) l “stampe stime.htm” (Tables) l Data sets SAS (“*.sas7bdat”)

GENEralised software for Sampling Estimates and Errors in Surveys 11 Output data sets (1/2) § Diagnostics (errors detected in the input step, if any) : l “missing”; (Id. Code =.) l “noti-miss”; (Planned popul. =.) l “vuoti” (domain is present in “Noti” but is not present in “Inp”); l “codici-doppi”; (Id. Code = double) l “csenzat” (domain is present in “Inp” but is not present in “Noti”); l “savestime” (shows parameters inputted)

GENEralised software for Sampling Estimates and Errors in Surveys 12 Output data sets (2/2) § Statistics and final weights: l “Pesifin” (initial w.; corr. factor; final w.; id.; conta; domain); l “stat” conta; max; min; sum; mean; var; cv; (with reference to initial weights, correction factor and final weights) Iterations; maxiter; converge; constraints (c2); sample units in the domain (r2); dist. func.; l ”stimedir”: domain; aux. var. totals; conta; l ”stimefin”: known total; direct estimate; final estimate; conta; difference between final estimate and known total

GENEralised software for Sampling Estimates and Errors in Surveys 13 Structural Business Statistics (SBS) Surveys using Genesees (1/2) § Small and Medium Enterprises (SME) Survey § Information and Communication Technologies (ICT) Survey § Structure of Earnings Survey (SES) § Labor Cost Survey (LCS) § Prodcom § SBS Preliminary Estimates § …

GENEralised software for Sampling Estimates and Errors in Surveys 14 Structural Business Statistics (SBS) Surveys using Genesees (2/2) Estimation of economic variables on enterprises according to: § Istat traditional data production on enterprises § Structural Business Statistics ( SBS ) EU Council Regulation No 58/97 l Preliminary estimates ( 1 estimation domain; t + 10 months ) l Final estimates ( 3 estimation domains; t + 18 months ) l Quality indicators and specific reports ( 3 estim.domains; t + 24 months ) Coefficient of Variation - CV (3 domains); Item and unit non response rate (1 domain); Specific reports on survey strategy and principal economic activity. t = year of reference

GENEralised software for Sampling Estimates and Errors in Surveys 15 Population of interest (1/2) Number of Italian enterprises (SBS 2002) Number of persons employed Economic activity sector (NACE Rev.1 Division) Total Manufacturing (10-41)463,05254,35926,22910,5061,553555,699 Constructions (45)513,15618,4035,1241, ,900 Services (50-74)2,555,63149,26817,4056,5971,2612,630,162 Total3,531,839122,03048,75818,2422,8923,723,761 Number of persons employed of the Italian enterprises (SBS 2002) Number of persons employed Economic activity sector (NACE Rev.1 Division) Total Manufacturing (10-41)1,230,838730,888774,903951,7641,176,7724,865,165 Constructions (45)1,045,386237,915145,77898,09147,8501,575,020 Services (50-74)4,464,548643,597517,363634,7251,459,2317,719,464 Total6,740,7721,612,4001,438,0441,684,5802,683,85314,159,649

GENEralised software for Sampling Estimates and Errors in Surveys 16 Population of interest (2/2)

GENEralised software for Sampling Estimates and Errors in Surveys 17 Business Register ASIA - Data sources: - Tax Register, Chambers of Commerce, Social Security, Work Accident Insurance, Electric Power Board, SEAT telephone directory - Statistical and probabilistic procedure for enterprises’ main economic activity detection - Variables in the register are the result of standardization, normalization and integration of information provided by administrative sources

GENEralised software for Sampling Estimates and Errors in Surveys 18 Domains of study (SBS final estimates) CodeType of domain (partition of population of interest) Number of domains (in the partition) DOM1NACE Rev.1.1 Class (4-digit)461 DOM2 NACE Rev.1.1 Group (3-digit) by size-class 1,047 DOM3 NACE Rev.1.1 Division (2-digit) by region 984

GENEralised software for Sampling Estimates and Errors in Surveys 19 SME Sampling strategy (current) § N ≈ 3,723,000 enterprises ( Business Register ) l (enterprises <10 persons employed cover 94.8% of the total enterprises and 47.8% of the total employment) § Stratified simple random sample § H ≈ 26,000 strata ( NACE Rev.1.1, Size class, Region ) § n ≈ 120,000 ( negative coordination with other SBS Surveys, multivariable and multidomain sample allocation ) § Survey technique: postal questionnaire; 2 call-backs § Calibration estimators methodology ( Deville and Särndal,1992 )

GENEralised software for Sampling Estimates and Errors in Surveys 20 Variables of interest - Turnover - Value added at factor cost - Employment - Total purchases of goods and services - Personnel costs - Wages and salaries - Production value - ….. Totals of variables of study are estimated with reference to subpopulation of interest (domains), as requested by SBS EU Regulation

GENEralised software for Sampling Estimates and Errors in Surveys 21 Case study

GENEralised software for Sampling Estimates and Errors in Surveys 22 Starting picture

GENEralised software for Sampling Estimates and Errors in Surveys 23 Thank you!