Model of transformation administrative data to statistical data Data used in Population and Housing Census 2011 – examples Janusz Dygaszewicz and Paweł.

Slides:



Advertisements
Similar presentations
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Advertisements

Using Administrative Data to Improve Social Statistics – An Example of Collaborative Work Minda Phillips, Office for National Statistics. Paul Sinclair,
Quality Guidelines for statistical processes using administrative data European Conference on Quality in Official Statistics Q2014 Giovanna Brancato, Francesco.
Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Post-enumeration Survey-A.
Regional Workshop for African Countries on Compilation of Basic Economic Statistics Pretoria, July 2007 Administrative Data and their Use in Economic.
Application for presenting census results in the context of statistical data confidentiality in Poland Amelia Wardzińska-Sharif Central Statistical Office.
The Dutch Censuses of 1960, 1971 and 2001 Producing public use files in the IPUMS project Wijnand Advokaat Statistics Netherlands Division Social and Spatial.
Labor Statistics in the United States Grace York March 2004.
The Use of Administrative Sources for Economic Statistics An Overview Steven Vale Office for National Statistics UK.
1 Overview of Manufacturing Statistics in Africa UNECA Andry Andriantseheno Workshop on manufacturing Statistics Lusaka 4-7 May 2009.
Seminar on Developing a Programme on Integrated Statistics in the Caribbean Saint Lucia The Components of an Integrated Business and International Statistics.
Census Census of Population, Housing,Buildings,Establishments and Agriculture Huda Ebrahim Al Shrooqi Central Informatics Organization.
UK Data Warehouse Work 23 rd May 2012 Paul Tutton, Sarah Ravenhill.
Sub-session 1B: General Overview of CRVS systems.
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Overview of Archiving of Microdata Session 4 United Nations.
GEOG3025 Census and administrative data sources 2: Outputs and access.
SSID and Banner Carl Ellsworth, Enterprise Data Operations.
Copyright 2010, The World Bank Group. All Rights Reserved. Integrating Agriculture into National Statistical Systems Section A 1.
CZSO Business Register in the Czech Statistical Office Prepared by: Jan Matejcek CZSO, Prague, Czech Republic
1 1 Definitions and basic concepts Statistical Training Course Use of Administrative Registers in Production of Statistics Warzaw 14 – 17 October 2014.
Use of survey (LFS) to evaluate the quality of census final data Expert Group Meeting on Censuses Using Registers Geneva, May 2012 Jari Nieminen.
Dutch Virtual Census Presentation at the International Seminar on Population and Housing Censuses; Beyond the 2010 Round November, 2012 Egon Gerards,
The Statistical Business Register of Macao SAR Government of Macao SAR Statistics and Census Service.
The Implementation of the Acquis into the Polish Official Statistics Special MGSC Meeting, Warsaw, 21 June 2002.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Transition from traditional census to sample survey? (Experience from Population and Housing Census 2011) Group of Experts on Population and Housing Censuses,
Register-Based Census 2011 in Slovenia – Some Quality Aspects Danilo Dolenc Statistical Office of the Republic of Slovenia UNECE-Eurostat Expert Group.
S T A T I S T I C S A U S T R I A May 13th – 15th Register Based Census “The Austrian Principles of Redundancy” UNECE/Eurostat.
Combining survey and administrative data to create a new input data file for National Accounts processes Shaun McLaughlin Central Statistics Office, Ireland.
Register-based migration statistics and using additional administrative data sources Barica Razpotnik Statistical Office of the Republic of Slovenia UNECE.
Geneva, 21 May 2012 Snezana Lakcevic Statistical Office of the Republic of Serbia Head of Population Census Division Workshop on Censuses Using Registers.
European Conference on Quality in Official Statistics Session 26: Quality Issues in Census « Rome, 10 July 2008 « Quality Assurance and Control Programme.
The Dutch Virtual Census based on registers and already existing surveys Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
The project for developing the methodology of register- based censuses in Estonia Kristi Lehto Statistics Estonia Methodology and analysis department Senior.
New sources – administrative registers Genovefa RUŽIĆ.
2 nd Inter- Agency and Expert Group Meeting (IAEGM) Organized by: ESCWA October, 2009 Beirut, Lebanon Mohamed Barre FAO-RNE Regional Statistician.
May 12-15, Evaluating the Integrated Census Israel Pnina ZADKA Central Bureau of Statistics Israel.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Comparison and integration among different sources for determining the legal foreign population stock in Italy Costanza Giovannelli Joint.
1 For a Population Statistical Register Characteristics and Potentials for the Official Statistics Central department for administrative data and archives.
Data sources of the EuroGroups Register Presentation by Eurostat
Dissemination of the Business Register Data Federal State Statistics Service (Rosstat) Elena Zarubina Deputy Director, Department of Statistical Surveys.
Workshop on the Improvement of Civil Registration and Vital Statistics in the SADC Region, Blantyre, Malawi, 1 – 5 December 2008 Integration and coordination.
Costa Rica´s business registry: Directory of institutional units and establishments Contacts: Odilia Bravo:
Session 3 The population registers in Germany – the main data source in the 2011 Census UNECE-Eurostat Expert Group Meeting on Censuses Using Registers.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
Towards a Process Oriented View on Statistical Data Quality Michaela Denk, Wilfried Grossmann.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic Business Register in the Czech Statistical Office =DISSEMINATION.
Public Libraries Survey Data File Overview. What We’ll Talk About PLS: Public Libraries Survey State level data Public library data (Administrative Entities)
Public Libraries Survey Data File Overview. 2 What We’ll Talk About PLS: Public Library Survey State level data Public library data (Administrative Entities)
A Training Course for the Analysis and Reporting of Data from Education Management Information Systems (EMIS)
Danish GFS compilation system by Martin Rasmussen.
Unified Migration Analytical System Ministry of Justice Public Service Development Agency Secretariat of the State Commission on Migration Issues Tbilisi.
2010 World Programme on Population and Housing Censuses Workshop on Civil Registration and Vital Statistics in the UNESCWA Region Cairo, Egypt, December.
Administrative Data and Official Statistics Administrative Data and Official Statistics Principles and good practices Quality in Statistics: Administrative.
Quality control of the statistical register in the Republic of Belarus Svetlana Nichiporuk Head of Statistical Register Department of National Statistical.
Armenia Action B1 28 th March – 1 st April 2011 General introduction to Business Registers Wednesday 30 March 2011 Mrs Vibeke Skov Møller
Implementation of Quality indicators for administrative data
Census Planning and Management
Technical Coordination Group for the next Census round in South East Europe EUROSTAT PREPARATION FOR CENSUS 2020 MONTENEGRO Budapest Jun 2017.
Administrative Data and their Use in Economic Statistics
Plans for the 2021 Population and Housing Census
Preparatory activities - CENSUS 2021
Basic preconditions The next round of population and housing censuses is scheduled for the start of the new decade (2021), both in the EU and in the partner.
Key Considerations for Planning and Management of Census Operations
Karmen Hren Statistical Office of Slovenia
The Challenge in Creating a Stock of Emigrants From Israel
Key Considerations for Planning and Management of Census Operations
Presentation transcript:

Model of transformation administrative data to statistical data Data used in Population and Housing Census 2011 – examples Janusz Dygaszewicz and Paweł Murawski Central Statistical Office POLAND

Outline 1. Purpose of the work on administrtive sources 2. Data quality 3. Extract data 4. Transform data 5. Summary

Data Owners: Ministry of Finance, Ministry of Interior and Administration, Ministry of Justice, Agricultural Social Insurance Fund, National Health Fund, Agency for Restructuring and Modernisation of Agriculture, Agricultural and Food Quality Inspection, Agency for Geodesy and Cartography, State Fund for Rehabilitation of Disabled Persons, County Offices, Commune Offices, Regional Offices, Telcoms, Energy Suppliers, Office For Foreigners, Social Insurance Institution, Housing Managers, Registers - data acquisition 3

Purpose of the work on administrative data Obtaining a sufficiently complete data set – subjective and objective completeness corresponding to classification standards, definitions and basic categories, and thus the effective use of administrative data

Data quality -measures- 1. Measuring the quality of administrative registers – timeliness of data – methodological compatibility – completeness – identification standards used in the registry – usefulness – compatibility of data in administrative sources to data obtained in the study/survey 2. Measuring the quality in processing of data registers – excessive coverage error rate – incomplete coverage error rate – subjective indicator of completeness – objective indicator of completeness – imputation rate – data correction index – integration data from various sources index

Extract data consolidation data from various source systems; different data format, extract data into the production environment based on the SAS software, converting data into one format that is suitable for processing – SAS tables, validate of imported data structure is an integral part of this process.

Extract data -examples-

Transform data Data processing in the production environment consisting of: profiling – create a raport on the data quality, unification/standardization of data, parsing (separation) or combining variables, standardization with schemes, conversion, validation, deduplication, data integration.

Transform data - profiling-

Transform data - standardization and parsing examples- Transform data - standardization and parsing examples-

Transform data - schemes examples-

Transform data - exemples: report data cleaning - DescriptionBefore cleaningAfter cleaning Group of variablesVariableTotal Inorrect Total Inorrect total incorrect In % total incorect In % Address of permanent residence COMMUNITY ,92% ,69% CITY ,77% ,99% STREET ,34% ,65% PREFIX ,00% Address of residence COMMUNITY ,57% ,58% CITY ,13% ,86% STREET ,90% ,55% PREFIX ,00% Corresponding address COMMUNITY ,59% ,50% CITY ,84% ,07% STREET ,17% ,50% PREFIX ,00% Personal DataNAME ,17% ,13%

Transform data - conversion: gender variables famale male FM12

Transform data - conversion: marital status variable- 3 married (M) 503 – married (M) ZNY - married (M) 3 – married (M) 1 – bachelor KWR – bachelor 502 – bachelor 1 bachelor

Transform data -validation- checking the data, correcting abnormal values, according to the algorithms prepared by methodologists, eventual exclusion from further processing records which improvement is impossible.

Transform data - deduplication - removal of repeated units, requires detailed analisys, including alalysis of legal acts individual for each register, result of deduplication – one record with all the possible and unique information.

Transform data -expamle of deduplication process- Transform data -expamle of deduplication process-

Transform data -data integration- process of selection of the best, most current and correct value of several or a dozen of registers Used to create a statistical register, which will be available for use by analysts.

Transform data -intergation process – scheme- A Register B Register C Register ONE ID MULTIPLE IDENTIFIRES ALTERNATIVE LINKING KEYS LINKINGLINKING SELECTINGSELECTING ALGORYTHMS SELECTING THE BEST VALUES DATA COMPLETENESS STATISTICAL REGISTER

kraj_ur_kod_KEP # not null msce_ur_kod_POBYT # not null kraj_ur_kod_GZM # not null Transform data - data integration: example of algorythm Transform data - data integration: example of algorythm FALSE TRUE Kraj_ur_kod select kraj_ur_kod_GZM select kaj_ur_kod_POBYT select kraj_ur_kod_KEP

Data integration -example of process- Data integration -example of process-

Summary Common difficulties: - poor quality data, missing values, duplicates, - conflicting data, - technical: size of the registers, time-consuming process. Benefits: - obtain relevent, useful, accurate data - improve the quality of the output data. - selection of the best variables from multiple registers,

Thank you for your attention