Lyne Guertin Census Data Processing and Estimation Section Social Survey Methods Division Methodology Branch, Statistics Canada UNECE April 28-30, 2014.

Slides:



Advertisements
Similar presentations
The Continuing Evolution of Generalized Systems at Statistics Canada for Business Survey Processing Chris Mohl Statistics Canada.
Advertisements

1 Editing the Integrated Census in Israel. EDITING THE INTEGRATED CENSUS IN ISRAEL Prepared by Eva Rotenberg, Central Bureau of Statistics, Israel (1)
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Migration of a large survey onto a micro-economic platform Val Cox April 2014.
Improvements to the Quality of Tax Data in the Context of their Use in Business Surveys at Statistics Canada François Brisebois, Martin Beaulieu, Richard.
Software Quality Assurance Plan
The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
1 Editing Administrative Data and Combined Data Sources Introduction.
Administrivia Lifecycle Architecture (LCA) group assignment will go out later today. Informal feedback meetings with LCO groups EasyShare: Mon, 2:45pm-3:15pm,
Brief Overview of Data Processing of Afghanistan Household Listing, Pilot Census Results, Population and Housing Census and NRVA Survey Brief Overview.
08/08/2015 Statistics Canada Statistique Canada Paradata Collection Research for Social Surveys at Statistics Canada François Laflamme International Total.
André Loranger New York, June 2014 The Integrated Business Statistics Program at Statistics Canada Presentation to the UNCEEA Assistant Chief Statistician.
Edit and Imputation of the 2011 Abu Dhabi Census Glenn Hui and Hanan AlDarmaki Statistics Centre - Abu Dhabi UNECE CES Work Session on Statistical Data.
Improving the Quality of Tax Statistics: Recent Innovations in Editing and Imputation Techniques at the Statistics of Income Division of the U.S. Internal.
E&I for 2006 Canadian Census Mike Bankier Statistics Canada
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
18/08/2015 Statistics Canada Statistique Canada Responsive Collection Design (RCD) for CATI Surveys and Total Survey Error (TSE) François Laflamme International.
National Household Survey: collection, quality and dissemination Laurent Roy Statistics Canada March 20, 2013 National Household Survey 1.
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Overview of Archiving of Microdata Session 4 United Nations.
FCM Quality of Life Reporting System Metadata By: Acacia Consulting and Research June 2002.
Chapter 8: Systems analysis and design
Statistics Canada’s Real Time Remote Access Solution 2011 MSIS Meeting – Karen Doherty May 2011.
Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Impact of using fiscal data on the imputation strategy of the Unified Enterprise Survey of Statistics Canada Ryan Chepita, Yi Li, Jean-Sébastien Provençal,
United Nations Economic Commission for Europe Statistical Division Seasonal Adjustment Process with Demetra+ Anu Peltola Economic Statistics Section, UNECE.
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
Editing a Mixture of Canadian 2006 Census and Tax Data Mike Bankier Statistics Canada 2006 Work Session on Statistical Data Editing
Eurostat Expression language (EL) in Eurostat SDMX - TWG Luxembourg, 5 Jun 2013 Adam Wroński.
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
Topic (vi): New and Emerging Methods Topic organizer: Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Oslo, Norway, September 2012.
Direction and system changes impacting on data editing and imputation at Statistics New Zealand Paper by Emma Bentley and Felibel Zabala, presented by.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
FIX Eye FIX Eye Getting started: The guide EPAM Systems B2BITS.
Quality Assurance Programme of the Canadian Census of Population Expert Group Meeting on Population and Housing Censuses Geneva July 7-9, 2010.
1 PTF Tracker Automatic Tracking of PTFs and Software Changes.
United Nations Economic Commission for Europe Statistical Division High-Level Group Achievements and Plans Steven Vale UNECE
SNA seminar in the Caribbean Integrated questionnaires Marie Brodeur Director General, Industry Statistics Branch, Statistics Canada St. Lucia February,
Business model Transformation Strategy (BmTS) John Pearson and Tracey Savage Statistics NZ’s.
A Quality Driven Approach to Managing Collection and Analysis
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Copyright 2010, The World Bank Group. All Rights Reserved. Managing Data Processing Section B.
Integrated Approach Processing Marie Brodeur Director General, Industry Statistics Branch, Statistics Canada St. Lucia February, 2014 SNA seminar in the.
Regional Seminar on Promotion and Utilization of Census Results and on the Revision on the United Nations Principles and Recommendations for Population.
Establishing E&I capability and best practices at Statistics NZ Vera Costa & Tracey Savage 2008 UNECE Work Session on Statistical Data Editing.
Heather Wagstaff and Thomas Burg Topic (vi) Methodologies for Editing Census Data INTRODUCTION UNECE Work Session on Statistical Data Editing:Vienna
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Developing the prototype Longitudinal Business Database: New Zealand’s Experience Julia Gretton IAOS Conference Shanghai, China, October 2008
Methods and software for editing and imputation: recent advancements at Istat M. Di Zio, U. Guarnera, O. Luzi, A. Manzari ISTAT – Italian Statistical Institute.
RECENT DEVELOPMENT OF SORS METADATA REPOSITORIES FOR FASTER AND MORE TRANSPARENT PRODUCTION PROCESS Work Session on Statistical Metadata 9-11 February.
Processing Methodology of Tax Data at Statistics Canada Authors: François Brisebois, Richard Laroche and Rossana Manriquez (Statistics Canada) Presenter:
The 2011 Census: Estimating the Population Alexa Courtney.
Towards the 2011 UK Census Editing Strategy Heather Wagstaff and Steven Rogers Methodology Directorate Office for National Statistics, U.K.
The development of a data editing and imputation tool set UN/ECE Work Session on Statistical Data Editing Topic (ii): Global solutions to editing Claude.
Wesley Yung and Claude Poirier 2015 World Statistics Congress CSPA from a Methodologist’s Point of View.
 A content management system ( CMS ) is a system providing a collection of procedures used to manage work flow in a collaborative environment. These.
Using administrative data to produce official social statistics New Zealand’s experience.
Administrative Data at Statistics Canada – Current Uses and the Way Forward Wesley Yung and Peter Lys, Statistics Canada.
Marc Hamel and Julie Trépanier May 21, 2014 Canadian Statistical Demographic Database: A research project.
United Nations Statistics Division Developing a short-term statistics implementation programme Expert Group Meeting on Short-Term Economic Statistics in.
Introduction to Statistics Estonia Study visit of the State Statistical Service of Ukraine on Dissemination of Statistical Information and related themes.
Advanced Higher Computing Science
Theme (ii): New Data Sources and Census
Canadian Census E&I – Lessons Learned from 2006 with Plans for 2011
System Design.
UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing April 2017 The Hague,
UNECE Work Session on Statistical Data Editing
Étienne Saint-Pierre, Statistics Canada
Presentation transcript:

Lyne Guertin Census Data Processing and Estimation Section Social Survey Methods Division Methodology Branch, Statistics Canada UNECE April 28-30, 2014 Editing the 2011 Census data with CANCEIS and options considered for UNECE 2014 Statistics Canada Statistique Canada

Outline 1.Overview of CANCEIS 2.Recent improvements to CANCEIS and to the 2011 E&I strategy 3.Options considered for UNECE 2014 Statistics Canada Statistique Canada

1. Overview of CANCEIS (CANadian Census Edit and Imputation System) 3 UNECE 2014 Statistics Canada Statistique Canada

4 UNECE 2014 Statistics Canada Statistique Canada

CANCEIS users  Domestic Users (other than Census) National Household Survey Canadian Income Survey Survey on Financial Security Survey of Household Spending Longitudinal and International Study of Adults 5 UNECE 2014 Statistics Canada Statistique Canada

 Other countries (users, past users, or exploring CANCEIS) Argentina Australia Brazil Israel ItalyJapan New Zealand PeruSwitzerland UKUSA  CSPA initiative (Common Statistical Processing Architecture) Targeted CANCEIS in a pilot with New Zealand to test portability. 6 UNECE 2014 Statistics Canada Statistique Canada

Imputation methods available  Deterministic imputation  Donor imputation Based upon the principles of –minimum change –preserving distribution of the data 7 UNECE 2014 Statistics Canada Statistique Canada

Developed by Mike Bankier in the 1990’s A.Apply edits  Search for invalid values, missing & inconsistencies  Classify records as Passed or Failed New Imputation methodology (NIM) 8 UNECE 2014 Statistics Canada Statistique Canada

B.Perform donor imputation Step1: establish list of best donors (i.e. that most resemble the failed record) Step2: find best imputation actions for these donors Step3: select an imputation action at random New Imputation methodology (NIM) (cont’d) 9 UNECE 2014 Statistics Canada Statistique Canada

Advantages of this methodology  Offers a practical solution to an operational problem  Allows simplification of edits  use minimum set in relation to the donor chosen  Computationally efficient  Can deal with non-linear edits  Data driven imputation UNECE 2014 Statistics Canada Statistique Canada 10

CANCEIS Features  Categorical, numerical and alphanumeric variables  Large numbers of edits & large data files  Portable, flexible & efficient  All parameterized  easy to customize Ten different distance functions to find best donors, which cover different types of variables 11 UNECE 2014 Statistics Canada Statistique Canada

over all paired fields (i) where V fi is the value of matching variable i for the failed record; V pi is the value of matching variable i for the passed record; w i is the weight of variable i (w i ≥0 ); D i is the distance function chosen for variable i (0 ≤ D i ≤1 ). Distance Measure for Potential Donors 12 UNECE 2014 Statistics Canada Statistique Canada

CANCEIS System Components Data Data Dictionary System Parameters Decision Logic Tables Donor Imputation Deterministic Imputation Imputed Data Reports & Logs 13 Inputs CANCEIS Components Outputs UNECE 2014 Statistics Canada Statistique Canada

14 2. Recent improvements to CANCEIS and to the 2011 E&I strategy UNECE 2014 Statistics Canada Statistique Canada

Improvements  For 2011, CANCEIS was rewritten in C# (C-sharp) in a.NET environment Easier to maintain Improved efficiency (lower processing time) Increased stability 15 UNECE 2014 Statistics Canada Statistique Canada

Improvements (cont’d)  Multi-threading now possible in donor imputation Enables processing of multiple failed units at one time Increases performance and reduces processing time 16 UNECE 2014 Statistics Canada Statistique Canada

Improvements (cont’d)  CANCEIS is more user friendly Before: could handle only.txt files (inputs/outputs) Now: handling also data dictionaries in Excel and creating summary reports in HTML 17 UNECE 2014 Statistics Canada Statistique Canada

Improvements (cont’d)  Increased content and level of detail in the logs Facilitate troubleshooting Facilitate validating desired strategy for each module 18 UNECE 2014 Statistics Canada Statistique Canada

New features added  Additional flexibility in specifying imputation parameters  New parameter to specify that the staged search will not stop until an excellent donor is found Continue to search if the target quality is not reached 19 UNECE 2014 Statistics Canada Statistique Canada

Modification to the 2011 E&I strategy  Group these five processes Place of birth of parents Immigration status Aboriginal status Citizenship Visible minorities into one ethnocultural process 20 UNECE 2014 Statistics Canada Statistique Canada

Modification to the 2011 E&I strategy (cont’d)  Goals: Increase data coherence between processes by using one single donor to impute all variables Reduce manual fixes after E&I  Challenge: manage lots of edits & data 21 UNECE 2014 Statistics Canada Statistique Canada

22 3. Options considered for 2016 UNECE 2014 Statistics Canada Statistique Canada

 Planning E&I strategy for 2016 Evaluating the use of administrative data as alternative source of data Exploring if the language processes could be grouped (mother tongue, home language, official language) Exploring if steps within processes could be grouped Exploring if processes could be run in parallel Goals  improve quality, reduce processing time 23 UNECE 2014 Statistics Canada Statistique Canada

 Continue improving CANCEIS to serve future requirements of the Census Research and development ongoing  Done by programmers and methodologists  CANCEIS v5.2 to be released by Dec.2014 Allowing DLTs and System Parameters in Excel Revisited contents of Inputs & Outputs Standardized naming convention Improvements to default values of parameters 24 UNECE 2014 Statistics Canada Statistique Canada

 Will offer the CANVERT conversion tool Ensures smooth transition from v5.1 to v5.2  Updated documentation will be provided Basic User Guide (with two simple examples and basic features) Comprehensive User Guide (with more examples, and all features) 25 UNECE 2014 Statistics Canada Statistique Canada

Merci!  For more information,  Pour plus d'information, please contact:veuillez contacter : Lyne Guertin ( ) Thank you for your attention! 26