Editing And Imputation For Manufacturing Statistics At Statistics Canada Marie Brodeur Director General, Industry Statistics Branch Santiago, Chile March.

Slides:



Advertisements
Similar presentations
What is Business-to-Business E-Commerce? Any activity between companies that is supported electronically - - Online purchasing - Online sales -
Advertisements

MONITORING OF SUBGRANTEES
1st Meeting of the Working Party on International Trade in Goods and Trade in Services Statistics - September 2008 Australia's experience (so far) in.
Re-design of the trade in commercial services program in Canada October 2010 OECD Working Party on Trade in Goods and Services.
Statistics NZs experience in using Administrative Data in an Integrated Programme of Economic Vince Galvin General Manager Strategy & Communications.
Use of Tax Data in the Unified Enterprise Survey (UES) Workshop on Use of Administrative Data in Economics Statistics Marie Brodeur Moscow November, 2006.
My presentation will be on the use of paradata… By
1 Sharing best practices for the redesign of three business surveys Charles Tardif, Business Survey Methods Division,Statistics Canada presented at the.
Integrated Business Statistics Program (IBSP) Introduction Daniela Ravindra Director, Enterprise Statistics Division November 9th, 2010.
Some considerations on developing a DWH for SBS estimates Orietta Luzi – Mauro Masselli Istat - Italy march 2013.
Towards a Better Integration of Survey and Tax Data in the Unified Enterprise Survey Claude Turmelle Statistics Canada ICES-III Montréal, Québec, Canada.
Industrial Statistics In Afghanistan Economic Statistics Division Central Statistics Organization(CSO) Afghanistan 1.
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007.
1 Editing Administrative Data and Combined Data Sources Introduction.
© John M. Abowd 2005, all rights reserved Economic Surveys John M. Abowd March 2005.
An Integrated Approach to Economic Statistics “ The Canadian Experience” UNSD – IBGE Workshop on Manufacturing Statistics Kevin Roberts Rio de Janeiro,
COUNTRY PRESENTATION – SINGAPORE INTERNATIONAL WORKSHOP ON INDUSTRIAL STATISTICS BEIJING, CHINA 8-10 JULY 2013.
Quality assuring the UK business register Andrew Allen.
Balance of Payments Collection and Compilation 23 Feb 2012 Central Statistics Office Ireland.
Jim Tebrake New York, March 2014 The Integrated Business Statistics Program at Statistics Canada Director General Statistics Canada Statistics Canada Statistique.
André Loranger New York, June 2014 The Integrated Business Statistics Program at Statistics Canada Presentation to the UNCEEA Assistant Chief Statistician.
1 The Business Register: Introduction and Overview Ronald H. Lee
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Regional GDP Workshop. Purpose of the Project October Regional GDP Workshop Regional GDP Scope Annual Current price (nominal) GDP By region.
The Integrated Approach to Economic Statistics “The Canadian Approach” Friends of the Chair Group on Integrated Economic Statistics Marie Brodeur, Michel.
Administrative Data at Statistics Canada – Current Uses and the Way Forward 27 th Voorburg Group Meeting Warsaw, Poland André Loranger October 4, 2012.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimating informal production, part 2 1 Business statistics and registers.
Changing the Treatment of Goods Sent Abroad for Processing (GSAP) Practical and Analytical Impacts on Production Accounts Michel Girard - Statistics Canada.
Data items and their definitions Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics.
1 Business Register: Quality Practices Eddie Salyers
The Canadian Integrated Approach to Economic Surveys Marie Brodeur, Peter Koumanakos, Jean Leduc, Éric Rancourt, Karen Wilson Statistics Canada International.
12th Meeting of the Group of Experts on Business Registers
1 Presentation to OG6 Canberra, Australia May 2011 Statistical Uses of Administrative Data in Canada.
Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical.
All the answers? Statistics New Zealand’s Integrated Data Infrastructure Paper by Felibel Zabala, Rodney Jer, Jamas Enright and Allyson Seyb Presented.
The Future of Administrative Data ICES III End Panel Discussion Don Royce Statistics Canada June 2007.
Impact of using fiscal data on the imputation strategy of the Unified Enterprise Survey of Statistics Canada Ryan Chepita, Yi Li, Jean-Sébastien Provençal,
© John M. Abowd 2007, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2007.
Collecting Electronic Data From the Carriers: the Key to Success in the Canadian Trucking Commodity Origin and Destination Survey François Gagnon and Krista.
International Workshop on Industrial Statistics Beijing, China 8-10 July 2013 Data items for Industrial Statistics (Part 1)
ICES III - Montréal, Canada Listening to Respondents for Better Results Alexander Hays Distributive Trades Division Statistics Canada.
Overview of the main changes in IRDTS 2008 Workshop for African countries on the Implementation of International Recommendations for Distributive Trade.
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
New sources – administrative registers Genovefa RUŽIĆ.
Methodology of Allocating Generic Field to its Details Jessica Andrews Nathalie Hamel François Brisebois ICESIII - June 19, 2007.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
SNA seminar in the Caribbean Integrated questionnaires Marie Brodeur Director General, Industry Statistics Branch, Statistics Canada St. Lucia February,
CHAPTER 2 TYPES OF BUSINESS INFORMATION SYSTEM. INTRODUCTION Information System support business operations by processing data related to business operation.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
Integrated Approach Processing Marie Brodeur Director General, Industry Statistics Branch, Statistics Canada St. Lucia February, 2014 SNA seminar in the.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
1 Overview of Economic Statistics in Africa UNECA Andry Andriantseheno Regional Workshop on Basic Economic Statistics Addis-Ababa October 2007.
Unified Enterprise Survey New Horizons International Conference on Establishment Surveys Daniela Ravindra and Marie Brodeur Montreal, June 2007 Statistics.
Processing Methodology of Tax Data at Statistics Canada Authors: François Brisebois, Richard Laroche and Rossana Manriquez (Statistics Canada) Presenter:
Administrative Data at Statistics Canada – Current Uses and the Way Forward Wesley Yung and Peter Lys, Statistics Canada.
The Evolution of Administrative Data Use for the Canadian Business Register (BR) IAOS Conference Gaétan St-Louis October 2008.
1 Overview of the U.S. Census Bureau’s Business Register Profiling Operations Presented to International Roundtable on Business Survey Frames– Wiesbaden.
African Centre for Statistics United Nations Economic Commission for Africa Expert Group Meeting: to review “Handbook on Supply and Use Table, Compilation,Application,and.
Small Business and Special Surveys Division Statistics Canada Entrepreneurship Indicators Project Steering Group Meeting Istanbul June 25-27, 2007.
4-6 September 2013, Vilnius Quality in Statistics: Administrative Data and Official Statistics USING ADMINISTRATIVE DATA SOURCES IN OFFICIAL.
Regional Roundtable on
TRANSACTION PROCESSING SYSTEM (TPS)
Quality Aspects and Approaches in Business Statistics
An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude.
ESTP COURSE ON PRODCOM STATISTICS
Business Register Redesign Technology Strategy Plan
Mapping Data Production Processes to the GSBPM
Presentation transcript:

Editing And Imputation For Manufacturing Statistics At Statistics Canada Marie Brodeur Director General, Industry Statistics Branch Santiago, Chile March 15 to 17, 2011

Outline Of The Presentation  Overview of the Manufacturing Program  Centralized Process  Surveys  Overview of the UES Survey Process  Post Collection Processing Inputs & Tools  Use of Tax Data  The many phases of UES Post Collection Process  Managing the UES Post Collection Process 2

Statistics Canada 3

Business and Trade Statistics IndustryStatisticsEconomy-wideStatistics Agriculture, Technology and Transportation Statistics Manufacturing and Energy DistributiveTrades Service Industries Enterprise Statistics Consumer Prices International Trade Producer Prices Investment and Capital Stock Enterprise Statistics Agriculture Small Business And Special Surveys Science, Innovation And Electronic Information Transportation 4

Manufacturing Distribution Of Sales 5

 Establishments primarily engaged in the physical or chemical transformation of materials and substances into new products  Includes assembly of the component parts of manufactured goods, blending of materials, finishing of manufactured products by dyeing, heat treating, plating and similar operations  Transformation of own materials or those owned by others  Service outputs: custom work, repair and maintenance  Product outputs: finished goods, intermediate goods Who Are Manufacturers? 6

 Monthly Survey of Manufacturing (MSM)  Annual Survey of Manufactures and Logging (ASML)  Series of sub-annual commodity surveys Manufacturing Program At Statistics Canada (STC) 7

 Monthly indicator of manufacturing activity  Last Redesign in 1999  Designed to be a reliable indicator for both trends and levels  Establishment Survey (n= 10,500)  Stratified by Province, NAICS and Size General Characteristics Of The MSM 8

 Sales Goods of own manufacture  Inventories Raw materials Goods-in-process Finished products  Orders New orders Unfilled orders  Goods purchased for resale (revenue and inventory) These data are collected but not released  Sales is the main concept, exceptionally production for some industries (aerospace and shipbuilding) MSM Concepts 9

SimpleComplex Total number of establishments on the business register 2,278,730110,557 Value of sales of all establishments on the Business Register $2,214.9 billion $1,859.1 billion Total number of manufacturing establishments on the business register 84,2156,648 Value of sales of manufacturing establishments on the Business Register $340.8 billion $234.5 billion Frame And Coverage 10

MSM Sampling Plan Take-Some Take-All Take-None 11 Tax replaced Survey

 Background The Goods and Services Tax (GST) is the federal Value Added Tax GST is collected by the Canada Revenue Agency (CRA) The CRA provides tax data to Statistics Canada  Information received includes the Business Number, revenue, tax remitted and input tax credit MSM Sampling Plan: Use Of Tax 12

 Who is replaced? Single establishment enterprises  Replace 50% of sampled data with GST data Chronic refusals  Who are not replaced? Very large single enterprise establishments Complex units (i.e. multiple establishments) – as it is found in the GST database Use Of Tax Data 13

 Measures the contribution of manufacturing industries to economic activity in Canada  In 2010, manufacturing accounted for 15% of GDP and 12% of total employment (SEPH)  Key input to SNA Input-Output tables  Survey collects data on what commodities are produced (Make matrix) where commodities are destined (provincial I/O tables) what commodities and primary inputs are used in production (Use matrix) What Is The Annual Survey Of Manufactures And Logging (ASML)? 14

 ASML is conducted under the umbrella of Statistics Canada’s Unified Enterprise Survey Program (UES)  Same as MSM  Establishments primarily engaged in manufacturing and logging activities and classified to NAICS 31, 32 and 33 as well as NAICS 113  Estimates produced for 261 NAICS6 level industries  Estimates produced for the 10 provinces and 3 territories. Survey Coverage 15

 Revenue variables (16), expense variables (43), detailed opening and closing inventories (12), other financial (5)  Sales or outputs variables are valued at producer or FOB factory gate prices required by SNA  Commodities consumed (inputs) and produced (outputs) both goods and services  Collect commodity values and quantities (for selected goods)  Services produced and consumed collected as expense items and classified based on COA Content: Commodity Variables 16

Types Of Administrative (Tax) Data From the Canadian Revenue Agency (CRA) Agreement between two agencies T1 (unincorporated businesses) T2 (incorporated businesses) T4 (pay slips) GST (goods and service tax) PD7 (payroll deduction accounts) 17

Editing And Imputation For Manufacturing Surveys

Why A Centralized Process?  Best Practices  Standardization of Processes Cross Survey Comparisons Enterprise Centric Processing/Coherence Analysis  Efficient use of Resources  Transportable Knowledge Across Survey Programs 19

Challenges Of A Centralized Process  Remain Centralized  Distribute processing  Priority Setting  Communication and Coordination 20

Pre-Grooming Allocation / Estimation Edit & Imputation “Clean” Records Central Data Store Subject Matter Review & Correction Tool Tax Data USTART UES Post-Collection Processing 21

Collection  Collection Period: February to early October  Collection Processing System: Blaise Blaise can be seen as being a Collection Control Center Blaise has many functions:  Call Scheduler  Transaction history files  Audit Trail Files  And more 22

Blaise: Variables  Questionnaire number  Mail-out date  Number of calls  Length of the call  Number of contact attempts  Response code  And more 23

Blaise: Bonuses Over The Years  Blaise Transaction History (BTH) Files Collection data analysis:  Produced a paper on best time to call  Produced a paper on maximum # of attempts  Audit Trail Files Find outliers Difficult to answer questions 24

Collection  Precontact (Dec-Jan) –Mostly for Business Register (BR) births; verification of contact information (name, address, …) –By phone (in a few cases, a letter or a fact sheet is sent)  Mail-out of questionnaires (Jan-March) –2 or 3 mail-out dates  Follow-up in case of non-response for some units (begins about a month atfer mail-out) –Phone call, r or fax  Mail-back of questionnaires  Verifications of received questionnaires / Edits –Is the questionnaire complete or are some key variables missing? (Edit follow-up by phone in some cases) 25

Collection  Coding of questionnaires (about 20 response codes) Response, non-response, out-of-scope, …  Imaging / Data capture (CADI - Computer Assisted Data Input) 26

Centralized Collection Mailout (38K CEs) Pre-Contact (17K Businesses) Edit / Verification (BLAISE) Receipt (75% target) Delinquent Follow-Up Capture / Imaging “Clean” Records Score Function 27

UES: Data Collection / Score Function  Introduced in 2002, the UES score function is the main tool used at the collection stage to determine which priority to give for the follow-up of about 23,000 Collection Entities (CE) each year.  Reduces collection costs yet retains data quality  Similar to the collection goal of obtaining a high weighted coverage response rate.  PRIORITY 1: Extensive follow-up for the larger revenue CEs in cases of non-response.  PRIORITY 0: Minimum follow-up for the smaller CEs in cases of non-response. 28

DISSEMINATION COLLECTION Chart Of Accounts Sales Operating revenue Cost of sales Gross profit Expenses EBIT Outputs Inputs Value added Shipments Operating Surplus GDP LINK, BRIDGE, CONCORDANCE 29

Expected Benefits Of A Chart Of Accounts  Standardization in business data collection  Higher survey response  Increase in quality of data  Comparison of data from various sources  Increase efficiency in using administrative data 30

Links To Chart Of Accounts CHART OF ACCOUNT Establishment Legal entity Enterprise 31

UES: Use Of Tax Data  Validation (comparison)  Verify dubious collected data against the equivalent tax data record  Imputation  One of the methods used for non-response  Estimation  Below take-none  Direct Data Replacement  Update Business Register  Allocation of survey data (use tax revenues, salaries and expenses)

 Develop centralized systems Move away from stand-alone Single point of access for security  Integrated Questionnaire Metadata System  Edit and imputation  Allocation and Estimation  Data Warehouse Centralized Processing Systems And Databases

Enterprise Portfolio Managers  Top 350 enterprises in Canada  Status Platinum, Gold, Silver, Bronze  Personal visits  Enterprise Profiling  Coordination of mail-out and collection  Enterprise/ Establishment coherence  Holistic Response Management Strategic Response Unit Escalation Process / Statistics Act 34

Review and Correction (Post-Capture)  Done via an application which is a micro-editing tool  Opportunity to perform edits and to manually correct data before the automated edit and imputation process  Opportunity to gain an understanding of the quality of data coming in from the field 35

What Is Generally Done By SMOs During This Process?  Ensure that industry codes are valid and response code are correct  Ensure that equivalent survey cells have consistent data  Enter data for records that came in after the collection cut-off date  Review high impact outliers in terms of profit, average salary, etc.  Check comments made by respondents and collection staff 36

Why Is This Process Necessary?  Reviewing and correcting records will increase the number and quality of donors for the automated edit and imputation (E&I) stage. This will improve the quality of data coming out of E&I.  Need to assess the quality of collected data  Determine if problems with questionnaire  Inability of respondent to provide a given data point  Determine if enough data for E&I 37

What Should Not Be Done During This Process?  Do not plug data for non-response records. They will be imputed during the automated E&I. 38

What Is E & I?  Editing Verify that parts add-up to total Ensure that there are no missing values where parts add up to total There must be consistency between related variables  Imputation Changing values in fields which fail edit rules with a view to ensuring that the resulting data satisfy all edit rules. In practice, reported data will rarely be changed Impute for missing data or partially responded data Impute entire records in the case of total non- response 39

Why Is E&I Necessary?  To produce a complete and consistent data file that accounts for all sampled units  Both units that did not respond to the survey must be imputed and units that did not provide a complete response must be imputed  Correct erroneous responses 40

E&I Terminology  Data Group Groupings (defined by SM) of records that will be kept together for imputation purposesGroupings (defined by SM) of records that will be kept together for imputation purposes These groupings are based on multi dimensions:These groupings are based on multi dimensions:  industry (NAICS)  geography (province)  Data groups that will be used for a specific survey will depend on: initial sample design (number of units sampled and the level of stratification used)initial sample design (number of units sampled and the level of stratification used) number of records that respond to the survey (a minimum of 5 or 10 records are required in a data group)number of records that respond to the survey (a minimum of 5 or 10 records are required in a data group)  May be changed during production if not enough donors 41

E&I Terminology (continued)  Edit Group Grouping of variables within a record that will be processed together in an imputation method Generally edit groups may be defined as follows for most surveys:  revenue and expense sections  employment section and provincial distribution of goods/services sold Allows for a record to be a donor if it has clean data in one section even when other sections are blank; this increases the donor pool 42

E&I Terminology (continued)  Key variables Total operating revenue Total operating expenses Salaries Cost of goods sold 43

The Stages Of The E&I System  Pre-processing  BANFF E & I System   Post-Processing  Allocation 44

Preprocessing  Deterministic Edits  Conditional edits - If A then B  Sum of Parts (SOP)  Assign 100% to percentage totals  Impute reporting period  Donor Outlier Detection 45

BANFF E & I System  Impute for missing key variables as specified by subject matter (i.e. total revenue, total expenses)  Impute for other missing variables: Apply Historical Trend Apply Current Year Trend Use donor (for partial imputation),  Select a donor for massive imputation for total non-response 46

BANFF Algorithms  DIFTREND - Historical trend imputation  CURRATIO - Current ratio imputation  PREVALUE – Value from the previous period for the same unit is imputed  PREAUX – Historical value of a proxy variable for the same unit  CURAUX – Current value of a proxy variable for the same unit 47

Post-Processing  Prorate components to ensure that they sum exactly to totals  Perform a number of consistency checks to ensure that micro-data are valid  Assign customer location (percentage cells)  Massive Imputation (donor selected during processor but applied in the post-processor) 48

Allocation - Definition & Purpose Definition:  Allocation is the distribution of survey and administrative data from their acquisition level (Collection Entity) to the targeted statistical units (Establishments or Locations) as defined on the survey frame. Purpose:  To provide fully-processed micro data on a fiscal year basis, for establishments or locations in-sample for the UES  Determine the distribution of value added by province 49

Sample Survey Allocation 50

 Post Collection Operations Committee Discuss production issues of common interest Provide status reports on production and production readiness  Divisional Production meetings Working group level dealing with production issues relating to a specific subject matter division, including planning and adhoc requests  Post Collection Processing Teams Structured by Subject Matter Division to provide the best support and to maximise subject matter expertise  Change Management Requests Improvements  Service Request Management Portal (SRM) Corrections Managing The UES Post Collection Process 51

Future Directions  IBSP (Integrated Business Statistics Project) New and Improved UES, to consolidate and standardise processing for more annual and sub- annual business surveys Start RY2013. To be completed for RY2015 Number of surveys to increase from 60 annual surveys to 120 annual and sub-annual surveys. 52