1 An Evaluation of Changes to the Universe Extraction for Current Business Surveys at the U.S. Census Bureau Author: Carol S. King Presenter: Ruth E. Detlefsen Service Sector Statistics Division U. S. Census Bureau June 20, 2007
2 Business Sample Revision Approximately once every five years For retail, wholesale, and services sectors
3 Why a Business Sample Revision (BSR) To ensure that samples remain representative of the retail, wholesale, and services universes To redistribute burden for small and mid-size firms currently in the samples To introduce changes such as new industry codes, coverage expansion, etc.
4 Topics Changes made to the edit process Efforts to improve edit bounds Introduction of the Edit Tracking System
5 Business Register (BR) Business Register (BR) - master business list of employer firms and establishments contains: administrative record data Economic Census data current survey data
6 Universe Extraction Process Extract from the BR all retail, wholesale and services establishments Assign a NAICS-derived sampling recode to each inscope establishment Compute an estimate of each establishment’s annual sales or receipts, measure of size (MOS) Edit, review and correct the extracted data
7 Data Edits - Why Edit the Data? To find systematic problems in the data or input files To review data used to compute the establishment measure of size To review establishments in industries that may have unique reporting or classification issues To find data problems for individual cases that if not corrected would adversely effect sample sizes
8 Editing of Data in Prior Universe Extractions Thirteen edits done at the establishment level Edits tested Census and administrative data Edit failures assigned a priority for review Approximately 132,000 edit failures in BSR-2K
9 Editing for BSR-06 Performed edits at both the establishment and sampling unit level Reviewed establishment failures only if sampling unit failed Provided an interactive correction system
10 Sampling Unit Edits for BSR-06 Identified noncertainty sampling units that increased the sample size by one or more units. This increase was called the effect. Identified certainty units that had MOS / β * payroll out of line with other certainty units in the same sampling recode.
11 Noncertainty Effect Edit Edit failures were grouped into three categories: Highest priority – unit added 15 or more units to the sample size Next highest priority – unit added 5 to 14 units to the sample size Lowest priority – unit added 1 to 4 units to the sample size
12 Maximum Effect for Selected Sampling Recodes TradeSampling RecodeInitialWeek 4Week 7 Services – Other Individual and Family Services Services – Portfolio Management Services – Travel Agencies31610 Retail – Hardware Stores Retail – Drinking Places106
13 Certainty Edit Used an Hidiroglou-Berthelot (H-B) edit for certainty unit outliers Delineated three regions away from the acceptance region Failures in furthest region were given review priority Failures in other regions were reviewed as time permitted
14 BSR-06 Edit Failures Total establishment failures – 126,000 (132,000 for BSR-2K) Sampling unit failures - 3,769 Noncertainty edit failures – 2,637 (0.07% of all noncertainty sampling units) Certainty sampling units - 1,132 (5% of all certainty units)
15 Review and Correction of Edit Failures Establishments of sampling unit edit failures were reviewed based on the priority assigned from the establishment edits. Corrections were made as needed. If no correction was made, sampling unit was marked so it would not appear as a failure in subsequent edit runs. Edits were rerun and effects recomputed.
16 Sample Sizes Before and After Data Editing Survey Sample Size Before Data Editing Sample Size After Data Editing Percent Change in Sample Size MRTS13,46012, % MWTS6,8346, % SAS70,71061, % 90,55480, %
17 Ratio Edit Bounds in Previous BSR Ratio BSR-2K Upper Bound BSR-2K Lower Bound Administrative Payroll/ Census Payroll 99 th percentile1/99 th percentile Census Receipts/ β * Census payroll 99 th percentile1/99 th percentile Census Inventory/Census Receipts MOS/Census Receipts101/10
18 Review of Ratio Edit Bounds For BSR-06, –Plain Vanilla (PV) was examined for determining ratio bounds –establishments were grouped by whether singleunit or multiunit establishment within the sampling recode
19 Establishment Edit Bounds for BSR-06 PV used for the ratios: –Census Receipts / β * Census Payroll –Census Inventory / Census Receipts BSR-2K method used for the ratios: –Administrative Payroll / Census Payroll –MOS / Annualized Census Receipts
20 Edit Tracking System Which establishments were corrected When corrections were made Which variables were corrected The magnitude of the corrections.
21 Some Results from the Tracking System Corrections made throughout the 7-week period Fifteen correctable fields. Most corrections to six fields. Number of establishments with at least one change as a percent of the total number of establishments was small. Changes made impacted the sample sizes.
22 Conclusions Satisfied with the sampling units edits and resulting reduction in sample sizes. Improved our edit bound determination process but would like to see if more improvements can be made. Will use the output of the edit tracking system to determine what improvements can be made to the extraction process for the next sample revision.
23 Contact Information Author: Carol S. King Presenter: Ruth E. Detlefsen