Download presentation
Presentation is loading. Please wait.
Published byJaden Daley Modified over 11 years ago
1
Slide 1Slide Slide 1 International Conference on Establishment Surveys III Montreal June 18-21, 2007 United States Department of Agriculture National Agricultural Statistics Service Generalized Census Processing System at the National Agricultural Statistics Service Thomas Jacob, Carol House National Agricultural Statistics Service
2
Presentation Outline Census of Agriculture Overview 2002 Census Processing System Reasons for redesign Redesign initiatives Dashboard for continuous monitoring Can the system be more generalized? Acknowledgements Questions Slide 1Slide Slide 2 International Conference on Establishment Surveys III Montreal June 18-21, 2007
3
Census of Agriculture Overview In 1997 Census of Agriculture was transferred from U.S Bureau of the Census 2002 -3 Million report forms mailed out 400+ system users in Headquarters and Field Offices Over 1,500 variables Over 110 published tables per state and US Volume, volume, volume Slide 1Slide Slide 3 International Conference on Establishment Surveys III Montreal June 18-21, 2007
4
2002 Census Processing System NASS contracted National Processing Center (NPC) for - Mail out, Check in, Capturing images and Capturing data ( OMR +ICR) SAS based system for Edit, Imputation and Analysis using Sybase and Redbrick databases - Edit Specifications captured using Decision logic table (DLT) - Micro level and macro level analysis - Automated edit using DLT - Tried to implement Fellegi-Holt (FH) methodology and DLT as a two-tier edit - Goal of 80% data not touched by analysts. Slide 1Slide Slide 4 International Conference on Establishment Surveys III Montreal June 18-21, 2007 OMR=Optical Marker Recognition ICR=Intelligent Character Recognition
5
What Worked Well Completed Census on Schedule Questionnaire Imaging Analysis - Macro and Micro tools % of records touched Disclosure routines worked well but independently Slide 1Slide Slide 5 International Conference on Establishment Surveys III Montreal June 18-21, 2007
6
Reasons for Redesign Increase system speed - Edit and Imputation was extremely slow (could only edit 75 records at a time) - Issues with loads between databases - Slow communication lines - Database design was inefficient - Nearest Neighbor Imputation using sequential search Slide 1Slide Slide 6 International Conference on Establishment Surveys III Montreal June 18-21, 2007
7
Reasons for Redesign Increase effectiveness and quality of process - Minimize data capture errors - Time consuming analysis - Inadequate dashboard for identifying influential records - Need for true interactive edit (IE) - Disclosure routine in old FORTRAN code Slide 1Slide Slide 7 International Conference on Establishment Surveys III Montreal June 18-21, 2007
8
Edit/Imputation/IE CATI Web SCAN Images Paper Forms KFI Raw Data Sybase/ OLTP Replication Server Redbrick/ OLAP Batch Edit Analysis Data Review Interactive Edit PRD DLT Edit Data Review Interactive Edit Replication Server Data Review Interactive Edit Disclosure/ Tabulation Slide 1Slide Slide 8 International Conference on Establishment Surveys III Montreal June 18-21, 2007 Donor Pool
9
Edit/Imputation/IE CATI Web SCAN Images Paper Forms KFI Raw Data Sybase/ OLTP Replication Server Redbrick/ OLAP Batch Edit Analysis Data Review Interactive Edit PRD DLT Edit Data Review Interactive Edit Replication Server Data Review Interactive Edit Disclosure/ Tabulation Qua Slide 1Slide Slide 9 International Conference on Establishment Surveys III Montreal June 18-21, 2007 Donor Pool
10
Redesign Initiatives Multiple modes of data collections ( CATI, WEB, KFI, …)- but use the same module for loading data Key from Image (KFI) instead of scanning (OCR&OMR) Create an indicator denoting additional information occurred on the report form ( Respondent notes, Remarks, Altered Stubbs) Create images for respondents who responded through CATI, Web Slide 1Slide Slide 10 International Conference on Establishment Surveys III Montreal June 18-21, 2007
11
Edit/Imputation/IE CATI Web SCAN Images Paper Forms KFI Raw Data Sybase/ OLTP Replication Server Redbrick/ OLAP Batch Edit Analysis Data Review Interactive Edit PRD DLT Edit Data Review Interactive Edit Replication Server Data Review Interactive Edit Disclosure/ Tabulation Slide 1Slide Slide 11 International Conference on Establishment Surveys III Montreal June 18-21, 2007 Donor Pool
12
Redesign Initiatives Batch edit in Unix, IE in PC( local) using the same code and same donors True interactive edit (IE) Dual screens for Data Review and Image comparisons Improve donor search strategies- scalable using daemons & SAS/SHARE More use of Previously reported Data (PRD) Slide 1Slide Slide 12 International Conference on Establishment Surveys III Montreal June 18-21, 2007
13
Edit/Imputation/IE CATI Web SCAN Images Paper Forms KFI Raw Data Sybase/ OLTP Replication Server Redbrick/ OLAP Batch Edit Analysis Data Review Interactive Edit PRD DLT Edit Data Review Interactive Edit Replication Server Data Review Interactive Edit Disclosure/ Tabulation Slide 1Slide Slide 13 International Conference on Establishment Surveys III Montreal June 18-21, 2007 Donor Pool
14
Redesign Initiatives Creating new data models for both Transactional (OLTP) and Analytic databases (OLAP) Editing is in OLTP environment. Analysis is in OLAP environment Introduce Replication server- moves and synchronizes data between OLTP and OLAP Perform more server side processing using SAS/CONNECT to reduce interactive response times Slide 1Slide Slide 14 International Conference on Establishment Surveys III Montreal June 18-21, 2007 OLTP=Online Transaction Processing OLAP=Online Analytic Processing
15
Redesign Initiatives Disclosure module converted to SAS/BASE The system is more metadata driven. Provide quality control grids to monitor the editing effects on the data Slide 1Slide Slide 15 International Conference on Establishment Surveys III Montreal June 18-21, 2007
16
Dashboard for Continuous Monitoring Implementing a Quality Control module to track four major areas in a proactive mode - Administrative Management Information System (MIS) reports to track weekly progress - Data Monitor what the system is doing to the data. Tables, maps, graphs, outlier grids Independent check of record level inconsistencies - Elapsed Times Track how long key processes are taking to run - System Stability Track key indicators that can impact performance of databases, UNIX machines, SAS, etc. Slide 1Slide Slide 16 International Conference on Establishment Surveys III Montreal June 18-21, 2007
17
Can the system be more generalized? Wanted to have one system for Surveys and Censuses Metadata can handle both Imputation can handle different types of imputation A few Surveys are using the system Survey Analysts are reluctant to use DLT for Survey edits FH methodology sent back to research for further evaluation. Slide 1Slide Slide 17 International Conference on Establishment Surveys III Montreal June 18-21, 2007
18
Acknowledgment Slide 1Slide Slide 18 International Conference on Establishment Surveys III Montreal June 18-21, 2007 We want to thank each and every member in the 2007 Census Team for their tireless efforts to make the redesign initiatives a reality.
19
Questions? Slide 1Slide Slide 19 International Conference on Establishment Surveys III Montreal June 18-21, 2007
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.