Download presentation
Presentation is loading. Please wait.
Published bySidney Waldridge Modified over 9 years ago
1
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics Montreal, QC June 20 2007
2
Overview Introduction of data warehousing as a concept Approaches to holding data Introduction to the Statistics Canada’s Unified Enterprise Statistics (UES) Program Centralized warehousing of UES data Example of the data warehouse at work
3
Subject-matter areas need or generate different types of information Data to support collection Questionnaires and supporting metadata Frame and sample information Status of each respondent during collection Survey data Administrative data Post-collection processing Edits (metadata) Imputation specifications Allocation specifications Generation of “clean” datasets Tabulation of estimates/analysis of results Value of estimate Data quality indicators Suppression patterns Analysis of coherence Input Data
4
Input Data Warehouse A copy of statistical input data specifically structured for querying and reporting Collection Post-collection processing Tabulation of estimates
5
Approaches to organizing information holdings Decentralized In a completely decentralized approach, each subject matter area maintains its own input data Centralized Centralized data warehouse contains all input data from all subject matter program areas All program areas need to use common concepts and standards for classification, or else a concordance would have to be found among these systems. These are extremes along a continuum
6
Centralized approach Advantages Economies of scale should lead to reduced overall development and maintenance costs Some human resource issues are eased (knowledge and skills retention and transfer) Eases integration of data to support data analysis, coherence analysis, etc. Allows subject-matter divisions to specialize in data analysis rather than data management
7
Decentralized approach Advantages Specialized subject matter expertise readily available Subject matter areas are not dependent on a central authority to make changes therefore flexibility is increased Care and control of the data is clearly established
8
Questions to address in moving to a more centralized environment What purpose does it serve? What must be done to the statistical model to ensure compatibility with other data sources? What mechanisms need to be in place to ensure productive client-service relationship? Who is custodian of the data? Do the benefits in moving to a more centralized environment truly outweigh the costs?
9
Statistics Canada and the Unified Enterprise Survey Program In the late 1990’s, Statistics Canada undertook a major program to improve the quality of the provincial economic accounts released by the Agency and the annual business surveys that feed into accounts These surveys were integrated in order to increase the quality of data produced from these surveys in terms of Consistency Coherence Breadth Depth
10
Features of the UES Improved frame (business register) Sampling made to be consistent across surveys and improved coverage Harmonized content and common collection applications Administrative data are to be used instead of survey data if possible and if the data are of good quality Common post-collection processing systems Common storage of data Central contact management system Improvements in outputs
11
Moving to a more centralized environment What is the purpose? The UES data warehouse forms a repository of all the files created through the processing phases of UES and accompanying metadata. This supports the work of analysts and survey managers in subject matter divisions, collection managers, statistical methodologists and users in the System of National Accounts
12
Moving to a more centralized environment What must be done to the statistical model to ensure compatibility with other data sources? The statistical model for UES surveys forced the harmonization of concepts, definitions and classifications across surveys Integration of survey and administrative data required the mapping of tax data to survey data (harmonized conceptually as well as characteristically)
13
Moving to a more centralized environment What mechanisms need to be in place to ensure productive client-service relationship? Project management structure for the UES that crosses functional boundaries Change management function to ensure seamless integration of surveys into UES
14
Moving to a more centralized environment Who is custodian of the data? ESD controls access to all common systems. Subject matter divisions are exclusively responsible for dissemination, including the determination of aggregations and data suppressions (due to quality and confidentiality)
15
Moving to a more centralized environment Do the benefits in moving to a more centralized environment truly outweigh the costs? Reduction in development costs Development of best practices that can be shared across the bureau Single point of access for input data improves security of all UES related data Rationalization of hardware to minimize the number of servers
16
The UES Data Warehouse UES Warehouse is centrally managed within Enterprise Statistics Division Major components of the data warehouse include: Metadata repository Processing metadata Central data store (CDS) External data Data that originate outside UES but have been integrated in the UES framework
17
The UES Data Warehouse Systems interfacing with the data warehouse: Unified Tracking and Retrieval Tool (USTART) Integrated Questionnaire Metadata System (IQMS) UES Processing Interface Working Estimation Environment (WEE) interface Macro-data adjustment Facility
18
Operational applications Operational monitoring Coherence analysis Baseline information for operational research Quality measures (i.e. response rate analysis) Integrated data analysis
19
Response rates in collection
22
Final response rates
23
The centralized system in action Outcomes The centralized input data warehouse provides a centralized tool that allows users to track performance on a consistent basis Same method Same source data
24
Conclusion The centralized data warehouse offers benefits to statistical programs There are a number of conditions that must be fulfilled for success Purpose Data compatibility Client-service relationship Custodian of data Cost-benefit
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.