Spanish Census 2001 dissemination on the Internet using Data Warehouse techniques. The DWC project Antonio ARGÜESO INE-Spain Luxembourg - 21 october , 2004
Population and Housing Census 2001: some figures Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project Population and Housing Census 2001: some figures Data collection: November 2001 - March 2002 First results: July 2002. Population by age and sex at municipal level (8,000 municipalities) Detailed results on the Internet: feb 2004 The DWC project: A dissemination system based on Data Warehouse Techniques 1/6
Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project The “classical” output of a statistical operation is a pre-defined set of tables (+ metadata) The output of the Population and Housing Census 2001 Dissemination System is NOT ONLY a set of tables ... ... but a 4-steps procedure to create tables from a storage containing microdata and cubes. 2/6
How to get the information? 4 steps: Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project How to get the information? 4 steps: Step 1: create a table (or select from a list of predefined tables) Step 2: geographical scope: Nation autonomous communities (17) provinces (50+2) municipalities (8.100) censal sections (35.000) (NEXT OCT) Step 3: group population, dwellings, buildings, households... Step 4: Select Variables or filters ...And navigate 3/6
The DWC project: Some figures Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project The DWC project: Some figures Calendar: Licitation Process: March - July 2003 First phase: on the Internet Feb 2004. (All data referred to persons , dwellings, buildings at municipal level) 2nd phase: on the Internet July 2004. (Rest of data at municipal level: households) 3rd phase: on the Internet Oct 2004. (Scheduled) (All data at level below municipalities: 33,000 sections) Costs: 1 M € (development 2003-2004) 0,12 M € annual licenses 0,2 M € improvements in 2005 The team INE: ~ 8 involved (2 full-time equivalents) Contractors: 8-10 involved (5 full-time) 4/6
The DWC project: Statistical confidentiality issues. Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project The DWC project: Statistical confidentiality issues. 1 Limits to number of variables crossed No limit > 20.000 inhab 3 variables From 101 to 20,000 inhab 1 variable < 100 hab Max No. of variables(1) crossed Size of smallest municipality in a query (1) Sex and age groups not included. 1 variable means: 1 var + sex + age group 2 Different level of detail for variables according to size of geographical units. Example: Age: 0,1,2,3,...(municipalities > 100 inhab) 0-4, 5-9, 10-14,... (all municipalities) 5/6
The project: some conclusions / recommendations Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project The project: some conclusions / recommendations 1. DW techniques not suitable for all projects. => Select carefully a first project 2. Have enthusiastic leaders !! 3. Cleaning, imputating: => Respect the timetable 4. Statistical confidentiality requirements. Define precisely 6/6
Spanish Census 2001 dissemination on the Internet using data warehouse techniques. The DWC project Thank you!