Urban Audit Quality Check of 2003 Urban Audit Data Eurostat Working Group on Regional Statistics 3-5 November 2004 Willibald Croi, Landsis g.e.i.e. (Luxembourg) supported by Prof. Risto LEHTONEN, University Jyväskylä (Finland)
Control Methods Checking of variables totals with provided aggregates Checking of indicator values against control limits Time line controls Interactive check
Control Methods Systematic controls Uni- and multi-variate controls Controls at each spatial level Controls across spatial levels Controls for each year/period Controls across years/periods
Checking of variable totals against provided aggregates Example: UK001 (London); 2001; Total Resident Population; DE1040V = DE1002V+DE1003V (Male + female) City Level: 7.172.091 = 3.468.793 + 3.703.298 LUZ Level: 11.624.807 = 5.644.620 + 5.980.187 SCD Level: (ex. UK001D05007 Highgate) 10.294 = 4.962 + 5.332 National Level: 59.623.406 = 29.370.634 + 30.252.772 Kernel level: 2.766.065 = 1.340.599 + 1.425.466 (Tolerances for estimates or rounding included)
Checking of indicator values against control limits Fixed ranges of accepted values according to common sense Ex.: EC3060I Proportion of HH reliant on Social Security: Accepted values in range of 6-49% (GHK control) Out of in total 1269 values, 643 are out of range…
Checking of indicator values against control limits Acceptable ranges of indicator values calculated using descriptive statistics Mean, SD of indicator values calculated per period, per country, per groups of countries (EU12, EU15) at all spatial levels (i.e. city, LUZ, SCD1 & 2) iterative process (excluding obvious outliers, e.g. value >100% in proportions) depending on relevance and data availability Individual values compared to ranges of +-x*SD from mean
Checking of indicator values against control limits Acceptable range = Mean +-3*SD (according to indicator)
Time line controls Average growth rates will be calculated and cross/checked with national growth rates Depending on number of available data
Time line controls Ex.: SA1023V Average price for a house per m2 1991 – 2000 Finland:
Interactive checks Individual controls Values that might not be controlled by systematic procedures: Values not yet controlled (missing value of an age group for a specific spatial unit, too few years available etc.) no meaning of means and SD Individual controls
Error Reports Each record “out of control” will be flagged “?” Lists will be discussed, checked and agreed by Eurostat before sending to NUACs Lists of values to be controlled will contain meta data on the control procedure NUACS are in charge to validate (update or confirm) the data Guidelines concerning the procedure will be given
Schedule Two phases: Workplan 1. Phase October – December 2004 Error localisation started with implementation of systematic controls Error reports expected for week 49 (end Nov.) Validated data from NUACs expected by Eurostat before the end of the year 2. Phase May – June 2005: control of data received since October 2004
Thank you for your Patience ! willibald.croi@landsis.lu