Administrative Data and their Use in Economic Statistics Vladimir Markhonko United Nations Statistics Division Vladimir Markhonko 12/7/2007
Contents Definitions Advantages of using administrative data Common problems Quality of administrative data Using administrative data in practice Conclusions Vladimir Markhonko 12/7/2007
Narrow Definition Vladimir Markhonko 12/7/2007
Wider Definition Vladimir Markhonko 12/7/2007
Administrative sources are sources containing information which is not primarily collected for statistical purposes. Vladimir Markhonko 12/7/2007
Reasons for this Definition Privatisation of some government functions Growth of private sector “value-added re-sellers” User interest in new types of data Vladimir Markhonko 12/7/2007
Benefits of Administrative Data Cost Surveys / censuses are expensive, administrative data are often “free” Response burden Reduced burden on data suppliers Statistics can be compiled more frequently with no extra burden Vladimir Markhonko 12/7/2007
Benefits of Administrative Data Coverage Full coverage of target population No survey errors and lower non-response Better small-area data Timeliness (sometimes!) Public image Making use of existing data can enhance the prestige of a statistical organisation by making it seem more efficient Vladimir Markhonko 12/7/2007
Population Census Costs 2000-2001 UK, €367m, €6.2 per person Austria, €56m, €6.9 per person Finland, €0.8m, €0.2 per person Source: Eurostat – Documentation of the 2000 round of population and Housing censuses in the EU, EFTA and Candidate Countries; Table 22 Vladimir Markhonko 12/7/2007
Common Problems Administrative units do not always coincide with statistical units Conversion via automatic rules for simple cases Profiling for more complex cases Gives a better understanding of complex business structures Expensive and needs trained staff Vladimir Markhonko 12/7/2007
Vladimir Markhonko 12/7/2007
Common Problems Different definitions and classifications Timeliness Administrative and statistical priorities are often different Conversion matrices needed for different classifications Timeliness Data arrive too late Data relate to a different time period Vladimir Markhonko 12/7/2007
VAT Birth Lags Vladimir Markhonko 12/7/2007
VAT Birth Lags 2/3 of businesses are on the register within 2 months of start-up Mean lag = 4 months due to “outliers” Median = Approx. 40 days Some pre-register - negative lags Vladimir Markhonko 12/7/2007
Common Problems Change management Data from multiple sources Risk of changes in government policy, thresholds, definitions, coverage etc. Need contingency plans Data from multiple sources Matching / linking issues Data conflicts – priority rules Vladimir Markhonko 12/7/2007
Quality of Administrative Data There are many aspects to quality Administrative data will be better than survey data in some aspects but not in others It is important to look at overall quality Do the data meet the needs of users? Vladimir Markhonko 12/7/2007
Three Aspects of Quality Quality of incoming data Quality of processing (matching, merging, ...) Quality of outputs - likely to be different to survey based outputs, but are they better? Vladimir Markhonko 12/7/2007
Quality Measurement How to measure the quality of data from administrative sources? Comparing sources Quality check surveys Knowledge of source (metadata) Quality reports / templates Vladimir Markhonko 12/7/2007
Quality Templates Vladimir Markhonko 12/7/2007
Using Administrative Data Conversion to statistical concepts and definitions Linking / Matching Exact Matching - linking records from two or more sources, often using common identifiers Probabilistic Matching - determining the probability that records from different sources should match, using a combination of variables Vladimir Markhonko 12/7/2007
UK Business Register Vladimir Markhonko 12/7/2007
Vladimir Markhonko 12/7/2007
Satellite Registers Vladimir Markhonko 12/7/2007
Examples of Satellite Registers Tourism - hotel register (category, number of beds) Transport - vehicle or ship register (type, capacity) Distributive trades - buildings register (building size, sales area) Vladimir Markhonko 12/7/2007
Conclusions Administrative sources should be defined in the widest sense There are many benefits in using administrative data, particularly reduced costs There are problems when using administrative data, but usually someone has found a solution Vladimir Markhonko 12/7/2007
Conclusions Most problems can be reduced by effective planning and detailed knowledge of the source The benefits are often greater than the costs Vladimir Markhonko 12/7/2007
Thank you for your attention. Vladimir Markhonko 12/7/2007