Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Use of Administrative Sources for Statistical Purposes Common Problems and Solutions.

Similar presentations


Presentation on theme: "The Use of Administrative Sources for Statistical Purposes Common Problems and Solutions."— Presentation transcript:

1 The Use of Administrative Sources for Statistical Purposes Common Problems and Solutions

2 Public Opinion The level of public concern about government departments sharing data varies from country to country There is usually some suspicion of the motives for data sharing Sometimes public opinion favours data sharing

3 Adopt and publish a code of practice following international standards Clearly stated limits and rules may help reduce concerns The principle of the “one-way flow” of sensitive data must be understood by all Solutions

4 Publish cost-benefit analyses of the use of different sources It may be possible to claim that data are more secure – No questionnaires sent by post – Fewer clerical staff, so fewer people with access to data

5 Public Profile Direct contact with the public via surveys helps raise the profile of the statistical office The use of administrative data can reduce contact with the public and awareness of the work of the statistical office

6 Effective ‘marketing’ of the statistical office and data outputs Greater involvement with education institutions, business groups, and other target customers Solutions

7 Units Administrative units may be different to statistical units: – Job / person – Tax unit / enterprise – Dwelling / household They may need to be converted to meet statistical requirements

8

9 Group Exercise Statistical and Administrative Units – How Many Enterprises?

10 Enterprise Definition “the smallest combination of legal units that is an organisational unit producing goods or services, which benefits from a certain degree of autonomy in decision- making.... An enterprise may be a sole legal unit.” Source: EU Regulation 696/93 on statistical units

11 Examples Taken from “The Impact of Diverging Interpretations of the Enterprise Concept” - a study prepared for Eurostat by Statistics Netherlands with input from Denmark and UK

12 Example 1 Two legal units in an enterprise group have different 4 digit NACE codes; both are selling mainly to third parties outside the group. They share buildings, management, purchases and employees.

13 Answers NL: Combine into one enterprise –Intensity of shared production factors UK: Combine into one enterprise –Intensity of shared production factors DK: Two separate enterprises –Both sell more than 50% outside the group

14 Four Legal Units : A and B have different activities, no combined purchases, but share buildings. C and D share buildings, employees, and purchases. All four present themselves as one firm. Example 2

15 Answers NL/DK: A and B are separate enterprises. Combine C and D into one enterprise –because A and B operate on market terms, whilst C and D share production factors UK: All four in one enterprise –because they present themselves as one firm

16 Three legal units: All produce mainly for external customers, they share management and purchases, and represent themselves as one firm. A and B share a building. B and C have the same activity, share employees and capital goods and can not supply separate data. Example 3

17 Answers NL: Combine into one enterprise –All share management and purchases, and represent themselves as one firm UK/DK: Combine B and C into one enterprise, A is a separate enterprise –Because B and C are horizontally integrated, and data are only available for these two together

18 Twelve legal units form an enterprise group. Only one is active, the others have no employees. Example 4

19 Answers NL: One enterprise which only consists of the active unit –Because units which are not active are not part of an enterprise UK: One enterprise which consists of all units –Because there is no point having separate enterprises for non-active units DK: Each unit is a separate enterprise –There are no strong ties between the units

20 Solutions Automatic rules for simple cases –These must be clear and consistent Statistical “adjustments” –E.g. the statistical unit is persons. The administrative unit is jobs. We know from a survey that working people have, on average, 1.15 jobs. This adjustment factor can therefore be used to estimate persons in employment from jobs Profiling

21 Profiling Definition Profiling is a method to analyse the legal, operational and accounting structure of an enterprise group at national and world level, in order to establish the statistical units within that group, their links, and the most efficient structures for the collection of statistical data. Source: Eurostat Business Registers Recommendations Manual, Chapter 19

22 Profiling Gives a better understanding of complex unit structures It is expensive and time consuming It needs trained staff It is a compromise based on a trade- off between quality, quantity and the resources available

23 Quality QuantityResources

24 Business Profiling in the UK 14 Staff Approx. 1500 cases per year – Including 100 public sector Mix of desk and visit profiling – Approx 200 visits per year Should profilers also collect data from key businesses?

25 Definitions of Variables Administrative data are collected according to administrative concepts and definitions Administrative and statistical priorities are often different, so definitions are often different

26 Unemployment Statistical definition (ILO) – Out of work – Available for work – Actively seeking work Administrative definitions are often based on those claiming unemployment benefits

27 Solutions Know and document the differences and their impact Use other variables to derive or estimate the impact of the difference Statistical adjustments during data processing

28 Classifications Two scenarios: 1. Same classification system 2. Different classification systems

29 Same Classification Used for different purposes May not be a priority variable for the administrative source Different classification rules Different emphasis, e.g. specific activity rather than main activity

30 Solutions Understand how classification data are collected and what they are used for Provide coding expertise, tools and training to administrative data suppliers

31 Different Classifications (or different versions of the same classification) Not always a 1 to 1 correlation between codes Tools are needed to convert codes from one classification to another

32 Solutions (1) Stress the advantages of using a common classification Offer expertise to help re-classify administrative sources Give early notice of classification changes and help implement them across government

33 Solutions (2) Use text descriptions to re-code administrative data Use probabilistic conversion matrices to convert codes –This results in individual unit classifications not always being correct, but aggregate data should be OK

34 Example of a conversion matrix (Approx. 22% probability of correct code!)

35 Missing Data Impute where possible Many different imputation methods are used. Two common methods are: – Deductive Imputation – Hot-deck Imputation

36 Case Study Eurostat have a project to develop enterprise demography They want to estimate the impact of enterprise births Employment of new enterprises is used, but this variable is often missing or unreliable for new units

37 Solutions Calculate turnover per head ratios to impute missing variables Ratios based on “similar” units by classification and size Problems with outliers therefore trimming used, e.g. x% or mean of inter-quartile range

38 Possible Distributions

39 Turnover per head ratios in practice ISICTPH..... 45.11 95 45.12 68 45.21 149..... A business has ISIC class 45.12, turnover is 200, employment is missing. What is the imputed employment value?

40 Imputed employment is: 200 / 68 = 2.94 = 3

41 Ratios such as turnover per head are also very useful for validating updates, matching and detecting errors!

42 Timeliness Two Issues Data arrive too late Data relate to a different time period

43 Data arrive too late Data from annual tax returns are often only available several months after the end of the tax year, so they are unsuitable for monthly or quarterly statistics Lags in registering “real world” events

44 UK VAT Birth Lags (1)

45 UK VAT Birth Lags (2) 2/3 of businesses are on the register within 2 months of start-up Mean lag = 4 months due to “outliers” Median = Approx. 40 days Some pre-register - negative lags

46 Solutions Understand the length and impact of lags Adjust data accordingly Look for ways to reduce lags where possible

47 Different Time Periods Administrative reference period (e.g. Financial/tax year) may not be the same as the statistical reference period Monthly average versus point in time (e.g. employment data)

48 Different Time Periods

49 Solutions Statistical corrections or estimations using data from other reference periods Be aware of possible biases when using point in time reference dates

50 Using data from different sources Data from different sources may not agree This may be due to: – Different definitions, classifications, time periods,.... – Errors

51 Using data from different sources Group Exercise

52

53 Solutions Data validation checks Benchmarking against other sources Priority rules for updating from different sources Knowledge of source quality

54 Benchmarking The map compares UK business register and “Yellow Pages” coverage for England and Wales Key: Blue = More businesses on register Red = More businesses in “Yellow Pages”

55 Priority Rules Different sources can be given different priorities for different variables To stop a “low priority” source overwriting a “high priority” one – Use source codes – Use priority / quality markers – Store dates with variables – Load data in reverse priority order

56 Resistance to Change Statisticians may resist the use of administrative data because they: –Do not trust data unless they collect them themselves; –Focus on negative quality aspects; –Have an over-optimistic view of the quality of survey data; –Assume survey respondents comply with statistical norms.

57 Solutions Education (courses like this!) Take a wider view of all the dimensions of quality, and focus on the impact on users Determine the real relative quality of survey and administrative data Identify how cost savings can be used to improve quality and / or increase outputs

58 Change Management Risk of changes in: – Government / administrative policy – Thresholds – Definitions – Coverage – Systems

59 High-Risk Times Immediately after an election Change of minister Change of government policy Change in EU legislation and…. When you least expect it!

60 Solutions Manage the Risk by: –Legal provisions –Contractual agreements –Regular contact with administrative colleagues –Anticipating changes –Contingency plans

61 Questions: Do you have contingency plans? Have you ever needed contingency plans?

62 Better to be proactive beforehand than have to react after the event !

63 Summary There are many problems to overcome when using administrative sources Most can be reduced by effective planning and management The benefits are still often greater than the costs

64 Group Discussion What is the main problem with using administrative sources in your area of statistics?


Download ppt "The Use of Administrative Sources for Statistical Purposes Common Problems and Solutions."

Similar presentations


Ads by Google