Quality in statistics: the BR case

1 Quality in statistics: the BR case
Session: Quality indicators and quality measurement of Statistical Registers 10 July 2008 Monica Consalvi – Giuseppe Garofalo – Caterina Viviano Italian National Statistical Institute

2 Business Register vs Statistical Survey
Quality in statistics: the BR case Business Register vs Statistical Survey BRs are statistical products with their own specificities: Extensive use of Administrative data Heterogeneity and variability of inputs Relevance of technological aspects Output specificity (dissemination of micro data) Heterogeneity of users Continuous data updating

3 Business Register vs Statistical Survey – Quality specificities
statistics: the BR case Business Register vs Statistical Survey – Quality specificities Extensive use of Administrative data The problem of quality is set in a different context – in comparison with statistical surveys – it is resolvable only ex-post: data is known but not how it is generated Heterogeneity and variability of inputs Quality indicators for specific subsets of units and for different variables are necessary Relevance of technological aspects Huge amount of data, complex procedures for data integration and methodologies application, changes over time in applied rules (e.g. changes in classification, in adm. sources contents….) Output specification The dissemination of micro data suggests that “errors annul each other on average” is not true anymore. With reference to BR errors add one to another (e.g. over and under coverage)

4 Heterogeneity of users Continuous data updating
Quality in statistics: the BR case Business Register vs Statistical Survey – Quality specificities Heterogeneity of users The BR’s reference universe and updating period will be different if used for the STS rather than for SBS If the Value Added is estimated referring to BR’s universe, the quality (e.g. activity code and size) of large units will be fundamental. If the indicators of the Business Demography take the BR as reference, the quality of the smaller units will be very important. Continuous data updating Need to identify actual and spurious changes: structural development of the economy demographic aspects, changes in size changes in economic activity process of revision of the register the BR may acquire data referring to a previous time actual changes recorded at a later time delay in recording birth/death or in recording changes in characteristics in the administrative registers

5 The BR quality indicators
statistics: the BR case The system of quality indicators refers to three dimensions: The phases of the BR’s updating process A framework of components of the quality The factors for the building up of the indicators Per valutare un prodotto statistico Un framework di criteri di qualità rispetto ai quali effettuare delle valutazioni

6 Quality in statistics: The BR case The phases of the BR’s updating process The BR is the result of a conceptual and physical integration of several administrative and statistical input sources 1)  Quality of the INPUT (input sources) 2)  Quality of the process (matching, merging, editing, updating) 3)  Quality of the OUTPUT

7 To monitor the BR quality the most frequently used components are:
in statistics: the BR case A framework of components of the quality To monitor the BR quality the most frequently used components are: Coverage in terms of both units and variables Timeliness in terms of delay in updating Completeness - Accuracy L’obiettivo è quello di avere la migliore qualità possibile riseptto a tutte queste compioneti , in pratica sappiamo che c’è un trade-off nel senso che se massimizziamo la mqulità ad ese nella completezza dobbiamo apsettre ache tutte le fonti siamo a discapito della tempestività…

8 The factors for the building up of the indicators
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case The factors for the building up of the indicators La qualità del registro ASIA Five factors for defining a BR quality indicator: time, scope, subpopulation, variable and criterion The most important factor is the criterion : a method to evaluate, unit by unit, the correctness of the variables’ values of the interest Compliance Internal Consistency Temporal Consistency Metadata L’obiettivo è quello di avere la migliore qualità possibile riseptto a tutte queste compioneti , in pratica sappiamo che c’è un trade-off nel senso che se massimizziamo la mqulità ad ese nella completezza dobbiamo apsettre ache tutte le fonti siamo a discapito della tempestività…

9 Criteria (1) 1. Compliance 2. Internal consistency
Quality in statistics: the BR case Criteria (1) 1. Compliance The value of a unit of the BR can be considered as correct if it is sufficient “close” to the reference value (external sources). The compliance determines whether or not the BR complies with an ex. source The compliance comes close to the reliability when the real value is not known 2. Internal consistency A value will be deemed “correct” if it is coherent in relation to other variables of the same unit.

10 4. Quality without ‘witness’ (use of metadata)
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case Criteria (2) La qualità del registro ASIA 3. Temporal consistency The quality is defined on the basis of a comparison between two values in two different periods. Big changes in short temporal lags are defined as impossible or less plausible 4. Quality without ‘witness’ (use of metadata) Usage of a set of information included in the BR to measure quality without needing a reference value and with no element of comparison - variables of BR management or metadata system: validity date, estimation methodology, origin of data, data validation process.

11 Phase: Input / Component : timeliness / Factor: temp. consistency
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case Phase: Input / Component : timeliness / Factor: temp. consistency Source: Social Security Indicator: Percentage of records with declared employees by month

12 Phase: Input / Component : coverage / Factor: temp. consistency
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case Phase: Input / Component : coverage / Factor: temp. consistency Source: Chamber of Commerce Indicator: Loss of information in dates of cessation Supply’s year 2001 2002 2003 2004 2005 2006 BR reference year I(t)% Cessation date 2000 1-[N(t+1)/N(t) 19 100 20 178 31 -9,6 14 30.055 36 -2,7 - 35 -3,1 4 -6,4 28 79.721 I(2004) = ( )/ = -6,4 12

13 Phase: Process / Component : accuracy / Factor: metadata
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case Phase: Process / Component : accuracy / Factor: metadata Indicator: Variables Edit and Imputation VAR INDICATOR It=2005 It=2005% NACE N° edit 1,85 % N° imputation 87.628 43,31 % N° edit without imputation 56,69 % VAR INDICATOR It=2005 It=2005% Empl. N° edit 74.312 0,68 % N° imputation 72.768 97,92 % N° edit without imputation 1.544 2,08 % I(2004) = ( )/ = -6,4 13

14 Phase: Process / Component : accuracy / Factor: int. consistency
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case Phase: Process / Component : accuracy / Factor: int. consistency Source: Tax Authority Indicator: out-of-date classification INDICATOR It=2005 It=2005% Var_I[t-(t-1)] N° record with out-of-date classification that are not decoded using NACE Rev 1.1 9,53 % 0,84 % I(2004) = ( )/ = -6,4 14

15 Phase: Output / Component : accuracy / Factor: compliance
Quality in statistics: the BR case Source: SME sample survey Indicator: differences in address and activity status

16 Phase: Output / Component : accuracy / Factor: temp. consistency
Quality in statistics: the BR case Phase: Output / Component : accuracy / Factor: temp. consistency Indicator: coherence in activity status Ij= 100 –[(xkj * ek) / xkj * 100] I2005 = 97.8

17 The BR’s Quality Declaration (QD)
in statistics: the BR case The BR’s Quality Declaration (QD) QD is a complex system of quality indicators QD is based on the concept of transparency: to supply all the meaningful and useful tools to measure different quality components in relation to each stage of the process. QD consists of a rich documentation made up of a set of important direct and indirect indicators, having a time dimension for data, sources and variables. QD contains: meta-data a set of indicators easily to be interpreted

18 The BR’s Quality Declaration (QD)
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case The BR’s Quality Declaration (QD) Phases of the process Components Factors Input C: timeliness, coverage, completeness, F: temporal consistency, internal consistency Process C: coverage, accuracy F : temporal consistency, internal consistency, metadata Output C: timeliness, coverage, completeness, accuracy F: compliance, internal consistency, metadata Input – si da una valutazione qualitativa sulle modificazioni nel numero di records nel tempo, focalizzandosi su alcune variabili rilevanti e con riferiemtno alla fonte specifica.

19 The BR’s Quality Declaration (QD)
in statistics: the BR case The BR’s Quality Declaration (QD) 37 Indicators have been identified:

20 The BR’s Quality Declaration (QD)
A methodological process for assessing variables coming from administrative sources Quality in statistics: the BR case The BR’s Quality Declaration (QD)

21 The BR’s Quality Declaration (QD)
in statistics: the BR case The BR’s Quality Declaration (QD) The QD has been disseminated to internal users for the first time in 2007 Problems not solved yet: Dissemination of a different version for external users - containing only meta-data and indicators on quality of output. The necessity to obtain a synthetic view of the proposed indicators using “compound indicators”. Internal users were involved in the discussion around QD, but a deeper analysis of their suggestions has not been considered yet.

