Presentation is loading. Please wait.

Presentation is loading. Please wait.

European Conference on Quality in Official Statistics, Rome, July 2008 Community Innovation Survey: a Flexible Approach to the Dissemination of Microdata.

Similar presentations


Presentation on theme: "European Conference on Quality in Official Statistics, Rome, July 2008 Community Innovation Survey: a Flexible Approach to the Dissemination of Microdata."— Presentation transcript:

1 European Conference on Quality in Official Statistics, Rome, July 2008 Community Innovation Survey: a Flexible Approach to the Dissemination of Microdata Files for Research Daniela Ichim

2 European Conference on Quality in Official Statistics, Rome, July 2008 Outline Dissemination of Microdata Files for Research Risk assessment Disclosure limitation Data quality –Record linkage –Data utility

3 European Conference on Quality in Official Statistics, Rome, July 2008 Confidentiality against Dissemination Find the right balance! Disclosure scenarios

4 European Conference on Quality in Official Statistics, Rome, July 2008 Community Innovation Survey IDENTIFYING VARIABLES –Nace –Nuts –Size –Turnover (TURN) (STRUCTURAL VARIABLES) CONFIDENTIAL VARIABLES –Expenditures in innovation (RTOT, …) –Number of patents, … (VARIABLES INVOLVED IN ANALYSES)

5 European Conference on Quality in Official Statistics, Rome, July 2008 Confounding Categorical Numerical safe unsafe A … A k-anonymity

6 European Conference on Quality in Official Statistics, Rome, July 2008 a) Given a threshold (on units) b) Local Outlier Factor as a measure of difference in density between a unit and its nearest neighbours General risk function Distance between and Density around :

7 European Conference on Quality in Official Statistics, Rome, July 2008 Threshold - dissemination policy Parameters Cut-off point for density (LOF) –quantiles –automatic

8 European Conference on Quality in Official Statistics, Rome, July 2008 Stratification variables Analysis by Nace Nace A all Nace

9 European Conference on Quality in Official Statistics, Rome, July 2008 Disclosure limitation MFR  Selective masking k-anonymity Nearest neighbour Micro-aggregation on tails

10 European Conference on Quality in Official Statistics, Rome, July 2008 Quality assessment Dissemination Confidentiality

11 European Conference on Quality in Official Statistics, Rome, July 2008 Risk measure assessment Quality of the external database D E Chambers of Commerce database Record linkage

12 European Conference on Quality in Official Statistics, Rome, July 2008 Record linkage M*=3 1 equal unit within 10% less than 3 units within 10% less than 3 units within 20% less than 3 units within 30% NACE 88%84%97%100% NACE EMP 63%60% a 74% a 87% a M*=5 1 equal unit within 10% less than 5 units within 10% less than 5 units within 20% less than 5 units within 30% NACE 88%73%87%96% NACE EMP 63%58% a 70% a 80% a a) 100% for enterprises with more than 250 employees

13 European Conference on Quality in Official Statistics, Rome, July 2008 Information content analysis Information preservation Selective masking –Data utility –Only identifying and confidential variables were modified. –Only records at risk were modified. The weights were not modified. –weighted totals (coherence with the already published information) Some statistical indicators were slightly modified: variances

14 European Conference on Quality in Official Statistics, Rome, July 2008 Information content analysis Data utility Assessment of the perturbation impact on ratios like RTOT/TURN Original Selective masking Individual ranking

15 European Conference on Quality in Official Statistics, Rome, July 2008 Conclusions 1.Confidentiality: Risk measure based on the k- anonymity principle Flexible a) continuous and categorical variables b) easy to implement c) consistent for extreme choices 2.Data utility: Selective protection to achieve the k- anonymity 3.Comparable dissemination: Control both risk of re-identification and information loss QUALITY DIMENSIONS


Download ppt "European Conference on Quality in Official Statistics, Rome, July 2008 Community Innovation Survey: a Flexible Approach to the Dissemination of Microdata."

Similar presentations


Ads by Google