Download presentation
Presentation is loading. Please wait.
Published byMatteo Deverell Modified over 9 years ago
1
Eurostat Statistical Disclosure Control
2
Presented by Peter-Paul de Wolf, Statistics Netherlands (CBS)
3
Content Introduction What’s the problem? –Specific for business statistics Formalising the problem What to do? –Methods –Software Summary
4
Introduction General definition of confidential data: Data can not be published “as is” »By law (e.g. statistical law) »Sensitive data (what’s sensitive?) »Respondent considers it confidential »…
5
Introduction Physical protection –Entrance –Network Legal protection –Oath Statistical Disclosure Control –Protection of statistical output
6
What’s the problem? Statistical output Microdata –Not often in case of business data –Obvious: each record represents a single respondent Tabular data –In business data often magnitude tables –Sometimes frequency tables –But: aggregated data?!?!?!?
7
Cell value itself not sensitive: –All contributions are equal (1) Spanning variables –Indentifying, e.g. NACE, Region –Sensitive, e.g. “environmental offence” (illegal dumping of waste, illegal fishing, oil spills, …) What’s the problem (frequency table)
8
Example: number of ship-owners Environmental offence RegionYes No Total … A 9 0 9...
9
What’s the problem (frequency table) Example: number of ship-owners Environmental offence RegionYes No Total … B 14 2 16...
10
What’s the problem (frequency table) Example: number of ship-owners Environmental offence RegionYes No Total … C 1 1 2...
11
What’s the problem (magnitude table) Turnover (10 6 €) of instrument producing companies Region A B C Total Harps 58 151 47 123 36 98 141 372 Organs 71 16 124 21 24 9 219 46 Pianos 92 5 157 2 59 1 308 8 Other 800 302 934 362 651 287 2385 951 Total 1021 474 1262 508 770 395 3053 1377
12
What’s the problem (magnitude table) Turnover (10 6 €) of instrument producing companies Region A B C Total Harps 58 151 47 123 36 98 141 372 Organs 71 16 124 21 24 9 219 46 Pianos 92 5 157 2 59 1 308 8 Other 800 302 934 362 651 287 2385 951 Total 1021 474 1262 508 770 395 3053 1377 ?
13
Formalising the problem Suppose cell (Piano, A) consists of Company X: 8110 6 € Company Y: 510 6 € Other three: 210 6 € each Total : 9210 6 € 92 – 5 = 87 is within 7.4%!
14
Formalising the problem General, objective rules needed Threshold rule Dominance rule or (n,k)-rule p%-rule p%-rule is favoured over (n,k)-rule and implies minimum of 3 contributors
15
What to do? Redesign table –Combine rows/columns –Define different categories Rounding Add noise Cell suppression
16
Region A B C D Total Harps 58 47 36 89 230 Organs 71 124 24 31 250 Pianos 92 157 59 28 336 Other 800 934 651742 3127 Total1021 1262 770 890 3943
17
Cell suppression Region A B C D Total Harps 58 47 36 89 230 Organs 71 124 24 31 250 Pianos 92 157 59 28 336 Other 800 934 651742 3127 Total1021 1262 770 890 3943 X X X
18
Cell suppression Region A B C D Total Harps 58 47 36 89 230 Organs 71 124 24 31 250 Pianos 92 157 59 28 336 Other 800 934 651742 3127 Total1021 1262 770 890 3943 X X X X XX
19
Cell suppression Region A B C D Total Harps 58 47 36 89 230 Organs 71 124 24 31 250 Pianos 92 157 59 28 336 Other 800 934 651742 3127 Total1021 1262 770 890 3943 X X X XX X X X X
20
Cell suppression Region A B C D Total Harps 58 47 36 89 230 Organs 71 124 24 31 250 Pianos 92 157 59 28 336 Other 800 934 651742 3127 Total1021 1262 770 890 3943 X X X XX X X X X
21
Software Latest version can be found on http://neon.vb.cbs.nl/casc New Open Source version available end 2014
22
Contact/info Glossary, handbook, project info –http://neon.vb.cbs.nl/caschttp://neon.vb.cbs.nl/casc Wiley book pp.dewolf@cbs.nl
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.