Download presentation
Presentation is loading. Please wait.
1
Spontaneous recognition: Risk or distraction
Presentation by Felix Ritchie Bristol Business School Budapest
2
Spontaneous recognition (SR)
recognising respondents in microdata examples: street, gender, age university department, gender, job title University of Tasmania Thurso power station need not be correct can lead to perceptions of insecurity important confidentiality risk Scientific Use Files (SUFs) and Secure Use Files (SecUFs) much effort expended on eliminating SR
3
SR: meaningless concept?
you can’t prove it doesn’t exist you can’t prove that it has occurred [unless IC occurs: see below…] it’s lawful it ignores human nature statistical protection excessive
4
Better: identity confirmation (IC)
actively taking steps to confirm suspicions of identity Examples: cross-checking with other variables checking other data sources speculating on the identity of respondents with others
5
Why is IC better? Focuses on the unlawful activity Easier to detect
Fewer assumptions needed about ability to breach confidentiality Management interventions well suited to this can be covered in contracts/training no need to damage data unnecessarily Exploits knowledge of actual human behaviour Reinforces trust models Emphasis is on engagement in research community Greater researcher buy-in = lower risk Researchers encouraged to report SR, not punished
6
Does this matter? An example
Community Innovation Survey: European business survey on R&D and innovation activities Old method: Assumes intruder, external dataset for matching, SR 100% of all continuous variables microaggregated New method: Assumes idiot, potential for IC additional guidelines for researchers issued 1% of one continuous var microaggregated
7
So why do data owners focus on SR?
almost all NSI decision-making is defensive Defensive: “I won’t do this unless I’m certain it’s safe” Faciliating: “I will do this unless I can’t do it safely” uses hypothetical worst cases rather than empirical evidence not possible to disprove requires those wanting to release data to prove safety no requirement for NSI to produce evidence statistical disclosure control literature entirely defensive
8
No-one gets fired for this…
“I’m not convinced yet that all the confidentiality risks have been minimised” …which they should be
9
Dealing with SR Identify the likelihood of genuine SR
Deal with the trivial blindingly obvious recognition no research value in the troublesome variables Focus on the likelihood of IC Use real evidence, not hypothetical modelling Consider management interventions as well as statistical solutions Keep data damage as the residual Ideally do all this from a facilitating perspective
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.