Presentation is loading. Please wait.

Presentation is loading. Please wait.

Imputation as a Practical Alternative to Data Swapping

Similar presentations


Presentation on theme: "Imputation as a Practical Alternative to Data Swapping"— Presentation transcript:

1 Imputation as a Practical Alternative to Data Swapping
Saki Kinney, David Wilson, Alan Karr (RTI); Kelly Kang (NSF) 29th July 2019

2 Statistical Disclosure Control
Agencies frequently publish microdata files which have been subject to alteration for the purpose of protecting the confidentiality of individuals represented in the dataset.These data still retain many important statistical properties of the original data. Statistical disclosure control (SDC) methods strive to balance data quality and confidentiality protection. Examples of SDC methods Data reduction: Top-coding, coarsening, rounding, suppression Data perturbation: Data swapping, Synthetic data (imputation)

3 Data Swapping and Imputation
In data swapping, selected variables and records have their values swapped with other similar records. This serves to add uncertainty to any attempted record linkage. In synthetic data applications, typically larger portions of datasets, sometimes entire datasets, have their values replaced with multiple imputations. This can provide high disclosure protection while allowing users to make valid inferences that account for uncertainty due to the disclosure protection.

4 Our paper We propose to apply imputation in the paradigm of swapping. That is, a select portion of records have their values replaced with (single) imputations. Compared to swapping: Imputation is simpler to implement with open source software Model-based approach is more flexible, intuitive, and transparent Imputation approximately preserves marginal distributions whereas swapping preserves precisely Imputation better preserves relationships between perturbed and unperturbed variables Higher perturbation levels can be used, enhancing disclosure protection We conducted experiments to demonstrate benefits. See poster for results and discussion.


Download ppt "Imputation as a Practical Alternative to Data Swapping"

Similar presentations


Ads by Google