U.S. flag

An official website of the United States government

Skip Header


Data Masking for Disclosure Limitation

Written by:
DOI: 10.1002/9781118445112.stat00064.pub2

Abstract

Governmental agencies that conduct surveys and censuses collect data from respondents with the purpose of releasing it in the form of statistical  summaries. The more detailed the summary is, the more likely a data intruder will be able to extract  confidential data about individual respondents from the released data. However, there are various ways of redesigning the data product and/or modifying the data themselves to protect the data while preserving their usefulness. We discuss methods that achieve these two goals: (i) a data intruder will not be able to extract, with high confidence, confidential data directly fromthe data product or derive confidential microdata fromseveral data products; and (ii) the released data are still quite detailed and useful to most data users, including researchers. Such “data-masking” methods comprise a fast growing field  often called statistical disclosure control.


We discuss somesimpler methods that havebeen used for decades, such as detail reduction, cell suppression, and data swapping; some methods developed in the 1990s, such as rank swapping, data shuffling, and multiplicative noise; and some methods developed in recent decade, such as randomization of microdata with constraints (PRAM) and synthetic data.

Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header