Skip Header

Data Masking for Disclosure Limitation

DOI: 10.1002/9781118445112.stat00064.pub2

Paul B. Massell, Michael H. Freiman, and Laura V. McKenna

Component ID: #ti1674214985

Abstract

Governmental agencies that conduct surveys and censuses collect data from respondents with the purpose of releasing it in the form of statistical  summaries. The more detailed the summary is, the more likely a data intruder will be able to extract  confidential data about individual respondents from the released data. However, there are various ways of redesigning the data product and/or modifying the data themselves to protect the data while preserving their usefulness. We discuss methods that achieve these two goals: (i) a data intruder will not be able to extract, with high confidence, confidential data directly fromthe data product or derive confidential microdata fromseveral data products; and (ii) the released data are still quite detailed and useful to most data users, including researchers. Such “data-masking” methods comprise a fast growing field  often called statistical disclosure control.


We discuss somesimpler methods that havebeen used for decades, such as detail reduction, cell suppression, and data swapping; some methods developed in the 1990s, such as rank swapping, data shuffling, and multiplicative noise; and some methods developed in recent decade, such as randomization of microdata with constraints (PRAM) and synthetic data.

X
  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
X
Comments or suggestions?
No, thanks
255 characters remaining
X
Thank you for your feedback.
Comments or suggestions?
Back to Header