Most statistical agencies are concerned with the dual challenge of releasing quality data and reducing, if not totally eliminating, the risk of divulging private information. Various data masking procedures such as data swapping, cell suppression, use of synthetic data and random noise perturbations have been recommended and used in practice to meet these two objectives. This paper investigates properties of random noise multiplication as a data masking procedure, especially for tabular magnitude data. We study effects of multiplicative noise on both data quality and disclosure risk. We establish that quite generally under independent random noise multiplication, the moments and correlations of the original data can be unbiasedly recovered from their noise-perturbed versions. In the context of tabular magnitude data, we show that independent multiplicative noises affect the quality of a cell total more for sensitive cells than for non-sensitive cells. For assessing disclosure risk and choosing a suitable noise distribution we use the prediction error variance in a very conservative scenario, where for a target unit, an intruder knows the perturbed cell total as well as all values within the cell, except the target unit's value. We also derive some interesting properties of a balanced noise method, proposed recently by Massell and Funk (2007a, b). Specifically, we prove that for any set of units, the perturbed total is symmetrically distributed around the total of the corresponding original values, so a perturbed total is an unbiased estimate of the original total. The reduction in the variance of a cell total, from the balancing mechanism, is also ascertained.
Data quality, disclosure risk, noise variance, tabular data, unbiasedness, variance inflation.
Nayak, Tapan K., Bimal Sinha, and Laura Zayatz. (2010). Statistical Properties of Multiplicative Noise Masking for Confidentiality Protection . Statistical Research Division Research Report Series (Statistics #2010-05). U.S. Census Bureau. Available online at <http://www.census.gov/srd/papers/pdf/rrs2010-05.pdf>.
Source: U.S. Census Bureau, Statistical Research Division
Published online: March 5, 2010
Last revised: February 28, 2010
This symbol indicates a link to a non-government web site. Our linking to these sites does not constitute an endorsement of any products, services or the information found on them. Once you link to another site you are subject to the policies of the new site.