Skip Header

Noise Multiplication for Statistical Disclosure Control of Extreme Values in Log-normal Regression Samples

CDAR2014-04
Martin Klein, Thomas Mathew, and Bimal Sinha
Component ID: #ti1325844934

Introduction

Statistical agencies must control disclosure risk when releasing data to the public. If income data on individuals or businesses are released, it could be possible to match extremely large values to specific individuals or businesses that are known to be wealthy, especially if some additional information is available on the same units in the dataset. The purpose of the present investigation is to explore noise multiplication as a strategy to protect large values in a dataset from disclosure, and to develop methodology for analyzing the resulting data under the assumption of a log-normal distribution on the sensitive variable. We assume that the log-scale mean of the sensitive variable is described by a linear regression on a set of non-sensitive covariates, and that the goal of the data analysis is to draw inference on parameters in the regression. We focus on the log-normal distribution because it is well known to be appropriate for modeling income data ([6]; [8]; [11]; [20]), and for income data, the extreme values usually need disclosure protection.

Related Information


X
  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
X
Comments or suggestions?
No, thanks
255 characters remaining
X
Thank you for your feedback.
Comments or suggestions?
Back to Header