We consider some aspect of using an additive mixture noise model for real microdata masking as a generalization to using normally distributed masking noise introduced by Roque. We introduce a simplified procedure for computing additive mixture noise and consider the effectiveness of this approach from the point of view of information loss measures and record re-identification. We concentrate on the information loss statistics for the variance/covariance matrix of the full data set and for arbitrary subsets. We consider some of the information loss statistics introduced by Domingo-Ferrer and we introduce some analytic alternatives. We see that for the full data sets, the analytic properties are well preserved and the data masking is effective. The analytic properties are less well preserved on the subsets of this highly skewed data. We include some SAS programs used in the study.
Source: U.S. Census Bureau, Statistical Research Division