Skip Header

We are hiring thousands of people for the 2020 Census. Click to learn more and apply.

A Local l-Diversity Mechanism for Privacy Protected Categorical Data Collection

Tapan K. Nayak and Xiaoyu Zhai
Component ID: #ti912985672


We consider the task of protecting respondent's privacy when collecting data on categorical variables. Any mechanism for masking the true value of a respondent can be viewed as a randomized response (RR) procedure, and its prudent planning depends crucially on the given privacy criterion. We examine some existing privacy criteria and describe their drawbacks. We show that a previous notion of average security is inappropriate. Several other criteria, which simply impose upper bounds on the parity of the RR design, inflict severe data utility loss, unless the number of categories is fairly small. This applies to local differential privacy (LDP), which is a leading privacy criterion, and reveals substantial statistical inefficiency of the RAPPOR procedure, which has been in use by Google, Apple and others. We propose a new privacy procedure that is similar to l-diversity but, works locally for each respondent. The procedure is simple to implement and its privacy protection is easy to understand and communicate to survey participants. We give an unbiased estimator of the probability vector of all categories and prove its minimaxity within a class of estimators under squared error loss. We argue and believe that the new procedure offers a better privacy-utility trade-off than LDP.

You May Be Interested In

  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
Comments or suggestions?
No, thanks
255 characters remaining
Thank you for your feedback.
Comments or suggestions?
Back to Header