U.S. flag

An official website of the United States government

Skip Header


Just Released: Noisy Measurement File for the 2010 DHC Demonstration Data

June 30, 2023: The U.S. Census Bureau today released the noisy measurement file (NMF) for the 2010 Census Production Settings Demographic and Housing Characteristics File (DHC).

This is the second set of demonstration noisy measurement files that apply the production settings that were used for the 2020 Census data to the 2010 Census data.

We released the 2010 Census Production Settings Redistricting Data (P.L. 94-171) Demonstration NMF on April 3. On June 15 we released the NMF for the 2020 Census redistricting data as a research data product, and we’ll release the corresponding NMF for the 2020 Census DHC later this fall.

There are approximately 25 trillion numbers associated with the combined redistricting data and DHC noisy measurement files, consuming 33 terabytes of space in a compressed format. Given their large size, the files are housed and accessible via off-site locations.

The noisy measurement files are an intermediate output of the Disclosure Avoidance System’s TopDown Algorithm (TDA). The TDA generates noisy measurements when it applies differentially private noise to each of the tabulations from the confidential data. Because the noise can result in internal and hierarchical inconsistencies within the tables we publish, the TDA completes a final step called “post-processing.” This post-processing improves accuracy for lower-level geographies and corrects those inconsistencies before the tables or Privacy-Protected Microdata Files (PPMFs) are published.

This public release gives researchers and data scientists the opportunity to independently process the files, complete analysis, and conduct valuable assessments of the confidentiality protections.: The U.S. Census Bureau today released the noisy measurement file (NMF) for the 2010 Census Production Settings Demographic and Housing Characteristics File (DHC).

This is the second set of demonstration noisy measurement files that apply the production settings that were used for the 2020 Census data to the 2010 Census data.

We released the 2010 Census Production Settings Redistricting Data (P.L. 94-171) Demonstration NMF on April 3. On June 15 we released the NMF for the 2020 Census redistricting data as a research data product, and we’ll release the corresponding NMF for the 2020 Census DHC later this fall.

There are approximately 25 trillion numbers associated with the combined redistricting data and DHC noisy measurement files, consuming 33 terabytes of space in a compressed format. Given their large size, the files are housed and accessible via off-site locations.

The noisy measurement files are an intermediate output of the Disclosure Avoidance System’s TopDown Algorithm (TDA). The TDA generates noisy measurements when it applies differentially private noise to each of the tabulations from the confidential data. Because the noise can result in internal and hierarchical inconsistencies within the tables we publish, the TDA completes a final step called “post-processing.” This post-processing improves accuracy for lower-level geographies and corrects those inconsistencies before the tables or Privacy-Protected Microdata Files (PPMFs) are published.

This public release gives researchers and data scientists the opportunity to independently process the files, complete analysis, and conduct valuable assessments of the confidentiality protections.

Page Last Revised - June 30, 2023
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header