Customizing protections for each data product is an iterative process that requires data user engagement and feedback. As we produce demonstration data and performance metrics during Disclosure Avoidance System (DAS) development, we’ll post that information here.
For more information, view this brief: Why the Census Bureau Chose Differential Privacy
The “2010 Demonstration Data Products Suite – Redistricting and DHC,” is a suite of files based on 2010 Census results to help data users analyze the impact of the new 2020 Census Disclosure Avoidance System.
The files incorporate the final production settings chosen for both the 2020 Census Redistricting Data (Public Law 94-171) Summary File and the Demographic and Housing Characteristics File (DHC). Included in the released suite of files is the 2010 Census Production Settings Redistricting Data (P.L. 94-171) Demonstration Noisy Measurement File (2023-04-03) (2010 Redistricting NMF).
Noisy Measurement Files are the intermediate output of the Disclosure Avoidance System’s TopDown Algorithm (TDA). The TDA generates noisy measurements when it applies differentially private noise to each of the tabulations from the confidential data. Because the noise can result in internal and hierarchical inconsistencies within the tables we publish, the TDA completes a final step called “post-processing.” This corrects those inconsistencies before the tables or PPMFs are published.
This public release gives researchers and data scientists the opportunity to independently process the files, complete analysis and conduct valuable assessments of the confidentiality protections.
This product provides detailed demographic and housing characteristics about the nation and local communities. We encourage data users to aggregate small populations and geographies to improve accuracy and diminish implausible results.
For more information about how differential privacy is applied to the DHC: Disclosure Avoidance and the 2020 Census: How the TopDown Algorithm Works
A subset of DHC tables were included in early iterations of DAS demonstration data. In the Redistricting Data section below, see:
Improvements in the design, processing and coding of the 2020 Census allow the release of data for almost five times as many detailed race and ethnic groups than were possible in 2010.
On January 31, 2023, the Census Bureau released a Proof of Concept to help data users understand how a new disclosure avoidance framework based on differential privacy may impact the 2020 Detailed DHC-A. The Proof of Concept includes proposed content and disclosure avoidance settings, which have not been finalized by the Census Bureau.
Subjects: Household type and tenure information for the same detailed race and ethnicity groups and American Indian and Alaska Native tribes and villages mentioned for the Detailed DHC-A.
2020 geographies: Nation, state, county, places (cities and towns), census tracts, and American Indian/Alaska Native/Native Hawaiian (AIANNH) areas.
Planned release date: September 2024.
Additional information about the release of the Detailed DHC-B is available in the newsletter Census Bureau Provides Updates on 2020 Census Data Products.
The S-DHC tables reflect especially complex relationships between the characteristics about households and the people living in them. These complex characteristics supplement the data about households and people available in the DHC product. We often refer to these tables as “complex person-household join tables” or “join tables.” Some tables are repeated by race and ethnicity.
Subjects: Data that combine characteristics about households and the people living in them, including the total population in households, average household size by age and tenure, average family size, household and family type for people under 18 years old, and total population in households by tenure.
2020 geographies: Nation, state.
Planned release date: September 2024.
Public Law 94-171 directs the Census Bureau to provide the data that may be used for redistricting to the governors and the officers or public bodies having responsibility for redistricting in each of the 50 states.
This product is the first from the 2020 Census that includes demographic and housing characteristics about detailed geographic areas including states, counties and places.
Subjects: Voting age, race, Hispanic or Latino origin, housing occupancy status, group quarters population by major group quarters type
Lowest level of geography: Census block
Access: FTP site in August (links to data files and support materials are available on the Decennial Census P.L. 94-171 Redistricting Data Summary Files page); data.census.gov on September 16
Date: Released on FTP August 12, 2021; the same data released on data.census.gov on September 16, 2021
Beginning in October 2019, the Census Bureau released a series of demonstration data products that applied iterative development versions of the 2020 Census Disclosure Avoidance System (DAS) to published 2010 Census Data. The first two demonstration data sets focused simultaneously on both redistricting and Demographic and Housing Characteristics data (DHC, known in earlier censuses as Summary File 1, or “SF1”). In August 2020, pandemic-triggered operational delays required the Census Bureau to prioritize development focus on the redistricting data to attempt to meet the statutory data release deadline. Demonstration data from September 17, 2020, forward focused solely on the redistricting data. (See Development Timeline)
The Census Bureau produced Detailed Summary Metrics and Privacy-Protected Microdata Files (PPMFs) to assist with data user analysis. IPUMS NHGIS converted the PPMFs into tabular format for ease of use. Data users evaluated each iteration and provided feedback that helped shaped the algorithm and settings throughout the development process. On June 8, 2021, the Census Bureau’s Data Stewardship Executive Policymaking Committee chose the final settings for production of the redistricting data. The data were released August 12, 2021.
Note that while the data in the Privacy-Protected Microdata files, the underlying untabulated microdata files used to generate the Detailed Summary Metrics, look like individual records, they are all privacy-protected through the application of differentially private statistical noise.
On June 8, 2021, The U.S. Census Bureau’s Data Stewardship Executive Policy Committee (DSEP) selected the settings and parameters for the Disclosure Avoidance System (DAS) for the 2020 Census redistricting data (PL-94-171).
This is the sixth and final set of Privacy-Protected Microdata Files (PPMFs) for the redistricting data that allow data users to compare the effect of the Disclosure Avoidance System settings on previously published 2010 Census data. These and previous PPMFs are only intended to demonstrate the redistricting data, not the Demographic and Housing Characteristics File (DHC) or other 2020 Census data products.
There are two sets of Privacy-Protected Microdata Files (PPMFs), record layouts, and Detailed Summary Metrics in this release:
We encourage data users to closely analyze this demonstration data. Feedback received by May 28, 2021, will be considered. Email feedback to: 2020DAS@census.gov; include “April PPMF” in the subject line.
Particularly useful feedback would describe:
We will provide additional metrics and educational webinars throughout the month of May to help you with that analysis. (Subscribe to our newsletter for the release and other updates.)
This release corrects a coding error discovered after publication of v. 2020-09-17, along with other minor refinements.
Subscribe to our digital newsletter for the latest updates in DAS development.
We appreciate your engagement and encourage you to email comments and suggestions to 2020DAS@census.gov.