Today, the Census Bureau released new demonstration data and performance metrics for the 2020 Census Demographic and Housing Characteristics File (DHC). The demonstration data apply the current iteration of the disclosure avoidance system (DAS) to 2010 Census data. This allows a side-by-side comparison of the impact of this version of the DAS against the published 2010 Census tables.
“Person” Tables Today; “Housing” Tables Targeted for Release in April
Today’s release includes the proposed “person” tables. Content includes sex, age, Hispanic origin, race, relationship to householder, and group quarters. Some person tables are repeated by race and ethnicity.
We are targeting an April 2022 release for the proposed “housing” tables, which include content on tenure, vacancy, household type, family type, and characteristics of the householder. Some of the housing tables are also repeated by race and ethnicity.
Please note that we are publishing the demonstration data in table format only; there is not a Privacy-Protected Microdata File (PPMF) published with this release. While the microdata file is used to generate the tabular data, it includes details at geographic levels we are not planning to publish for certain tables. Forgoing PPMF publication for these demonstration data allows us to achieve greater accuracy for the published tables for the same level of confidentiality protection.
Updated Detailed Summary Metrics Available
Metrics for the first round of DHC demonstration data now populate what had been blank placeholder tables in the Detailed Summary Metrics documents created during development of the 2020 Census (PL. 94-171) Redistricting Data Summary File. Although the housing tables won’t be released until April, the metrics for both the person and housing tables are available today. Keep in mind that the impact comparison is imperfect because the 2010 data used the “swapping” method of disclosure avoidance.
Because the DHC demonstration data are consistent with the PL 94-171 Production Settings PPMF (2021-06-08), any metrics tables that also appeared in that release will show the same numbers in this release. For example, the mean absolute error of total population at the county level is the same as it was in the earlier release.
Two Feedback Cycles for This Phase of DHC DAS Development
This is the first of two feedback cycles for this new phase of DAS development. Although we released DHC demonstration data based on the 2010 Census in October 2019 and May 2020, these new feedback rounds will be the first to include all proposed 2020 DHC tables, and they will have separate comment periods. While the first round splits the person and housing tables into separate releases, the comment period for both releases will close 30 days after release of the housing tables in April. See below for more information on how to submit feedback.
As a reminder, our current focus is on DHC development. We’ll share updates on Detailed DHC development in the months ahead.
DHC Changes Based on Your Crosswalk Feedback Will Appear in the Next Round of Demonstration Data
Based on your feedback so far, we are proposing additional tables and geographical granularity. The proposed DHC changes will be included in the second round of demonstration data in April. Person tables have not changed since the last version of the crosswalk. The list of changes are in the crosswalk spreadsheet change log. The final scope and granularity of those proposed changes are dependent on ongoing analysis.
Progress Since the DHC DAS Demonstration Data Released in October 2019 and May 2020
These demonstration data reflect initial efforts to meet an array of accuracy targets based on external stakeholder feedback and internal program requirements. We work to meet the targets during development by making a series of adjustments (we call this “tuning”) to the amount of privacy-loss budget (PLB) applied to different sets of tabulations. These targets were established by subject matter experts in the Census Bureau’s Demographic Programs Directorate.
When we speak about the accuracy of data, we are speaking about statistical accuracy. Consistent with the Office of Management and Budget’s Statistical Policy Directive #1 (as codified in Title III of the Foundations for Evidence-based Policymaking Act of 2018), statistical accuracy means that our publicly released data products meet established information quality guidelines while also protecting the confidentiality of respondents’ information and, when appropriate, providing information on limitations of the data that may assist data users in determining the suitability of the data for their purposes.
We continue to tune the DAS to meet the targets we set, but not all tables in the first round of demonstration data will meet those marks. Data user feedback on these demonstration data will enable further tuning and optimization of these privacy-loss budget allocations. Your feedback throughout this process will also inform the setting of the privacy-loss budget used for the production (final) settings.
As you analyze the data, recall that block-level data are fit-for-use when aggregated into geographically contiguous larger entities. They are not intended to be fit-for-use as a unit of analysis.
The following tables highlight some of the improvements since the earlier demonstration data releases (Oct. 2019 and May 2020):
Areas We’re Still Working to Improve in the Person Tables
DAS development is an ongoing, iterative process. We’ve identified areas for improvement in this round of demonstration data and are actively working to address them prior to release of the second round, including:
We are continuing to review the person and housing tables to identify additional areas for improvement. In addition, your feedback on areas for improvement will be instrumental. We will review your feedback to determine how to further tune the DAS.
How to Submit Feedback
Data user feedback has been instrumental to the iterative development of the DAS. We encourage you to examine the tables and associated detailed summary metrics to assess whether they are fit for your data uses. Specifically, based on your analysis, we would like to know which data would be deemed unusable for your use cases. It will be helpful if you can identify the problematic table, level of geography, and a description of the use case, along with the likely implications should the data be released as is.
We ask that you submit your comments on both today’s person tables and the upcoming housing tables within 30 days after release of the housing tables to 2020DAS@census.gov using the subject “2020 Census Data Products.” We’ll provide a more definitive deadline with the release of the housing demonstration data.
It’s possible that we might choose to publish detailed summaries of the comments we receive. If so, we will make every effort to remove identifying information, such as names of individuals and organizations.
Join us for a webinar on Tuesday, March 22, at 3:00 p.m. ET to learn more about new 2010 demonstration data for the DHC. For log-in and more information, visit our webinar page.