Estimated reading time: 8 minutes
On March 10, 2022, the U.S. Census Bureau will release coverage estimates for the 2020 Census. These are our estimates of how well the 2020 Census covered — or included — everyone in the nation and in certain demographic groups.
Specifically, coverage estimates give us some statistical insight into whether the census counts for certain groups may be too low, meaning the census likely missed some people, or whether the counts may be too high, indicating some people may have been counted in error or more than once. We call these estimated undercounts and overcounts, respectively.
The estimates will come from two sources:
Using DA and the PES to analyze the census results is just one of many ways we evaluate the quality of the 2020 Census. We also released a variety of operational metrics, worked with outside experts, and are currently working on a series of extensive evaluations and assessments of 2020 Census operations. More information about each of these is available on our 2020 Census Data Quality page. Together, all of these measures provide a fuller picture of the quality of the census.
In this blog, we describe what DA and the PES can tell us in next week’s release and how DA and the PES differ. Although DA and the PES both estimate “net coverage error,” which is the difference between the census count and an estimate of the number of people, they estimate the number of people using different approaches.
To understand what DA can tell us, it’s important to understand how DA estimates are produced.
With DA, we estimate the number of people using birth and death records, data on international migration, and Medicare enrollment records. The source data and methods for DA estimates are completely independent of the 2020 Census. In fact, we released DA estimates by age, sex, broad race categories, and Hispanic origin months before releasing the first census results.
As the census results are tabulated, we can compare them with DA population estimates and see where there may be net coverage errors. For DA, net coverage error is the difference between the census counts and DA population estimates. If it’s positive, it implies a net overcount. If it’s negative, it implies a net undercount.
With the 2020 Census results already released, we have shared the net coverage error rates for four population groups so far:
We described each of these results (and further explained net coverage error) in the blog, Using Demographic Benchmarks to Help Evaluate 2020 Census Results.
Next week, we will release DA net coverage error estimates by:
We have not released these detailed 2020 Census results by age and sex. However, we are using a special 2020 Census file, not yet published, with added confidentiality protections from the Census Bureau’s new disclosure avoidance system to analyze the population by single year of age and sex and for various age groups. It is important to make DA net coverage error estimates available to the public at the same time as the PES coverage results since together, they provide a more complete picture. Using prerelease 2020 Census results with appropriate confidentiality protections in our comparisons with DA enables us to do this.
For race and Hispanic origin, we need more information from the 2020 Census results before DA can analyze net coverage error for those characteristics. As we explained in the Using Demographic Benchmarks to Help Evaluate 2020 Census Results blog, the census uses different race categories than the historical birth and death records we use to produce DA estimates. We’ll need to reconcile the differences and create a file, called the 2020 modified race file. With the modified race file and additional demographic information to come from the census, we will be able to make better comparisons, resulting in a fuller picture of 2020 Census quality for certain groups.
The PES produces an estimate of the number of people using a survey. We then compare the census counts to the PES estimate and calculate the difference between the two. We also estimate how many people were counted correctly, missed, duplicated or counted by mistake in the census.
The 2020 PES uses a technique called “dual-system estimation,” with the two systems being the PES and the census. With this technique, the survey independently interviews people, asks where they lived on April 1, 2020, and then matches that information to the census results. More information about the survey is available in The Post-Enumeration Survey: Measuring Coverage Error blog.
Next week, we’ll have the first results from the PES. We will release an estimate of net coverage error — the difference between the census count and the PES estimate — for the nation. We’ll also release the difference between the census counts and the PES estimates for a variety of demographic characteristics at the national level including:
We will also release estimates of the proportions of census records that are estimated to be correct, wrong, or for which we don’t have enough information to be sure one way or the other. We refer to these as the “components of census coverage” and the three groups as “correct enumerations,” “erroneous enumerations,” and “whole-person imputations.” We also divide the estimated population size into people who were correctly enumerated in the census and person omissions (people who should have been enumerated but were not). Many of these people may have been accounted for in the whole-person census imputations. More information about the components is available in The Post-Enumeration Survey: Measuring Coverage Error blog.
In the summer, we’ll release additional coverage estimates, including state-level results. More information on what’s scheduled for release this summer is available on the Post-Enumeration Surveys page.
The design of the PES included many activities to minimize potential survey errors. These activities included:
More information about the design of the PES is available in The Design of the Post-Enumeration Survey for the 2020 Census.
Through dual-system estimation, we use both the census and the PES — drawing strength from both. This can reduce the impact of possible errors that might influence each system individually and allows us to estimate the number of people missed in both data collections — the census and the PES. Yet, errors in the PES operations can affect the quality of the coverage estimates. A source and accuracy statement will be released along with the PES coverage report. This document describes the typical errors in surveys and discusses how they might impact the PES results.
Of course, conducting the survey amid the COVID-19 pandemic posed a number of challenges. Many in-person surveys collecting data in the summer of 2020 suffered high levels of nonresponse. The PES mitigated this by delaying interviews until many shutdowns had ended, increasing response.
Despite challenges related to PES data collection and operations, we think the PES estimates will produce a helpful picture of the census coverage. They will provide valuable insights into how census coverage differs by a variety of demographic characteristics, especially those that are unavailable in DA.
PES and DA are two different approaches to analyzing coverage in the census. A summary of the two approaches is available on the page Similarities and Differences Between the Demographic Analysis and Post-Enumeration Survey Coverage Programs.
Because the PES is based on interviews, it can produce estimates of coverage by reported characteristics that might change over time or not be available in administrative records. For example, the PES reports coverage by the race categories used in the census.
On the other hand, a strength of DA is that it primarily used existing data sources, so the pandemic did not affect DA’s methodology, as it did for the PES and census. DA estimates the population as of April 1, 2020. By that time, the pandemic had had a relatively small impact on births, deaths and international migration. We accounted for any additional deaths by using preliminary monthly mortality data from the National Center for Health Statistics.
Another key difference between DA and the PES is DA coverage estimates include people living in group quarters and Remote Alaska areas, while these groups are excluded from the PES. People living in group quarters and Remote Alaska areas are excluded from the PES estimates because many of them are likely to move between the census and PES interviews, making the matching and followup unreliable.
The fact that the PES is a survey also means that it is subject to sampling and nonsampling errors, such as nonresponse error. DA is subject to a different set of errors and assumptions. The differences in these errors can result in different estimates between the PES and DA.
The release of DA and PES coverage estimates on March 10 will provide two similar but different pictures of the 2020 Census coverage. They are two important pieces in continuing to evaluate overall census quality. Together, they will also inform how we plan for the 2030 Census. We look forward to discussing the data in the coming weeks.