Estimated reading time: 8 minutes
Although we undertake extensive efforts to accurately count everyone in the decennial census, sometimes people are missed or duplicated. Census errors can result in a smaller or larger population count than the actual number of people. The U.S. Census Bureau estimates the true population using the post-enumeration survey.
Since 1950, the Census Bureau has used a post-enumeration survey (although it has gone by various names) to measure this error and the accuracy of the census. The post-enumeration survey creates a precise alternative estimate of the number of people in the United States. We then compare the census counts to the survey’s estimate and calculate the proportion of people in the estimated true population who were missed, duplicated, or counted by mistake in the census.
The post-enumeration survey is one of the many ways we evaluate the quality of the census. For example, we also compare census counts to other population benchmarks as described in our recent Using Demographic Benchmarks to Help Evaluate 2020 Census Results blog.
This blog provides more information about this decade’s post-enumeration survey, including when to expect the results.
The 2020 Post-Enumeration Survey (PES) uses a technique called “dual-system estimation,” with the two systems being the survey and the census. With this technique, the survey independently interviews people, asks where they lived on April 1, 2020, and then matches that information to the census results.
The survey takes more than two years and involves enumerating housing units and people from scratch in about 10,000 blocks across the country.
After we match these housing units and people to the list of addresses and people in the census, we’re able to determine who was counted:
We use this information to estimate how many people are in the U.S. and how many people were correctly counted in the census.
In fact, we use both the post-enumeration survey and census data in the estimation process. One of the advantageous aspects of dual-system estimation is that two imperfect sources are combined to estimate the population size more accurately than either could on its own.
For a visual explanation of how dual-system estimation works, watch this video on the 2010 PES. We called the PES “Census Coverage Measurement” back then, but the basic statistical methods are the same.
As we did for the 2010 Census, we plan to provide two types of results from the post-enumeration survey:
“Net coverage error” is the difference between the census count and the PES estimate of the actual number of people in the U.S.
If the net coverage error is negative, it means the census counts were too low and the census missed some people. We call this an undercount. If it is positive, it means the census counts were too high, indicating some people may have been counted more than once. We call this an overcount.
We also look at the difference between the census counts and the PES estimates for a variety of demographic characteristics including:
Then we compare the net coverage error rates across the various demographic groups to determine differences in how the groups were counted in the census. When a group has a larger or smaller net undercount than the country as a whole, we call this a “differential net undercount.”
Through PES fieldwork, follow-up, and analysis, we try to estimate the proportions of census records that are correct, wrong, or we don’t have enough information to be sure one way or the other.
We’ll report the “components of coverage” by breaking final census counts into the three groups:
To determine the size of these categories, we count how many census records need whole-person imputation, and the PES estimates the other two groups.
For census records in the PES sample blocks, we ask detailed questions about where people were living on April 1, 2020, in an independent follow-up interview. This follow-up interview is independent because we do not tell the people we interview who was counted in the census. Instead, we ask them to tell us about the household on April 1, 2020.
We then look for those people in the census files. If we find them, we have confirmation that the census record was correctly enumerated.
We use the information from the sample to estimate the number of correct and erroneous enumerations in the entire census file.
It’s important to keep in mind that the PES — like all sample surveys — has its own sources of error. We try to measure many of these errors and will present them in a Source and Accuracy Statement.
We expect to release preliminary results from the 2020 PES in the first quarter of calendar year 2022 and a second set in the summer of 2022. The first release will provide estimates of population coverage overall and for important demographic groups for the nation. The second release will provide estimates of population coverage for states and by some census operations, as well as for coverage of housing units. The reports will be similar to post-enumeration survey reports from 2010. New this decade, the data tables will also be available at data.census.gov.
The 2020 PES was designed to support national- and state-level estimates of census coverage. We plan to disseminate many of the tables from past decades including national coverage estimates by race, Hispanic origin, age group, sex, and tenure (owners and renters). We will also produce numerous national tables showing components of census coverage by operational variables, such as the people enumerated through the Nonresponse Followup operation and by Type of Enumeration Area. In most regards, the plan is to report 2020 Census coverage by similar characteristics as we reported in the 2010 Census Coverage Measurement program.
One notable difference is that we do not plan to include tables showing census coverage for large counties or places. The methods used to estimate census coverage in 2010 were developed assuming a much larger sample than we have in the 2020 PES.
In 2010, the county and place estimates of net coverage were “synthetic estimates” — meaning they were modeled using averages across areas with a similar demographic composition. The estimates were not “direct estimates” based on observed coverage in the specific county or place. For this reason, the county and place estimates of net coverage for sub-state areas in the 2010 Census may not have reflected the true coverage of the sub-state areas. As a result, our measure of the potential error in the estimates — the estimated mean squared error — did not always correctly capture the model error in the 2010 estimates. We learned this early in the planning for the 2020 PES.
Earlier this decade, after reviewing the county and place estimates of the 2010 Census coverage, we concluded that we needed considerable research on the methods used to produce sub-state estimates and their mean squared errors. Given the sample size for the 2020 PES and the assumptions required to make sub-state estimates, we cannot include county or place estimates in the 2020 PES reports.
The post-enumeration survey will help us estimate how well the census covered — or counted — the population. The survey, along with other ways of evaluating quality, provides a measure of overall census quality and helps us identify ways to improve the next census.