The Post-Enumeration Survey: Measuring Coverage Error

December 16, 2021

Written by:

Timothy Kennel, Assistant Division Chief for Statistical Methods, Decennial Statistical Studies Division

Estimated reading time: 8 minutes

Although we undertake extensive efforts to accurately count everyone in the decennial census, sometimes people are missed or duplicated. Census errors can result in a smaller or larger population count than the actual number of people. The U.S. Census Bureau estimates the true population using the post-enumeration survey.

Since 1950, the Census Bureau has used a post-enumeration survey (although it has gone by various names) to measure this error and the accuracy of the census. The post-enumeration survey creates a precise alternative estimate of the number of people in the United States. We then compare the census counts to the survey’s estimate and calculate the proportion of people in the estimated true population who were missed, duplicated, or counted by mistake in the census.

The post-enumeration survey is one of the many ways we evaluate the quality of the census. For example, we also compare census counts to other population benchmarks as described in our recent Using Demographic Benchmarks to Help Evaluate 2020 Census Results blog.

This blog provides more information about this decade’s post-enumeration survey, including when to expect the results.

How the 2020 Post-Enumeration Survey Works

The 2020 Post-Enumeration Survey (PES) uses a technique called “dual-system estimation,” with the two systems being the survey and the census. With this technique, the survey independently interviews people, asks where they lived on April 1, 2020, and then matches that information to the census results.

The survey takes more than two years and involves enumerating housing units and people from scratch in about 10,000 blocks across the country.

After we match these housing units and people to the list of addresses and people in the census, we’re able to determine who was counted:

In the census only.
In the PES only.
In both the census and PES.

We use this information to estimate how many people are in the U.S. and how many people were correctly counted in the census.

In fact, we use both the post-enumeration survey and census data in the estimation process. One of the advantageous aspects of dual-system estimation is that two imperfect sources are combined to estimate the population size more accurately than either could on its own.

For a visual explanation of how dual-system estimation works, watch this video on the 2010 PES. We called the PES “Census Coverage Measurement” back then, but the basic statistical methods are the same.

What the PES Provides

As we did for the 2010 Census, we plan to provide two types of results from the post-enumeration survey:

Net coverage error.
Components of coverage.

Net Coverage Error

“Net coverage error” is the difference between the census count and the PES estimate of the actual number of people in the U.S.

Comparing the Census Count and Post-Enumeration Survey (PES) Estimate

If the net coverage error is negative, it means the census counts were too low and the census missed some people. We call this an undercount. If it is positive, it means the census counts were too high, indicating some people may have been counted more than once. We call this an overcount.

We also look at the difference between the census counts and the PES estimates for a variety of demographic characteristics including:

Age group.
Sex.
Race – specifically for the following races, alone or in combination with one or more other races:
- White (with a breakout for non-Hispanic White alone).
- Black.
- Asian.
- American Indian and Alaska Native (with breakouts for American Indians and Alaska Natives living on reservations, living on American Indian Areas off reservation, and living in the rest of the U.S.).
- Native Hawaiian or Pacific Islander.
- Some Other Race.
Hispanic origin.

Then we compare the net coverage error rates across the various demographic groups to determine differences in how the groups were counted in the census. When a group has a larger or smaller net undercount than the country as a whole, we call this a “differential net undercount.”

Components of Coverage

Through PES fieldwork, follow-up, and analysis, we try to estimate the proportions of census records that are correct, wrong, or we don’t have enough information to be sure one way or the other.

We’ll report the “components of coverage” by breaking final census counts into the three groups:

Correct enumerations. These refer to people counted in the census who were living in the U.S. on April 1, 2020 (the reference day for the census). According to the PES, the individuals should have been and were counted in the census. Of the household population counted in the 2010 Census, we estimated that 94.7% were enumerated correctly. (I explain below how we determine whether the enumeration is correct.)
Erroneous enumerations. According to the PES, these include duplicate records of people who were correctly counted in the census as well as people who were counted but should not have been. For example, they may have died before April 1, 2020, or were just visiting the country. Of the household population counted in the 2010 Census, we estimated that 3.3% were enumerated erroneously.
Whole-person imputations. For some records in the census, we didn’t receive a response with enough characteristics, so we used a statistical technique called whole-person imputation to fill in the blanks. (More information about imputation is described in a recent blog.) In 2010, 2.0% of the count fell into this category.

To determine the size of these categories, we count how many census records need whole-person imputation, and the PES estimates the other two groups.

How Do We Know Whether an Enumeration Is Correct?

For census records in the PES sample blocks, we ask detailed questions about where people were living on April 1, 2020, in an independent follow-up interview. This follow-up interview is independent because we do not tell the people we interview who was counted in the census. Instead, we ask them to tell us about the household on April 1, 2020.

We then look for those people in the census files. If we find them, we have confirmation that the census record was correctly enumerated.

We use the information from the sample to estimate the number of correct and erroneous enumerations in the entire census file.

It’s important to keep in mind that the PES — like all sample surveys — has its own sources of error. We try to measure many of these errors and will present them in a Source and Accuracy Statement.

PES Results

We expect to release preliminary results from the 2020 PES in the first quarter of calendar year 2022 and a second set in the summer of 2022. The first release will provide estimates of population coverage overall and for important demographic groups for the nation. The second release will provide estimates of population coverage for states and by some census operations, as well as for coverage of housing units. The reports will be similar to post-enumeration survey reports from 2010. New this decade, the data tables will also be available at data.census.gov.

The 2020 PES was designed to support national- and state-level estimates of census coverage. We plan to disseminate many of the tables from past decades including national coverage estimates by race, Hispanic origin, age group, sex, and tenure (owners and renters). We will also produce numerous national tables showing components of census coverage by operational variables, such as the people enumerated through the Nonresponse Followup operation and by Type of Enumeration Area. In most regards, the plan is to report 2020 Census coverage by similar characteristics as we reported in the 2010 Census Coverage Measurement program.

One notable difference is that we do not plan to include tables showing census coverage for large counties or places. The methods used to estimate census coverage in 2010 were developed assuming a much larger sample than we have in the 2020 PES.

In 2010, the county and place estimates of net coverage were “synthetic estimates” — meaning they were modeled using averages across areas with a similar demographic composition. The estimates were not “direct estimates” based on observed coverage in the specific county or place. For this reason, the county and place estimates of net coverage for sub-state areas in the 2010 Census may not have reflected the true coverage of the sub-state areas. As a result, our measure of the potential error in the estimates — the estimated mean squared error — did not always correctly capture the model error in the 2010 estimates. We learned this early in the planning for the 2020 PES.

Earlier this decade, after reviewing the county and place estimates of the 2010 Census coverage, we concluded that we needed considerable research on the methods used to produce sub-state estimates and their mean squared errors. Given the sample size for the 2020 PES and the assumptions required to make sub-state estimates, we cannot include county or place estimates in the 2020 PES reports.

Summary

The post-enumeration survey will help us estimate how well the census covered — or counted — the population. The survey, along with other ways of evaluating quality, provides a measure of overall census quality and helps us identify ways to improve the next census.

Page Last Revised - March 23, 2022

Is this page helpful?
Thumbs Up Image

Yes

NO THANKS

255 characters maximum

255 characters maximum reached

Thank you for your feedback.
Comments or suggestions?

Top