Skip Header

Quality in a Census Part 5

Mon Aug 30 2010
Robert Groves
Component ID: #ti1302487861

Comparing the census counts to alternatives is one way to get a sense of how good the census is; if all alternative ways of estimating the size of the population yield about the same conclusions, we feel good about the results. The last post on quality described “demographic analysis” as an alternative way of measuring the population of the United States.

Component ID: #ti34593169

This post describes the post-enumeration sample survey approach to measuring the population. (This decade’s version of the post-enumeration survey is called “Census Coverage Measurement.”) Post-enumeration sample surveys have been part of US censuses for some time. For 2010 a survey will be used to estimate the number of persons missed by the census as well as those erroneously enumerated (e.g., duplicates and visitors from other countries).

Component ID: #ti1115699966

A post-enumeration survey draws two samples – one from the full population in complete ignorance of whether sample cases were covered in the census; another from the census address universe. After a survey is done, each address and each person enumerated within the address are carefully matched to the census file, as a way of determining which cases were captured by both the census and the survey or by only one of the methods. From this matching operation, estimates of the misses and erroneous enumerations are made.

Component ID: #ti1115699965

Just like demographic analysis, if the post-enumeration sample survey achieves its ideal form, it offers completely accurate estimates of differential undercount, the tendency for some populations to be covered by the census less well than others. However, the ideal post-enumeration survey is never achieved.

Component ID: #ti1115699964

In an ideal post-enumeration sample survey,

Component ID: #ti1115699963

a) The likelihood that a sample person is measured in the survey is completely independent of the likelihood that the person is measured in the census. (More loosely stated, those who were covered by the census and those not covered by the census have the same probability of being measured in the survey.)

Component ID: #ti1115699962

Problem: This independence is not fully achievable.

Component ID: #ti1115699961

Fix: The field staff working on the post-enumeration survey are different from those working on the census, and use different materials. Overlap between the operations is kept to a minimum. This means that the operations are kept independent, but if those who are reluctant to respond in a census are also reluctant to respond in the survey, the problem remains.

Component ID: #ti1115699960

b) The probability of being captured in the census is the same for all persons and for those in the survey all persons have the same probability of being measured.

Component ID: #ti1115699959

Problem: The assumption is violated in the census and the survey; the probabilities of capture range widely among people with different life styles (e.g., very mobile young singles who live by themselves vs. nuclear families).

Component ID: #ti1115699958

Fix: Group people with similar characteristics who share similar probabilities of being captured in the census; use statistical models to reduce the effect of violating this assumption.

Component ID: #ti1115699957

c) The respondent to the survey correctly reports his or her April 1, 2010, residence and household composition.

Component ID: #ti226960499

Problem: The survey interviewing begins in mid-August, 2010; some persons, especially people who have moved, may have difficulty recalling their April 1 residence status.

Component ID: #ti226960498

Fix: Additional fieldwork and statistical models can be used to mitigate the effect of incomplete information and reporting errors. Further, an auxiliary study has been mounted to estimate what difficulties persons have in accurately reporting their April 1 residence.

Component ID: #ti226960497

d) The survey operation collects the information requested completely.

Component ID: #ti226960496

Problem: Not all persons in the sample survey will be contacted or agree to participate; those who don’t may have distinctive characteristics on geography or person-level attributes.

Component ID: #ti226960495

Fix: We try to contact proxies, neighbors or others who know the nonrespondent cases. Statistical models will also be used in an attempt to remove the nonresponse errors from the sample survey.

Component ID: #ti226960494

Finally, an inherent weakness of a post-enumeration survey is that it is based on just a sample of the population, not the total population. This means that all estimates from it are subject to instability due to sampling variability. This, however, can be measured, just as we are accustomed to seeing in surveys as “margins of error” or the sampling error figures.

Component ID: #ti226960493

As this admittedly high-level description itself demonstrates, the statistical complexity of the post-enumeration survey is high; the technical expertise required to construct the estimates and evaluate them is considerable. Whenever possible, the violations of assumptions in the estimates of undercount will be themselves investigated, but no one involved believes that perfection will be attained. Instead, as with demographic analysis, we will openly document the strengths and weaknesses of the estimates from the post-enumeration survey, so that all can form their judgments about the utility of the estimates to evaluate the census.

Component ID: #ti226960492

We will not have the statistical results of the post-enumeration survey until 2012, so we have to wait awhile to compare this alternative way of measuring the population to the 2010 census. Stay tuned!

  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
Comments or suggestions?
No, thanks
255 characters remaining
Thank you for your feedback.
Comments or suggestions?
Back to Header