U.S. Census Bureau

Appendix B

Technical Appendix

1. Introduction

This study uses demographic benchmarks and other analytic techniques to evaluate the results of the Census 2000 Dress Rehearsal. In the Dress rehearsal sites of Sacramento City, California and Menominee County, Wisconsin, the independent benchmarks are used to evaluate the census estimates and Integrated Coverage Measurement (ICM) results. In the South Carolina site (eleven counties, including Columbia City), the demographic benchmarks are used to evaluate the census results and Post Enumeration Survey (PES) coverage estimates.

The use of independent demographic benchmarks to evaluate the census and ICM results relies on five estimates or data sets. Each involves the development of population or housing benchmarks in the test sites for 1998 (and comparable estimates for 1990 and earlier where possible) using a different technique. Some of the benchmarks refer to the total population, some pertain to limited age groups only. Our objective is to see if the different benchmarks can together provide an independent basis to evaluate the effectiveness of the one-number census and ICM in 1998. Given some uncertainty in the demographic estimates, no one method can stand alone. If, however, all or most estimates are consistent in pointing to a reduction in the differential undercount in the Dress rehearsal sites, we can have more confidence in the results.

2. Methodology

The five approaches to developing independent demographic benchmarks are as follows:

  1. Population benchmarks based on previous census results (1970-1990).

  2. Independent benchmarks of the total population in the test sites in 1998.

  3. Independent demographic benchmarks for the population aged 0-9 for 1998 and historical coverage indicators.

  4. Independent benchmarks of the population 65+ based on Medicare data.

  5. Independent benchmarks of the school-age population based on school enrollment data.

The development of the above benchmarks is described below.

The methods employed in this project use aggregate level administrative data to produce the independent benchmarks. Other research is being conducted at the Census Bureau which uses administrative records at the individual level (i.e., matching of records). That project is not included in this work, but a comprehensive integrated coverage measurement program in 2000 would draw from both approaches (aggregate and individual levels).

2.1 Historical Decennial Census Data

The detailed historical data available from the decennial censuses provide the backbone for some of the demographic benchmarks in this and other demographic studies. These data include housing units and population counts, GQ population, vacancy rates, persons per household, age/sex distributions, and race/Hispanic origin distributions. The decennial data for 1990 serve as the benchmark for the postcensal estimates described in Section 2.2. Historical census data for the population under 15 are inputs to the demographic analysis coverage indicators described in Section 2.3.

2.2 Independent benchmarks of the total population in the dress rehearsal sites in 1998.

The Census Bureau, as part of its postcensal population estimates program, produces annual estimates of the total population for states, counties, and places. The estimates for the 1990's are based on the 1990 census, carried forward with estimated components of change (births, deaths, net migration). The estimates program represents a cost effective, operationally feasible, and timely source for providing independent benchmarks for evaluating the results in the three dress rehearsal sites.

To produce the 1998 population estimates, the demographic accounting equation is:

  TP98       =       TC90 + B90-98 - D90-98 + M90-98 (1)
where:        
  TP90       =       "Census-level" estimate for the total population (all ages) in 1998  
  TC90       =       Census count of the total population in 1990  
  B       =       Births occurring between the census (1990) and the estimate date (1998)  
  D       =       Deaths to the population occurring between the census (1990) and the estimate date (1998)  
  M       =       Estimated net migration occurring between the census (1990) and the estimate date (1998) (includes domestic and international migration)  

A description of the Census Bureau methodology of producing subnational population estimates is provided by Batutis (1994), Long (1993) and Sink (1996). An evaluation of the accuracy of the 1990 postcensal population estimates (compared to 1990 census counts) is provided by Davis (1994). For an additional independent source, we have obtained the population estimates for Sacramento City from the California State agency (Demographic Research Unit, Department of Finance) and for Menominee County from the Wisconsin State agency (Demographic Services Center). The California estimate is derived using the housing unit method; the Wisconsin estimates are based on a combination of the ratio difference method and the composite method. We were not able to obtain independent estimates from South Carolina (that State uses Census Bureau estimates).

For this evaluation, allowance is also made for net under enumeration in 1990 (the standard population estimates are "census level"). This is accomplished by adding a factor for undercount to equation 1:

  TP'98       =       TP98 + TU90 (2)
where:        
  TP'98       =       Adjusted-level for the total population in 1998  
  TU90       =       Estimate of net undercount in the 1990 census.

The factor for undercount can be based on the 1990 PES and supplemented by illustrative estimates developed in section 2.3.

2.3 Independent demographic benchmarks for the population aged 0-9 in 1998 and historical coverage indicators for 1960-1990.

The demographic benchmarks described in Section 2.2 focus on the total population only. To assess relative undercount for specific age-sex-race groups, this section describes an approach that can provide useful historical coverage benchmarks.

New research is being conducted to produce demographic indicators of coverage for the youngest age groups (0-9) at the State and county level. The measurement of coverage for young children is singled out for three reasons:

  1. Undercoverage is relatively high in these ages (West, 1998), and differentials by race are significant (mirroring that of the total population)

  2. The development of subnational estimates for younger ages is more feasible than older ages because error in measuring net migration is reduced

  3. The indicators can be produced for a series of censuses (e.g., 1960-1990, 1998), providing important historical measures of change in coverage at the subnational level.

Preliminary research on the development of this methodology is described in Robinson (1994). The methodology can be extended to provide crude indicators of coverage for the Dress rehearsal sites. The method was successfully used in a comprehensive evaluation of the quality of the 1995 test census results for Oakland, CA, Paterson, NJ, and 6 parishes in Louisiana (see Robinson, 1996).

To produce historical demographic indicators of population and coverage for States or counties, birth and death statistics are compiled from available data and net migration is estimated on the basis of changes in cohort size between successive censuses. The equations for the specific estimates from the 1990 and 1998 censuses are as follows:

  0-9P'90       =       B80-90/brc80-90 - D80-90 + M80-90 (3)
  0-9P'98       =       B88-98/brc88-98 - D88-98 + M88-98 (4)
where:        
  0-9P'x       =       Demographic estimate for the population aged 0-9 in 1990 and 1998  
  Bx,x+10       =       Births occurring in the intercensal period x to x+10 (1980-90, 1988-98)  
  brcx,x+10       =       Estimated registration completeness of births in the intercensal period x to x+10  
  Dx,x+10       =       Deaths occurring to the birth cohort in the intercensal period x to x+10 (1980-90, 1988-98). Deaths have initially been estimated with life table survival rates.  
  Mx,x+10       =       Estimated net migration occurring in the intercensal period x to x+10 (1980-90, 1988-98)  

It should be noted that the actual calculations in equations 3-4 are carried out in single-year-of-age detail (0,1,....8,9).

The difference between the estimated population (0-9P'x) and the census count (0-9Cx) is an indicator of the net census undercount and net undercount rate:

  0-9Ux       =       0-9Px - 0-9Cx (5)
  0-9rx       =       (0-9Ux/0-9Px) * 100 (6)
where:        
  Ux       =       indicator of net undercount of the population aged 0-9 in time x  
  0-9rx       =       indicator of net undercount rate  

For the initial testing of this subnational demographic methodology for censuses to 1990, the net migration component (M) is estimated from the ratio of the census count for the population aged 10-14 in one census to the census count for ages 0-4 in the previous census. An adjustment is made for mortality and for relative change in coverage of the cohort between censuses. The equation is illustrated for the 1980-90 net migration estimate:

  10-14MR80-90       =       10 - 14 C 90
------------------------- X ( ru80 - 90)
0 - 4 C 80 * S R 80 - 90
(7)
where:        
  10-14MR80-90       =       Ratio of the cohort aged 10-14 in 1990 to 0-4 in 1980  
  10-14C90 & 0-4C80       =       Census count of the cohort aged 10-14 in 1990 and 0-4 in 1980  
  SR80-90       =       Survival rate from age 0-4 in 1980 to age 10-14 in 1990 (intercensal period)  
  ru80-90       =       Change in census coverage (1990 versus 1980) of the population aged 10-14 in 1990 and 0-4 in 1980  

In application, if the ratio (MR) for age 10-14 in 1990 is greater than 1.0 we infer net in-migration and if the ratio is less than 1.0 we infer net out-migration. Migration "rates" and migration amounts (M's) for individual ages 0 to 9 are interpolated from the implied rate for age 10-14 (MR).

The migration estimates for subnational areas can be improved by using actual census data on net migration specific to the area (from the questions on State of birth and residence 5 years ago). For the estimates at the county level, historical trends can be supplemented with current administrative sources (e.g., school enrollments) to develop preliminary net migration estimates for the 0-9 population (Ahmed, 1998).

The subnational demographic indicators of coverage for the population aged 0-9 (as carried out in equations 3-4) are actually developed separately for Blacks and Nonblacks. Inconsistencies in the reporting of race affect the quality of the race estimates, however. The measures for Blacks still can provide crude indicators of racial differentials in coverage.

Although we focus on the period since 1990 in this study, coverage measures for the population under age 10 since 1960 can be derived. We will develop these estimates in later research.

2.4 Independent benchmarks of the population 65+ in the test sites in 1998 based on Medicare data.

Medicare data have been used extensively in the development of demographic analysis coverage estimates for the nation and in the production of postcensal population estimates for States and counties. Medicare tabulations for the most currently available date (1997) provide independent benchmarks for assessing the 1998 census results of the population 65 and over. Aggregate Medicare tabulations are not available below the county level, so this evaluation focuses on the results for the eleven counties in South Carolina. The population is too small in Menominee County to produce reliable results.

We utilize Medicare data to assess coverage of the population 65 and older as follows:

  Ratiox       =       65+Px / 65+Mx (8)
where:        
  Ratio       =       Ratio of census population to Medicare enrollment (in 1990 or 1998)  
  65+Px       =       Census population 65 and over (in 1990 or 1998)  
  65+Px       =       Count of the number of persons aged 65 and over enrolled in Medicare (in 1990 or 1998)  

The ratios of the census population to Medicare enrollment in 1990 and 1998 are used to broadly assess change in coverage. If the ratio in 1998 is greater than the ratio in 1990 we infer an improvement in census coverage of the population over age 65; if the ratio in 1998 is lower we infer a decline in coverage.

The ratios themselves cannot be used as direct measures of coverage because of known differences between the census and Medicare universes. First, no allowance is made for underenrollment in the Medicare files (estimated to be about 3 percent nationally). Second, the county of residence in the census could be different than that reported in the Medicare file (e.g., location of doctor's office address). As long as the underenrollment and residency reporting remain about the same in 1990 and 1998, the change in the ratios can be used as a rough indicator of change in coverage. The Medicare data for the eleven counties in South Carolina were obtained from the Health Care Financing Administration.

2.5 School Enrollment Data

Administrative data on school enrollment are inexpensive and provide independent benchmarks for evaluating coverage of the school-age population. The school enrollment data, which are quite complete (especially for public schools), can be compared with the enumerated school age population to provide coverage indicators in the Dress rehearsal sites at the time of the 1990 census and the Dress rehearsal. This provides an effective mean to measure change in completeness of coverage between the two points in time (1990 and 1998), in a manner similar to the use of the Medicare data for the older population.

We utilize school enrollment data to infer the change in coverage as follows:

  Ratiox       =       7-14Px / SEx (9)
where:        
  Ratio       =       Ratio of census population to school enrollment (in 1990 or 1998)  
  7-14Px       =       Census population aged 7-14 (in 1990 or 1998)  
  7-14SEx       =       Count of the number of persons enrolled in grades 1 to 8 (in 1990 or 1998)

If the ratio in 1998 is greater than the ratio in 1990 we infer an improvement in census coverage of the school-aged population; if the ratio in 1998 is lower we infer a decline in coverage. Like the discussion with the Medicare-based ratios, the ratios themselves cannot be used as direct measures of coverage because of known differences between the census and school enrollment universes. First, no allowance is made for children not enrolled in school (would include those in institutions and home-schooled). Second, the county of residence in the census could be different than that reported in the school file (e.g., location of school's address). As long as the enrollment levels and school districts remain about the same in 1990 and 1998, the change in the ratios can be used as a rough indicator of change in coverage. The data on school enrollment for the eleven counties in South Carolina were obtained from the State education agency.

Appendix B
References

Ahmed, Bashir. 1998. "The Use of School Enrollment Data to develop County Population Age 5-17." Forthcoming.

Batutis, M. J. 1994. Subnational Estimates of Total Population by the Tax Return Methodology. U.S. Bureau of the Census, Population Division, Washington, D.C.

Davis, S. 1994. Evaluation of Post-Censal County Estimates for the 1980's. Bureau of the Census, Population Division, Working Paper Series, Number 5. Washington, D.C.

Long, J. 1993. Post-Censal Population Estimates: States, Counties, and Places. Bureau of the Census, Population Division Working Paper Series, Number 3. Washington, D.C.

Robinson, J. Gregory. 1996. "Evaluation of CensusPlus and Dual System Estimates Results with Independent Demographic Benchmarks", Integrated Coverage Measurement (ICM) Evaluation Project 15. U.S. Bureau of the Census, Washington, D.C.

Sink, L. 1996. Estimates of Population of Counties by Age, Sex, Race and Hispanic Origin: 1990-1994. Release PE-48 (Methodology), U.S. Bureau of the Census. Washington, D.C.

West, Kirsten and J. Gregory Robinson. 1998. "What Do We Know About the Undercount of Children," U.S. Bureau of the Census. Paper presented at the Annual Meeting of the Southern Demographic Association, Annapolis, Maryland, October 1998.

Authors: J. Gregory Robinson, Kirsten West, and Arjun Adlakha (Population Division)
Created: May 26, 2000