In order to adequately examine and contrast figures from the various data sources, one must consider the overall comparability of the data in question. Because each data set analyzed here is unique, based on differing census/survey designs and methodologies, several important factors have the potential to significantly affect counts of the foreign-born population. The following sections address various areas of comparability.
The March 2000 CPS and the C2SS are surveys. As such, they produce only samples upon which statistical inferences can be made for the total population, as measured by Census 2000.
The March 2000 CPS is based on a sample of 50,000 households within 754 primary sampling units throughout the United States (Technical Paper 63). Households are inducted into the sample based on the 1990 Census address file and its subsequent updates throughout the decade. The sample is restricted to the "civilian, non-institutionalized" population. Consequently, individuals residing in many group quarters, such as army barracks, hospitals and prisons, are excluded.3 A sample household participates in the survey over a 16-month period, during which it is surveyed during the first four months, dropped the following eight months, and then rotated back in the remaining four months. Because respondent households respond to questions regarding nativity only once upon inclusion in the survey, a data value for a nativity item in the March 2000 Current Population Survey may reflect the response received as many as 15 months prior. Subsequent changes to a respondent's citizenship status after this time would not be known.
Like the CPS, the C2SS is a national sample of 700,000 households and excludes group quarters. Unlike the CPS, however, the C2SS uses a sampling frame based on the 2000 Master Address File (MAF), the same source used for Census 2000. The Census 2000 Supplementary Survey was conducted over a time period from January through December, 2000.
The population universe for Census 2000 is the resident population of the United States as of 1 April 2000. All housing units are identified and enumerated using the 2000 MAF. Nativity questions, however, are included only on the long form, a sample of the U.S. population. Because the Census seeks to count the entire U.S. resident population, the data include individuals living in group quarters. For the purposes of this comparison, however, the provisional Census 2000 data are divided into household and group quarters populations.
Table 1 summarizes these basic differences in coverage across the three data sources. Differences in sampling frames/address files, respondent pools, and especially survey/census durations provide possible explanations for differences in nativity estimates across the data sources under analysis.
With respect to overall coverage of the data sources in question, there exists a mixture of census and survey methodologies used to obtain data. The following sections address some of these differences.
Mode of collection
The nativity data collected in the CPS are acquired in the initial interview of the sample household. That is, questions that concern place of birth, citizenship status and year of entry are asked by a CPS interviewer at the time the household enters the survey. Consequently, nearly all CPS nativity data come from person-to-person interactions, where probing can produce more reliable responses.
The Census 2000 data rely heavily on self-administered mail-back questionnaires from long-form recipients. Enumerators usually contact individuals only if no questionnaire is received by the Census Bureau after a reasonable duration. The General Accounting Office report, "Status of Nonresponse Follow-up and Key Operations" (2000), reports a long-form mail-return rate of 54.1 percent. Measures to acquire missing information include Nonresponse Follow-up (NRFU), Coverage Edit Follow-up (CEFU) and the Coverage Improvement Follow-up (CIFU), which involve enumerator interaction. (These response rates refer to completion of the actual questionnaire; individual item response rates are discussed and presented below.)
The C2SS data were collected by means of mail-back questionnaires, follow-up telephone calls and follow-up personal visits. All sample households received an announcement of their selection as part of the survey, followed by a questionnaire. If the questionnaire was not returned within the time frame specified, a second questionnaire was delivered to the household. Households that did not return questionnaires were subject to Computer Assisted Telephone Interview (CATI) completion of the survey. One of every three remaining non-response households was visited by field interviewers for Computer Assisted Personal Interview (CAPI) completion of the survey. Table 2 presents the mode of data collection for the Census 2000 Supplementary Survey, by nativity. In brief, these figures show that data for the foreign born were more likely to be collected via personal interview - the third and final data collection attempt - than by other methods: 48.4 percent of foreign-born respondents received CAPI data collection, while only 31.4 percent of natives were surveyed in such a manner.
The varying modes of data collection described above, especially concerning potentially sensitive questions of nativity, may contribute to observed differences in estimates.
Question position, format and skip patterns
The sequence of the nativity items (i.e., the order and position of the questions within each questionnaire), the manner in which they are stated, and the skip patterns contained within may influence certain response outcomes. As shown in Table 3, although the order of the nativity items never change in sequence relative to each other, they do shift location within the overall questionnaire across all three instruments. Further, differences exist with respect to question and response wording as well as with certain skip patterns.
Figure 1 presents the actual place-of-birth questions used in each of the three data sources. For Census 2000 and C2SS, the place-of-birth items are nearly identical: "Where was this person born?" followed by two check boxes and fields (or delimited spaces) for the name of the state of birth (if in the United States) or the country of birth (if elsewhere). The March 2000 CPS, however, phrases the question somewhat differently - "In what country were/was __________ born?" - followed by a field in which the interviewer must enter the appropriate country code (derived from adjacent computer screens).
Figure 2 shows that the citizenship item, including question and response options, is identical for the Census 2000 and the C2SS: "Is this person a citizen of the United States?" followed by four citizenship categories and a residual non-citizen category. Again, however, the Current Population Survey differs. The Current Population Survey uses a skip pattern based upon the responses to the place-of-birth questions obviating the citizenship question altogether if either the respondent or his/her parents report the United States as the place of birth. Only those individuals who report a foreign country of birth for self and parents are questioned about citizenship through as many as three questions: "(Are/Is) ... a citizen of the United States?," "(Were/was) ... born a citizen of the United States?," and "Did ... become a citizen of the United States through naturalization?"
Finally, as shown in Figure 3, all three data products share nearly identical year-of-entry items. Posed to all respondents who report a non-U.S. place of birth, the question asks, "When did (this person/_________) come to live in the United States?" followed by spaces for a four-digit year response.
In sum, although Census 2000 and C2SS closely resemble each other concerning nativity item placement and formats, the CPS differs sufficiently, especially with respect to the citizenship item, to warrant caution when making comparisons.
In addition to the mode of overall data collection, individuals may return census and survey questionnaires without responding to certain items. Differences in item response rates could contribute to data incomparability.
Although all three data sources contain similar questions regarding nativity, Table 4 shows that response rates for the nativity items vary across data sources and population sub-groups. Rates for both the entire population and foreign born only are shown. Excluding group quarters, response rates among the total population range from 91.7 (Census 2000) to 99.0 percent (CPS) for the place-of-birth item, a difference of 7.3 percentage points. Response rates for the citizenship item range from 95.7 percent (Census 2000) to 98.8 percent (CPS) among the total population (excluding group quarters). Finally, response rates for the year-of-entry item5 among the foreign born range from 88.0 percent (Census 2000) to 91.9 percent (C2SS).
In sum, among the civilian, non-institutionalized population, response rates appear to be slightly lower for the provisional Census 2000 data than for CPS or C2SS.
Each of the three data sources is maintained by a separate group within the Census Bureau (and, in the case of CPS, the Bureau of Labor Statistics). As a result, decisions regarding data imputation and edits ultimately rest with different decision makers. Owing to the complexity of the edit and imputation processes adopted by each group, this report acknowledges that differences do exist, not only in nativity item specifications, but in other items (race, Hispanic origin, sex) that contribute to the overall DA estimates. Specifically, an examination of the Census 2000 and C2SS edit specifications by the Ethnic and Hispanic Statistics Branch reveals little difference among nativity items. However, owing to the unique skip pattern of the CPS mentioned earlier, the imputation/edit specifications differ significantly from those of the other two data sources (Costanzo, Davis and Malone 2001; Hansen 1994). These differences in data evaluation and reconstruction introduce another source of potential lack of comparability.
Weights are assigned to each record to estimate the number of individuals in the total population represented by the sample data. Differences in weighting techniques may introduce potential sources of incomparability.
Owing to differences in sampling between the CPS and C2SS, differing weighting schemes exist also. Further, as mentioned earlier, the provisional Census 2000 data are a sample of the enumerated population and thus are weighted as well. Details of the weighting process for the provisional Census 2000 data are contained in Appendix C.