4.1 Is the change in the Hispanic population reasonable?
While Census 2000 data revealed that the Hispanic population had increased substantially between 1990 and 2000 (57.9 percent), the total population grew by only 13.2 percent.14 This dramatic growth plus the disparity between the Census 2000 figure (35.3 million) and the demographic estimate for July 1, 2000 (32.2 million) prompted the Census Bureau to reexamine assumptions about international migration used in the development of population estimates.15
Shortly after the Census was completed, new international migration (legal and unauthorized) assumptions were developed for the Hispanic population based on a series of demographic research reports.16 Although the number of Hispanics counted in Census 2000 was higher than expected, the growth of the Hispanic population was deemed reasonable after the results of the demographic review showed that the Census Bureau had originally underestimated Hispanic international migration during 1990s. Hispanic population adjustments were made to reflect these new findings (e.g., 35.6 million = July 1, 2000 estimate).17
4.2. Improved response rates
Sequencing of the Hispanic question before the race question may have finally resolved the problem of how to improve the response level of the Hispanic question without the use of a field content-edit followup.18 Compared with results from the 1990 census, the Census 2000 total allocation rate fell from 10.4 percent to 5.6 percent (Table 3). In addition, the Hispanic allocation rates declined in every state except Alaska, New Mexico, Vermont, and Wyoming. The total number of imputations for the Hispanic Origin question also dropped during the period from 25.5 million in 1990 to 16.8 million in 2000.
4.3. Improved imputation methodology
During Census 2000, in addition to improving the response rate and thereby reducing the number of required imputations, the Census Bureau improved the imputation process by:
- Reducing the proportion of imputed "hot deck" cases. In 1990, 75.6 percent of census data allocations were processed using the "hot deck " method. In Census 2000, only 41.2 percent of the imputations were processed in this manner.19
- Introducing the use of Spanish and non-Spanish surnames into the hot deck procedure. When the hot deck procedure was used, donors with Spanish surnames were used to impute values to Spanish surnamed cases with missing values (and vice versa for non-Hispanic cases). Approximately 31.4 percent of all imputations and 8.1 percent of all Hispanic imputation cases were affected by this enhancement of the hot deck procedure.
- Combining the race and Hispanic imputation procedures. In 1990, the race and Hispanic origin edit and imputation procedures were executed independently. This approach apparently contributed to the propagation of relatively rare race Hispanic combinations (for example, Black Mexicans), although in some part these esoteric combinations may also have been the result of misreporting. In Census 2000, the race and Hispanic origin edit specifications were integrated and the rules for ‘within household’ and ‘hot deck procedures’ restricted so that Hispanic donors and donees were matched before race was assigned.
4.4. Good overall response consistency as measured by reinterview and by comparison with the 1990 version of the Hispanic question
Since 2000, the Census Bureau has conducted several studies evaluating the quality of Census 2000 data including the Census 2000 Content Reinterview Survey (CRS) and the Alternate Questionnaire Experiment (AQE).20 Major findings from these studies are discussed below.21
Shortly after the last decennial census, the Census Bureau randomly selected a sample of 30,000 households that had received the long form in 2000. One person from each household in this sample was telephone interviewed by an experienced field representative. The primary goal of the survey was to evaluate the quality of the census data by comparing responses provided through a phone interview with those reported in the census questionnaire. Using data from this Content Reinterview Survey, analysts developed an index of inconsistency in reporting.
Results from the CRS indicate edited data for the Hispanic origin question displayed mixed results. The consistency of "Not Hispanic," "Mexican," "Puerto Rican," and "Cuban" responses fell in the good range (less than 20). "Other Hispanic" scored in the moderate range (20 to 50). "Multiple non-Hispanic," "Multiple Hispanic," and "Mixed Origin" scored in the poor range (over 50) (Table 4).22
The Alternate Questionnaire Experiment was designed to measure the total effect of the changes in the Census mail questionnaire from 1990 to 2000 by comparing two independent random samples of households. About 10,500 households received the1990- style short form while about 25,000 households received the census 2000 short form. The 1990-style form retained the same 1990 question wording, categories, order and format, but incorporated some recognizable elements of the Census 2000 design. Because this experiment was conducted by mail, the results of the study were generalizable only to the Census 2000 mailout-mailback universe.
According to the results of the AQE, changes to the Census 2000 questionnaire led to improved reporting of Hispanic origin as measured by item nonresponse. For example, the overall item nonresponse to the question of Hispanic origin was 3.3 percent in the Census 2000-style questionnaire, compared with 14.5 percent in the 1990-style questionnaire. Nonresponse to the race question by Hispanics was also reduced by nearly 10 percentage points, from 30.5 percent in the 1990-style questionnaire to 20.8 percent in the 2000-style questionnaire.
4.5. Weaknesses in the Census 2000 Hispanic data
4.6. Less than expected growth for specific Hispanic groups; Substantial growth in reporting of "generic" Hispanic terms; Evidence that question wording and format led respondents to report more general responses instead of more specific responses
Despite many positive findings, the Alternative Questionnaire Experiment (discussed above) revealed the census mail questionnaire probably produced a few unwanted results. There was no difference between the two groups (those receiving the 1990-style questionnaire and those receiving the 2000-style questionnaire) in the percent of people reporting Hispanic (about 11.1 percent of each group surveyed). However, members of the group receiving the 2000-style questionnaires were less likely to report a specific Hispanic group (e.g., Mexican, Cuban, Puerto Rican) and more likely to report a general Hispanic term (e.g., Hispanic, Latino, Spanish) compared with the sample that received the 1990-style questionnaires. Specifically, the AQE found that 92 percent of the Hispanics who responded to the 1990-style form provided a specific Hispanic group identity compared with 80 percent of those who responded to a 2000-style form. Thus, the 2000-style forms produced about 10 percent more general responses than the 1990- style forms. AQE results suggest this difference is probably due to the combined effects of changes in the question wording (e.g., removal of the word "origin" which appeared on the 1990 form, and addition of the term "Latino" to the 2000 form) and/or the elimination of specific Hispanic origin examples from the Census 2000 questionnaire.
Findings from Census 2000 compared with those from the 1990 census show a similar pattern (Table 2). In Census 2000, the proportion of the Hispanic population providing a specific origin was 83.9 percent compared with 93.6 percent in the 1990 census.23 In addition, the proportions of persons responding differed across groups. The Mexican origin population declined from 61.2 percent in 1990 to 59.3 in 2000; Puerto Rican origin declined from 12.1 percent to 9.7 percent; Cuban origin declined from 4.8 percent to 3.5 percent. On the other hand, general responses all experienced dramatic increases. For example, the percent "Latino" increased from less than 0.1 percent to 1.2 percent, "Hispanic" increased from 1.8 percent to 6.6 percent, "Spanish" increased slightly from 2.0 percent to 2.2 percent and finally "Other Hispanic" increased from 2.6 percent to 5.8 percent.
Two independent external studies by Roberto Suro of the Pew Hispanic Center and John Logan of the Lewis Mumford Center raised additional concerns about the accuracy of the detailed Hispanic group data.24 Although both analysts praised the Census Bureau for producing a good total count of Hispanics, Suro and Logan both provide additional evidence that the Census 2000 Hispanic question may have significantly underestimated the size of some Hispanic groups in the United States.
In 2002, Government Accounting Office auditors met with Census Bureau staff to discuss the decision that led to the selection of the version of the Hispanic question used in Census 2000 as well as the results from Census 2000. The GAO noted "the Bureau’s lack of agency wide guidelines for its decisions on the level of quality needed to release data to the public" when the agency realized that question wording and format may have adversely affected reporting of detailed Hispanic groups.25
At the request of members of Congress and the Latino community, the Census Bureau further analyzed the Census 2000 Hispanic data in an effort to ascertain what kinds of detailed responses individuals might have provided in lieu of the more general responses they did provide. Using a ‘what if’ scenario, Census Bureau staff devised a simulation model that generated detailed Hispanic values based on information derived from responses to other census questions such as nativity and ancestry.26 For example, if a respondent reported "Latino" in the Hispanic origin question and indicated he was born in Mexico, he was coded "Mexican" (a detailed response) in the simulation model.
When the criteria for refining the Hispanic detail were applied the numbers and proportions for many of the detailed groups increased. For example, the category Spaniard increased by about 69 percent (Census 2000 results = 112,999; Simulation model results = 190,656). In fact, all the detailed Hispanic group proportions increased at least 24 percent except Mexican (7 percent), Puerto Rican (4 percent), and Cuban (5 percent). The biggest numerical gainer was the Mexican category (Census 2000 results = 20.9 million; Simulation results = 22.3 million, or a 1.4 million gain). The Mexican cases accounted for nearly half (47 percent) of all the sample reassigned from a general to a specific response (1.4 million out of 3.1 million) in the simulation model.
Because long form information derived from the place of birth and ancestry question are only available for long form census cases, the simulation study findings may be of limited use in refining decennial census data.27 However, this methodology opens up interesting possibilities with regard to the American Community Survey conducted annually.
Another group of Census Bureau analysts compared the Census 2000 Hispanic data results with data reported in the Census 2000 Supplemental Survey (C2SS).28 These researchers found that both data sources provided similar totals for the Hispanic population. On the other hand, they noted Census 2000 produced lower detailed Hispanic group rates and higher general group rates. The report suggests:
"the observed differences are due to the use of examples in the C2SS during telephone and personal visit interviewing. These aids were not provided during Census 2000 operations, although one could argue that the presence of the Hispanic origin checkbox groups act as examples. This reasoning does not explain why the Mexican percentage is also lower in Census 2000."29
Although the universe for these two data collection systems are somewhat different (Census 2000 represent the total population, i.e. the population in both households and group quarters; C2SS represents the household population alone), the results of this comparison add to the evidence that the Census 2000 question may have influenced respondents to report general Hispanic answers.
4.7. Evidence of slight decline in response consistency as measured by reinterview
Finally, even though response consistency was in the good range for the Hispanic question, response consistency declined between 1990 to 2000 not only for the Hispanic question overall, but also for the Mexican and Puerto Rican origin categories (Table 4.). These differences can be partly explained by: a) the Census 2000 CRS questionnaire used a somewhat different Hispanic question than that which appeared in the mail form, whereas the 1990 CRS used the same question as that used in the mail form, and b) the Census 2000 CRS used more Hispanic categories to derive the index of inconsistency than did the 1990 CRS. Thomas, et al note:
"the level of index is sensitive to the number and detail of categories in a classification system as well as to the distribution of the population over these categories."30
Nevertheless, these data add one more piece of evidence concerning the affect of the changes in the Census 2000 questionnaire with regard to reporting detailed and general Hispanic responses.
4.8. What is our overall assessment of data quality for the Census 2000 Hispanic question?
When viewed from the perspective that emerged following our evaluation of the 1990 census results, Census 2000 Hispanic data appear to be of a very high quality. Reversing the order of the race and Hispanic questions addressed the problem of nonresponse as shown by the relatively low allocation rates in 2000 (compared with 1990), as well as results from the Alternative Question Experiment. Refinements to the data editing process, particularly the new rule for imputing origin from people of the same race and imputing race from people with the same origin, dramatically reduced the artificial creation of relatively rare race/Hispanic origin combinations.
Despite the almost universal approval of the Hispanic population totals, it is clear that some of the changes introduced in Census 2000, such as the omission of the examples in the Hispanic question, probably encouraged respondents to provide general rather than detailed responses. This result casts a shadow on the quality of detailed Hispanic data. In the next section of this paper, will discuss efforts to address this problem as it relates to Census 2010 and the American Community Survey.
Given that the prime legislative mandate for Hispanic data is to provide an accurate count of the Hispanic population, and given the improvements the Census Bureau introduced in Census 2000 that largely addressed this directive, our overall assessment is that the Hispanic data quality is quite good.31