Census Bureau

3. SURVEY DESIGN AND METHODOLOGY

Line Divider

This section provides general information on the design of the survey, the methodology for the survey, and the editing of race data, and the data analysis methods used. Additional detail on the survey design and methodology is provided in Appendix B.

3.1 Design of the Survey

3.1.1 Experimental Design

The 1996 Race and Ethnic Targeted Test (RAETT) is the major vehicle for testing alternative versions of the race and Hispanic origin questions for Census 2000. The primary objectives of the RAETT were to test the effects of:

To test these objectives, eight questionnaires (panels) were included in this survey: one control panel and seven experimental panels. Each of the experimental panels was designed to assess one or more of the proposed changes, as shown in Table 3-1. For example, a comparison of Panel B to Panel A allows us to evaluate the effect of including a multiracial category, while a comparison of Panel B to Panel D allows us to evaluate the effect of asking the race and Hispanic origin questions in a different sequence. Table 3-2 on the following page provides a detailed description of the question design features for these eight panels.

Table 3-1. Experimental Design

Order of race and
Hispanic origin
questions
Options for reporting more than one race
None Multiracial category Mark one or more/
Mark all that apply
Hispanic origin asked
 first
Panel A Panel B, Panel G Panel C, Panel H
Race asked first   Panel D  
Race and Hispanic
 origin asked together
 in one question
  Panel E Panel F

Table 3-2. Race and Hispanic Origin Question Design Features by Panel1

Separate race, Hispanic origin questions Combined race, Hispanic origin, ancestry
question
Separate race, Hispanic origin
questions
Panel A Panel B Panel C Panel D Panel E Panel F Panel G Panel H
Modified
1990 Census race
question2)
"Multiracial or
biracial" category
"Mark one or
more races..."
instruction
"Multiracial or
biracial" category
"Multiracial or
biracial" category
"Mark one or more
boxes..." instruction
"Multiracial or
biracial"
category
"Mark all
that apply"
instruction
Separate categories:
"Indian (Amer.)"
"Eskimo" "Aleut"
Combined category,
"Indian (Amer.) or
Alaska Native"
Combined category,
"Indian (Amer.) or
Alaska Native"
Combined category,
"Indian (Amer.)
or Alaska Native"
Combined category,
"Indian (Amer.)
or Alaska Native"
Combined category,
"Indian (Amer.)
or Alaska Native"
Combined
category and
spell out
"American Indian
or Alaska Native"
Combined
category,
"Indian (Amer.)
or Alaska
Native"
"Hawaiian";
"Guamanian"
categories
"Hawaiian";
"Guamanian"
categories
"Hawaiian";
"Guamanian"
categories
"Native Hawaiian";
"Guamanian or
Chamorro"
categories
Combined category,
"Asian or Pacific
Islander"
Combined category,
"Asian or Pacific
Islander"
"Native
Hawaiian";
"Guamanian or
Chamorro"
categories
"Hawaiian";
"Guamanian"
categories
No
alphabetization
No
alphabetization
No
alphabetization
No
alphabetization
No
alphabetization
No
alphabetization
Alphabetize Asian
and Pacific
Islander groups
No
alphabetization
Modified
1990 Census
Hispanic origin
question2)
Modified
1990 Census
Hispanic origin
question
Modified
1990 Census
Hispanic origin
question
Modified
1990 Census
Hispanic origin
question
Combined
question
Combined
question
Modified
1990 Census
Hispanic origin
question
Modified
1990 Census
Hispanic origin
question
1995 test census
sequence: Hispanic
origin followed
by race
Hispanic origin
followed by race
Hispanic origin
followed by race
Race followed by
Hispanic origin
Combined
question
Combined
question
Hispanic origin
followed by race
Hispanic origin
followed by race

1Terminology for the Black and Hispanic origin population is consistent across all panels. All forms have consistent sequencing of sex, age, and relationship as the first three questions.

2See Appendix B for the modifications to the 1990 Census race and Hispanic origin questions.

3.1.2 Sample Selection

It was critical that the RAETT sample allow inferences to be made for the following population groups:

To accomplish this, six independent sampling frames were created based on 1990 race, Hispanic origin, and ancestry data. In order to obtain samples that included a higher proportion of the targeted population groups than would result from a stratified national sample, only areas with high proportions of households for each special targeted population group were included in the sampling frames. For example, the Hispanic targeted sample would contain only areas with high proportions of Hispanic households. Appendix B details the criteria used to maintain a sufficiently large sampling frame while maintaining a high proportion of the targeted population groups. The six sampling frames represent only a fraction of the total housing units in the United States. In some instances the sampling frames are limited to 15 or fewer states. Table 3-3 provides data on the number of states and housing units in each sampling frame. For each targeted sample, the final column in Table 3-3 provides an approximation of the number of occupied housing units (households) in each sampling frame (e.g., Black) as a proportion of occupied housing units in the United States containing such persons. For example, the 10 percent in the last column of Table 3-3 for the Black targeted sample means that the sampling frame for the Black targeted sample only contained approximately 10 percent of the Black households in the United States.

Table 3-3. Characteristics of the Sampling Frames

Targeted sample States Housing units
(in thousands)
Households in sampling frame
as a percent of total U.S.
households (containing
race/ancestry group)
White ethnic 29 156 1
Black 34 1,495 10
Hispanic 15 1,190 15
American Indian 18 35 2
Asian or Pacific
Islander
8 119 3
Alaska Native Alaska only,
20 villages
2 8

An independent, systematic sample of housing units was selected from each of these six frames. When a housing unit was selected, the next seven housing units were also taken, thus forming fairly homogenous clusters of eight housing units. The eight housing units in the clusters were then randomly assigned to the eight panels such that each housing unit was assigned to only one panel. The sample was allocated to the eight panels as shown in Table 3-4. Since the RAETT analysis is based only on questionnaires returned by mail, the sample allocation was designed so the expected number of returned questionnaires across panels within each targeted sample would be approximately the same; variation in mail response rates was expected across targeted samples. As shown in Table 3-4, the total sample consisted of 112,100 housing units.

Table 3-4. Mailout Sample Size (Housing Units) by Panel and Targeted Sample

Targeted
sample
Panels Total
A B C D E F G H
White
 ethnic
2,710 2,710 1,355 2,710 2,710 2,710 1,240 1,355 17,500
Black 4,126 4,126 2,063 4,126 4,126 4,126 1,794 2,063 26,550
Hispanic 4,126 4,126 2,063 4,126 4,126 4,126 1,794 2,063 26,550
American
 Indian
2,450 2,450 1,225 2,450 2,450 2,450 1,150 1,225 15,850
Asian or
Pacific
Islander
3,660 3,660 1,830 3,660 3,660 3,660 1,740 1,830 23,700
Alaska
Native
650 650 (NA) 650 (NA) (NA) (NA) (NA) 1,950
Total 17,722 17,722 8,536 17,722 17,072 17,072 7,718 8,536 112,100
(NA) Not applicable. Panel not included for the Alaska Native targeted sample.

3.2 Survey Methodology

In order to maximize the number of questionnaires returned by mail, a mailout strategy developed in testing after the 1990 Census was used in the RAETT. A prenotice letter (advising the household that a questionnaire would arrive shortly) was mailed to all sampled housing units on June 14, 1996. This was followed by the initial questionnaire mailout on June 18. The RAETT census day was June 22, and a reminder card was sent on June 26. Finally, in an attempt to maximize response rates, a replacement questionnaire was mailed on July 16. The replacement questionnaire was only sent to the households that had not returned the initial questionnaire.

Because the Hispanic targeted sample contained many Spanish speaking households, the eight forms were translated into Spanish. Each household in the Hispanic targeted sample was mailed both English and Spanish forms, and the respondents could choose which form to fill out and return. Just over 11,000 forms were completed and returned in the Hispanic targeted sample; of these, almost 38 percent were Spanish forms.

The response rates for the six targeted samples are provided in Table 3-5. The response rate is the ratio of the total number of questionnaires returned to the total mailed questionnaires that could be delivered by the United States Postal Service. Results provided in this report are based on responses from persons in households who filled out and returned a form. These results apply only to those who responded to the mailout questionnaire and cannot be generalized to persons in households who did not complete and return a questionnaire.

Table 3-5. Mail Response Rate by Targeted Sample

Targeted sample Mail response rate
(percent)
Number of returns
White ethnic 71.3 12,471
Black 47.4 12,577
Hispanic 44.1 11,714
American Indian 53.1 8,411
Asian or Pacific Islander 55.2 13,081
Alaska Native 34.0 663
Total 52.6 58,917

3.3 Editing of Race Data

For the mail return form that did not contain an option for reporting more than one race in the race question (Panel A), the 16 race categories were collapsed into the following five categories for analysis: White, Black, American Indian or Alaska Native, Asian or Pacific Islander, and Other race. The mail return forms containing the option for reporting more than one race (Panels B through H) varied in the number of race categories each had; some forms had six categories while others had 15 categories. Regardless, the various race categories were collapsed into six categories for analysis: the same five categories plus multiracial category/multiple race. An unrequested multiple response category was also included in some analyses to reflect those respondents who checked more than one race category when the instructions said to mark only one. The five panels with an unrequested multiple response category were A, B, D, E, and G. Those respondents who reported more than one race were aggregated in several ways using the write-in entries. These different methods are discussed in detail in Appendix B.

For the mail return race data, if only one specific race were provided by a respondent and it did not agree with its associated major race category, it was reclassified into the appropriate major race category. For example, a write-in entry coded as Cape Verdian provided in the "Other Asian or Pacific Islander" group would have been reassigned to the "Other race" category. Cases in which the respondent provided two or more write-ins for a category were not reclassified. Most write-in entries were computer coded using a master file built from the 1990 census; however, those entries that could not be coded by the computer were coded by expert clerical coders. A more detailed discussion of the response coding techniques is in Appendix B.

3.4 Data Analysis Methods

3.4.1 Statistics Used

The effects of different race and Hispanic origin questions on the mail return questionnaires were evaluated by comparing item nonresponse rates and the distribution of responses across panels. The race distributions, excluding item nonresponse, were compared at the category level using the proportion within the category. The item nonresponse rate for a panel gives the proportion of persons who did not answer the question. The distributions of responses are compared to determine if patterns of response differ across panels.

3.4.2 Panel Comparison Methods

To test for differences between two panels, estimates for the percent in a category and the item nonresponse were calculated, as well as the standard errors of those estimates, and the standard error of the difference between the two estimates. The standard error of the estimates measures the amount of variation in the estimates due to sampling. Because data from the RAETT are based on a sample survey and not on a complete census of households in each sampling frame, the results are subject to sampling error. Standard errors are not included in text tables as they are in the detailed tables in Appendix D.

Confidence intervals at the 90 percent level were used to test for significant differences. The standard method of constructing confidence intervals was employed, i.e., the standard error of the difference was multiplied by a value from the Student's t-distribution and this product was added to and subtracted from the estimate of the difference. If zero is included in the confidence interval, then no significant difference exists (i.e., the apparent difference may be due to sampling error). The t-distribution value was 1.645 for most comparisons. Some comparisons required an adjustment to account for the effect of multiple comparisons in order to maintain an overall confidence level of 90 percent. The adjustment was used for two comparisons: Panel A to Panel C; and Panel A to Panel H. The effect of using an adjustment factor is to make the individual comparison tests more conservative (i.e., less likely to detect a significant difference), while maintaining the overall confidence level for the two comparisons. For those comparisons where a multiple comparison adjustment was needed, a value of 1.95 was used.

Throughout this report, statements that a treatment had an effect indicate that the differences in the percent in the category or in the item nonresponse were statistically significant at the 90 percent confidence level. Conversely, statements that a treatment had no effect indicate that such differences were not statistically significant at the 90 percent confidence level. Since the sample design for each targeted sample used equal probability of selection methods, unweighted data were used in all the analyses in this report. Usually, weights are used if one desires to make inferences (e.g., population totals) about the target population; however, in the RAETT only estimates of population proportions were made.

Line Divider

Source: U.S. Census Bureau, Population Division and
Decennial Statistical Studies Division

Questions? / 1-866-758-1060
Last Revised: October 31, 2011 at 10:03:14 PM