Appendix C. 1996 National Content Survey Sample Design
Design of the Survey
The 1996 National Content Survey (NCS), also known as U.S. the 2000 Census Test, is the major vehicle for testing subject content and specific question wording, format, and sequencing of items for Census 2000. Thirteen different questionnaires were included in this survey, including seven simple (short) forms and six sample (long) forms. The universe for the NCS is housing units in 1990 decennial census mailback areas only. These areas represent about 95 percent of the country.
The NCS sample design divided this universe into two strata based on 1990 race, Hispanic origin, and tenure (i.e, owner or renter) variables. The first stratum has a high proportion of Black persons, persons of Hispanic origin, and renters. This stratum is designated as the low coverage area (LCA) stratum and represents 20 percent of the NCS universe. The second stratum has comparatively low proportions of such persons and is designated as the high coverage area (HCA) stratum.
Stratified sampling was used to select a national sample of 94,500 housing units for the NCS. The sample was allocated to the panels (seven simple form panels and six sample form panels). For each of the seven simple form panels, a sample of 2,400 housing units was selected from the HCA stratum and 3,600 housing units from the LCA stratum. For each of the sample form panels, a sample of 3,500 housing units was selected from the HCA stratum, and a sample of 5,250 housing units was selected from the LCA stratum. The response rate for the simple form panels was approximately 72 percent at the national level, 77 percent in the HCA stratum, and 52 percent in the LCA stratum.
Results provided in this report are based on four simple form panels and apply only to persons in households who filled out and returned these forms. Results cannot be generalized to persons in households that did not complete and return a questionnaire, or to persons in households who did not reside in 1990 decennial census questionnaire mailback areas.
Computer-assisted telephone reinterviews were conducted in May and June 1996 to assess the reliability of the information collected on the mailback forms. In the LCA stratum, telephone reinterviews were attempted with each household that returned a completed questionnaire while in the HCA stratum telephone reinterviews were attempted with half of the households returning a completed questionnaire.
Approximately 2,550 households per panel were selected for reinterview (1,700 in the LCA and 850 in the HCA). The proportion of completed reinterviews was 77 percent (74 percent for LCA and 84 percent for HCA). Whenever possible, the respondent in these telephone reinterviews was the household member who completed and mailed back the questionnaire.
This subsection describes procedures followed in data collection and data preparation that are pertinent to the data analysis and how results are presented. These include: the collapsing of racial categories for analysis, how the race question and the Hispanic origin question were asked in the telephone reinterview, and how responses to open-ended categories such as "Other race" and "Other Asian or Pacific Islander" were coded.
For the two mail return forms containing race questions without a multiracial category (Panels 1 and 3), the 16 race categories were collapsed into the following five categories for analysis:White, Black, American Indian, Asian or Pacific Islander, and Other race. For the mail return forms containing a multiracial category (Panels 2 and 4), the 17 race categories were collapsed into six categories--the same five categories, plus a multiracial category.
The telephone reinterview used the race and Hispanic origin questions that were used in the 1990 census, modified for telephone interviewing, with one exception: one-half of the households who completed forms that provided a multiracial response option in the original survey were also given this multiracial option in the reinterview. This information was collected again in the telephone reinterview to evaluate the extent of inconsistent answers. Additional questions such as preferences for race and Hispanic origin terminology, were also collected in the telephone reinterview. These included, for example, "African-American" instead of "Black" and "Latino" instead of "Hispanic origin".
The question on race in the computer-assisted telephone reinterview was asked in two parts using "unfolding response category" methodology. This methodology worked as follows. In the telephone reinterview, households who did not have a multiracial option in their mail questionnaire were asked to select one of the following seven categories: White; Black; American Indian; Eskimo; Aleut; Asian and Pacific Islander, or Other race. Respondents selecting Asian or Pacific Islander were then asked to choose from 10 specific Asian or Pacific Islander subgroups. Telephone reinterviews of respondents whose mail questionnaire contained a multiracial category followed the same method, but a multiracial option was included along with the other seven categories for half of those reinterviewed. The same "unfolding response category" methodology was used for the "Other Hispanic" category of the Hispanic origin question in the telephone reinterview.
Data about race collected in the reinterview were collapsed into five major categories (six for respondents given the multiracial option) to be comparable with the mail return data. Write-in entries in the mail return questionnaires and follow-up entries in the telephone reinterviews were coded to the standard categories described earlier. Most entries were computer coded using a master file built from the 1990 census. Entries that could not be coded by the computer were coded by expert clerical coders.
To assess the reliability of the information collected on the mailback forms, the rate of disagreement for each category and the overall rate of disagreement were compared across panels. These estimates compare responses from the mailback forms with those given in the telephone reinterview and are an indication/measure of the inconsistency of reporting. The rate of disagreement for a particular response category, c, is the total number of persons reported in category c on either the mailback form or the reinterview, but not both, divided by the total number of persons with responses both measures. The overall rate of disagreement is the total number of persons with inconsistent responses on the mailback form and reinterview divided by the total number of persons with responses on both measures.
This report presents findings at national, stratum, and subdomain levels. Subdomains for analyzing the race question included Hispanic and not Hispanic, native born and foreign born. The subdomain classifications are based on information collected in reinterview, not reported on the mail return form. All persons not providing a response to a subdomain reinterview item are not included in a subdomain classification and are, therefore, excluded from the subdomain analysis. The different race and Hispanic origin questions on the mail return questionnaires are evaluated by comparing item missing data rates and the distribution of responses (excluding persons with no response) across panels. The item missing date rate for a panel gives the proportion of persons for whom the question was not answered. The distributions of responses are compared to determine if patterns of response differ across panel.