
Census STF3A USER NOTE 2: Clarification of Differences Between 100-Percent Counts and Sample Estimates
Estimated population and housing unit totals based on tabulations from only the sample questionnaires (sample tabulations) may differ from the official counts as tabulated from every census questionnaire (100-percent tabulations). Such differences result, in part, because the sample tabulations are based on information from a sample of households rather than from all households (sampling error). Differences also can occur because the interview situation (length of questionnaire, effect of the interviewer, and so forth) and the processing rules differ somewhat between the 100-percent and sample tabulations. These types of differences are reflected in what is called nonsampling error. For a more detailed description go to Nonsampling Errors.
The 100-percent data are the official counts and should be used as the source of information on items collected on the 100-percent questionnaire, such as race, Hispanic origin, age, and number of rooms in housing. This is especially appropriate when the primary focus is on counts of the population or housing units for small areas such as census tracts, block groups, and for American Indian and Alaska Native areas. For estimates of counts of persons and housing units by characteristics asked only on a sample basis (such as education, labor force status, income, and source of water), the sample estimates should be used within the context of the error associated with them.
Many users are interested in tabulations of items collected on the sample cross-classified by items collected on a 100-percent basis such as age, race, gender, Hispanic origin, and housing units by tenure. Given the way the weights were applied during sample tabulations, generally there is exact agreement between sample estimates and 100-percent counts for total population and total housing units for most geographic areas. At the state and higher levels, we also would expect that sample estimates and 100- percent counts for population by race, age, gender, and Hispanic origin and for housing units by tenure, number of rooms, and so on, would be reasonably similar and, in some cases, the same. At smaller geographic levels, including census tract, there is still general agreement between100-percent counts and sample estimates of total population or housing units. At smaller geographic levels, however, there will be expected differences between sample estimates and 100-percent counts for population by race, age, gender, and Hispanic origin and for housing units by tenure, number of rooms, and so on. In these cases, users may want to consider using derived measures (mean, median, and so on) or percent distributions. Whether using absolute numbers or derived measures for small population groups and for a small number of housing units in small geographic areas, users should be cautioned that the sampling error associated with these data may be large.
Even though the differences between sample estimates and 100-percent counts for these categories are generally small, the differences for the American Indian as well as the Hispanic origin populations are relatively larger than for other groups. The following provides some explanation for these differences.
State-level sample estimates of the number of American Indians are generally higher than the corresponding 100-percent counts. It appears the differences are primarily the result of proportionately higher reporting of "Cherokee' tribe on sample questionnaires. This phenomenon occurs primarily in off-reservation areas. The reasons for the greater reporting of Cherokee on sample forms are not fully known at this time. The Census Bureau will do research to provide more information on this phenomenon.
For the Hispanic origin population, sample estimates at the state level are generally lower than the corresponding 100-percent counts. The majority of difference is caused by the 100-percent and sample processing of the Hispanic question on the sample questionnaire when the respondent did not mark any response category. When processing the sample, we used written entries in race or Hispanic origin as well as responses to questions only asked on the sample, such as ancestry and place of birth. These procedures led to a lower proportion of persons being assigned as Hispanic in sample processing than were assigned during 100-percent processing. The Census Bureau will evaluate the effectiveness of the 100- percent and sample procedures.
As we have done in previous censuses, we will evaluate the quality of the data and make this information available to data users. In the meanwhile, both 100-percent and sample data serve very important purposes and, therefore, should be used within the limitations of the sampling and nonsampling errors.
Source: U.S. Census Bureau
Last Revised: Thursday, 26-Jan-2012 17:47:33 EST