With the exception of the 1998 CPS data that are used in the 3-year averages, all CPS data that are used in the 1999 estimates were reweighted using weights that were controlled to Census 2000 population estimates. Previously, weights were controlled to demographic population estimates that were 1990-based and updated through the population estimates program. Although the CPS weights were controlled to Census 2000, the CPS sample remained 1990-based.
State Model Changes
The most important change to the state models for 1999 was the use of Census 2000 data in defining regression variables in place of the 1990 census data that were used previously. Two additional aspects of the models that were changed for 1999 are the following: (1) We used Census 2000 estimates as regression variables, instead of using "census residuals" (residuals from fitting an analogous model to the census data) as was done in most cases in past years; and (2) We specified an informative prior distribution for the regression coefficients of the administrative records predictors in our poverty ratio models.
Change (1) was done for a narrow technical reason and had minor effects on the results. In fact, in the census year (income year 1999), use of census data as a regression variable in the model for the CPS data gives identical model-based estimates to use of census residuals in the model, as long as the regression variables that are used in the census equation to construct the census residuals also appear in the CPS model equation. This was, in fact, the case for the median income model and the 65 and over poverty ratio model. For the 0-4, 5-17, and 18-64 poverty ratio models, however, the census equation included the food stamp participation variable that is no longer featured in the CPS equation. This variable was significant in the census equation for 1999, though it has been generally insignificant in CPS equations since 1997, including results for 1999 and preliminary results for 2000. Its significance in the census equation for 1999, despite its insignificance in the 1999 CPS equations, is most likely due to the very low levels of sampling error in the census estimates relative to the CPS estimates, which results in much lower standard errors for parameters estimated in the census equations. Rather than use 1999 CPS equations that did not include the food stamp participation variable but with census residuals computed from a model that did include the food stamp participation variable, we used CPS models with census data as a regression variable rather than census residuals.
Change (2) was made because of the much different status of the census poverty ratio estimates when they refer to the same year as the CPS estimates being modeled rather than to an earlier year. Discussion of the rationale behind the informative prior used and of its construction is given in the 1999 State Level Estimation Details. The use of the informative prior yielded small reductions in the state prediction error variances for the poverty ratio models, and hence, slightly narrower confidence intervals for the true poverty ratios. The prior had very minor effects on the point estimates. We did not use an informative prior distribution for coefficients in the median income model because median income models fitted for 1989 and 1999 rejected the assumption underlying the priors used for the poverty ratio models, namely, that certain regression coefficients were near zero. In fact, all regression coefficients in the fitted median income models were statistically significant.
County Model Changes
Two substantive changes are made in the 1999 county median household income model. One, the 1999 model is multiplicative instead of additive. Two, we use a new set of predictors in the 1999 model.
The motivation for replacing the additive model with a multiplicative model is both theoretical and empirical. Several qualities of household income suggest that the errors associated with its measurement increase as incomes increase. The multiplicative form results in variance estimates, and therefore confidence intervals, that increase with the point estimates of median household income. The change represents an empirical improvement over the old model due to its better model diagnostics, improved fit statistics, and a lower estimate of variance for most counties.
We chose a new set of predictors for two reasons. One, the new set of predictors is more parsimonious and, across all years that county SAIPE estimates were produced, provides a slight improvement in model performance. Two, the a priori theoretical and empirical relationship between the predictors and median household income corresponds well with the observed parameter estimates over all years that county SAIPE estimates were produced. While our primary concern is with the quality of the median household income estimates, the dependent variable of the model, a well-behaved and understood relationship between predictor and dependent variable improves the process of model diagnostics and outlier detection.