SIPP Home > Technical Information > SIPP Sampling and Weighting > SIPP Sampling Error > Using GVFs to Approximate Variance Estimates
The Census Bureau provides two forms for approximate variance estimation: GVFs and tables of standard errors (the square root of the variance) for different estimated numbers and percentages. The generalized estimates provide indications of the magnitude of the sampling error in the survey estimates. They serve as convenient ways to summarize the sampling errors for a broad variety of estimates.
The GVFs for SIPP were derived by modeling the standard error behavior of groups of estimates with similar standard errors. The mathematical form of the function adopted is
![]()
where s represents the standard error and x the value of an estimate. The parameters a and b are derived on the basis of a selected group of estimates. They are updated annually and are included in the source and accuracy statement that accompanies each SIPP data file for a panel. It is essential to use the parameter estimates for a specific panel and to follow the instructions to apply necessary adjustments to obtain the correct estimates for subgroups. Besides GVFs, the Census Bureau provides summary tables of general standard errors. Those estimates are also available in the source and accuracy statements. The following examples show how to use GVFs to estimate the standard errors of estimated numbers and of sample means. The use of GVFs and tables of standard errors is described in the source and accuracy statements for each panel.
Before looking at the examples, the user should note that the generalized variance estimates for estimating the standard errors of other statistics may not be accurate for small subgroups. Using the 1984 SIPP Panel, Bye and Gallicchio (1989) developed variance functions for participants of Old-Age, Survivors, and Disability Insurance (OASDI) and Supplemental Security Income (SSI) programs. They found that for estimates of less than 10 million, the generalized standard error estimates provided by the Census Bureau were 1.20 to 1.75 times larger than those obtained from the variance functions developed specifically for that subgroup.
The approximate standard error, s, of an estimated number of persons (or households, and families) can be obtained by the formula
![]()
where a and b are the parameters associated with the estimate for the particular reference period, and x is the weighted estimate. This equation is appropriate for the standard errors of estimated numbers and should not be applied to estimates of dollar values.
Suppose that the number of households with monthly household income above $6,000 is estimated from Wave 1 of the 1991 Panel to be 472,000. The approximate values of a and b from Table 6 of the source and accuracy statement of the 1991 Panel are a = –0.0001005 and b = 9,286. Then, the standard error, s, of this estimated number is given by
The approximate 90 percent confidence interval for the estimated number can be computed as x ± 1.64 s, which ranges from 364,000 to 580,000. Therefore, a conclusion that the average estimate derived from all possible samples lies within an interval computed in this way would be correct for roughly 90 percent of all samples.
A mean is defined here to be the average quantity of some
characteristic (other than the number of persons or households) per person or
household. For example, a mean could be the average monthly household income of
females 25 to 54 years of age. The formula used to estimate the standard error
of a mean,
, is
where y is the size on which the estimate is based, s2 is the estimated population variance of the characteristic, and b is the parameter associated with the particular type of characteristic. Because of the approximations used in developing this formula, an estimate of the standard error of the mean obtained from this formula will generally underestimate the true standard error.
The estimated population mean is computed with the formula
[D]
and the estimated population variance can be computed as
[D]
with the use of standard software for weighted data. Suppose that, based on Wave 1 data of the 1991 Panel, the mean monthly cash household income for females aged 25 to 54 is $2,530, the weighted number of females in this age range is y = 39,851,000, and the population variance is estimated to be s2 = 3,159,887. When the appropriate b parameter of 7,514 from Table 6 of the source and accuracy statement for Panel 1991 is used, the estimated standard error of this mean is
Thus, the 90 percent confidence interval, computed as
![]()
ranges from $2,491 to $2,569. Therefore, a conclusion that the average estimate derived from all possible samples lies within an interval computed in this way would be correct for roughly 90 percent of all samples.
|
Main |
Introduction to SIPP |
SIPP Survey Content |
Technical Information |
Using & Linking Files |
SIPP Publications |
|
Access SIPP Data |
SIPP Users' Guide |
SIPP Tutorial |
User Notes/ListServe/News |
SIPP Help |
Page Last Modified: May 9, 2006