All statistical estimates contain error, and there has never been a complete census of income and poverty for all school districts in the nation. The exact size and direction of these errors are unknown, although their relative magnitude can often be estimated. The methodology used to produce the SAIPE school district estimates is designed to minimize these errors.
For SAIPE 2010, a methodological change was implemented that makes the guidance detailed below less useful. Details on this change are contained in Estimation Procedure Changes. Briefly, one of the inputs to the school district estimation process, the decennial 2000 sample estimates of school-age poverty, was replaced with estimates derived from the latest five-year sample of the American Community Survey (ACS 2006-10). Preliminary evaluation indicates that the ACS five-year estimate as an estimator of current-year poverty represents an improvement in relative error compared to using the decennial 2000 estimate. The impact on the final SAIPE school-district estimates will thus yield an improvement in precision. A fuller evaluation is underway to give more detail on how the uncertainty of these estimates has improved.
The remainder of this document provides an explanation of the SAIPE program’s estimation of the average size of these statistical errors for SAIPE 2009. Given the methodological change for SAIPE 2010, this guidance should be viewed as an approximate upper bound on the uncertainty present in SAIPE 2010 school-district estimates.
Before discussing the average statistical error, it is helpful to first briefly review the methodology by which the school district poverty estimates are currently created. The school district poverty estimates are based on a shares model, whereby the SAIPE county poverty estimates are allocated to the pieces of school districts within each county. The within-county poverty shares are estimated by combining poverty shares from two sources: estimated poverty shares from the long-form sample data from the prior census (Census 2000), and current year poverty shares calculated from aggregated data from federal individual income tax returns. In general, the relative contributions of these two sources to the poverty share estimates depends on the extent to which tax returns for the county can be assigned (geo-coded) to the particular school districts within the county. If almost all county tax returns can be geo-coded, the poverty share estimates will be very close to the shares from the tax data. If a substantial number of county tax returns cannot be geo-coded, the poverty share estimates will be close to those from Census 2000. For more details of the estimation procedure, please see Estimation Details for School Districts.
Given this methodology, the statistical errors cannot be separately identified as sampling error, model error, or measurement error. Thus, the relative magnitude of the combination of all these errors is estimated. The resulting relative error is different from the standard error reported for Census Bureau survey estimates, which is solely a measure of sampling error.
Relative error in the school district estimates comes from errors in the estimation components, and so the relative error can be divided into four main components:
To estimate the magnitude of relative errors in the school district poverty estimates, an evaluation study was performed. First, 1999 SAIPE school district poverty estimates were created using current SAIPE methodology. The resulting estimates were then compared to the Census 2000 school district poverty estimates (which also refer to 1999). Differences between these two sets of estimates reflect both errors in the 1999 SAIPE school district poverty estimates and sampling errors in the Census 2000 poverty estimates. The squared differences between these estimates provided data that were used, with certain assumptions, to estimate a simple model that produced estimates of the mean squared errors (MSEs) in the 1999 SAIPE school district poverty estimates. This model reflected variation in how the MSEs depend on the poverty shares, and also variation in MSEs between groups of school districts defined by their corresponding county population sizes and geocoding rates. For a detailed description of this evaluation, see Calculating Coefficient of Variation for the Minimum Change School District Poverty Estimates and the Assessment of the Impact of Nongeocoded Tax Returns.
The MSE model just mentioned was applied to the current data to produce MSE estimates for each school district for the current estimation year. The MSEs were then converted to coefficients of variation (CVs), defined here as the square root of the MSE divided by the estimate. These results are summarized below through the medians of the CVs across groups of school districts defined by their total population sizes.
|Total Population of School District||Median CV|
|65,000 and up||0.15|
The coefficient of variation presented is a measure of the relative error in an estimate, calculated as the square root of the estimated MSE of the poverty estimate divided by the poverty estimate itself. For an approximate 90% confidence interval (under the assumption that the poverty estimate itself is unbiased), expressed as a proportion of the poverty estimate, multiply the table value by 1.645. For example, for a school district with a total population of 75,000 individuals, an approximate measure of the relative standard error of the SAIPE poverty estimate is 0.15, or 15%. An approximate 90% confidence interval for this school district would be 0.15*1.645*100%, or +/- 25%.