The accuracy of the school district estimates were evaluated for a number of school district characteristics based on prior research on school district poverty and population estimates (Miller 2001; National Research Council 2000). Keeping the characteristics examined consistent allows for comparisons across decades to identify improvements in estimation methods. The accuracy of the school district estimates was examined for the following characteristics of school districts:
- Total census population in 1990;
- Total census population in 2000;
- Percent change in census population from 1990 to 2000;
- Census division; and
- School district type.
See Tables 8 and 9 for summaries of the MALPEs and MAPEs by these characteristics of school districts. Analyzing the accuracy of the school district estimates using these characteristics helps determine the types of school districts for which the synthetic ratio method works relatively well and for the types for which this approach is problematic. Where the accuracy of the school districts estimates differs by geography, size, or population change, bias in the population estimates may adversely affect the estimates of the percent of children in poverty (when combined with SAIPE’s poverty estimates) and the distribution of Title I federal funds to school districts.
A. 1990 Census Population
Almost half (46.0 percent) of school districts had a total population under 5,000 in 1990. These school districts accounted for only 5.7 percent of the school-age population. School districts with 20,000 or more people represented 19.3 percent of all school districts, but contained 73.0 percent of the school-age children in 1990 (see Table 10). As larger school districts were more likely to make up larger proportions of the county populations, it may be easier to estimate the school district populations for larger school districts with the synthetic ratio method.
Figure 2 shows the MALPEs for the total and school-age populations by the school district total population size in the 1990 census. The synthetic ratio method consistently underestimated the school-age population for all size categories of school districts. The largest MALPE was for school districts with total populations of 10,000 to 19,999, which were underestimated by 5.4 percent.
For school districts with less than 5,000 people, the synthetic ratio method overestimated the total population by 2.0 percent. The total population was underestimated for all larger school districts by 1.2 to 2.0 percent (MALPEs). This is in contrast to the findings reported for the 1980 Census-based estimates of the 1990 population where the total population was overestimated for all school districts, particularly the largest and smallest (Miller 2001). The differences were likely due to differences in errors for the county estimates across decades where the county population estimates were too high for the 1980s and too low for the 1990s (Blumerman and Christenson 2002).
Figure 3 presents the unweighted and weighted MAPEs for the estimates of the school-age populations by the school district size in 1990. Similar to the results found in the previous work (Miller 2001), the unweighted error for school districts with less than 5,000 people (16.2 percent) is about 50 percent higher than the errors for larger school districts (ranging from 9.2 to 11.1 percent). Weighting the MAPEs by the school-age population in Census 2000 somewhat reduced the differences in errors across size categories, but there was still a steady decline in average errors with increasing school district size. Figure 4 shows similar results for the total population estimates: the larger the school districts, the smaller the average errors.
The school-age population estimates had larger errors (MALPEs and MAPEs) than for the total population, which suggests that it is more difficult to correctly distribute the population by age within counties (and consequently school districts) than to estimate the total county (and school district) populations.
B. Census 2000 Population
Of the 14,310 school districts, 6,252 (43.7 percent) had total populations of less than 5,000 in Census 2000, a decline from 46.0 percent in 1990 (see Table 10). These school districts accounted for only 4.5 percent of the school-age population. School districts with 20,000 or more people represented 21.8 percent of all school districts, but contained 77.3 percent of the school-age children in 2000. Both the proportion of large school districts and the proportion of children in those districts were higher in Census 2000 than in the 1990 census.
Not surprisingly, the relationship between the size of the school districts in 2000 and the size of the average errors was similar to that for the size of the school districts in 1990. Figure 5 shows that the school-age population was underestimated in all sizes of school districts, ranging from -2.2 percent for school districts with under 5,000 people to -6.1 percent for school districts with 20,000 to 39,999 people. The total population was overestimated by 3.0 percent for school districts with less than 5,000 people and was underestimated for all other school district size categories by 1.4 percent to 3.2 percent.
The unweighted MAPEs for the school-age population were almost two-thirds higher for school districts with less than 5,000 people than for larger school districts (see Figure 6). Weighting the MAPEs by the school-age population in Census 2000 reduced the average errors for both the largest and smallest school districts, but changed the MAPE values for the other size categories very little. The smallest school districts also had the largest unweighted MAPEs for the total population, 11.5 percent for school districts with populations under 5,000 and 7.0 to 7.5 percent for the larger school districts (see Figure 7). As with the school-age population, the differences in average errors by school district size were substantially reduced when weighted by population. These findings suggest that smaller school districts may need special treatment in future school district estimates and research.
The relationship between school district size and average errors was similar to that found for county size and average errors. When April 1, 2000 county estimates were compared with the Census 2000 results, larger counties tended to have lower MAPEs. This was also true when comparing April 1, 1990 estimates with the 1990 census data (Blumerman and Christenson 2002). The similarities may be due to both the nature of creating population estimates and the inclusion of county estimates in the calculation of school district population estimates.
C. Percent Population Change, 1990 - 2000
The differences in the accuracy of the estimates for the total and school-age populations in school districts by the percent of population change from 1990 to 2000 are striking. Figure 8 shows that for school districts with school-age population declines of more than 10 percent, the estimates of the school-age population were on average too high (MALPE of 18.0 percent). The estimates for school districts with population increases of 10 percent or more were on average 12.9 percent too low. The synthetic ratio method overestimated by 25.7 percent (MALPE) the total population for school districts with more than a ten percent decline in population in the 1990s. The school districts that experienced total population declines of 5 percent to 10 percent were overestimated by 7.7 percent. School districts with population increases of 10 percent or more were underestimated by 8.1 percent. For school districts with more moderate population changes (decreases up to 5 percent through increases up to 10 percent), the synthetic ratio method performed relatively well, with MALPEs of -0.5 percent to 4.5 percent for the total population and -3.7 percent to 0.9 percent for the school-age population. The unweighted MAPE for school districts with declines in the school-age population of 10 percent or more was 20.4 percent (see Figure 9). The next largest MAPE was for school districts with population increases of 10 percent or more (14.5 percent). For the school districts with changes of 10 percent or less, the MAPEs show that the average errors were about the same (7.8 to 8.6 percent). Weighting the MAPEs again reduced the differences among the categories of population change, but still shows that the largest errors occurred for the school districts with the largest percent changes.
For school districts where the total population declined by more than 10 percent between 1990 and 2000, the MAPE was 26.1 percent for the total population estimates, over twice the mean errors for school districts that experienced population declines of less than ten percent or population growth (ranging from 4.6 percent to 10.3 percent, see Figure 10). The second largest MAPE was for school districts with population increases of ten percent or more (10.3 percent). Weighting the MAPEs with the Census 2000 population reduced the MAPE for school districts with the largest population declines by over half to 12.6 percent.
The synthetic ratio method does not perform well when estimating school districts with extreme population changes, though the errors are attributed partly to errors in the county estimates. This was also true when comparing April 1, 2000 county estimates with Census 2000 data. Counties with the largest percent population change from 1990 to 2000, whether growth or decline, had the largest MAPEs (Blumerman and Christenson 2002). These findings demonstrate how the assumption that school district populations change at the same rates as the counties in which they lie fails to capture large population changes and redistribution within these counties. These results also suggest that small school districts with relatively large population changes are among the most difficult to estimate accurately.
D. Census Divisions
As described above (Sections III.A. and III.B.), the school district population can be difficult to estimate for states with relatively large population changes, such as California, Arizona, New Mexico, and Texas. States with many small school districts and relatively small school district populations, such as Nebraska, North Dakota, and Montana, also had relatively large errors when comparing the estimates to the Census 2000 standard (see Tables 2 through 5).
Tables 8 and 9 summarize differences in the accuracy of the school district population estimates by Census Division.3 The estimates produced with the synthetic ratio method underestimated the school-age population in all divisions, particularly in the south and east (West South Central, South Atlantic, Middle Atlantic, New England, and East South Central Divisions), but also in the Mountain Division. The total population was underestimated in the New England and South Atlantic Divisions and overestimated in the West North Central and Pacific Divisions. The mean errors were quite small for the other five Census Divisions. The relatively small differences across Divisions were likely due to the effects of combining many sizes of school districts with large ranges of population changes into single categories.
Consistent with the findings for individual states, the Pacific and Mountain Divisions had the largest MAPEs for the total and school-age population estimates. These Divisions include the states of Arizona, Colorado, Idaho, Nevada, New Mexico, Oregon, Utah, and Washington, some of the fastest growing states during the 1990s. The smallest MAPEs were in the South Atlantic and East South Central Divisions, which contain some of the states with the slowest growth in the past decade, as well as five of the states for which most or all of the school district boundaries were identical to county boundaries. The states with coterminous county and school district boundaries have lower errors on average because the county population estimates were more accurate than the school district estimates. Weighting the errors by population greatly reduced the differences in the errors by Census Division.
Similar to what was found with the bias associated with the percent change in school districts from 1990 to 2000, Census Divisions that experienced the largest population growth from 1990 to 2000 contained school districts whose populations were the most difficult to estimate accurately. These findings suggest that the synthetic ratio method may yield school district population estimates of acceptable accuracy for most states and Census Divisions. The development of alternative methodology could focus on the fastest changing areas.
E. Type of School District
This report attempts to determine the limits of the synthetic ratio method for estimating the population in school districts, and it may prove useful to determine whether the method produces estimates that differ in accuracy by types of school districts. Of the 13,876 school districts included in statistics for the estimated school-age populations, 17.3 percent were Elementary School Districts (ESDs), 3.4 percent were Secondary School Districts (SSDs), and 79.2 percent were Unified School Districts (USDs, see Table 8). The distribution was similar for the 14,256 school districts included in the statistics for the estimates of the total population: 18.8 percent were ESDs, 3.4 percent were SSDs, and 77.7 percent were USDs (see Table 9).
There were also five areas outside of school districts for which the school-age population size was large enough (30 people or more) to be included in the evaluation statistics discussed above. For the estimates of the total population, there were ten areas outside of school districts that met the minimum size criteria and were included in the analyses presented in this report. However, when MALPEs and MAPEs were calculated separately by school district type, the interpretation is limited for these areas outside school districts because only five or ten observations were used. For example, the school-age population was underestimated by 10.9 percent (MALPE) and the total population for areas outside school districts was overestimated by 10.1 percent. The unweighted MAPEs had even more extreme average errors of 30.0 percent for the school-age population and 35.9 percent for the total population. As these errors for the areas outside of school districts were relatively large, a series of MALPEs and MAPEs (not shown) were also calculated for the school district characteristics described above with the areas outside of school districts excluded. The differences between the MALPEs and MAPEs presented in this paper and for the subset with the areas outside school districts excluded were negligible for all the evaluation characteristics.
The MALPEs in Table 8 show that the school-age population was overestimated for ESDs by 1.5 percent, underestimated by 16.5 percent for SSDs, and underestimated by 4.5 percent for USDs. The differences in mean errors were much smaller for the total population which was overestimated by 3.2 percent for ESDs and underestimated for SSDs and USDs by 1.2 percent and 0.7 percent, respectively.
The MAPE for the estimates of the school-age populations were almost 50 percent higher for the ESDs and SSDs than for the USDs (see Figure 11). These differences remained after weighting by the school-age population. In contrast, the MAPE for the ESDs for the total population estimates was about two times higher (14.6 percent) than those for the SSDs (7.2 percent) and the USDs (7.8 percent) (see Figure 12). These differences were smaller for weighted MAPEs (9.5 percent for ESDs and 6.5 percent for both SSDs and USDs), but still show that the estimates were least accurate for the Elementary School Districts.
One reason the errors were higher for the ESDs and SSDs may be that the average population in Census 2000 was at least twice as large for USDs as for ESDs and SSDs (1,196 for ESDs, 2,206 for SSDs, and 4,406 for USDs). As shown in Sections IV.A. and IV.B. above, estimates for school districts with smaller populations tended to be less accurate. In addition, the SSD mean errors apply to the smallest number of school districts (477 for the school-age population and 491 for the total population), so outliers would have greater effects on the errors. The underestimation of the older population of children (including 12 to 17 year olds) may be because there was a relatively large immigrant population which was underestimated for this age group in the county estimates for the 1990s.
The results also indicate that there may be some differences in the age-to-grade distributions used to assign relevant children to overlapping school districts. The assignment of relevant children to school districts in the 1990 census starting population was based on averages of 1988-1990 CPS data. In the Census 2000 standard used for the evaluation, relevant children were assigned to school districts based on Census 2000 sample data by sex, race, Hispanic origin, and Census Region. It is possible that the age-to-grade distributions may have changed over the ten-year period and introduced additional error into the estimates of Elementary and Secondary School Districts that do not affect the estimates for children in Unified School Districts.
Finally, the statistics for the estimates for the total population may be somewhat skewed because the errors associated with overlapping school districts may be counted more than once in the computation of the MAPEs and MALPEs. However, there is little evidence of bias created by including overlapping school districts in the statistics more than once. Tables 11 and 12 show MAPEs for a subset of school districts that includes only USDs and school districts with boundaries that were not coterminous with county boundaries. The MAPEs demonstrate similar patterns of errors by school district characteristics with the largest errors for the smallest school districts and for those with the largest percent changes from 1990 to 2000.
F. Errors for School Districts with Total Populations of 20,000 or More
One of the major conclusions supported by the findings presented above and in prior work (Miller 2001) is that the synthetic ratio method performs relatively well for larger school districts. In order to further test this finding, a set of statistics was calculated only for the school districts with total populations of 20,000 or more. Table 13 shows some substantial difference between the MAPEs by school district characteristics. For example, the MAPE for the school-age population estimate for school districts with more than a 10 percent decrease in the total population from 1990 to 2000 was 20.4 percent for all school districts compared with 12.3 percent for the subset of larger school districts. Large differences also occurred for the New England, West North Central, West South Central, Mountain, and Pacific Divisions (which contain many of the smaller school districts) and Elementary School Districts. The errors were smaller with only the larger school districts included for every category except for those with population declines of 5 percent to 10 percent. The ranges of errors across characteristics were smaller when only the school districts with populations of 20,000 or more were included in the MAPEs, supporting the conclusion that the size of the population influences the size of the errors for all other school district characteristics.
G. Errors for School Districts that Were Not Coterminous with County Boundaries
Where school district boundaries are identical to county boundaries, the errors for the school district population estimates are due entirely to the errors in the county population estimates by methodological design. To determine if including school districts with boundaries that were coterminous with county boundaries substantially reduced the mean errors for the school district estimates, MAPEs were also calculated with these school districts excluded. All the school districts in Florida, Nevada, Maryland, and West Virginia were coterminous with county boundaries, and Hawaii and the District of Columbia contained single school districts.
Table 14 shows the differences between MAPEs for the two sets of school districts for the states with the largest differences. While most of the errors increased when coterminous school districts were excluded, the MAPEs for the school-age population for Louisiana and Virginia increased slightly because only a few school districts were included in the analyses for those two states. There were slight increases in the MAPEs for the school-age and total population by size, percent population change, and school district type when the school districts with boundaries identical to county boundaries were excluded (see Tables 15 and 16). The largest differences were for the largest school district size categories and for the largest percent changes, whether population growth or decline. Not surprisingly, the largest changes were for the categories for which the most school districts were excluded as having boundaries coterminous with county boundaries.