Skip Main Navigation Skip To Navigation Content

Evaluation of the 1990 School District Level Population
Estimates Based on the Synthetic Ratio Approach

Esther R. Miller

Population Division
U.S. Census Bureau
Washington, DC 20233-8800

September 2001

Working Paper Series No. 54

DISCLAIMER:

This paper reports the results of research and analysis undertaken by Census Bureau staff. It has undergone a more limited review than official Census Bureau publications. This report is released to inform interested parties of research and to encourage discussion.


ABSTRACT

The Census Bureau was tasked with conducting research and evaluation and developing a methodology to produce updated estimates of the total population and the total number of school-age children in each school district. This paper provides an overview of the methodology and limitations, the steps necessary to create the synthetic - population estimates, problems we encountered, and results from our evaluation of the data.


Table of Contents

Skip Table of Contents Chapter I: Introduction
Chapter II: Development of Methodology
Chapter III: Evaluating the Ratio Approach
Chapter IV: Results of the Evaluation
Chapter V: Conclusions and Plans to Improve School District Estimates
Chapter VI: References
Chapter VII: Appendix

Population Division Working Papers

Acknowledgments:

The author would like to thank Paul Siegel (Small Area Income and Poverty Estimates), Bashir Ahmed and Signe Wetrogan (Population Division) for their invaluable comments and contributions to this research.


Evaluation of the 1990 School District Level Population
Estimates Based on the Synthetic Ratio Approach

I. INTRODUCTION

The elementary and secondary schools in the United States depend on federal dollars to supplement programs for disadvantaged children. Title 1 of the Elementary and Secondary Education Act provides a means for the Department of Education (DOE) to distribute federal funds to school districts.

Prior to School Year (SY) 1997/1998, the distribution of federal dollars to school districts was carried out in a two-step process. First, the DOE allocated federal dollars to counties. States then had the responsibility to distribute the federal dollars to school districts. In order to determine the amount of money to allocate to a state, the DOE used the most recent decennial data on the number of school-age children in poverty in each county within the state. States then used a variety of data sources to allocate the monies down to the school districts including special decennial census tabulations of the number of school-age children in poverty in each school district.

In 1994, Congress enacted a law authorizing the Department of Education to allocate Title 1 funds directly to school districts, beginning with school year 1997/1998. In doing so, Congress also specified that the DOE use updated estimates of the number of school-age children in poverty in each school district rather than the once-a-decade measures from the decennial census.

The Census Bureau was tasked with conducting research and evaluation and developing a methodology to produce updated estimates of the number of school-age children in poverty. Because the distribution of the funds also requires updated estimates of the total population and the total number of school-age children in each school district, the Census Bureau also had to develop methodologies for these data requirements.

This paper focuses on the development and evaluation of the methodologies to produce updated estimates of the total population and the total number of school-age children in each school district. It is divided into five sections. Section I is the introduction. Section II describes the methodology developed to produce the updated population estimates for school districts and issues that affect the production and subsequent accuracy of the estimates; Section III describes the methodology used to evaluate the school-district estimates; Section IV presents the results of the evaluation; and Section V presents conclusions and discusses plans to improve the population estimates for school districts. A discussion of the development and evaluation of the methodology to produce updated estimates of the number of school-age children in poverty is presented in a separate paper.1

II. DEVELOPMENT OF METHODOLOGY

This section presents an overview of the methodology used to produce the estimates of the total population and the school-age population in each school district.

As noted in the prior section, the Census Bureau was tasked with developing the methodology to produce updated estimates of the total population and the school-age population in each school district. To comply with the legislation, the methodology had to be developed and implemented for the allocation of funds for school year 1997/1998.

Although the Census Bureau did have a program to develop and produce annual estimates of the population of functioning governmental units, the methodologies developed for those estimates could not be used to produce updated estimates of school districts. Therefore, it was necessary for the Census Bureau to construct a new methodology to produce the population estimates for school districts.

Factors Affecting Development of Methodology

In developing the methodology, we encountered a number of factors which complicate the development of estimates for school districts.

School Districts are Small with Unique Boundaries

School districts are small with unique boundaries. As such, little Census or other data are available as input to an estimation methodology. In 1990, there were 15,226 school districts in the United States. Table 1 shows that approximately 50 percent of these school districts have a total population of less than 5,000 people. Approximately 82 percent of all school districts have an estimated total population of less than 20,000 people (U.S. Census Bureau, 1997).

Table 1. Percent Distribution of All School Districts by Population Size and
of School-age Children by Population Size of School District: 1990
School District Population Size Percent of a School Districts Cumulative Percent of School Districts Percent of School-age Children Cumulative Percent of School-age Children
Under 5,000 49.2 49.2 6.0 6.0
5,000 - 9,999 17.0 66.2 7.7 13.7
10,000 - 19,999 15.6 81.8 13.4 27.1
20,000 - 39,999 9.7 91.5 15.4 42.5
40,000 or more 8.5 100.0 57.6 100.1
Total in 1990 15,226 15,226 45.3 million 45.3 million

In most parts of the United States, school district boundaries are unique in that they do not coincide with other governmental units for which data are regularly tabulated. There are only seven states where school district boundaries coincide with county boundaries, accounting for only 928 of the 15,226 school districts in the United States. Although most school districts are confined to a single county, some cross county boundaries, further adding to the complications in developing an estimation methodology.

School Districts are Defined by Relevant Grades

School districts are defined according to the grade levels served by the school district. Therefore the estimates of the number of school-age children in each school district had to be calculated according to the grade level served by the school district. In 1990, about 74 percent of the school districts across the United States served grade levels kindergarten through 12th grade. The remainder of the school districts served only specific grades such as kindergarten through 6th grade (22 percent) or 9th through 12th grade (4 percent).2

For those school districts which served only partial grade levels, it was necessary to translate the grade levels served back to relevant ages. The 1990 census data on highest grade completed together with data from the October supplements of the 1988, 1989, and 1990 Current Population Surveys provided the necessary information to develop a grade to age relationship.

The translation of grade to age was done so that each school-age child could be assigned to one and only one school district. Thus, the sum of school-age children across school districts would equal the total number of school-age children in the United States. However, this is not true for the sum of the total population across school districts. Because a school district may provide elementary grade service on the same piece of land as a district that provides education for middle school grades, the estimates of the total population for these overlapping school districts will be double counted. Thus, the sum of the estimates of the total population for all school districts cannot be compared with the total population of the United States.

School District Boundaries Change Over Time

Several changes may occur to school districts over time. School districts can annex new territory over time; school districts can close; and new school districts can be created. In order to maintain correct and up to date boundaries, the Census Bureau must periodically survey school districts to obtain current boundary information.

Additionally, the changes to boundaries complicate the complete evaluation of any methodology.

Choosing the Ratio Methodology

The complexities outlined above and the scarcity of data available for school districts led the Census Bureau to choose a ratio or synthetic approach to produce the school district estimates. In choosing the ratio approach, the Census Bureau decided to rely upon the 1990 census to provide a starting point and the annual estimates of the county population to provide the basis for change. The annual estimates of the total population for counties would provide the basis for change in the total population for school districts. The annual estimates of the population by age for counties would provide the basis for change in the school-age population for school districts. This approach assumes that all school districts within a county change at the county rate. The formula for developing the estimates for the post 1990 period is:

P(sd t) = P(sd 1990) / P(county 1990) * P(county t)

where:
P(sd t) = Estimated school district population in current boundaries for time t
P(sd 1990) = School district population in current boundaries from 1990 census
P(county 1990) = County population from 1990 census
P(county t)= County population for time t

While most school districts are confined to a single county, some do cross county boundaries. For those cases where the school district crosses county boundaries, it is necessary to construct a separate ratio and separate estimates for the school district piece in each county. In these cases, as a final step, the separate school district county pieces are summed to produce the school district estimate.

Assumptions Associated with Ratio Approach

The ratio approach assumes that the ratio of the school district population to the county population will remain constant over time. In other words, it assumes that the population in each school district county piece changes at the same rate as that of the county. However, in reality this may not be the case. If the county population is estimated to decline, but the school district population in that county increases or vice versa, the resulting estimates of the school district population will be biased.

The estimate is further complicated when a school district crosses county boundaries. In that case, the ratio method assumes that each school district-county piece grows at the rate of that county. In a school district that crosses county boundaries, one of the counties it comprises may see a population spurt whereas the other county may experience a decline in population. When the two county pieces are summed together, the school district population may be underestimated or overestimated, depending upon the size of the school district pieces.

III. EVALUATING THE RATIO APPROACH

Development of Ratio Estimates for Evaluation

To do a complete evaluation of the school district methodology, we need to have school district data at two points in time. The data for the 1980 and 1990 censuses provide us with that opportunity. To evaluate the ratio methodology, we used the 1980 census as the base, developed an estimate for 1990, and compared the estimate to the 1990 census data. The estimates were produced for both the total population and the school-age population aged 5-17 years. For this evaluation, we developed four sets of synthetic population estimates.

Set 1: County Estimates-Based Model

To evaluate the ratio approach applied to an estimate of the county population (as would be the case in the post 1990 period), we must develop a 1990 estimate for the county. For this test, we used the 1990 estimate of the county population that had been developed using our standard county estimates approaches and based on the 1980 census.3

To produce these estimates, we first compute the ratio of the school district population to county population using the 1980 census data. Then we apply the ratio to the 1980-based estimate of the 1990 county population developed by the Census Bureau. This evaluation measures the effect of the ratio approach as well as any error caused by the estimate of the county population.

P (sd 1990) = P ( sd 1980)/P (county 1980) * P (county 1990)

where:
P (sd 1990)= Estimated school district population in 1990
P (sd 1980)= School district population from 1980 census
P (county 1980)= County population from 1980 census
P (county 1990)= Estimated county population in 1990
Set 2: County Count-Based Model

This approach is very similar to Set 1 except that the ratios are multiplied by the 1990 census data for the county population rather than the 1980-based estimate. We are assuming that all school districts within the county change at the same rate as the county. Although for the post 1990 period we would only have estimates data available, this estimate is a good benchmark against which to judge all other model-based estimates.

In this approach, we multiply the ratio of the 1980 school district population to 1980 county population by the 1990 census county population.

P (sd 1990) = P( sd 1980)/P(county 1980)* P (county 1990)

where:
P (sd 1990)= Estimated school district population in 1990
P (sd 1980)= School district population from 1980 census
P (county 1980)= County population from 1980 census
P (county 1990)= County population from 1990 census
Set 3: State Growth-Based Estimates

This approach is similar to Set 2 except that it assumes that the school districts all change at the same rate as that of the state. To develop the estimates, we multiply the ratio of the 1990 state population to 1980 state population by the 1980 school district population.

P (sd 1990) = P(State 1990)/P (State 1980)* P(sd 1980)

where:
P (sd 1990)= Estimated school district population in 1990
P (State 1990)= State population from 1990 census
P (State 1980)= State population from 1980 census
P (sd 1980)= School district population from 1980 census
Set 4: National Growth-Based Estimates

This approach is also similar to Sets 2 and 3 except that it assumes that the school districts all change at the same rate as that of the entire United States. To develop this estimate, we multiply the ratio of the 1990 national population to 1980 national population by the 1980 school district population.

P (sd 1990) = P ( National 1990)/P (National 1980)* P (sd 1980)

where:
P (sd 1990)= Estimated School district population in 1990
P (National 1990)= National population from 1990 census
P (National 1980)= National population from 1980 census
P (sd 1980)= School district population from 1980 census

Note that the assumptions underlying the models may not be realistic. For example, the population growth in a school district does not correspond to the growth in a county or state. Similarly, it is not reasonable to assume that each and every school district will grow at the same rate as the nation.

Creating a Comparable Universe of School Districts Across the Decade

To do a complete evaluation of the methodology, we need a comparable universe of school districts over the 1980 to 1990 time period. Optimally, for our analysis we would use a matched 1980 and 1990 file, geocoded to identical school district boundaries. The advantage of this type of file is that we would not need to make assumptions about school district boundaries across the decade.

If the Census Bureau had a 1980 data file geocoded to the 1990 school district geography we could simply apply synthetic ratios to 1990 census data and compare the expected value to the "truth" in 1990. If we were able to geocode 1990 data into 1980 school district geography, we could administer the same approach. However, neither data set is available.

Considering we do not have files geocoded to the same boundaries, we concluded we needed to prepare a universe of school districts that are "equivalent" across the decade. The starting point for our universe is the total number of school districts in 1990 (15,226). (See Table 2). We first excluded 928 school districts that were coterminous with county boundaries as the stable shares approach perfectly predicts the population for the 1990 school district for this set of school districts.

Table 2. Universe of School Districts for Evaluation of Synthetic
Estimates of Population: 1980 to 1990

  School Districts School-age Children
Type of School District Number Percent Number (in thousands) Percent
Total 1990 15,226 100.0% 45,339 100.0%
District or piece coterminous with county boundaries1 928 6.1% 10,116 22.3%
Districts eligible for the synthetic ratio evaluation 14,298 93.9% 35,223 77.7%
Limited Grade Range2 4,018 26.4% 7,308 16.1%
Newly formed3 416 2.7% 775 1.7%
County boundaries changed from 1980 to 1990 12 0.1% 62 0.1%
School district county pieces did not match up
across the decade
609 4.0% 1,742 3.84%
School districts with a population size of less than 31 people4 42 0.3% 0 0%
Districts in Evaluation 9,201 60.7% 27,079 55.9%


1 Includes 15 new districts containing 85,068 school-age children in districts.

2 Includes non-unified districts and 13 districts containing 23,189 school-age children in counties which changed boundaries between 1980 and 1990. Also includes 213 new districts containing 39,829 school-age children.

3 Districts with an ID numbers in 1990 but no ID number in 1980.

4 We excluded these school districts due to the large errors they contributed to the analysis.

Essentially, we could apply the synthetic ratio approach to the remaining 14,298 districts. However, in order to have an "equivalent" universe file over the decade, we also removed:

  1. School districts with limited grade ranges (4,018)4;
  2. School districts which were newly formed between 1980 and 1990 (416);
  3. School districts in counties where the county boundaries changed between 1980 and 1990 (12)5;
  4. School district county pieces did not match up across the decade (609); and
  5. School districts with a population size of less than 31 people (42).

The final universe for the 1980-1990 evaluation file contained 9,201 matched school district identification numbers.

Evaluation Measures

To compare and evaluate the estimates, we used two standard statistical measures: (1) the Mean Absolute Percent Error (MAPE), and (2) the Mean Algebraic Percent Error (MALPE).6 The MAPE is computed as the sum for all school district pieces of the absolute difference between the estimate and the 1990 census figure divided by the number of school districts. The MAPE measures the accuracy of the estimates. The MALPE is computed in a similar manner, except that we take the sign of the difference into consideration. Positive mean algebraic percent errors indicate overestimation of a population and negative errors indicate an underestimate of a population.

We also examined weighted MAPEs. The unweighted statistics treat each school district with equal importance, regardless of size. The weighted MAPEs, on the other hand, take into consideration the size of a school district, measured by the total population or the school-age population in that school district. Weighting by the total population in each school district addresses the size of the school district population affected. Weighting by the number of school-age children indicates how accurate the estimates are for the districts containing the average child.

IV. RESULTS OF THE EVALUATION

For purposes of this evaluation, we developed four sets of synthetic population estimates. Set 1 uses the ratio approach and the 1980 based county population estimate. Set 2 is similar except that it uses the 1990 census data for the county rather than the 1980 based estimate. The differences between Set 1 and Set 2 represent the additional error in the ratio approach introduced by using an estimate of the population rather than the census counts. Sets 3 and 4 represent alternatives to a county-based approach. Set 3 assumes that the school district grows at the same rate as that of the state, while Set 4 assumes that the school districts all grow at the national rate.

Overall Quality

As shown in Table 3 and Figure 1, the county count-based estimates have the smallest unweighted MAPEs (12.6 and 16.0), followed by the county estimates-based (13.3 and 16.9), the state growth-based (16.4 and 18.9), and national growth-based estimates (18.9 and 20.6). This pattern holds both for total population and school-age population aged 5-17, whether the MAPEs are weighted or unweighted.

Table 3. Mean Absolute Percent Errors (MAPEs) in Synthetic Estimates of the Total Population and School-age Population, Selected School Districts: 1980 to 1990

  Unweighted Percent Error Weighted Percent Error
Type of Synthetic Method Total Population School-age Population 5-17 Total Population School-age Population 5-17
Set 1: County Estimates-based 13.3 16.9 9.6 12.0
Set 2: County Count-based 12.6 16.0 9.2 10.4
Set 3: State Growth-based 16.4 18.9 11.8 13.3
Set 4: National Growth-based 18.9 20.6 13.9 16.6
Figure 1. Mean Absolute Percent Errors (MAPEs) for Estimates of Total Population
and School-age Population by Method to Estimate 1990 School Districts
mean absolute percent errors for estimates total population and school-age 1990

Table 4 presents the results of comparing the MAPEs across each set of estimates. As show in the first row of Table 4, we lose only a minor amount of accuracy when we use an estimate rather than the census count as the base for the 1990 county data. Comparing Set 1 to Set 3 and Set 4 indicate that the use of the ratio approach at the county is superior to one that uses state or national growth rate assumptions.

Table 4. Comparison of the Percent Differences Between Mean Absolute Percent Errors (MAPEs) for the Total Population and School-age Population by Synthetic Ratio Methodology, Selected School Districts: 1980 to 1990

Percent Differences Between Synthetic Ratio Estimates Unweighted Percent Error Weighted Percent Error
Total Population School-age Population Total Population School-age Population
County Count-based and County Estimates-based = (Set 2 - Set 1)/Set 2 -5.6 -4.3 -5.6 -15.4
County Count-based and State Growth-based = (Set 2 - Set 3)/Set 2 -30.2 -28.3 -18.1 -27.9
County Count-based and National Growth-based = (Set 2 - Set 4)/Set 2 -50.0 -51.1 -28.8 -59.6
State Growth-based and National Growth-based = (Set 3 - Set 4)/Set 3 -13.2 -15.1 -8.3 -19.9
County Estimates-based and State Growth-based = (Set 1 - Set 3)/Set 1 -23.3 -22.9 -11.8 -10.8
County Estimates-based and National Growth-based = (Set 1 - Set 4)/Set 1 -42.1 -44.8 -21.9 -38.3

Using the MAPEs as our unit of analysis, we would conclude that the Set 2 approach is the most accurate for estimating the school district population. However, the Set 2 (county count-based approach) can be produced only at the census year. Therefore, if we must rely on the synthetic approach, we need to employ a set of estimates. And as shown by the comparison to Sets 3 and 4, the use of the county estimate is superior to a method that uses state or national growth rate assumptions. For this reason, the remainder of this section reports results from the county estimates-based MAPEs and MALPEs.

Quality of the Estimates by Demographic and Economic Characteristics

To evaluate the amount of "bias" or other patterns in the county estimates-based school district estimates, we selected ten economic and demographic characteristics. These characteristics are a subset of those the National Academy of Sciences used to evaluate poverty estimates at the county level.7 The ten characteristics are:

  1. Size of the School District in 1980;
  2. Size of the School District in 1990;
  3. Population Growth, 1980-1990;
  4. Percent Poor School-age Children in 1980;
  5. Percent Poor School-age Children in 1990;
  6. Numerical Change in Poverty Rate for Children, 1980-1990;
  7. Census Division;
  8. Percent Hispanic in 1980;
  9. Percent Black in 1980; and
  10. Percent Group Quarters in 1980.

Table 5 shows both the unweighted and weighted MAPEs8 and unweighted MALPEs for total population, by the selected characteristics. Similarly, Table 6 shows the unweighted and weighted MAPEs and unweighted MALPEs by characteristics for school-age population aged 5-17. Additionally, the two tables present the total population (or school-age population) and the percent of the population in each category.9

Table 5. Mean Absolute Percent Errors (MAPEs) and Mean Algebraic Percent Errors (MALPEs) for Selected School Districts, by School District Characteristics: Total Population, 1980 to 1990


Demographic Characteristic Total Population

Unweighted Number of School Districts Weighted by the Total Population
(in thousands)


N %N MAPE MALPE N %N MAPE


Total
9,201 100.0% 13.3 5.0 137,698 100.0% 9.6

School District Population, 1980*
   Under 5,000 4,438 48.2 18.1 8.8 9,562 6.9 13.3
   5,000 - 9,999 1,745 19.0 8.6 1.3 13,459 9.8 9.3
   10,000 - 19,999 1,504 16.3 8.5 0.6 23,099 16.8 9.2
   20,000 - 39,999 883 9.6 9.3 2.4 26,372 19.2 8.9
   40,000 - more 631 6.9 9.4 3.2 65,206 47.4 9.5

School District Population, 1990**
   Under 5,000 4,333 47.1 18.3 10.5 8,731 6.3 12.3
   5,000 - 9,999 1,693 18.4 8.6 0.7 12,196 8.9 8.6
   10,000 - 19,999 1,541 16.7 8.6 0.1 21,960 15.9 8.5
   20,000 - 39,999 920 10.0 9.1 0.1 25,449 18.5 8.9
   40,000 or more 714 7.8 9.7 -0.6 69,361 50.4 10.0

Population Growth, 1980-1990
   Decrease of 10% or more 1,946 21.1 29.2 28.9 11,497 8.3 17.3
   -5.0 - 9.9% 1,231 13.4 8.0 7.2 16,042 11.7 9.5
   -0.1 - 4.9% 1,385 15.1 6.4 4.8 23,739 17.2 6.5
   0.0 - 4.9% 1,178 12.8 5.5 2.0 20,536 14.9 5.5
   5.0 - 9.9% 917 10.0 5.8 -0.1 15,672 11.4 6.6
   10% and over 2,544 27.6 13.7 -10.9 50,212 36.5 11.9

Percent Poor School-age Children, 1980
   Zero 157 1.7 33.9 10.6 109 0.1 25.4
   0.1 - 5.9 1,581 17.2 13.7 1.1 32,126 23.3 11.3
   6.0 - 8.9 1,425 15.5 11.6 3.0 23,388 17.0 8.6
   9.0 - 12.4 1,530 16.6 11.8 4.3 21,273 15.4 8.6
   12.5 - 16.4 1,429 15.5 12.1 5.6 19,807 14.4 7.9
   16.5 - 23.9 1,617 17.6 12.0 6.4 23,340 16.9 10.2
   24 or more 1,462 15.9 16.5 9.3 17,655 12.8 10.0

Percent Poor School-age Children, 1990
   Zero 390 4.2 45.1 30.4 286 0.2 18.9
   0.1 - 5.9 1,616 17.6 10.9 -2.1 30168 21.9 11.3
   6.0 - 8.9 1,119 12.2 10.5 2.4 17,712 12.9 8.8
   9.0 - 12.4 1,322 14.4 10.1 3.3 19,082 13.9 8.2
   12.5 - 16.4 1,297 14.1 10.1 3.7 17,682 12.8 8.7
   16.5 - 23.9 1,674 18.2 11.6 4.9 22,310 16.2 8.1
   24 or more 1,783 19.4 16.5 10.1 30,458 22.1 10.7

Change in Poverty Rate for Children, 1980-1990
   Decrease of 10% or more 724 7.9 24.1 14.9 1,461 1.1 12.4
   -5.0 - 9.9% 882 9.6 14.0 5.5 6,037 4.4 9.5
   -0.1 - 4.9% 2,583 28.1 11.1 2.3 42,612 30.9 9.5
   0.0 - 4.9% 2,651 28.8 10.9 2.8 52,522 38.1 9.2
   5.0 - 9.9% 1,307 14.2 10.8 4.6 23,085 16.8 8.8
   10% and over 1,054 11.5 19.8 10.9 11,982 8.7 13.0

Census Division
   New England 873 9.5 9.0 -1.9 12,017 8.7 5.3
   Middle Atlantic 1,453 15.8 7.8 1.1 26,113 19.0 6.7
   South Atlantic 199 2.2 13.4 7.2 8,060 5.9 9.2
   East North Central 1,854 20.1 9.0 2.6 29,501 21.4 8.1
   East South Central 367 4.0 11.1 3.9 9,496 6.9 9.5
   West North Central 1,974 21.5 13.2 7.2 13,051 9.5 9.1
   West South Central 1,390 15.1 17.2 5.5 18,206 13.2 16.8
   Mountain 520 5.7 33.9 22.1 7,867 5.7 9.8
   Pacific 571 6.2 21.2 9.0 13,387 9.7 13.3

Percent Hispanic 1980
   0.0 483 5.2 22.7 8.7 250 0.2 14.9
   0.1 -0.9 5,282 57.4 10.0 3.5 60,230 43.7 8.0
   1.0 - 4.9% 2,313 25.1 15.7 5.8 45,582 33.1 9.6
   5.0 - 9.9 416 4.5 17.3 7.1 12,624 9.2 11.9
   10.0 - 25.0 378 4.1 22.8 13.8 12,728 9.2 12.5
   More than 25 329 3.6 19.9 7.0 6,285 4.6 14.1

Percent Black 1980
   0.0 1,944 21.1 19.0 11.3 3,334 2.4 10.1
   0.1 - 0.9 4,363 47.4 11.4 3.0 49,660 36.1 9.2
   1.0 - 4.9 1,413 15.4 11.0 1.0 35,376 25.7 9.5
   5.0 - 9.9 474 5.2 12.9 5.5 11,985 8.7 9.3
   10.0 - 25.0 579 6.3 13.9 6.2 21,015 15.3 9.9
   More than 25 428 4.7 14.7 8.6 16,327 11.9 10.7

Percent Group Quarter Residents, 1980
   0.0 - 0.18 3,633 39.5 16.9 6.9 17,978 13.1 11.1
   0.19 - 1.3 2,314 25.1 11.1 3.2 48,224 35.0 10.8
   1.4 - 2.4 1,438 15.6 10.4 4.1 30,050 21.8 8.2
   2.5 - 10.0 1,508 16.4 9.1 3.0 35,545 25.8 8.3
   More than 10.0 308 3.3 21.4 10.9 5,900 4.3 9.5


Table 6. Mean Absolute Percent Errors (MAPEs) and Mean Algebraic Percent Errors (MALPEs) for Selected School Districts, by School District Characteristics: School-age Population, 1980 to 1990


Demographic Characteristic Total Population

Unweighted Number of School Districts Weighted by the Number of School-age Children (in thousands)


N %N MAPE MALPE N %N MAPE


Total
9,201 100.0% 16.9 4.3 25,336 100.0% 12.0

School District Population, 1980*
   Under 5,000 4,438 48.2 22.8 6.9 1,948 7.7 15.7
   5,000 - 9,999 1,745 19.0 10.9 0.1 2,684 10.6 11.6
   10,000 - 19,999 1,504 16.3 10.8 1.1 4,363 17.2 11.4
   20,000 - 39,999 883 9.6 12.4 4.7 4,723 18.6 11.9
   40,000 - more 631 6.9 12.4 5.0 11,619 45.9 11.8

School District Population, 1990**
   Under 5,000 4,333 47.1 23.0 8.4 1,791 7.1 14.8
   5,000 - 9,999 1,693 18.4 11.1 -0.6 2,412 9.5 10.9
   10,000 - 19,999 1,541 16.7 10.9 0.5 4,147 16.4 10.8
   20,000 - 39,999 920 10.0 12.3 2.9 4,533 17.9 12.1
   40,000 or more 714 7.8 12.4 1.2 12,452 49.1 12.3

Population Growth, 1980-1990
   Decrease of 10% or more 1,946 21.1 32.8 28.8 2,180 8.6 17.1
   -5.0 - 9.9% 1,231 13.4 11.5 4.9 2,875 11.3 11.7
   -0.1 - 4.9% 1,385 15.1 11.0 4.0 4,131 16.3 10.7
   0.0 - 4.9% 1,178 12.8 9.7 1.0 3,590 14.2 9.5
   5.0 - 9.9% 917 10.0 10.3 -0.2 2,882 11.4 9.9
   10% and over 2,544 27.6 16.2 -11.4 9,678 38.2 13.2

Percent Poor School-age Children, 1980
   Zero 157 1.7 48.4 9.8 19 0.1 28.3
   0.1 - 5.9 1,581 17.2 18.2 6.0 5,792 22.9 15.0
   6.0 - 8.9 1,425 15.5 14.9 2.6 4,302 17.0 11.6
   9.0 - 12.4 1,530 16.6 14.4 1.5 3,871 15.3 11.5
   12.5 - 16.4 1,429 15.5 14.7 3.1 3,580 14.1 9.6
   16.5 - 23.9 1,617 17.6 14.4 3.5 4,274 16.9 11.7
   24 or more 1,462 15.9 21.3 8.4 3,498 13.8 11.3

Percent Poor School-age Children, 1990
   Zero 390 4.2 59.9 38.1 47 0.2 22.3
   0.1 - 5.9 1,616 17.6 14.5 2.0 5,412 21.9 15.0
   6.0 - 8.9 1,119 12.2 12.6 1.2 3,256 12.9 11.8
   9.0 - 12.4 1,322 14.4 12.9 1.3 3,484 13.8 11.0
   12.5 - 16.4 1,297 14.1 12.7 0.4 3,214 12.7 10.8
   16.5 - 23.9 1,674 18.2 14.1 1.5 4,093 16.2 10.3
   24 or more 1,783 19.4 20.9 8.6 5,830 23.0 12.0

Change in Poverty Rate for Children, 1980-1990
   Decrease of 10% or more 724 7.9 31.8 16.4 283 1.1 14.4
   -5.0 - 9.9% 882 9.6 16.7 3.6 1,116 4.4 12.4
   -0.1 - 4.9% 2,583 28.1 13.7 2.4 7,734 30.5 12.2
   0.0 - 4.9% 2,651 28.8 14.2 2.4 9,603 37.9 11.9
   5.0 - 9.9% 1,307 14.2 13.6 1.8 4,271 16.9 10.7
   10% and over 1,054 11.5 25.3 9.1 2,328 9.2 14.1

Census Division
   New England 873 9.5 15.3 -0.9 1,927 7.6 10.6
   Middle Atlantic 1,453 15.8 11.8 4.2 4,381 17.3 9.9
   South Atlantic 199 2.2 13.5 7.5 1,450 5.7 10.3
   East North Central 1,854 20.1 11.2 1.2 5,536 21.8 9.5
   East South Central 367 4.0 13.7 4.4 1,828 7.2 11.0
   West North Central 1,974 21.5 17.4 2.8 2,476 9.8 11.7
   West South Central 1,390 15.1 20.1 5.9 3,671 14.5 17.9
   Mountain 520 5.7 38.9 21.0 1,651 6.5 13.0
   Pacific 571 6.2 24.1 7.0 2,416 9.5 15.6

Percent Hispanic 1980
   0.0 483 5.2 35.1 9.9 50 0.2 20.4
   0.1 -0.9 5,282 57.4 12.8 2.2 11,041 43.6 10.2
   1.0 - 4.9% 2,313 25.1 19.5 6.5 8,233 32.5 12.4
   5.0 - 9.9 416 4.5 18.5 3.4 2,237 8.8 14.2
   10.0 - 25.0 378 4.1 26.1 12.1 2,378 9.4 14.4
   More than 25 329 3.6 23.9 7.1 1,396 5.5 16.6

Percent Black 1980
   0.0 1,944 21.1 25.4 8.6 688 2.7 12.8
   0.1 - 0.9 4,363 47.4 14.2 1.9 9,501 37.5 12.4
   1.0 - 4.9 1,413 15.4 15.3 3.4 6,295 24.8 12.7
   5.0 - 9.9 474 5.2 14.4 5.3 2,088 8.2 10.9
   10.0 - 25.0 579 6.3 16.0 6.7 3,690 14.6 11.6
   More than 25 428 4.7 15.0 7.2 3,073 12.1 10.7

Percent Group Quarter Residents, 1980
   0.0 - 0.18 3,633 39.5 21.3 6.4 3,538 14.0 13.7
   0.19 - 1.3 2,314 25.1 13.9 3.6 9,212 36.4 13.2
   1.4 - 2.4 1,438 15.6 12.9 2.0 5,546 21.9 10.1
   2.5 - 10.0 1,508 16.4 13.3 1.5 6,167 24.3 11.1
   More than 10.0 308 3.3 22.7 8.5 873 3.4 12


Figures 2 through 11 are pictorial representations of the weighted and unweighted MAPEs for both the total and school-age population, by demographic and economic characteristics.

Size of the School District in 1980 (See Figure 2 and Tables 5 and 6)

Figure 2. Mean Absolute Percent Errors (MAPEs) for Estimates of Total
Population and School-age Population by Size of School District: 1980
mean absolute percent errors by size of school district 1980

Size of the School District in 1990 (See Figure 3 and Tables 5 and 6)

Figure 3. Mean Absolute Percent Errors (MAPEs) for Estimates of Total
Population and School-age Population by Size of School District: 1990
mean absolute percent errors by size of school district 1990

Population Growth, 1980-1990 (See Figure 4 and Tables 5 and 6)

Figure 4. Mean Absolute Percent Errors (MAPEs) for Estimates of Total Population
and School-age Population by Population Growth: 1980-1990
mean absolute percent errors by population growth 1980-1990

Percent Poor School-age Children in 1980 (See Figure 5 and Tables 5 and 6)

Figure 5. Mean Absolute Percent Errors (MAPEs) for Estimates of Total Population
and School-age Population by Percent Poor School-age Children: 1980
mean absolute percent errors by percent poor school-age 1980

Percent Poor School-age Children in 1990 (See Figure 6 and Tables 5 and 6)

Figure 6. Mean Absolute Percent Errors (MAPEs) for Estimates of Total Population
and School-age Population by Percent Poor School-age Children: 1990
mean absolute percent errors by percent poor school-age 1990

Numerical Change in Poverty Rate for Children, 1980-1990 (See Figure 7 and Tables 5 and 6)

Figure 7. Mean Absolute Percent Errors (MAPEs) for Estimates of Total Population and
School-age Population by Change in Poverty Rate for Children: 1980-1990
mean absolute percent errors by change in poverty rate 1980-1990

Census Division (See Figure 8 and Tables 5 and 6)

Figure 8. Mean Absolute Percent Errors (MAPEs) for Estimates of Total
Population and School-age Population by Census Division: 1980
mean absolute percent errors by census division 1980

Percent Hispanic in 1980 (See Figure 9 and Tables 5 and 6)

Figure 9. Mean Absolute Percent Errors (MAPEs) for Estimates of Total
Population and School-age Population by Percent Hispanic: 1980
mean absolute percent errors by percent hispanic 1980

Percent Black in 1980 (See Figure 10 and Tables 5 and 6)

Figure 10. Mean Absolute Percent Errors (MAPEs) for Estimates of Total
Population and School-age Population by Percent Black: 1980
mean absolute percent errors by percent black 1980

Percent Group Quarters in 1980 (See Figure 11 and Tables 5 and 6)

Figure 11. Mean Absolute Percent Errors (MAPEs) for Estimates of Total Population and
School-age Population by Percent Group Quarters (GQ): 1980
mean absolute percent errors by percent group quarters 1980

V. CONCLUSIONS AND PLANS TO IMPROVE SCHOOL DISTRICT ESTIMATES

This paper attempted to evaluate the 1990 school district level population estimates which were developed by the synthetic ratio approach. For both the total population and the school-age population age 5-17, four sets of synthetic estimates were produced: (1) the 1990 county estimates-based estimates; (2) the 1990 county count-based estimates; (3) state growth-based estimates; and (4) national growth-based estimates. To evaluate the estimates, we used both the Mean Absolute Percent Error (MAPE) and the Mean Algebraic Percent Error (MALPE). We examined the variations in the MAPEs and MALPEs by selected demographic and economic characteristics.

To summarize, the state growth and national growth-based models produced the least accurate estimates. They are feasible alternatives, but the school district growth rate is least likely to be the same as the state’s or the nation’s. The county count-based and the county estimates-based models were close to each other although the former provided more accurate estimates than the latter. We found that the differences were especially apparent for small school districts, districts with high and low poverty rates; and districts with high and low growth rates. However, the county count-based estimates can be produced only at the census year. Therefore, if we must rely on the synthetic estimates, we do need to use the county estimates-based model.

What are our plans to improve the school district estimates?

The Census Bureau is required to produce school district level population estimates for SY 1995/1996 and every two years thereafter. For SY 1995/1996 and SY 1997/1998, the synthetic estimates were based on data from the 1990 census and updated county estimates thereafter. For SY 1999/2000, we will use the Census 2000 data.

However, for post 2000 school district estimates, we plan to conduct further research to improve the estimates. These research plans include:


1 See http://www.census.gov/hhes/www/saipe/schooltoc.html for the documentation.

2 Special tabulation by the U.S. Census Bureau.

3 See http://www.census.gov/popest/archives/methodology/90s-st-co-meth.txt for the methodology.

4 Most school districts cover the grade range of K-12. These are known as unified school districts. A non-unified school district does not cover grades K-12 but instead covers elementary, middle, or high school grades. If a school district is not unified across the decade, it is not possible to determine whether the grades the district includes are the same across time (U.S. Department of Education, 1999).

5 We assumed the school district boundaries did not change if the identification number did not change over the decade. This assumption may not always be correct because the state did not always assign new IDs when land was annexed over the decade, political boundaries changed, etc. (U.S. Department of Education, 1999).

6 See Appendix for the formulas for school district estimators and evaluation statistics for the models. The appendix includes references to both population and poverty estimates. There are some slight differences in the terminology. Our text refers to MALPE whereas the appendix refers to MALP. Additionally, Model-based refers to our Set 1, census county-based refers to our Set 2, and the naive-based refers to our growth-based estimates. Thanks to William R. Bell for providing the statistical explanation for the computations (U.S. Census Bureau, 1998).

7 See National Research Council, 1998.

8 We will not discuss weighted MALPEs because the sum of the MALPEs for each economic or demographic characteristic would be equivalent to zero if all of the school districts in each county were represented in our sample, thus the weighted MALPEs are meaningless to the analysis.

9 The unweighted number of school districts in each category of the demographic and economic characteristics remain the same across Table 5 and Table 6. This is because the demographic and economic categories (e.g., Size of the School District in 1980 or Percent Poor School-age Children in 1980) were defined based on the characteristics of the total population in a school district. For example, if the total population in a school district is 9,000 and the school-age population in a school district is 4,500 the school district falls into the school district population of 5,000 - 9,999. In Table 5, the total population is determined by weighting the number of school districts by the population in each school district. In Table 6, we determined the school-age population by weighting the number of school districts by the number of school-age children in each school district.

10 The findings in the last two bullets above are consistent with findings shown in Table 2 in that about one half of all school districts are made up of less than 5,000 people. The difference is that Table 2 is based on the total number of school districts as of 1989-1990 (15,226 school districts); whereas the evaluation universe is based on 9,201 districts.

11 Special tabulation by the U.S. Census.

12 Special tabulation by the U.S. Census.

13 When there were no related poor school-age children in 1980, then the shares methodology predicted that the percentage of children in poverty in 1990 will be zero as well. Obviously, these situations occur in very, very small school districts. As a result, the predictions are not accurate and there is a high degree of error between the predictions and the truth.

14 When there are no children in poverty, the percent difference (for the school district) is undefined and excluded from our tabulations. Even with the missing values removed, smaller school districts continue to contribute disproportionately to the high MAPEs.

15 Special tabulation by the U.S. Census.

16 Special tabulation by the U.S. Census.


REFERENCES

National Research Council, 1998. Small-Area Estimates of Children in Poverty, Interim Report 2, Evaluation of Revised 1993 County Estimates for Title I Allocations. Panel on Estimates of Poverty for Small Geographic Areas, C.F. Citro, M.L. Cohen, and G. Kalton, eds., Committee on National Statistics. Washington, D.C.: National Academy of Press.

U.S. Census Bureau, 1997. Table presented to the National Academy of Sciences, Panel on Estimates of Poverty for Small Geographic Areas, Sixth Plenary Meeting, November 4, 1997.

U.S. Census Bureau, 1998. Appendix provided to the National Academy of Sciences, Panel on Estimates of Poverty for Small Geographic Areas, Ninth Plenary Meeting, October 2-3, 1998.

U.S. Department of Education, 1999. National Center for Education Statistics. Classification Evaluation of the 1994-95 Common core of Data: Public/Elementary/Secondary Education Agency Universe Survey, NCES 1999-316 by Stephen Owens. Project Officer: Beth Young. Washington, DC.


APPENDIX

Formulas for School District Estimators and Evaluation Statistics