The Census Bureau’s latest series of state population projections for 1995 - 2025 was prepared in 1995 and released in 1996. This paper examines the performance of this series of projections during their first five years. Using the census 2000 counts and estimated births, deaths, domestic migration,and international migration from administrative records, this paper examines the accuracy of projected total population and projected components of change for 50 states and the District of Columbia. The paper also examines the historical trend of projection accuracy and the geographic variation of projection accuracy by U.S. regions and subdivisions. A multiple regression analysis is used to analyze the relative impact of errors in the projected components of change, errors in state estimates, and 1990 census undercount on the accuracy of the latest state population projections. A discussion of the accuracy of national projections is also included.
We found that the latest series of state population projections are more accurate than previous projections series. The projections continue to perform poorly in the West. The percent errors in domestic migration continue to be the highest among all projected components of change, followed by international migration. The projected births had the lowest average percent errors.
The results from the multiple regression analysis show that the percent errors in the projected births had the largest impact on the accuracy of projections, followed by international migration and projected deaths. The percent errors in domestic migration cannot explain the variation of projection accuracy among the 50 states and District of Columbia. When the 1990 census undercount and the accuracy of state estimates were taken into account, the percent errors in state estimates explain most of the errors in state projections followed by the census undercount. All the direct impact of percent errors in the projected components of change were reduced. It is also found that the errors in state estimates are correlated with the 1990 census undercount rates. Thus, it is concluded that the 1990 census undercount is responsible for a large proportion of the errors in state estimates which, in turn, affects the accuracy of state projections. In addition, the national projections which were used to control the state projections were also affected by the census undercount and the accuracy of the national estimates.
This paper reflects the results of research undertaken by Census Bureau staff. It has undergone a more limited review than official Census Bureau publications. This paper has been prepared to inform interested parties of on-going research and to encourage discussion.
|All Tables in Excel (60k)|
|Table 1:||Mean Absolute Percent Errors (MAPE), Mean Algebraic Percent Errors(MAPLE) of State Population Projections for 2000 as Compared with Census 2000
PDF (47k) | Excel (18k)
|Table 2:||Mean Absolute Percent Errors of Various State Population Projections Series for 5 Years Ahead by Region
PDF (74k) | Excel (20k)
|Table 3:||Mean Absolute Percent Errors (MAPE), Mean Algebraic Percent Errors (MALPE) of State Projections for 2000 Adjusted for 1990 Census Net Undercount
PDF (50k) | Excel (18k)
|Table 4:||Difference between State Estimates for 2000 and Census 2000 Counts, and Mean Absolute Percentage Errors of State Estimates
PDF (49k) | Excel (18k)
|Table 5:||Mean Absolute Percent Errors of Projected Components of Change as Compared with the Estimated Components of Change between July 1, 1995 and July 1, 2000
PDF (50k) | Excel (20k)
|Table 6:||Correlation Matrix of Absolute Percent Projection Errors and Independent Variables
PDF (47k) | Excel (16k)
|Table 7:||Standardized Regression Coefficients of Independent Variables on Absolute Percent Error of State Projections
PDF (47k) | Excel (17k)
|Table 8:||U.S. Population Projections, Census 2000 count, and Vintage 2000 Estimates: April 1, 2000
PDF (49k) | Excel (17k)
|Table 9:||Projected and Estimated Components of Change of the U.S. Population, 1999 and 2000
PDF (48k) | Excel (16k)
|All Figures in Excel (144k)|
|Figure 1:||Percent Difference Between State Projections for 2000 and Census 2000 Counts, Series A and B, ranked by Series A (Projections-Census)
PDF (60k) | Excel (37k)
|Figure 2:||Percent Difference Between State Projections for 2000 and Census 2000 Counts, with and without 1990 Census Undercount Adjustment - Series A
PDF (58k) | Excel (36k)
|Figure 3:||Percent Difference Between State Projections for 2000 and Census 2000 Counts, with and without 1990 Census Undercount Adjustment - Series B
PDF (61k) | Excel (34k)
|Figure 4:||Percent Difference Between State Estimates for 2000 and Census 2000 Counts, with 1990 Census as Enumerated and with Undercount Adjusted Base (Estimates - Census)
PDF (62k) | Excel (39k)
|Figure 5:||Mean Absolute Percent Difference of Projected and Estimated Components of Change between 1995 and 2000 in 50 States and D.C.
PDF (47k) | Excel (24k)
|Figure 6:||Absolute Percent Errors of Projected Domestic Migration between 1995 and 2000 and Absolute Percent Errors of State Projections for 2000 (Series A)
PDF (56k) | Excel (35k)
|All Appendices in Excel (125k)|
|Appendix A:||Comparison between State Population Projections for 2000 and Census 2000 Population - Regions, Divisions, and States
PDF (61k) | Excel (29k)
|Appendix B:||1990 Census Undercount Rates and State Population Projections for 2000 with and without Undercount Adjustment - Regions, Divisions, and States
PDF (59k) | Excel (30k)
|Appendix C:||Comparison between Estimated State Population for 2000 and Census 2000 Population - Regions, Divisions, and States
PDF (56k) | Excel (27k)
|Appendix D:||1990 Census Undercount Adjusted Population, Estimated Components of Change between 1990 and 2000, and Difference between the Estimates for 2000 and Census 2000 Population - Regions, Divisions, and States
PDF (63k) | Excel (28k)
|Appendix E:||Projected and Estimated Components of Change between July 1, 1995 and July 1, 2000 - Regions, Divisions, and States (Series A)
PDF (63k) | Excel (35k)
|Appendix F:||Projected and Estimated Components of Change between July 1, 1995 and July 1, 2000 - Regions, Divisions, and States (Series B)
PDF (63k) | Excel (32k)
|Appendix G:||Ranking of Absolute Percentage Errors of Projected Components of Change Between July 1, 1995 and July 1, 2000 by State - Series A
PDF (57k) | Excel (26k)
|Appendix H:||Ranking of Absolute Percentage Errors of Projected Components of Change between July1, 1995 and July 1, 2000 by State - Series B.
PDF (58k) | Excel (26k)
The purpose of the paper is to evaluate the Census Bureau's latest series of state population projections for the years 1995-2025 (Campbell, 1996b, PPL47). Based on the census 2000 results, this paper examines the performance of the projections for only the first five years. Using the census 2000 counts and estimated births, deaths, domestic migration, and international migration from administrative records, this paper examines the accuracy of projected total population and projected components of change for 50 states and District of Columbia. This paper also examines the historical trend and regional differences of the projection accuracy. A multiple regression analysis is also used to analyze the relative impact of errors in the projected components of change, errors in state estimates, and 1990 census undercounts on the accuracy of the state population projections. This study will provide information about the accuracy of the projections and source of errors for use in improving current projection models and procedures.
The paper begins with an overview of the methodology of the state projections, followed by a discussion of potential factors affecting the accuracy of the projections. Then it presents a comparison of projected 2000 state population and the census 2000 count with and without adjustment for census undercount. The assessment of the accuracy of state estimates against the census 2000 is also made because the estimates were used for the starting base year population for the projections. The projected components of change - births, deaths, domestic migration, and international migration - between 1995 and 2000 are compared with the most recent estimates of component change in the same period based on the administrative records data compiled in the Census Bureau's population estimates program.
Then, the paper presents the relationships between the factors affecting the accuracy of projections in a multiple regression analysis to demonstrate the proportion of errors explained by these factors collectively and independently. The analysis provides information about the relative importance of each factor affecting the accuracy of state projections while holding other factors constant. In addition, the accuracy of the national projections is also discussed to demonstrate the dependency of state projections on the accuracy of the national projections.
The cohort survival component method is used by the Census Bureau to prepare the state population projections. The components of population change - births, deaths, and migration - are projected separately. It requires separate projection assumptions for each birth cohort by single year of age, sex, race and Hispanic Origin. The race and Hispanic origin groups were non-Hispanic White, non-Hispanic Black; non-Hispanic American Indian, Eskimo, and Aleut; non-Hispanic Asian and Pacific Islander; Hispanic White, Hispanic Black, Hispanic American Indian, Eskimo, and Aleut; and Hispanic Asian and Pacific Islander. The detailed components for the projections and assumptions were derived from vital statistics, administrative records, 1990 census data, state population estimates, and the middle series of the national population projections (P25-1130, 1996).
The cohort component method used to produce the projections for every year from 1995 to 2025 is based on the following formula:
P1 = P0 + B - D + DIM - DOM + IIM - IOM
P1 = population at the end of the period
P0 = Population at the beginning of the period
B = births during the period
D = deaths during the period
DIM = domestic in-migration during the period
DOM = domestic out-migration during the period
IIM = international in-migration during the period
IOM = international out-migration during the period
The 1990 census base population estimates for 1994 were used as the starting base population to launch the projections. The first projected 1995 results were later adjusted to agree with the 1995 state population estimates when they became available. First, survival rates were used to survive each age-sex-race/Hispanic group forward one year. Then the state-to-state migration rates were applied to the survived population in each state. The projected out-migrants were subtracted from the state of origin and added to the state of destination as in-migrants. Then the immigrants from abroad were added to each group, while emigrants were subtracted. The population under one year of age was created by applying age-race/Hispanic specific birth rates to females of childbearing age. The number of births by sex and race/Hispanic origin were survived forward and exposed to the migration rates to derive the population under one year of age. The results of each age group were adjusted to agree with the national population projections by single year of age, sex, and race/Hispanic origin.
Two sets of state population projections were prepared based on different models used in projecting the domestic migration component. The migration trends data used in both projections were based on state-to-state migration flows data, extracted from annual matches of Internal Revenue Service(IRS) individual income tax returns. The data contain 19 observations from 1975-76 to 1993-94 on each of the 2,550 state migration flows (51 x 50 matrix). Two models were used to project these migration flows into the future:
(1) Series A used a time series model - regression of changes in the natural logarithms of the migration rates. The first five years of the projections used the time series projections exclusively. The next ten years of projections were interpolated from the time series projections toward the mean of the series. The final 15 years used the series mean exclusively.
(2) Series B is an economic model. Changes in state-to-state migration rates were derived from the relationship between changes in the migration rates and Bureau of Economic Analysis projected changes in employment in the origin and the destination states. Detailed assumptions and procedures used in the projections are described in the Census Bureau's report, PPL-47 (Campbell, 1996b).
Based on the methodology of the state population projections, several factors need to be considered in order to evaluate the accuracy of the projections.
To assess the accuracy of the projections, most studies compare the projections with the census count for the census year or with most recent population estimates available for the inter-censal or post-censal years (Smith and Sincich, 1990, 1992; Wetrogan and Campbell, 1990; Campbell, 1996a, 1997). Changes in net undercount between the two census affect the validity of measurement of accuracy of the projections. According to the Accuracy, Coverage, and Evaluation (ACE) survey and the Demographic Analysis (DA) by the Census Bureau, the net undercount rates in the census 2000 are significantly lower than in the 1990 census (Robinson, et al., 2001). Based on the Post Enumeration Survey, the 1990 census had a net national undercount of 1.6 percent, while the net undercount rate for the census 2000 was reduced to 0.06 percent, based on the similar quality-check survey (Census 2000 Initiative, 2001). Therefore, we would expect that the projected 2000 population based on the 1990 census would understate the 2000 population as compared with the census 2000 counts.
The 1995-2025 state projections were based on July 1, 1994 state population estimates as the first base year population and then the first projection year was adjusted to agree with the 1995 state population estimates. The accuracy of the state population estimates definitely affects the base year population for projections. To assess the accuracy of the projections, we also need to examine the accuracy of the state estimates against the census 2000 population.
The state population estimates were derived from the Census Bureau's annual county estimates based on a component of change method. To derive natural increase, the Census Bureau uses vital statistics (births and deaths) collected from the National Center for Health Statistics and state agencies in the Federal-State Cooperative Program for Population Estimates (FSCPE). In terms of the migration component, the Census Bureau uses annual matches of extracts of IRS individual income tax returns to derive migration rates for the population under 65 in each state and county. The immigration data from the Immigration and Naturalization Services (INS) were used to derive the number of legal immigrants by state of intended residence. The Census Bureau also estimates the number of residual foreign born and emigrants for the states. In addition, the data for movement of federal civilian population were also used as another component of change for state estimates. The Medicare enrollments were used to estimate the population 65 and over. Finally, the county estimates were controlled to the national estimates and the state estimates were derived.
Since the state estimates are derived from the component method by adding the components of change to the base year population, the accuracy of state estimates depends on the accuracy of each component of change and the census population count as well. In other words, to assess accuracy of the estimates also faces the problem of different net undercount rates in the two censuses. Since the net undercount rates in the census 2000 are lower than the rates in the 1990 census, we should expect an overall under-estimate of the estimated population for 2000.
Since the state projections are derived from the demographic accounting of births, deaths, domestic migration, and international migration, the quality of input data and methodologies for deriving projection assumptions for each component will definitely affect the accuracy of the projections. To assess the accuracy of the projected components of change, we use the most recent available statistics compiled by the Census Bureau for the Population Estimates Program between 1995 and 2000.
The accuracy of the projected components of change is affected by the input data, selection of the starting point of various rates used in the projections, and the statistical models used in projecting each component. Instead of examining the procedures to derive these components, this paper is limited to the comparison of the projected total births, deaths, net domestic migration, and net international migration with current statistics.
The results of state population projections were controlled to agree with the most recent national population projections as the final stage of procedures. The accuracy of the national projections will eventually affect the accuracy of the state projections. For example, the national projections, to which the current series state projections were controlled, showed 274.0 millions people in 2000 while the census 2000 showed 281.4 million. A difference of 7.4 million between projected national population and the census count will definitely affect the accuracy of the state projections when the state projections are controlled to agree with the national projections.
The projected 2000 population in the most recent series of national projections (working paper #38) also shows a significantly lower projected population than the census 2000 count. It is due primarily to a higher undercount rate in the 1990 census than in the census 2000. A brief note of evaluation of the national population projections is presented at the end of the paper.
To assess the impact of national population projections on the state projections, it is necessary to compare the projections with and without national controls. However, this paper will limit itself to discussion of the accuracy of the projected national population to infer its impact on the accuracy of the state projections.
Most projections are based on the assumption that population change can be predicted if the current or historical demographic trends continue in the future. However, it is not always the case. Therefore, we can anticipate that the projections for the areas which experience dramatic socioeconomic changes will not be as accurate as the areas with stable socioeconomic conditions. The population change between 1990 and 2000 can be used to measure where the states have experienced dramatic changes or not.
In addition, the previous studies also indicate that the population size affects the accuracy of population projections. It is mainly due to the relationship between so-call "true demographic rates" - (fertility, mortality, migration rates) and the population size. Since the detailed demographic rates - age, sex, and race for small states will be likely to have many small numbers in each cell or many empty cells, these rates for smaller population bases will be unstable.
The paper uses two measures to evaluate the accuracy and bias of the projections. To measure accuracy and bias, we need a "true population" to compare for the same year. Normally, the decennial census count and inter-censal estimates are used as the "true population." Due to undercount and coverage issues, there may not actually be a "true population." Therefore, the measurement of accuracy should be considered as an approximation.
The most commonly used measurement of accuracy of the projections is Mean Absolute Percent Error (MAPE), which is the average error when the direction of error (positive or negative) is ignored. The measurement indicates the magnitude of the errors among a specific number of geographic units. The formula for the MAPE is:
MAPE = (Sum(|projection - census|/census*100))/n
Where, n is the number of states. MAPEs are calculated for the United States (the states and the District of Columbia), where n is 51, and for each census region or division, where n equals the number of states in each region or division. This is used as a measure of accuracy of forecast or projections (Smith and Sincich, 1990, 1992).
The second measure is Mean Algebraic Percent Error (MALPE), which takes into account the direction of error. It has been used as a measure of forecast bias, whether under-projected or over-projected (Smith and Sincich, 1990, 1992). The formula for the MALPE is:
MALPE = (Sum((projection - census)/census *100))/n
It has been argued that the MAPE overstates the error of projections or estimates because a few extreme outliers would make the average (arithmetic mean) higher than reality (Tayman and Swanson, 1999; Tayman, Swanson, and Barr 1999, Swanson, Tayman, and Barr, 2000). However, in order to compare the results with previous studies using the MAPEs, and cross-comparison of errors in different variables, this study used the MAPE to discuss the accuracy of the projections.
In addition, because the state projections were prepared as of July 1 for each year, it is necessary to develop an April 1, 2000 projection to compare with the census 2000. The July 1, 2000 projections are converted to April 1, 2000 based on the following formula:
P2000(4/1) = P1999 * (P2000/P1999)(9/12)
As shown in Table 1, the series A of the state projections produced a mean absolute percentage error (MAPE) of 2.6 and the series B had a slightly lower MAPE of 2.4. The Mean Algebraic Percent Error (MALPE) was -1.4 percent for Series A and -1.7 percent for Series B. This indicates a general tendency for the two series to under-project the state populations as expected due to higher undercount rates in the 1990 census. Only 10 states have the projected 2000 population more than the census 2000 count for Series A, and only 9 states for Series B. (See Appendix A)
|Region and Division||Number of States||Series A||Series B|
|East North Central||5||1.54||-1.54||1.36||-1.36|
|West North Central||7||1.60||-0.01||1.43||-0.30|
|East South Central||4||0.89||-0.89||0.87||-0.87|
|West South Central||4||2.29||-2.29||2.22||-2.22|
Source: Campbell, Paul R. "Population Projections for States by Age, Sex, Race, and Hispanic Origin: 1995 to 2025," PPL-47, U.S. Census Bureau, Population Division, October, 1996
The MAPEs for Series A show that the projections are more accurate in the Midwest (1.6%) and less accurate in the West (3.8%). Most of the states in the Midwest had the percent errors below 2.6 percent. Only Illinois had the percent error of over three percent (3.1%).
The MAPEs for the West vary dramatically from state to state. Generally, the projections are less accurate in Mountain states (MAPE of 4.4 percent) with a wide range of levels of accuracy -- from 7.0 percent for Arizona and 7.6 percent for Nevada to -1.8 percent for New Mexico and -1.7 percent for Utah. The percent errors in the Pacific states also vary significantly ranging from -4.2 percent for California to -1.0 percent for Washington. (See Series A in Appendix A)
The MAPE for the South is about the same level of accuracy as the average of 50 states and the District of Columbia (2.6%). However, the percent errors in the South also vary dramatically from state to state. The MAPE for the South Atlantic division is higher (3.5%) than other divisions in the South, while the MAPE for the East South Central division is significantly lower than other divisions with an MAPE of 0.9 percent.
The percent errors of the projections for states in the Northeast region also vary in a wide range from -4.9 percent for Rhode Island to -0.7 percent for Pennsylvania and 1.1 percent for Vermont. However, the variation of percent errors in the Northeast states is much less than in the West and South.
The MAPEs for Series B projections also show the similar pattern of variation among four regions and states as Series A. Generally, the percentage errors of Series B are very close to Series A (See Figure 1)
Despite the errors we just described, the current set of projections tends to be more accurate than in the earlier projections produced before the 1990s. According to Smith and Sincich (1992), the MAPEs for the Census Bureau's state projections after 5 years ranged from 3.1 to 5.0 percent for earlier versions of the projections (1955 through 1980). Wetrogan and Campbell (1990) analyzed the Census Bureau's previous series of state projections from 1965 (P25-375) to 1980 (P25-937) and found the MAPEs for the first five years of projections ranged from 3.0 to 5.2 percent.
To update the later series of projections after 1980, the MAPEs for the 1986 Series (P25-1017), 1988 Series (P25-1053) and 1992 Series (P25-1111) are calculated to compare with the current series. As shown in Table 2, the overall accuracy of the state population projections has improved since the 1986 Series (P25-1017) with an MAPE of 2.6. The first projections series after 1990 (P25-1111) was even more impressive with an MAPE of 1.6 for series A for the first 5 years. Then, the MAPE for the latest series PPL-47 returned to the same level of 2.6 as previous two series in the late 80s.
|Projection Reports||P25-375||P25-477||P25-796||P25-937||P25-1017||P25-1053||P25-1111||Current PPL-47|
|Jump off Year||1965||1970||1975||1980||1986||1988||1992||1995|
|Evaluation Year||1970||1975||1980||1985||1991||1993||1997||2000||2000||Revised **|
|Evaluation Yr Data||Census||Estimates||Census||Estimates||Estimates||Estimates||Estimates||Census||Estimates*||Estimates|
|Series I-D||Series I-E||Series II-A||Series A||Series A||Series A||Series A||Series A|
|Series II-D||Series II-B||Series B||Series B||Series B||Series B||Series B|
|Series C||Series C|
Wetrogan, Signe I., 1988, "Projections of the Population of States by Age, Sex, and Race: 1988 to 2010," U.S. Census Bureau, Current Population Reports, Series P25-1017, U.S. Government Printing Office, Washington, D.C.
Wetrogan, Signe I., 1990, "Projections of the Population of States by Age, Sex, and Race: 1989 to 2020," U.S. Census Bureau, Current Population Reports, Series P25-1053, U.S. Government Printing Office, Washington, D.C.
Wetrogan, Signe I. And Paul R. Campbell, 1990, "Evaluation of State Population Projections," presented at the Population Association of America Annual Meetings, Toronto, Canada, May 3-5.
Campbell, Paul R.,1994, "Population Projections for States, by Age, Sex, Race, and Hispanic Origin: 1993 to 2020," U.S. Census Bureau, Current Population Reports, P-25-1111, U.S. government Printing Office, Washington, D.C.
Campbell, Paul R.,1996, "Population Projections for States, by Age, Sex, Race, and Hispanic Origin: 1995 to 2025," U.S. Census Bureau, Population Division Working Papers, PPL-47.
*The 2000 state population estimates as 4/1/2000 are obtained from the Population Estimates Program with extrapolation.
**The revised 2000 state estimates are obtained from the Population Estimates Program with interpolation to 4/1 from 7/1/2000.
It seems that the performance of the current projections series is worse than the 1992 series (MAPE of 2.6 vs. 1.6). This is misleading. Since the MAPE for 1992 series was based on the 1997 estimates to evaluate the accuracy for the first five years, while the MAPE for the current series is based on the census 2000. The 1997 state population estimates were consistent with the 1990 census which had higher rates of undercount than did Census 2000. Comparisons based on a different census base are not valid.
If we use the same series of state estimates extrapolated from 1999 to 2000, instead of census 2000 counts, to calculate the MAPE for the current series, the results show that the MAPE for the current projections series was reduced to 1.5, slightly lower than the previous one (see last second column of Table 2). However, if we use the 1990 census based 2000 estimates (revised by the Census Bureau's estimates program), interpolated from 1999 estimates, the MAPE for the current series for series A increases to 1.7 (see last column of Table 2). Nevertheless, the MAPEs for series B based on revised estimates series are still lower than the previous series. Therefore, we can conclude that the projections series after 1990 are generally better than the earlier series.
Table 2 also shows that the state projections continue to do poorly in the West as compared with other regions, no matter what series of projections are examined. The projection errors for the Midwest states have been very stable within the range of 1.0 and 1.8 since the 1975 projections series. The projections for the Northeast has been improved over time, but the South has had the smallest MAPEs since the 1988 projections series if the estimates were used to measure the accuracy.
As mentioned above, the census 2000 had a higher coverage rate than the 1990 census. The projections based on the 1990 census will certainly tend to under-project the population. Thus, if we used the 1990 census undercount adjusted population for projections, we should see a reduction in percentage errors. Instead of re-running the lengthy projections program in this study, the 2000 projections were adjusted with state specific undercount rates in 1990. The results show an improvement of the projections.
|Region and Subdivision||Number of States||Series A||Series B|
|East North Central||5||0.98||-0.85||0.91||-0.67|
|West North Central||7||1.47||0.63||1.27||0.35|
|East South Central||4||0.95||0.95||0.97||0.97|
|West South Central||4||0.73||-0.17||0.87||-0.09|
Source: Campbell, Paul R. "Population Projections for States by Age, Sex, Race, and Hispanic Origin: 1995 to 2025," PPL-47, U.S. Census Bureau, Population Division, October,1996
As table 3 shows, the MAPE for all states was reduced from 2.6 to 2.2 for Series A and from 2.4 to 2.0 for Series B. The number of states with the percentage error of less than 1.0 percent increases from 10 to 20 for Series A, and from 13 to 18 for Series B.(See Appendix B) Projections are improved except in 12 states for Series A and 15 states for Series B.(See Figures 2 and 3) These exceptions are those states with high projections originally or those with low percentage errors which turn to over-projected values after the undercount adjustment was made. The MAPEs for all regions were reduced after adjusting undercount except the West. The MAPEs for the West after adjustment are higher because many states in that region were over-projected initially. For example, the projections for both Series A and B for 2000 for Idaho, Montana, Wyoming, Hawaii, and Alaska were above the census 2000 count. Once their projected populations were inflated by the undercount rates, the Mean Absolute Percentage Error for the region becomes higher.
One crucial factor affecting the accuracy of the state projections is the use of state estimates as the base year population to launch the projections. If the estimates are not accurate, the projections will be automatically inaccurate. The evaluation of the estimates against the census 2000 count faces the same issue of census undercount as evaluating the projections. Therefore, a comparison of the 1990 census base estimates and the estimates adjusted for net census undercount is also made.
Table 4 shows the difference between the census 2000 count and the estimated 2000 population by region and division. The estimates based on the official 1990 census count under-estimated the U.S. population by 2.4 percent or a total of 6.8 million people. Almost all states had the estimated population lower than the census count except West Virginia (See Appendix C). The West had the highest MAPE of 3.2, followed by the South, and the Northeast region. The Midwest had the lowest MAPE (1.4%). However, in terms of divisions in the regions, the Mountain division had the highest MAPE, followed by South Atlantic states, the similar pattern of the geographic distribution of errors for the state population projections. (See Table 1)
|Region and Division||Difference between Estimates and Census 2000||MAPE|
|1990 Census Base||Undercount Adjusted Base||1990
|East North Central||-473,715||-1.05||-161,533||-0.36||1.13||0.79|
|West North Central||-324,135||-1.68||-222,802||-1.16||1.65||1.06|
|East South Central||-339,067||-1.99||-65,363||-0.38||1.91||0.50|
|West South Central||-859,452||-2.73||-181,631||-0.58||2.56||0.83|
If we use the net census undercount adjusted 1990 population as the base to derive the estimates, we can see a dramatic reduction of estimation errors. All the states have a reduction of errors except Alaska, Michigan, and West Virginia, where the errors remain low. (See Figure 4) The amount of under-estimation for the U.S. as a whole decreases from 6.8 million to 2.9 million, 57 percent reduction (Table 4). The negative percent difference for the entire U.S. decreases from 2.4 to 1.0 percent. The mean absolute percent error (MAPE) for all states dropped from 2.6 percent to 1.5 percent. The reduction of percent errors in state estimates based on the 1990 census adjusted for net undercount is so overwhelming that all regions have a reduction of estimation errors (See Table 4). Since the births and deaths are considered more accurate than other components, the 2.9 million discrepancy between the estimates adjusted for 1990 census net undercount and the census 2000 count could be attributed to the migration component, more likely the underestimation of net international migration for the nation.
Since the Cohort-Component Method was used to produce the state projections, the accuracy of every component will affect the accuracy of the projections. To evaluate the accuracy of each component - births, deaths, and migration, the most current vital statistics and migration data from the administrative records were used. The Census Bureau has routinely compiled the annual component data for its Population Estimates Program. Because the components of change produced in the state projections are from mid-year to mid-year as in the population estimates, we can compare the projected components of change for 7/1/1995 to 6/30/2000 with the estimated components for the same period.
As Table 5 and Figure 5 show, the projected births are more accurate than other components with lowest Mean Absolute Percentage Errors, followed by deaths. The net domestic migration is the worst component in the projection - the MAPE reached 193.3 for Series A, and 174.2 for Series B. The MAPE of net international migration was 31.5 for Series A and also 31.5 for Series B. The differences of MAPEs for births and deaths between Series A and Series B are also about the same. Only the MAPEs of domestic migration are different between Series A and Series B. This reflects the fact that the only primary difference between Series A and Series B is the use of different models in projecting domestic migration.
|Region and Division||Series A||Series B|
|Births||Deaths||Net migration||Births||Deaths||Net migration|
|East North Central||2.4||7.4||213.6||29.8||2.5||7.4||222.5||29.8|
|West North Central||2.7||8.2||166.4||36.2||2.6||8.1||143.8||36.1|
|East South Central||4.3||4.0||147.4||17.6||4.3||4.0||138.7||17.7|
|West South Central||5.3||5.5||58.7||26.5||5.3||5.5||65.1||26.5|
Source: Campbell, Paul R. "Population Projections for States by Age, Sex, Race, and Hispanic Origin: 1995 to 2025," PPL-47, U.S. Census Bureau, Population Division, October,1996
Estimated components of change are derivied from the Census Bureau's Population Estimates Program
Although the MAPEs for the birth component are more accurate, they vary from region to region. The projected births for the West had the highest MAPE with 9.6 percent as compared with 2.6 in the Midwest. However, the MAPE for the New England (9.1) is comparable to that in the Pacific division. The Mountain states had the highest MAPE for projected births (9.9 for Series A).
The percent difference for births in Series A differs from state to state, ranging from 0.1 percent for South Carolina to 27.9 percent for District of Columbia.(See Appendix G) The projected numbers of births for South Carolina, New Jersey, Delaware, Illinois, Iowa, and Missouri are more accurate with the percent error of less than 1.0 percent. But, California, Maine, Hawaii, Vermont, Utah, Nevada, and D.C. are among the worst in projected births (11 percent or higher error). The discrepancies for births in Series B are about the same as in Series A.
Projected deaths are more accurate in the West, followed by the South. The MAPE for projected deaths is highest in the Northeast with the highest MAPE for the Middle Atlantic States (See table 5). The states with highest discrepancies in the death component (more than 10 percent) are District of Columbia, New York, Rhode Island, California, Hawaii, Massachusetts, Nevada, New Jersey and Illinois. (See Appendix G) In contrast, South Carolina, Utah, Wyoming, and Alaska have the smallest error rates (less than 1.0 percent).
Some states have the similar levels of accuracy for projected births and deaths, such as South Carolina (the best) and District of Columbia (the worst). However, some states have complete opposite trends in their projected births and deaths. For example, Utah has 15.5 percent error in projected births, among the worst, but has 0.9 percent error in projected deaths, among the best. (Appendix Table G). The percent errors of projected deaths in Series B are about the same as in Series A.
The net domestic migration had a wider range of percentage errors among states ranging from 2.3 percent for Georgia to 2,245 percent for Utah (Series A). The estimated net domestic migration for Utah between 1995 and 2000 was -5,247, but the projected net domestic migration was 112,548. States with the highest errors in projecting domestic migration are Montana, Indiana, New Mexico, Vermont, Wyoming, South Dakota, Alabama, Nebraska, California, Kansas, and Idaho with absolute percentage errors of 200 percent and higher. (Appendix G)
The MAPE for the net domestic migration for the West is the highest among the four regions, especially among the Mountain states. (Table 5) The South had the lowest mean absolute percent error. However, the variations in the MAPE among divisions are very substantial. For example, the East South Central states had MAPEs close to 150 percent, while the South Atlantic states had a MAPE of 17.6 percent.
The variation of average absolute percent errors in domestic migration projection seems to have no precise relationship with geographic location and size of population. For example, the Mountain region and New England region where many small states are located had a percent error of 606.6 percent and 139.7 percent respectively for Series A, 554.0 percent and 113.1 percent for Series B. Arizona and Nevada with low projected domestic migration error rates (19.3 percent and 20.0 percent respectively) are located in the Mountain region where projection errors are the highest. The percent error for projected domestic migration error for California, the largest state, is substantially higher (253.9 percent), while the error for New Hampshire, one of the smallest states, is only 3.6 percent of error. This suggests that there is no unique pattern in percent errors in projected domestic migration among the 50 states and District of Columbia.
The percent errors in projected domestic migration in Series B are generally lower than those in Series A except for the East North Central states (Table 5 and Appendix H). However, the overall variation of the errors among regions and subdivisions is about the same. As in Series A, Utah and Montana have the highest percent errors of the projected domestic migration in Series B.
The percent discrepancies between projected and estimated net international migration were higher in the West and the Northeast. Again, the Mountain and New England states have the highest percent errors. (See Table 5 and Appendix G). Generally, the Mid-Atlantic, East South Central and Pacific states have lower percent errors in projected international migration. However, there are no particular patterns in the errors for the location of specific states. For example, the states with the highest and lowest error in international migration, New Hampshire ( 2.8%) and Rhode Island (109.5%), are both located in the New England area.
Similar to domestic migration, the percent errors in international migration are not associated with the population size. For example, Texas (60.0%) is among the states with the highest percent error in projected international migration while New York (10.0%) and California (16.6%) are among the states with relatively lower errors in projected international migration.
The description of the errors (MAPEs and MALPEs) of the state projections, state estimates, and projected components of change as we presented above does not provide sufficient information to quantify the relationships among errors. It is only possible to say that the domestic migration has the highest percent errors among the four components. It cannot tell the extent to which the errors in projected domestic migration contributed to the variation of errors in state population projections among 50 states and the District of Columbia. A further question is, to what extent the potential factors of projections error, such as the undercount rates, errors in state estimates, and errors in projected components of change affect the accuracy of state projections collectively and independently. To answer this question, it is necessary to do a multiple regression analysis.
The dependent variable for the analysis is the absolute percent error of state projections. The independent variables include - 1990 census net undercount rates, absolute percent error of state estimates, absolute percent error of projected births, deaths, net domestic migration, and net international migration. In addition, the percent population change between 1990 and 2000 is used to measure the uncertainty of the projections in predicting future trends. Since the pattern of projections errors for Series B is very close to Series A, the following analysis will present Series A only.
(1). Correlation between Projection Error and Dependent Variables
Before presenting the results of the multiple regression analysis, we need to present the correlation between dependent and independent variables - the gross relationship between two variables without holding other variables constant. Table 6 shows the simple correlations among these variables. As expected from the discussion above, the projection errors are highly correlated with percent error in state estimates (correlation coefficient of 0.72), and also related to the 1990 census undercount rates (0.47). The projection error is also associated with population change (0.42) -- a dramatic change in population would usually produce a larger error in projections.
|Variables Absolute % Error||Absolute Percent|
|Projections Error||Undercount Rate||Estimates Error||Births Error||Deaths Error||Domestic Mig Error||International Mig Error||Pop Change 1990-2000|
The general perception is that the percent errors in the projected components should be the primary source of errors in the projections because the projections were based on the cohort component method. As expected, the error in projected births is significantly correlated with the projection errors (0.57). However, the percent errors in projected deaths and international migration only correlate moderately with errors in population projections. Surprisingly, the percent error in domestic migration has no correlation with percent projection errors. This indicates that a state with higher percent error in projected domestic migration may not necessarily have a higher percent error in projections. This can be seen from Figure 6. This may also reflect the problems of measurement of domestic migration based on IRS data. Changes in tax laws, problems in the geo-coding of tax returns addresses, and different levels of coverage rates of population may contribute to the uncertainty of this variable. The migration flows used in the projections may not reflect the true migration, but the estimated net domestic migration used to evaluate the projected domestic migration may not reflect the true migration either.
(2). Multiple Regression of Factors Affecting Projection Accuracy
The simple correlation between two variables may include the impact of other variables on the specific variable. For example, the correlation between errors in projected births and errors in projected population may be due to the impact of state estimates and census undercount on the projected births because the census undercount and state population estimates affect the accuracy of population base to derive fertility rates for the projections. In other words, the impact of errors in births on projection errors is also due to the effects of errors in state estimates or census undercount on projections at the same time. The results of the multiple regression analysis in Table 7 show the importance of each variable contributing independently to the projection errors while holding other variables constant in three conditions and how much all the variables together can explain the projection errors.
Table 7 shows the standardized regression coefficients of the independent variables on percent projection error in 3 models. Model 1 includes only percent errors in births, deaths, domestic migration, and international migration. Model 2 includes census undercount rates and state estimates errors, in addition to the variables in model 1. Model 3 includes one more variable - population change between 1990 and 2000.
The errors in the projected components as shown in model 1 explain 40 percent of projection error (R-square of 0.40). The percent error in projected births accounts for most of the weight (coefficient of 0.52), followed by international migration (0.21). The errors in projected deaths and domestic migration do not explain the variation in percent projection errors in the 50 states and District of Columbia. Surprisingly, when other components are held constant, the domestic migration tends to have a slight negative impact on projection accuracy. This further indicates that the problem of measuring the domestic migration in the state population estimates and population projections.
When the net census undercount rate and percent errors in state estimates are included in the regression, the combined set of variables explain over 60 percent of variation in projection errors. Most of the projection errors originally explained by the projected components of change are replaced by the percent errors in state population estimates and the net census undercount. The standardized coefficient of percent errors in births was reduced from 0.53 to 0.16. The percent error in state estimates stands out as the most important variable in explaining errors in the state population projection -- 0.46, followed by the net census undercount (0.23).
The reason for such dramatic shifts in explaining the errors in projections is that the state population estimates are not only used as the starting population base to launch projections, but also are used as the controls to develop population base for fertility, mortality, and migration rates. This can be seen from the correlation between percent errors in projected births and percent errors in state estimates (0.59), and the correlation between errors in projected deaths and state estimates (0.35).
In model 3, the percent population change is included in the regression to see whether difference in rates of population change can explain the variation of errors in projections due to uncertainty of predicting the turning point of population growth. The results show that although population change correlates significantly with projection error (0.42 in Table 6), its net impact on the projection errors becomes unnoticeable when other variables are taken into account.
|Independent Variables||Series A|
|Model 1||Model 2||Model 3|
|Absolute % error in projected births||0.525*||0.164||0.166|
|Absolute % error in projected deaths||0.092||0.084||0.068|
|Absolute % error in projected domestic migration||-0.180||-0.113||-0.114|
|Absolute % error in projected international migration||0.212*||0.143||0.144|
|1990 census undercount rate||--||0.231*||0.236*|
|Absolute % error in state estimates||--||0.464*||0.485*|
|Absoulte % population change 1990-2000||--||--||-0.031|
As mentioned before, the results of the state projections were controlled to the national population projections. The accuracy of the national projections would automatically affect the accuracy of the state projections. The national projections series used to control the state projections total show that the projected U.S. population in 2000 as of April 1 was 274,055,000, an under-projection of 7.4 million as compared with the census 2000 count of 281,422,000. The percent difference of 2.62 percent between the national projections and the census 2000 U.S. population is about the same as the MAPE of the state population projections. The latest national projections to year 2100 released in January, 2000 show a projected population of 274,659,000 in 2000, an under-projection of 6.8 million (see Table 8).
The accuracy of the national projections is also affected by the 1990 Census net undercount and the accuracy of national estimates. As Table 8 shows, if the 1990 census undercount rates were applied to the projected total population, the under-projection of the U.S. population would have been reduced dramatically -- from 6.8 million to 2.4 million if the 1990 PES (Post-Enumeration Survey) undercount rate were used, and to 2.2 million if the DA (Demographic Analysis) undercount rate were used. The percent errors for the projections would have been reduced from 2.4 percent to 0.9 percent with PES rate adjustment and to 0.8 percent with DA rate adjustment. This suggests that if the projections had been based on the 1990 population adjusted for net census undercount, the Census Bureau's latest U.S. projections would have been more accurate.
|Official/Adjustment||2000 Projections*||Estimates Vintage 2000||Census 2000||Projections - Census||Projections - Estimates|
|Adjustment based on:|
|PES undercount rate**||278,989,377||278,947,158||--||-2,432,529||-0.86||42,219||0.02|
|DA undercount rate**||279,181,631||279,139,384||--||-2,240,275||-0.80||42,248||0.02|
* Population Projections of the United States: 1999 to 2100 (Population Division Working Paper No. 38) **The adjustment for 1990 census undercount is based on the following information. Official U.S. Population 248,709,873 Undercount Adjusted (PES) 252,730,369 Net Undercount Rates Post Enumeration Survey(PES) 1.58 Demographic Analysis(DA) 1.65 Source: U.S. Census Bureau, ESCAP II: Demographic Analysis Results, October 13, 2001. and https://www.census.gov/dmd/www/pdf/understate.pdf.
Since the national population projections also use the most current population national estimates as the base, the accuracy of the national estimates would affect the accuracy of the national projections. As Table 8 shows, the national estimates also under-estimated the national population by 6.8 million and there is no significant difference between the projected U.S. population (274,649,908) and estimated population (274,608,346) as of 4/1/2000. Since the national estimates were also based on the 1990 census population as enumerated, the errors due to the net census undercount would also affect the accuracy of the national estimates. Therefore, when the estimates are adjusted by the 1990 net census undercount rates as the adjustment for the projections, the differences between estimates and census 2000 are about the same as difference for projections. It becomes obvious that the 1990 net census undercount has seriously affected the accuracy of both the population estimates and projections.
The national projections were based on the component method. The latest national projections were done in 1999 and released in 2000. In order to evaluate the accuracy of projected components of change for the first two years, we compare the projected 1999 and 2000 components with the most recent statistics. As Table 9 shows, the projections under-projected the number of births by 65,000 for 1999 and 148,000 for 2000 (1.65% and 3.66%) based on the provisional NCHS report. The projections under-projected the number of deaths by 19,000 for 1999 and 11,000 for 2000 (0.81% and 0.47%). If the projections had been based on the 1990 population adjusted for net census undercount, the projected births and deaths would have increased to some extent due to the larger population base. The percent errors of projected births and deaths should also be reduced. Therefore, we can conclude that the projected births and deaths for the first two years are quite accurate.
|Projections - Estimates|
|Net International Migration||960,215||970,368||864,844||880,119||95,371||90,249||11.03||10.25|
Table 9 also shows that the projections of net international migration in 1999 and 2000 are higher than the estimated figures by 10 to 11 percent. Since the projections of international migration were based on the estimates of international migration, the errors of the national projections for the first two years are largely due to the errors of the estimates of international migration. The errors of this component also affected the accuracy of the state population projections.
The accuracy of state projections depends upon many factors. It has been shown that the level of accuracy or magnitude of errors depends on the accuracy of census counts, national projections which are used to control the results of state projections, the accuracy of state estimates, and the components in the projections. The overall performance of the latest state projections series has been relatively more accurate than previous state projections series. The projections continue to perform poorly in the West. The state population estimates which were used as the population base to start the projections have similar level of errors as the projections, largely due to the net undercount in the 1990 census.
The percent errors in domestic migration continue to be the highest among the projected components of change, followed by the international migration. The projected births had the lowest average percent errors. However, the states with lower percent errors in projected domestic migration do not necessarily have more accurate state projections.
The multiple regression analysis further confirms that errors in the state estimates are the most important variable contributing to the state projection errors. The errors in the projected components - births, deaths, domestic migration and international migration - should have contributed a significant amount of error to the projections. However, when the state estimates and the 1990 census net undercount are taken into account, the impact of errors from the components becomes less. Since the 1990 census net undercount affected a large portion of errors in the state estimates, the net census undercount also had a significant impact on the accuracy of the projections. The census undercount and the accuracy of the U.S. population estimates also affect the accuracy of the national projections, which in turn, affect the accuracy of the state projections. This further indicates the importance of the accuracy of base year population in producing accurate projections.
When the state estimates and 1990 census net undercount are not taken into account, the errors in projected births explain most of the error, followed by the error in international migration. The errors in projected deaths contributed less to the errors in the projections. However, the errors in domestic migration cannot explain the projection errors although the MAPE of the domestic migration is the highest among the components. This further indicates the difficulty of projecting the migration component in the population projections.
These results suggest that if we want to improve the projection, we need to pay special attention to the accuracy of the base year population and the accuracy of the population estimates. Since the net undercount rates in the census 2000 are relatively low, we would expect that the new projections based on census 2000 or estimates based on census 2000 should not be influenced by the 2000 census net undercount to the extent as by the 1990 census net undercount. Therefore, it is necessary first to ensure the accuracy of projected births because it explains largest proportion of projection errors among the components. It will be more cost-effective to do so because any improvement in projecting births can have a noticeable effect on projection accuracy. On the contrary, it may take more effort to make improvement in the domestic migration component for projections because its direct impact is mixed - it can go in either direction depending on other errors. This does not mean we should not pay attention to this important component in projections. We should know that no matter what we do to improve this component we may not expect to get the expected results. In other words, we do not need a complicated model to project the migration. What we need is a simple, reasonable, and understandable model to explain to the user what we do. Demographers repeatedly indicate that complex techniques did not produce more accurate forecasts or projections (Smith and Sincich, 1992).
Campbell, Paul R. 1994, "Population Projections for States by Age, Sex, Race, and Hispanic Origin: 1993 to 2020" Current Population Reports, P25-1111, U.S. Government Printing Office, Washington, D.C.
Campbell, Paul R. 1996a, "How Accurate were the Census Bureau's State Population Projections for the early 1990's?" Paper presented at the Federal Forecasters Conference, Washington, D.C. May 2.
Campbell, Paul R. 1996b, "Population Projections for States by Age, Sex, Race, and Hispanic Origin: 1995 to 2025," PPL-47, U.S. Census Bureau, Population Division, October.
Campbell, Paul R. 1997, "An Evaluation of the Census Bureau's 1995 to 2025 State Population Projections - One Year Later." U. S. Census Bureau, Paper presented at the Population Association of America Meeting, Washington D.C. (March)
Census 2000 Initiative, 2001," Census Bureau Says No To Adjustment; Review Finds Duplicates Wipe Out Most of Net Undercount" News Alert - October 18, 2001
Day, Jennifer Cheeseman, 1996, "Population Projections of the United States by Age, Sex, Race and Hispanic Origin: 1995 to 2050." U.S. Census Bureau, Current Population Reports, Series P25-1120, Government Printing Office, Washington D.C.
Hollmann, Frederick, Tammany Mulder, and Jeffrey Kallan, 1999, "Methodology and Assumptions for the Population Projections of the United States: 1999 to 2100." U.S. Census Bureau, Population Division Working Paper, No. 38, December, 1999.
Robinson, J. Gregory, 2001a, "Accuracy and Coverage Evaluation: Demographic Analysis Results," U.S. Census Bureau, DSSD Census 2000 Procedure and Operations Memorandum Series B-4* (March 12, 2001)
Robinson J. Gregory, 2001b, "ESCAP II: Demographic Analysis Results," U.S. Census Bureau, Executive Steering Committee for ACE Policy II, Report No. 1 (October 13, 2001).
Smith, Stanley K. and Terry Sincich, 1990, " The Relationship Between the Length of the Base Period and Population Forecast Errors," Journal of the American Statistical Association, Vol 85, No. 410, 1367-1375.
Smith, Stanley K. and Terry Sincich,1992, "Evaluating the Forecast Accuracy and Bias of Alterative Population Projections for States." International Journal of Forecasting, 8, 495-508.
Smith Stanley K. and Scott Cody, 1994, "Evaluating the Housing Unit Method, A Case Study of 1990 Population Estimates in Florida," Journal of the American Planning Association, vol. 60, No. 2 (Spring)
Swanson A. David, Jeff Tayman, and Charles F. Barr, 2000, "A Note on the Measurement of Accuracy for Subnational Demographic Estimates." Demography, Vol.37, No. 2 May 2000:193-201.
Tayman, Jeff and David A. Swanson, 1999, "On the Validity of MAPE as a Measure of Population Forecast Accuracy," Population Research and Policy Review: 18: 299-322.
Tayman, Jeff , David A. Swanson, and Charles F Barr, 1999, "In Search of the Ideal Measure of Accuracy for Subnational Demographic Forecasts." Population Research and Policy Review, 18: 387-409.
U.S. Census Bureau, https://www.census.gov/population/www/projections/st_yr95to00.html
U.S. Census Bureau, https://www.census.gov/population/www/projections/st_comp-chg.html
U.S. Department of Commerce, Press Release, 2001, "Statement of Acting Census Bureau Director William Barron Regarding the Adjustment Decision." CB01-CS.08, October 17, 2001.
Wetrogan, Signe I. And Paul R. Campbell, 1990, "Evaluation of State Population Projections." Paper presented at the Population Association of America, Toronto, Canada, May 3-5.