July 1, 2002
Working Paper Series No. 50
An earlier version of this paper was presented at The Direction of Fertility in the United States meeting in Alexandria, Virginia, October 2, 2001.
This paper reports the results of research and analysis undertaken by Census Bureau Staff. It has undergone a more limited review than official Census Bureau publications. This report is released to inform interested parties of research and to encourage discussion. Please direct questions to Tammany Mulder at (301) 763-6137 (email@example.com).
During the 1900's, knowledge of population trends and their future repercussions for the size and distribution of the population became increasingly important as the US experienced major shifts in fertility and net immigration. Population forecasts produced by the Census Bureau are used widely, informing researchers, planners, legislators, and many others, on the future course of population change. Because forecasts are subject to inherent uncertainty, as they are based on a compilation of reasonable assumptions for the components of population change, it is essential to educate customers as to the amount of uncertainty within the forecasts for the population and the components of population change. To date, the Census Bureau has not published a comprehensive analysis of the accuracy of their forecasts. The aim of this research is to address this gap and systematically evaluate the accuracy of the existing Census Bureau forecasts both in terms of their ability to predict the national population as well as individual components of change.
Overall, the Census Bureau has greatly improved the level of accuracy found within its forecasts. Recent forecasts produced in the 1990's have minimized the inherent uncertainty and provide a reliable product for consumers in the short term. Improvement in the forecast reliability is, in all likelihood, the result of the stabilization of the components of population change. This study reveals that forecasters failed to foresee turning points in population trends, resulting in erroneous forecasts, particularly for fertility and net immigration. The inadequate base data used for certain series severely reduced accuracy upon beginning the forecast. Consequently, the forecasts maintained higher levels of error throughout the forecast period. In addition, the assumptions formulated by the Bureau were often outperformed by simple assumptions of constancy. The research presented here represents a contribution to the discussion of population forecasting accuracy for the United States; however, additional research is needed.
Table of Contents
Complexities in Assessing the Accuracy of Forecasts
- Choosing Among Multiple Forecast Series
- Measurement of Forecast Error at Multiple Levels
- Forecast Error Patterns
- Explanation of Indicators
- Comparison of the Census Bureau Forecast Models with a Naïve Model
- Potential Biases Present Within the Estimate and Forecast Series
Total Population Growth Rate Forecasts
- Overall Accuracy and Duration-Specific Forecast Error of the Population Growth Rate Forecasts
- Comparison of Growth Rate Forecast Models
- Summary of Forecast Error for Growth Rates
Components of Change Forecasts
- Fertility Forecasts Error Analysis
- Overall Accuracy of Fertility Forecasts
- Duration-Specific Forecast Error for Fertility
- Comparison of Fertility Forecast Models
- Summary of Forecast Error for Fertility
- Mortality Forecasts Error Analysis
- Overall Accuracy of Mortality Forecasts
- Duration-Specific Forecast Error for Mortality
- Comparison of Mortality Forecast Models
- Summary of Forecast Error for Mortality
- Net Immigration Forecasts Error Analysis
- Overall Accuracy of Net Immigration Forecasts
- Duration-Specific Forecast Error for Net Immigration
- Comparison of Net Immigration Forecast Models
- Summary of Forecast Error for Net Immigration
|Table 1.||Summary of the U.S. Census Bureau National Population
Projections Products: 1947 to 1999
Excel (18k) | PDF (65k)
|Table 2.||Error Statistics for the Forecasted Annual Growth Rate for the
Total US Resident Population: 1947 to 1999.
Excel (24k) | PDF (53k)
|Table 3.||Percent Error for the Total U.S. National Population Forecasted Annual
Growth Rates: 1947 to 1999
Excel (18k) | PDF (45k)
|Table 4.||Error Statistics for the Forecasted Number of Births for the Total US
Resident Population: 1963 to 1999.
Excel (22k) | PDF (51k)
|Table 5.||Error Statistics for the Forecasted Crude Birth Rates for the Total US
Resident Population: 1963 to 1999.
Excel (22k) | PDF (50k)
|Table 6.||Percent Error for the Fertility Forecasts of the US: 1963 to 1999
Excel (19k) | PDF (45k)
|Table 7.||Error Statistics for the Forecasted Number of Deaths for the Total US
Resident Population: 1963 to 1999.
Excel (22k) | PDF (51k)
|Table 8.||Error Statistics for the Forecasted Crude Death Rates for the Total US
Resident Population: 1963 to 1999.
Excel (22k) | PDF (50k)
|Table 9.||Percent Error for the Mortality Forecasts of the US: 1963 to 1999
Excel (19k) | PDF (46k)
|Table 10.||Percent Error for Net Immigration Forecasts of the US: 1963 to 1999
Excel (19k) | PDF (48k)
|Table 11.||Error Statistics for the Forecasted Number of Immigrants Net of Emigration
for the Total US Resident Population: 1963 to 1999.
Excel (22k) | PDF (53k)
|Table 12.||Error Statistics for the Forecasted Crude Net Immigration Rates for the
Total US Resident Population: 1963 to 1999.
Excel (22k) | PDF (51k)
|Graph 1.||The Annual Growth Rates for the Total Population of the United States:
1947 to 1999
GIF (5k) | PDF (35k)
|Graph 2.||The General Fertility Rate for the Total Population of the United States:
1943 to 1998
GIF (5k) | PDF (45k)
|Graph 3.||The Observed and Forecasted Crude Birth Rates for the Total Population of
the United States: 1964 to 1999
GIF (9k) | PDF (44k)
|Graph 4.||The Multiple Series MAPE by Single Year for Each Component of Change and
The Respective Crude Rates for the Total Population of the United States
GIF (7k) | PDF (36k)
|Graph 5.||The Observed and Forecasted Crude Death Rates for the Total Population
of the United States: 1964 to 1999
GIF (8k) | PDF (44k)
|Graph 6.||The Observed and Forecasted Number of Deaths for the Total Population of
the United States: 1964 to 1999
GIF (9k) | PDF (44k)
|Graph 7.||The Observed and Forecasted Crude Net Immigration Rates for the Total
Population of the United States: 1964 to 1999
GIF (9k) | PDF (43k)
|Graph 8.||Comparison of the Multiple Series RMSE for the Crude Immigration Rate
for the Total Population of the United States
GIF (5k) | PDF (34k)
|Table B-1.||Error Statistics of the Forecasted Annual Total US Resident Population:
1947 to 1999
Excel (25k) | PDF (53k)
|Table B-2.||Error Statistics of the Forecasted Annual Growth Rate for the Total US
Resident Population: 1947 to 1999
Excel (24k) | PDF (53k)
Population projections are computations of future population size and characteristics based on separating the total population into its basic components of fertility, mortality, and migration and estimating the probable trends in each component under various assumptions (Srinivasan, 1998). National projections give planners, legislators, policy makers, and researchers, among others, a glimpse of possible future demographic trends for the population and the forces acting to produce population change. The U.S. Census Bureau, in collaboration with Thompson and Whelpton of the Scripps Foundation, began producing population projections and estimates for the national population in the 1940s. Following the first collaborative publication, the Census Bureau independently produced approximately eighteen primary forecasts for the national population (Whelpton, Eldridge, and Siegel, 1947). Because projections are simply a compilation of reasonable assumptions as to what will happen to the current population in future years, the accuracy of forecasts will depend on the validity of the assumptions and the accuracy with which the assumptions are quantified. Correspondingly, it is critical for the consumers of population projections to recognize the level of uncertainty found within population forecasts both in terms of their overall accuracy as well as in terms of the specific components of population change.
To date, the Census Bureau has not published a comprehensive analysis of the accuracy of their forecasts, which means customers depend on the expertise of the demographers producing the product. Long (1987), Stoto (1983), and Ascher (1978), each evaluated the forecast accuracy for the growth rate of the total population, while Ahlburg (1982) evaluated the accuracy of US Census Bureau forecasted total births. However, these analyses have not been updated since their original publication. The aim of this research is to address this gap and systematically evaluate the accuracy of the existing Census Bureau forecasts both in terms of their ability to predict the national population as well as individual components of change.
Projections are used for planning the delivery of various services, such as education, health facilities, employment, water and utilities, communications, transportation, and housing stock among many others, the distribution of federal and state resources, and to assist producers and sellers of various goods and services to predict future markets for their products. Moreover, in addition to understanding the overall size of the national population in the future, planners and policy-makers have an equally important stake in getting an accurate reading of the age and sex composition of the future population (Srinivasan, 1998). An evaluation of the accuracy of the national population forecasts and their components of change, will allow consumers to become more discriminate users of population forecasts. In addition, the research allows forecasters greater insight into how to improve their ability to forecast and where potential problems or biases exist.
The present paper evaluates the accuracy of Census Bureau population forecasts using an ex-post facto approach. That is, the performance of a forecast is evaluated relative to what was observed, which is operationalized here as intercensal estimates from 1947 to 1989, and the post-censal estimates from 1990 to 1999, produced by the Census Bureau (1990, 1993, 1995, 1999, 2000a). In addition, the present study evaluates the assumptions used as input variables in the cohort component method. Specifically, this research will attempt to answer two research questions. First, how accurately did the Census Bureau forecast the total population and its respective components of change? Second, did the forecasts for the population and components produced by the Census Bureau perform more accurately than a naïve model assuming constant rates?
Given that this paper represents the first effort to evaluate the accuracy of U.S. population projections on a comprehensive scale, few precedents exist regarding how to properly conduct the assessment. The next section details the complexities involved in assessing the accumulated national projections to date. This is followed by a section on the specific research design used to address these complexities in this paper. Next, the paper provides a discussion of the results of the accuracy assessment. This is broken into two sub-sections: population growth rate forecasts and components of change forecasts. The paper then presents the results, closing with a discussion and conclusions.
For the purposes of this research, the following terminology, which is consistent with language used among demographers and adapted from Smith and Sincich (1991), will be used to describe forecasts throughout the text:
|Base year:||The most recent estimate used to begin the forecast;|
|Target year:||The designated point 1 (year) the forecast reaches;|
|Forecast period:||The interval between first forecast year after the base year and target year;|
|Forecast error:||The difference between the observed and the forecast population at a designated point in forecast period.|
When discussing population projections, demographers often specify the difference between a "forecast" and a "projection." A projection generally represents possible population trends, while forecasts are produced to represent real population trends. In order to analyze the accuracy of the projections, the "preferred" middle series is used (U.S. Census Bureau, 2000b). In other words, this is the series the Bureau feels is most likely to take place, typifying a forecast. Furthermore, the object here is to analyze "forecast error," meaning the difference between forecast results and estimates.
B. RESEARCH DESIGN AND METHODS
Complexities in Assessing the Accuracy of Forecasts
Table 1 summarizes the base years, the forecast periods, the authors, and the type of series produced in each Census Bureau forecast product as of 1947. To assess the accuracy of this accumulated body of forecasts is inherently complex and requires a multi-pronged approach. Forecast error:
First, for any given national forecasts generated by the Census Bureau, multiple series are produced to represent the potential uncertainty experienced when forecasting the future population. Generally, a middle or "preferred" series forecast is produced with several alternate series based on differing assumptions for the components of change. Second, measurement of error can be calculated at three different levels: 1) forecast error by individual year of forecast; 2) averages of forecast error within intervals of a forecast period; and 3) averages of forecast errors across multiple series for specific points in the forecast period. Consequently, it is possible to examine forecast error resulting within individual series (defined as the error occurring within a specific series forecast period) as well as across multiple forecast series. Third, accuracy evaluations for individual and multiple series are approached from two perspectives: 1) the overall degree of accuracy for the forecasts; and 2) the pattern of error experienced at different points in the forecast. This separation permits analysis of how well the forecaster performed in general, which components of change potentially contributed to the error, and how much error may be attributable to the model upon which the forecasts were built. A fourth complexity inherent in evaluating the accuracy of the national population forecasts is that there is no consensus among forecasters as to the best indicator of forecast error to use. Fifth, because population change is driven by the trends for three components – births, deaths, and migration – forecasts of future population size and growth are built upon assumptions about the annual rate of population growth, as well as trends in the individual components of population change over time. Consequently, the accuracy of any forecast can be assessed according to its ability to forecast the population as well as forecasting the individual components of population change. Sixth, because forecasts are created using various assumptions, the forecasts can be compared to simplified or "naïve" models with assumptions of no change in future trends, providing a benchmark to compare Census Bureau forecast error. Lastly, forecast error may be skewed by biases present in the population estimates and forecasts and the individual components of change.
The Research Methods section provides the details of how these levels of complexity will be addressed in the present paper.
1) Choosing Among Multiple Forecast Series
In the recent past, the Census Bureau produced a middle series forecast and several alternate series based on differing assumptions for the components of change. Because the Census Bureau refers to the middle series as the "preferred series," and consumers commonly use this series, it is used hereafter for analytic purposes (U.S. Census Bureau, 2000b). The final column of Table 1 specifies the series used in this paper. For ease of discussion, each series will be identified by its respective base year (column 1). To evaluate the accuracy of the forecasts for the total population, seventeen forecasts were analyzed with base years ranging from 1947 to 1994 (U.S. Census Bureau, 1949 to 1996; Whelpton, Eldridge, and Siegel, 1947). Twelve series for the components of change are available from 1964 to 1994 (U.S. Census Bureau, 1964 to 1996).
Error for the total population is measured by its annual percentage rate of change, or annual growth rate, which is calculated using the exponential formula shown in Appendix A. Measurement of error for population projections can be influenced by the size of the projected population and the forecast length (Stoto, 1983). Use of the growth rate for the total population and the rate for the components of change removes any effects of the potential error from population size or the length of the forecast period. Evaluation of forecast accuracy for the growth rate of the total population builds on existing research by Long (1987), Stoto (1983), and Ascher (1978). Comparison of total births follows existing research by Ahlburg (1982).
Ex-post facto evaluation compares the forecast results with the historical population that was actually observed. Therefore, to evaluate the performance of past forecasts, each series is compared with intercensal (1947 to 1989) or postcensal (1990 to 1999) national estimates for the total population from 1947 to 1999. The forecast components of change and the corresponding crude rates are compared with the components produced as a part of the Census Bureau national estimates and vital statistics from the National Center for Health Statistics from 1963 to 1999 (National Center for Health Statistics, 1993; U.S. Census Bureau, 1990, 2000a; Ventura, et. al., 1999, 2000). Both the estimated and the forecast population growth rates are calculated for annual intervals ending on June 30, while the components of change are summed for calendar years. Note that the lengths of forecasts vary, ranging from 7 to 101 years, and that the forecast period of subsequent forecasts always overlap to some extent with that of prior forecasts. Because few forecast series for the components of change and the total population are available in a consistent time series beyond 20 years in length, this analysis does not extend past the 20-year period.
Because forecasts and the input assumptions are created with several characteristics, this provides greater detail for analysis, including variables such as age, sex, race, and Hispanic origin. Additional detail, however, may either not be available in a consistent time series, or is not categorized in a consistent manner across products since 1947. Therefore, this analysis pertains only to the total number and crude rates for the total population and the components of change.
2) Measurement of Forecast Error at Multiple Levels
A complicating factor in evaluating forecast error is that it can be calculated at different levels. It is possible to analyze an individual point in the forecast, the individual series to determine the error for specific products, as well as the error for multiple forecast series (one series per product) averaged to assess the aggregation of error generally associated with the Census Bureau forecasts. The schematic diagram shown in Model 1 depicts how these types of accuracy assessments are made and how they compare to one another. In each case, forecast error terms — the difference between the observed and the forecast population — are used.
First, consider the assessment of the level of error for the forecast error term using a series with the base year 1947 (S1) (see Model 1). The years analyzed in this forecast period cover 1948 (S1+1) to 1955 (S1+8). Notice that for each year in this forecast period, a forecast error term is calculated within each cell of column (2) as the difference between the forecast and the observed values, both in terms of the population and the components of change.2 Each cell conveys the error that occurred at a specific point in the forecast period. In this particular instance, the forecast period contained 8 years.
The second level of interest, the individual series, represents the average of the error associated with any specific interval of interest, for example, over the first 5 years of the forecast, the first 10 years, etc. Referring to Model 1 (column 2, final row), using the same 1947 based series, the gray-filled cells of column (2) show how in the case of the 1 to 5-year interval, the five forecast error terms are summed and divided by five. The same logic applies to the other targeted intervals.
The third level of accuracy assessment relates to aggregating the past forecast error to reflect on experience in a cumulative manner. In this case, multiple series, the middle series from each product, are used for the input. Specifically, for any given year in a forecast period (e.g., the 1st, 2nd, 3rd,..., 20th), forecast error terms are averaged across each product for the specific time elapsed from the base dates of the series. An example of the formula for assessing the accuracy of the forecasts for their first year (point) is depicted in Model 1 as the bold-framed cells. The forecast error terms for the first year in each series are summed, then divided by the number of series included (final column). Again, this same logic extends to each of the other period target years in the series.
3) Forecast Error Patterns
Accuracy evaluation can be approached from two perspectives. Until now, the focus has been on evaluating overall forecast error. These evaluations relate strictly to the general performance of the forecast(s). The second, and more specific approach in performing a comprehensive assessment of forecast accuracy is that in addition to overall series error, there may also be patterns of error across time. In other words, how well did the forecasts perform throughout the length of forecast period and does a particular pattern exist? Smith and Sincich (1991: pg. 261) found that "... there is a linear or nearly linear relationship between forecast accuracy and the length of the forecast horizon,..." Uncovering these patterns helps to decipher the relationship between the error attributed to the different components of change, as well as if they demonstrate different patterns of change throughout the forecast period. In order to assess the patterns of error throughout the forecast period, a supplemental analysis is presented for both individual and multiple series. Hereafter, duration-specific forecast error references the observation of patterns of error. Indicators used to measure overall error also measure the duration-specific forecast error for both the individual and multiple series.
4) Explanation of Indicators
Statistics used to measure the accuracy of forecasting methodology and assumptions originated from economic forecasting analysis. Demographers and statisticians apply these statistics to measure the accuracy of population forecasts at the national and sub-national level. Researchers have not reached a consensus as to which indicators are most indicative of the accuracy of national population forecasts (Ahlburg, 1992; Armstrong and Collopy, 1992). Consequently, several statistics are often used to afford analysis from different perspectives. Some of the most common, and those used in this report, include the percent error, the mean percent error, the mean absolute percent error, the median absolute percent error, and the root mean squared error. The equations of the aforementioned statistics are presented in Appendix A.
The percent error (PE) is defined as the difference between the actual value and the forecast value (forecast error term) divided by the actual value. PE accounts for the direction of error and may be positive or negative. Negative values indicate underestimation and positive values indicate overestimation. The mean percent error (MPE) is the average of the percent errors in the forecast series for a specified interval. The forecast error term use to calculate the PE is referenced in column (2) of Model 1. The MPE is referenced in the final rows of these respective columns.
The mean absolute percent error (MAPE) also calculates the difference between actual and forecast values, but is the average of the absolute value (irrespective of whether the error is positive or negative) of the error terms. Positive and negative errors therefore reinforce each other, rather than cancel each other. Each forecast error term is weighted equally. The MAPE is commonly used by forecasters because of the ease of calculation, analysis, and reliability (Tayman and Swanson, 1996). In addition, Swanson, Tayman, and Barr (2000: pg. 193) argue that the MAPE possesses "...highly desirable statistical and mathematical properties." The MAPE, however, is an arithmetic mean with an asymmetrical distribution and is prone to being influenced by outlier values, thereby tending to underestimate accuracy. Consequently, the aforementioned authors argue that in reference to evaluating the accuracy of sub-national estimates, the MAPE may lack validity. Contrary to the arithmetic mean, the median is not influenced by outlier values within the distribution. Consequently, the median absolute percent error (MdAPE) was calculated as a supplementary statistic and is presented in the data.
Another commonly used statistic to measure the accuracy of population forecasts is the root mean squared error (RMSE). Forecast error terms are squared and converted to a square root and averaged, providing a statistic in the same unit of analysis as the original variable. In comparison to the MAPE, the RMSE gives additional weight to larger error terms because of squaring. Therefore, as an arithmetic mean, outliers influence both the RMSE and the MAPE. The RMSE gives even greater weight to those series experiencing large error values. The root mean squared percent error (RMSPE) provides the same properties as the RMSE, but is expressed as a percent.
These evaluative statistics apply to the individual and the multiple series analysis for both the overall forecast error and the duration-specific forecast error. To assess overall error, the PE is used to measure the forecast error that occurred at specified points in the forecast period (1, 5, 10, 15, 20 years). The MPE and the remainder of the statistics present the average within an individual series forecast period at specified intervals (5, 10, 15, and 20 year intervals). These indicators also measure the average across multiple series at designated points of the forecast period (1st, 5th, 15th, and 20th year from the base) as opposed to within series averages. Duration-specific forecast error is measured using the same indicators; however, for multiple series each indicator is analyzed annually (for each point) as opposed to designated points.
5) Comparison of the Census Bureau Forecast Models with a Naïve Model
Each Census Bureau forecast is based on a complex set of assumptions about how patterns of fertility, mortality and migration will behave over time. In order to understand the uncertainty related to these assumptions, each component of population change, as well as the population growth rate, is compared with a "naïve" model. Comparing the forecasts with a simplified naïve model assuming no change in future trends provides a benchmark to evaluate and compare the error experienced by the forecast model (Keyfitz, 1977: pg. 230). It provides additional insight into the assumptions made both in the long and short term of the forecast period. Lastly, it contributes to the knowledge of the quality of base data used for the forecast.
The naïve model is created by assuming the annual growth rate for the total population or the crude rates for the individual components remained constant as of the base year or "jump-off" population for the forecasts. For example, annual growth rates for the forecasts produced from 1967 to 1990 in P25-381 are compared with the constant annual growth rate for 1966, the designated population base of that forecast. The naïve model for number of deaths, however, cannot be simply held constant, as this would not be representative of actual trends. The naïve numbers of deaths were recalculated for each series based on the associated forecast population and the constant crude death rate. The RMSE is also calculated for the naïve model to determine whether the assumptions made within the forecast performed better than simply forecasting a constant. Therefore, if the value of the forecast RMSE is smaller than the naïve RMSE, the forecast assumptions or forecast growth rates outperformed the naïve model.
6)  Potential Biases Present Within the Estimate and Forecast Series
An accurate assessment of forecast error depends upon the characteristics and the quality of both the estimates and forecast series for the population and the components of change. Therefore, it is important to discuss discrepancies and irregularities found between and within data sources.
The postcensal national population estimates are derived from the most recent national census. This complete enumeration often contains error relating to such issues as underenumeration and data problems in the estimation of population change. Following the census, the postcensal estimates are adjusted for the error of closure. The 1980 census results determined that the 1970s population estimates underestimated the total population by approximately 5 million people in 1980. Consequently, the 1970 estimates were adjusted for the error of closure by adding approximately ½ million people, compounding each year. Therefore, the base populations used for the 1972, 1974, and 1976 series forecasts were off by the respective adjustments in the first forecast period year. For 1972, the forecast erred by 1 million or .54 of a percentage point, for 1974 2.0 million or .95 of a percentage point, for 1976 3.0 million or 1.39 percent. The forecast growth rates were compared with growth rates revised after the forecast production.
Identification of a single middle series permits the comparison of error across products and the error experienced by each individual series. Therefore, in addition to analyzing the forecast error for each series, the error is calculated for the combination of series at specific points in the forecast period. Note that in Table 1 several products produced before 1974 failed to designate a specific middle series. Alternatively, four series were created based on differing assumptions ranging from lowest to highest values, which are not equidistant in value. In order to create a middle series for evaluation, we computed the average of the two series between the lowest and highest valued alternatives. This was done for the total population, births, and deaths, and is specified in Table 1, column 6. Among the products included in this research, eight products with base dates between 1953 and 1972 did not designate a middle series. Five series, produced between 1963 and 1972, are averaged for the components of change.
The universe for net immigration changed throughout the history of Census Bureau forecasts. For most of the products, net immigration referred to net civilian immigration with the Armed Forces Overseas (AFO) population as part of the base population. The Census Bureau changed the definition of net civilian immigrants to net migration to the U.S. and began treating the AFO as a separate universe by not including it within the base population. The national estimates and national forecasts used this methodology beginning in the 1990s (U.S. Census Bureau, 1993). Therefore, to maintain consistency, the AFO population was added to each total population estimate and forecast. For the total population forecasts, the AFO experienced in the base population were simply held constant throughout the forecast period.
Before the 1986 forecast series, the assumed number of immigrants for the national forecasts did not include undocumented immigrants nor the number of emigrants from the U.S. Following the 1980 census, the national estimates included the number of net undocumented immigrants and emigrants (U.S. Census Bureau, 1990). Discussed later, undocumented immigration began to increase in the 1970s. Consequently, the observed number of immigrants net of emigration and the corresponding rates for the observed estimates from 1970 to 1979 were adjusted upward by 76,000 for each elapsed year after 1970, to include the movement of these groups. The forecast series produced before 1986 did not include these flows in its universe. Therefore, for this analysis, the series produced from 1963 to 1983 are compared with the adjusted observed number (and rates) of immigrants net of emigration, hereafter referred to simply as immigrants and the net immigration rate. In addition, the naïve model used the adjusted observed estimates to create forecasts. Consequently, Census Bureau net immigration forecasts for 1970, 1972, 1974, 1976, and 1982, are being compared to a naïve model based on adjusted observed data mentioned above.
Total Population Growth Rate Forecasts
The U.S. population tripled between 1900 and 1999 as the nation maintained growth rates ranging between approximately a high of 2.0 percent and a low of .6 of a percentage point, with current rates leveling off near .9 percentage points (U.S. Census Bureau, 1999). Graph 1 presents the annual growth rate for the total population from 1947 to 1999, the respective years analyzed for this research. Analyses of how well the Census Bureau forecast the nation's growth trends are first discussed for the multiple series followed by a discussion of each individual series. As mentioned earlier, accuracy assessment is approached from two perspectives: 1) in terms of overall error in the series; and 2) in terms of duration-specific forecast error. Overall error is analyzed for the direction of error (the tendency of the forecast growth rate to generally over- or underestimate the observed growth rate, which is measured using the PE and MPE) and the magnitude of error (which is measured with the MAPE and RMSE). The duration-specific forecast error analyzes the pattern of the error throughout the forecast. Lastly, a comparison of the naïve and the forecast model will be made using the RMSE results.
Because previous authors have examined the historical performance of the forecast population growth rate, the following discussion will remain brief (Ascher, 1978; Stoto, 1983; Long, 1987). This research improves and extends existing research by: 1) evaluating forecasts that are more recent; 2) utilizing more recent national estimates and vital statistics data for the observed series; 3) comparing individual and multiple series results; 4) increasing the sample size for multiple series error statistics; and 5) calculating several statistics to compare results.
1) Overall Accuracy and Duration-Specific Forecast Error of the Population Growth Rate Forecasts
The multiple series and individual series statistics in presented in Table 2 allow for an assessment of whether the total growth rate is generally over- or underestimated by the Census Bureau. As shown in the final column of row (1), the multiple series MPEs for the annual growth rate indicate that the Census Bureau generally underestimated growth rates within the first five years (MPE= -3.8 at the fifth year). In contrast, beyond the five-year period, on average the growth rates were overestimated, as indicated by positive MPEs.
Table 3 presents the percent error occurring at designated points of the forecast period (1st, 5th, 10th, 15th, and 20th years). The wide variations between the MPE, MAPE, and MdAPE (Table 2), and the wide range between individual PEs, within each of the four target forecast periods, indicates that potential outliers influence the multiple error statistics. The PEs range between -26.5 percent (1974) and 6.4 percent (1966) at the first year and from -48.6 (1947) and 29.2 (1963) at the fifth year (n=17). This implies that the multiple error statistics are not representative of the general performance for the growth rates forecast between 1947 and 1999. Within the more recent forecast publications, the Census Bureau includes multiple series RMSE results for the growth rate of the total population as a way of addressing the uncertainty of their forecasts (U.S. Census Bureau, 1996). The RMSE results question the validity of such multiple series growth rate statistics and underscores the need to examine individual series.
An evaluation of the statistics for the individual series reveals a more complex trend of over- and underestimation. Forecasts produced in 1955 and earlier consistently underestimated growth rates. This trend reversed for series produced between 1957 and 1972. Following 1972, the growth rate for each series is again underestimated. Of the seven forecast series produced between 1974 and 1994, three series resulted in small overestimates in the first five years (MPE=3.9, 1.9, and 7.6 percent respectively). Otherwise, within and beyond the five-year period, growth rates for those series were underestimated.
For series with base years between 1947 and 1957, the accuracy improved from series to series within the first five years. Series produced in 1947 and 1949 have the largest percent errors at the fifth and tenth year period, with five year MAPEs of 31.2 percent and 18.5 percent respectively (Table 2). Series produced in 1953, 1955, and 1957 improved in overall accuracy within ten years, averaging 11.5 percent for 1955 and 15.6 percent for 1953. Series 1957 experienced the lowest MAPE of 2.0 percent within the first five years for all series. The accuracy decreased for this series throughout the remainder of the forecast period.
Forecasts for 1963, 1966, 1969, and 1970 did not generally improve in accuracy over the 1953, 1955, and 1957 series in the first five years. The 1972 series showed an improvement, but then the 1974 and 1976 series showed more error. Series 1974 and 1976 increased in error within the first five years with MAPEs of 20.8 and 21.5 percent respectively, from the improved 1972 MAPE of 4.1 percent. The increase in error and the pattern of underestimation for the 1974 and 1976 series may be the result of the error of closure adjustment made to the intercensal estimates mentioned above. When not allowing for the error of closure, Long (1987) calculated lower RMSEs for 1974 and 1976. Within the first four years of the forecast period, Long (1987) calculated a RMSE of .09 percentage points for 1974 and .18 percentage points for 1976 (Table 1). In comparison, when accounting for the error of closure, Long obtained RMSEs similar to the results presented in Table 2 (Table 1A).
The accuracy of forecast growth rates improved after the 1970s within the first five years. The MAPEs ranged between a low of 2.5 percent (1991) and a high of 9.9 percent (1986). Forecasts produced in 1982, 1991, and 1994, for the first five years improve in accuracy with MAPE values below 4 percent. Although series 1986 and 1992 maintain higher five-year MAPEs of 9.9 and 7.6 percent than those produced after the 1970s, these series still maintain lower averages than most previous series.
An analysis of the percent error in Table 3 and the statistics in Table 2 reveal that the pattern of error, the duration-specific forecast error, throughout the forecast periods did not increase linearly for each series. To the contrary, certain series both under- and overestimate the growth rate throughout the period. In addition, the magnitude of error fluctuated throughout certain series. For example, the PE changes direction throughout the forecast period of twenty years for 11 of the 17 series. In addition, the error does not generally increase in size throughout the forecast period; i.e. as the growth rate is forecast for longer time intervals, the error does not generally increase. Both the percent error statistics and the average error statistics for the individual series demonstrate this trend. The MAPEs and MdAPEs for series 1953, 1974, and 1976, among others, both increase and decrease beyond the five-year period.
2) Comparison of Growth Rate Forecast Models
Table 2 shows the results for the naïve and Census Bureau forecast model RMSE. At the fifth year period, on average the naïve model outperformed the forecast model. The RMSE of .30 percentage points at the fifth year is larger than a RMSE of .18 percentage points for the naïve models, a difference of .12 percentage points (n=17). This trend changed throughout the average forecast period. Beyond five years, the disparity between models diminished and the performance of the naïve model deteriorated more than that of the forecast model. At ten years, the difference decreased by -.05 percentage points (n=13). At twenty years, the trend reversed and the RMSE for the naïve model increased to .46 percentage points compared with a smaller forecast RMSE of .43 percentage points (n=10).
Individual series analysis indicates that the naïve model generally outperformed each forecast model with exception to 1955, 1957, and 1963, throughout most of the twenty-year forecast period. Within the first five-year period, the RMSE for the forecast model was smaller than or equal to the naïve model 8 out of the 17 series (47.1 percent). Of the 51 points compared for all series combined, the naïve model outperformed the forecast model 32 times (62.8 percent). Nonetheless, approximately half (51.0 percent) of the 51 comparison points maintain differences smaller than .10 of a percentage point.
Recent forecasts indicate an improvement in the Census Bureau forecast model for short term (5 years) over the naïve model. The series 1982, 1991, 1992, and 1994 model outperformed the naïve model within the first five years with very small RMSEs ranging between .03 percentage points and .08 percentage points. Beyond five years, however, the RMSEs for the naïve model is smaller for series 1982 and 1986.
3) Summary of Forecast Error for Growth Rates
Except for the 1974 and 1976 series, the pattern of under- and overestimation and level of accuracy for the individual series are closely related to the Census Bureau's assumptions for fertility and will be discussed in detail in the following sections. The first two forecast series, 1947 and 1949, greatly underestimated the overall population growth rate as fertility rates began to rise in 1947, resulting in the Baby Boom. Short-term (five year) accuracy improved between 1953 and 1957 as growth rates remained at high levels resulting from high fertility rates. Following 1957, the growth rate began to decline, while the Census Bureau continued forecasting high growth rates. The total populations' forecast growth rates became more accurate within the recent past with average error statistics (excluding the MPE) falling below 10 percent within the first five years for the past five series as population growth stabilized in the 1980s and 1990s. The average error generally increased after the five year forecast period; however, the direction and magnitude of error did not increase or decrease in a consistent manner. Because of large outlier error terms, the multiple forecast error statistics do not represent the actual error experienced overall for the Census Bureau's forecasts. In general, the naïve model outperformed the cohort component forecast, particularly in the latter half of the forecast period. Except for the 1957 series, the naïve model outperformed the forecast model for a minimum of one point in the measured forecast periods for each series. In contrast, recent cohort component forecasts consistently outperformed the naïve model in the first five years. The overall error remained high in comparison to a naïve model until the 1980s and 1990s.
Components of Change Forecasts
Fertility Forecasts Error Analysis
Throughout the first part of the 1900s, fertility rates in the United States declined until 1946 when rates increased dramatically. Graph 2 depicts the trends of the U.S. general fertility rate (births per 1,000 15- to 44-year old women) between 1943 and 1998. Following World War II, fertility rates among American women increased from 85.9 births per 1,000 women in childbearing age to 101.9 births between 1945 and 1946, representing an increase of 16.0 births (National Center for Health Statistics, 1993). Fertility rates remained unusually high, peaking at 122.7 births per 1,000 women in 1957. After 1957, rates declined until the mid 1960s. Referred to as the Baby Boom, this historic abnormality in U.S. fertility occurred between 1946 and 1964. Subsequent to the Baby Boom, except for small increases in the later part of the 1960s into the early 1990s, fertility remained stable. After 1973, fertility rates ranged between a low of 65.2 births in 1976 and a high of 70.9 births in 1990, which is a difference of 5.7 births.
Of the three components of population change, fertility assumptions are subject to the largest levels of uncertainty. When formulating fertility assumptions as inputs for the cohort component model, demographers must attempt to forecast the trends of American women by age and in the more recent past, by race and Hispanic origin. This encompasses anticipating changes in many variables that directly or indirectly affect fertility, such as contraceptive prevalence, marital status, and female labor force participation rates. Most importantly, demographers try to anticipate potential turning points and/or the stability of the current trends.
For series produced in 1963 to 1972, the Census Bureau formulated fertility assumptions using a cohort fertility methodology as opposed to building from estimates of period fertility. That is, series were formulated based on the completed fertility of cohorts of women in childbearing ages and further adjusted for timing patterns. Timing patterns were generally based on age-specific fertility rates from past years and the average age of childbearing.3 Assumptions pertaining to the expected level of completed fertility and timing patterns did not remain consistent across products. Estimates for the ultimate completed fertility rates were generally formulated using birth expectation data from different surveys and demographic theory, such as stable population theory and replacement level fertility (U.S. Census Bureau, 1970).4
Series produced in 1974, 1976, and 1982, continued the use of the cohort fertility model; however, timing patterns used previously were replaced with assumptions about short- and long-term fertility trends. These trends were also based on survey-generated birth expectations data as well as theory. Estimates used for the fertility assumptions for 1986 and 1991 continued to be based on the cohort fertility method while using Box-Jenkins time series methods to forecast short-term trends. Production of the two latest or most recent series, 1992 and 1994, switched to a period fertility methodology and assumed that the current age and race specific fertility rates remained constant throughout the forecast period.
To calculate the number of live births for a designated forecast period, age-specific birth rates were applied to the average number of women in childbearing ages. Once calculated, the births were survived forward to account for infant mortality. The number of births was summed for each calendar year. The crude birth rate is defined as the number of births per 1,000 people occurring within a calendar year.
1) Overall Accuracy of Fertility Forecasts
According to the MPE for the multiple series, the Census Bureau consistently tended to overestimate the fertility of American women with the absolute level of error decreasing in the 1990s. Tables 4 and 5 show that multiple series MPEs for the number of births and the crude birth rate never fell below 12 percent. Within the twenty-year forecast period, the average error falls to percentages below the average error experienced within the first ten-year period. The MAPE for births increased from 13.9 percent within five years, to 28.3 percent and 29.4 percent at the tenth and fifteenth forecast period years, followed by a decline in the average error to 26.8 percent within 20 years. The average errors for crude birth rates are generally smaller than those experienced for the number of births.
Examination of the individual series forecasts for births and the crude rate display a consistent trend of overestimation until series 1982. Graph 3 displays the estimated or actual crude birth rates and the forecast crude rates for each series. According to the average statistics for the number of births (Table 4); the series produced from 1963 to 1972 greatly overestimated the number of births in comparison to later series. The series with the largest error during the first five years, 1970, experienced a MAPE of 29.0 percent. This error increased to 37.1 percent during the ten-year period and 39.4 percent within fifteen years. MAPEs for the remaining series (1963, 1966, 1969, and 1972) ranged between 12.5 and 17.6 percent during the first five years and 20.3 and 37.1 percent within ten years. The series for 1972, however, did not increase as rapidly with an average error remaining between 18 and 21 percent throughout the period. Series 1963 and 1966 experienced the largest MAPE statistics, 42.9 percent and 46.2 percent respectively, for long term forecast periods (15 and 20 years).
Table 6 shows that PEs for the first year of forecast births and rates for 1966, 1970 and 1972, were larger than other series. The PE in the first year for 1972 of 10.7 percent (CBR=11.3) and for 1970 of 8.6 percent (CBR=9.1) indicate that these series began with inadequate base data. In addition, 1970 represents a turning point in fertility trends as the number of births declined from 1970 to 1973. Each forecast with base dates before 1974 failed to incorporate the decline and subsequent stability in fertility patterns seen throughout the early and mid-1970s.
After 1972, forecast error for the number of births decreased substantially from previous series, with continued improvement in the recent past. During the first five years, the MAPE for series produced after 1972 ranged between a low of .5 percent (1991) and a high of 8.3 percent (1986), and within ten years 4.0 (1982) and 9.3 percent (1986). The lowest error was experienced throughout all periods by the 1991 and 1994 series. Within five years, series 1991 had a MAPE of .5 percentage points and 1994 a MAPE of 0.9 percentage points.
2) Duration-Specific Forecast Error for Fertility
Graph 4 shows the multiple MAPEs for each component of population change for the twenty-year forecast period for each single year. This MAPE represents the average absolute error occurring on the specific year of the forecast period. Error for the number of births increased throughout the first 9 years and began to stabilize past 10 years. The average error for the crude birth rate stabilized and actually declined after ten years. This trend is attributable to specific series included with the later forecast periods and the actual trend of fertility. Specifically, series 1972, 1974, 1976, and 1982 first overestimated fertility. Later in their respective forecast periods, these series then underestimated fertility. The series underestimated fertility as the observed number of births increased in the 1980s. Therefore, because observed fertility trends increased during the 1980s and particular series forecast an eventual decline in the long term (with forecast periods falling within this time interval), the referenced series average error statistics decreased later in the forecast period. In contrast, the early series, 1963 to 1969, consistently overestimated fertility during a period of decline following the Baby Boom.
3) Comparison of Fertility Forecast Models
Analysis of the RMSE for the multiple series statistics indicates the naïve model forecast the number of births and the crude rate more accurately (Tables 4 and 5). In addition, the values for the naïve model RMSE remained at least 40 percent smaller for the number of births than the Census Bureau forecasts throughout the forecast period. During the first ten years, the multiple series RMSE for the forecasts was 1.2 million births (CBR RMSE=5.0), in comparison to 495.1 thousand births (CBR RMSE=3.0) for the naïve model. The large disparity continues throughout the twenty-year period, with the naïve RMSE remaining smaller than the average error experienced in the first five years of the Census Bureau forecast series.
Before the 1974 series, the naïve model outperformed each forecast series for births and the crude birth rate. The RMSEs for the naïve model never fell below 84.8 thousand for the number of births, maintaining high levels of error for each series. Within ten years, the naïve RMSE ranged between a low of 235 thousand births per year and a high of 604 thousand births. In reference to recent forecasts beginning in 1974, the forecast model outperformed the naïve model for the number of births. Of the 16 points measured throughout the periods of the remaining seven series following 1972, the forecast RMSE was smaller than the naïve RMSE at 11 points (68.8 percent) of the targeted forecast periods. The assumptions made for the 1976 series consistently outperformed the naïve model throughout the entire twenty-year period. A constant forecast of births or birth rates for the 1986 series, however, would have performed better. In contrast, the naïve model for the crude birth rate outperformed the Census Bureau forecast in general. Of the 16 points observed as of 1972, the RMSEs for the crude rate naïve model were greater than forecasts for only six points compared with eleven.
4) Summary of Forecast Error for Fertility
The Census Bureau remained extremely optimistic about fertility trends remaining at levels experienced during the Baby Boom from 1963 to 1972, despite the continued decline experienced following the peak in 1957. Error decreased for series 1974 and 1976 because of two main factors. The 1974 series reduced the number of alternate series from four to three, resulting in one middle series with a lower completed fertility of 2.1, compared with an average of 2.5 and 2.1 for 1972. In addition, the number of births that actually occurred began to increase in the long-term forecast period. The 1976 series improved over the 1974 series by further reducing the short-term assumptions. In addition to a general improvement in the level of accuracy, the 1974 forecast began a trend of outperforming the naïve model of constant rates, with exception to the 1986 model.
In contrast, the 1982 and 1986 series were conservative and resulted in underestimating births. Series 1982 continued the use of the cohort fertility approach, while the 1986 series used a Box-Jenkins time series model for short-term forecasts. The completed fertility level was further reduced to 1.9 for 1982 and 1.8 for 1986. Following the 1990 turning point, the number of births remained stable. Accuracy improved for series 1991, which continued the use of the time series model, increased the completed fertility to 2.1, and abandoned the racial convergence assumption, among other changes. This stability, combined with improved assumptions, permitted a more accurate forecast for those series produced within that decade. High levels of accuracy for short-term forecasts were duplicated for the 1994 series, which abandoned the cohort fertility method and assumed constant trends among the largest racial groups.5
The results of the comparison between forecast models differed for the number of births and the crude rate. The Census Bureau forecasts for the number of births were more accurate in the recent past. This is not necessarily true for the crude rate forecasts.
In summary, accuracy for the number of births improved in the recent past. Improved accuracy, however, does not seem to be explicitly determined by the different approaches toward deriving forecast assumptions (cohort vs. period) used to forecast short-term trends.
Mortality Forecasts Error Analysis
Mortality rates decreased consistently throughout the 20th century as life expectancy at birth increased from 47.3 years in 1900 to 77.0 in 1999, an increase of 29.7 years in approximately 100 years (Anderson, 1999; U.S. Census Bureau, 2000b). Graph 5 displays the observed and forecast crude death rates from 1964 to the present. Crude death rates generally decreased throughout the 1960s and 1970s, falling from 9.4 deaths per 1,000 people in 1964 to 8.6 deaths by 1977, a time span of 13 years. Following 1977, the rate remained stable, ranging between 8.5 and 8.8 deaths for 21 years. As rates stabilized or decreased, the base population continued to grow in size, resulting in an increase in the number of deaths. The number of deaths steadily increased from approximately 1.8 million in 1964 to 2.4 million in 1999. Graph 6 displays the observed number of deaths from 1964 to 1999. Between 1964 and 1983, the number of deaths increased from 1.8 to 2.0 million. Beyond 1983, the number of deaths increased to 2.4 million. These trends differ by age, sex, race, and Hispanic origin at the national level (Anderson, 1999). For the purposes of this research, only the forecast number of deaths and the crude death rate for the total population will be examined.
To forecast trends in mortality, age-specific death rates and survival rates are used as inputs to the cohort component model to survive the population forward. Rates are generally calculated by single year of age, sex, and more recently race and Hispanic origin. Mortality forecast assumptions formulated between 1963 and 1986 depended on life tables created by the Social Security Administration and were adapted to the needs of the Census Bureau. Before 1982, one set of rates was used as inputs for the model. Forecasts following 1976 produced a low, middle, and high mortality series. For series produced in 1991 forward, the Census Bureau used its own forecast life tables based primarily on the rate of mortality change experienced in previous decades.
1) Overall Accuracy of Mortality Forecasts
Compared to births, deaths are not as numerous and exhibit less fluctuation over time. Therefore, mortality forecasts are subject to smaller numeric magnitudes than fertility and exhibit smaller summary error statistics. Tables 7 and 8 present the error statistics for the forecast number of deaths and the crude death rates. Multiple series error statistics for the number of deaths begin with a MAPE of 5.1 percent (CDR=5.6 percent) at the fifth year of the forecast period. At the twentieth year, the MAPE reaches its highest value of 12.2 percent (CDR=9.7 percent). On average, the error terms for the number of deaths and the crude rates increased throughout the forecast periods. Correspondingly, mortality trends forecast by the Census Bureau were generally too conservative and failed to adequately forecast improvements in life expectancy.
Similar to the results for the individual fertility series, the overall accuracy of the individual mortality series for the number of deaths and the crude rates improve dramatically in the recent past. Graph 6 displays the individual series forecast for deaths and the actual number of deaths. Forecasts produced in 1976 and earlier consistently overestimated deaths. Beginning in 1963, error terms generally increased within the first five years for each series, peaking at 1974 (with exception of series 1972 and 1974 beyond the fifteen year forecast period). Series 1974 was inaccurate by 9.9 percent (for both the MPE and MAPE), increasing from 1.8 percent in 1963, within the first five years. Table 9 displays the PEs for the number of deaths and the crude death rates. Again, series 1974 experienced the largest error term, with a PE of 8.2 percent at the first year for deaths and 9.1 percent for the crude rate.
Following series 1974, the level of accuracy improved. In 1976, the MAPE for the number of deaths fell to 4.6 percent during the first five years and again to .91 percentage points by 1982. Forecast deaths and crude rates produced after the 1976 series were consistently more accurate than previous series, except for 1992, which had a MAPE of 3.8 percent within the five years. The MAPE within the first five years for series produced after 1982, excluding 1992, ranged between .9 percentage points and 1.3 percent. For series 1982 and 1986 with forecast periods beyond five years, the MAPE remained near 1.0 percent and 1.1 percent.
2) Duration-Specific Forecast Error for Mortality
Multiple series error statistics increased throughout the forecast period for both the numbers of deaths and the crude death rates. The crude rate, however, accumulated less error throughout the forecast period. (This can also be witnessed for individual series.) Graph 4 shows the multiple MAPEs for each component of population change for the twenty-year forecast periods by single year. The MAPE remains stable after ten years for both deaths and the crude rate. Within ten years, the crude rates demonstrated lower average error statistics, increasing the gap between the MAPEs for the number and the rate of deaths as the forecast periods lengthened.
The duration-specific forecast error for individual series deaths generally increased throughout the forecast period, with exception to 1974 and 1986. In contrast, crude rate forecasts with periods fifteen years and longer, the average error declined at twenty years for series 1966 and 1969. Series 1974 and 1982 experienced smaller averages within fifteen years than ten years, followed by an increase within 20 years for 1974.
3) Comparison of Mortality Forecast Models
A comparison of the multiple series forecast and naïve models RMSE indicates that the naïve model outperformed the forecast series throughout the entire forecast period for both the number of deaths and the crude rates. The difference between the two models' RMSEs diminished within the twenty-year period for the forecast number of deaths and the crude rate, with the Census Bureau forecast outperforming the naïve model within 20 years for deaths. The multiple series forecast number of deaths RMSE of 265.5 thousand is smaller then the naïve RMSE of 278.9 thousand. In contrast, the naïve model multiple series RMSE for the crude rate outperformed the forecast series by .19 deaths per 1,000 people at twenty years.
For the individual series forecasts, the naïve model of a constant number of deaths and crude rates outperformed the forecast series for every series with exception to 1982, 1986, and 1991, and long-term forecasts for 1963 and 1966. Naïve models for series 1974, 1976, and 1986 produced RMSEs below 50 thousand deaths throughout the entire forecast period and were superior to the performance of Census Bureau forecasts. Within five years, the naïve model RMSE for 1976 averaged 19.6 thousand deaths, the lowest RMSE reported for deaths.
4) Summary of Forecast Error for Mortality
Beginning in 1963, the Census Bureau generally underestimated improvements in life expectancy. Particular forecasts produced after 1976, in contrast, slightly overestimated improvement. Forecasts produced between 1963 and 1974 gradually increased in error, highlighting a trend of the Census Bureau's historically conservative approach toward forecasting improvements in life expectancy. Recent forecasts experienced superior performance. This improvement in accuracy may be indicative of the stabilization of mortality trends beginning in the late 1970s. In addition, the Census Bureau began producing a middle series mortality assumption for the 1982 series; potentially further contributing to the overall level of mortality forecast accuracy. Similar to fertility, the error terms for the number of deaths are slightly larger throughout the forecast period than those for the crude rate as they are more dependent on the size of the forecast population. Multiple series forecast error generally increased throughout the forecast horizon, stabilizing after the 10th year of the forecast period. Lastly, except three series, the naïve mortality models outperformed the Census Bureau forecasts. In comparison to fertility, the most recent forecasts, series 1992 and 1994, did not exhibit superior performance relative to the naïve model.
Net Immigration Forecasts Error Analysis
Net immigration for the United States is largely determined by domestic policy and the type of immigration occurring at any given point in history. For example, over 80 percent of the current number of immigrants entering the U.S. in 1999 were attributable to family reunification policy and of immigrants with refugee status (Kramer, 1999: pg. 2). In addition, the types of immigrants are controlled through bureaucratic and/or political means. During the 1970s, however, research found that the number of undocumented immigrants increased dramatically (Passel and Woodrow, 1987). This increase remains at levels researchers are unable to directly determine. The Census Bureau's current knowledge of net immigration is dependent on legal immigration data from the Immigration and Naturalization Service (INS). Given the limitation of data on the current level of net migration and the inability to predict domestic and international policy, forecasts of this component are especially problematic.
Consequently, the historical forecasts for net immigration have remained conservative. Except the most recent release in 2000 and the 1986 series, net immigration was assumed to remain constant throughout the forecast period for each series. Graph 7 depicts the observed and forecast crude rate of net immigration for each series produced as of 1963. The forecast number of immigrants was applied each year as a constant number with a constant age and sex distribution. Recent products assumed separate distributions by age, sex, race, and Hispanic origin. Characteristics of the net immigrant populations experienced around the time of the base year generally represented the forecast distributions.
As a result of these complicating factors and those mentioned above in relation to emigration, undocumented immigration, the change in the universe, serious limitations to the evaluation of the accuracy of net immigration forecasts exist. Nevertheless, it may still be profitable to examine these data at some level to further understand how they affect results of the forecast and inform us about trends. Analysis of the immigration component for this report is conducted at a general level.
1) Overall Accuracy of Net Immigration Forecasts
The forecast number of immigrants and the net immigration rate are consistently underestimated in each forecast and the magnitude of error for both variables is larger than either components of population change discussed previously (Table 10). For multiple series error, the MPE for the number of immigrants is underestimated by -21.0 percent at the fifth period year (Table 11). The RMSE at five years is 189.2 thousand immigrants. At the tenth year, the MPE increased to -36.5 percent and -50.2 percent at twenty years. The number of immigrants and the rates' MAPE statistics correspond with the MPE statistics.
Among individual series forecasts, the overall accuracy of series 1976 demonstrated the worst performance and series 1966 performed the best. The recent series for 1991, 1992, and 1994, are more accurate within the first five years than past forecast series. The average error within the first five years for series 1992 had the smallest MAPE of 5.5 percent. The PEs for the first year of the forecast indicates that the base number of immigrants used to create the forecasts is often of poor quality. Table 10 displays the PE for both the crude rates and the number of immigrants. PEs for the number of immigrants range between -0.3 for the 1992 series and -24.0 for 1982. Of the twelve series in the first year, only five series experience PEs below 10 percent.
2) Duration-Specific Forecast Error for Net Immigration
As the number of immigrants increased throughout 1963 to 1999, the forecast individual series for constant numbers and rates of immigrants resulted in increasing error throughout the forecast period. As previously stated, the multiple series MAPE began at over 20 percent at the fifth year (n=13) and increased to over 50 percent at the twentieth year (n=6) for multiple series error. Graph 4 displays the MAPE by single year for each component. The MAPE for both the number and rate are larger throughout the entire forecast period than the error for fertility and mortality. A large proportion of the error occurred between the first and ninth year, increasing from approximately 10 percent to over 35 percent, a 25 percentage point increase. For individual series, the MAPE within twenty years ranged between a low of 21.9 percent for 1966 and a high of 41.8 percent for series 1976 (n=6).
3) Comparison of Net Immigration Forecast Models
For multiple series error statistics, the naïve model outperformed the Census Bureau forecast model. At the tenth year of the forecast, the RMSE for the naïve model of 244.0 thousand was smaller than the Census Bureau RMSE of 321.8 thousand immigrants. Series 1974, 1991, 1992, and 1994 are the only forecasts that outperformed the naïve model (with exception to 1970 within the first five years). For crude rates, only three series (1970, 1991, and 1992) outperformed the naïve model and only within the first five years. The naïve model is based on adjusted numbers for net undocumented immigrants and emigrants in the 1970s and afterward. Graph 8 displays the multiple series RMSE for both models for the forecast crude rate of net immigration. This offers a hypothetical or possible representation of the RMSE for the Census Bureau forecasts if the base error was improved and the adjustment for undocumented immigrants and emigrants were included. With exception to the first three periods, the RMSE could be smaller for the net immigration rate as indicated by the naïve model.
4) Summary of Forecast Error for Net Immigration
Given that actual net immigration increased throughout the period between 1963 and 1999, the forecast assumptions of constant trends resulted in consistent underestimation. Error terms throughout the forecast period increased, and maintained the highest error statistics compared to the fertility and mortality forecasts throughout. Because most of the series begin with large forecast error terms within the first year, the base data used may be contributing to a large proportion of the error throughout the forecast period. Nonetheless, net immigration forecasts have improved in the recent past. This improvement is also evident when comparing the naïve and Census Bureau forecast models of net immigration. The naïve model consistently outperformed the Census Bureau forecast model, with exception to the fifth year average for 1991, 1992, and 1994, for both the number of immigrants and the crude rate. In spite of this, the naïve results are not a dramatic improvement over the Census Bureau forecasts.
D. DISCUSSION OF RESULTS
This paper has evaluated the accuracy of population growth forecasts produced by the Census Bureau beginning with the 1947 series publication. To summarize the findings, the research questions asked previously are restated. First, how accurately did the Census Bureau forecast the total population and their respective components of change? In general, the forecasts produced by the Census Bureau overestimated total population growth. A detailed analysis of the components of population change, however, revealed a more complex pattern of over-and underestimation.
Erroneous assumptions about fertility following the Baby Boom era were largely responsible for a pattern of overestimation of the total population. Specifically, the growth rate forecast performance worsened for the series produced between 1957 and 1972. The number of births and the crude rate were severely overestimated between series 1963 and 1972, influencing the forecast growth rate. Before the 1957 series and following the 1972 series, annual growth rates were underestimated. Therefore, if the fertility component was not as grievously overestimated, the forecast results may be much more conservative and possibly underestimate the series as witnessed before the 1957 and after the 1972 series.
The mortality component of change generally presents the least amount of contributing error to the forecast model in comparison to fertility and possibly net immigration. The MAPE for both the number of deaths and the crude rates begin below 5 percent at the first year and never rise above 15 percent within the twenty year period.
The assumptions for constant levels of net immigration consistently produced underestimated series as the observed number of immigrants continually increased for over thirty years. Forecasts were further troubled by the poor quality base data.
Recent forecasts for series 1991, 1992, and 1994, show improvement in accuracy over previous series within the first five years. Series 1991 and 1994 forecasts for fertility and mortality maintain smaller average errors than previous forecasts, while the net immigration forecasts are smaller for the 1991 and 1992 series. This improvement in accuracy may be indicative of the stabilization of the components of change of the total population. In addition, the level of detail has expanded as more race and Hispanic origin groups were added to the product, the terminal age of the population data rose, and the quality of input data improved.6
The duration-specific forecast error generally increased throughout the forecast period for both multiple series and individual series for the growth rate and the components of change. The magnitude by which the error increased differs for each component of population change. Net immigration consistently maintains the highest level of error throughout the multiple series statistics, followed by fertility and mortality. Fertility error increased rapidly within the first half of the average forecast period, but is followed by the stabilization of error terms in the latter half. This stabilization of error is most likely the result of an eventual increase in the actual fertility of American women, following a major decline in conjunction with Census Bureau assumptions for long-term fertility trends. Mortality maintains the smallest error and remains stable throughout the forecast period past the tenth forecast year, as compared to the net immigration and fertility forecasts.
Secondly, did the forecasts for the population and the components of change produced by the Census Bureau perform more accurately than a naïve model assuming constant change? With exception to the recent forecasts of 1991, 1992, and 1994, and earlier series 1955, 1957, and 1963, the naïve models outperformed the Census Bureau forecasts for the growth rate and each component of population change. It is evident that the Census Bureau's inability to forecast turning points in trends greatly diminishes the accuracy of each forecast series.
The assumption of constancy for the naïve model outperformed the Census Bureau forecasts for series experiencing a change in trends. In contrast, once the population stabilized in the recent past or experienced minimal to moderate change before the Baby Boom, the Census Bureau forecasts generally outperformed the naïve model.
During the 1900s, knowledge of population trends and their future repercussions for the size and distribution of the population became increasingly important as the U.S. experienced major shifts in fertility and net immigration. Population forecasts produced by the Census Bureau are used widely, informing researchers, planners, legislators, and many others, on the future course of population change. Because forecasts are subject to inherent uncertainty, as they are based on a compilation of reasonable assumptions for the components of population change, it is essential to educate customers as to the amount of uncertainty within the forecasts for the population and the components of population change. Throughout the second half of the century, the forecasts produced by the Census Bureau improved in accuracy as a result of several factors including improvements in data quality and methodology. Nonetheless, this study reveals that forecasters failed to foresee turning points in population trends, resulting in erroneous forecasts, particularly for fertility and net immigration. In addition, with the exception of net immigration, the assumptions formulated by the Bureau were often outperformed by simple assumptions of constancy.
Recent forecasts produced in the 1990s minimize the inherent uncertainty and provide a reliable product for consumers. The forecast reliability is, in all likelihood, the result of the stabilization of the components of population change.
This research addresses the error experienced for general characteristics of the forecasts. Previous studies by Long (1987), Stoto (1983), and Ascher (1978) examined the accuracy of the Census Bureau population growth rate for individual series. The present study makes a contribution to this body of research by using a multi-pronged approach, combining the analysis of the individual error terms, individual series error, and multiple series error. In contrast to previous studies, this research evaluates and compares the accuracy results with multiple statistical tools, strengthening its validity. This study is unique in that it is the only one to systematically analyze the forecast error for the population growth rates in combination with the respective components of change for the U.S. Census Bureau. In addition, this research represents the only detailed accuracy analysis of the net immigration and mortality forecasts.
In order to reduce uncertainty for future products, further analysis is necessary to understand the uncertainty in forecasting specific characteristics of the population, such as the forecasts of the race and Hispanic origin distribution and the age-specific assumptions for the components of change. Correspondingly, a detailed analysis comparing the specific assumptions made between products and analysis of additional characteristics such as age- and race or Hispanic origin-specific assumptions may strengthen the understanding of the weakness in the chosen assumptions.
1 Throughout the text, "point" refers to a finite time interval within the forecast period.
2 For forecasts produced before 1986, the components of change were published for the mid-year population (July 1). Whereas, rates for products published in 1986 and afterward were calculated for calendar years. Therefore, for purposes of this research the components of change and their respective rates were recalculated to represent the calendar year event.
3 Please refer to the original publication for further discussion of assumptions and methodology.
4 To collect birth expectation data, the Census Bureau used national survey data from the Growth of American Families Studies, the University of Michigan national sample surveys, and the Current Population Survey (U.S. Census Bureau, 1967, 1984).
5 Fertility among non-Hispanic White, non-Hispanic Black, and non-Hispanic American Indian women remained at constant levels, while rates for Hispanic and Asian women were assumed to decline.
6 Beginning with the 1991 series, the Census Bureau began producing forecasts with greater detail for race and Hispanic origin groups. The vital statistics data and the estimates were used to forecast four race groups by Hispanic and non-Hispanic origin. In 1982, the age distribution of the forecast population was extended from 85 years and over to 100 years and over. Lastly, for the 1991 series, the detail for immigrants was expanded to five types of immigration to the U.S. (U.S. Census Bureau, 1984, 1992).