Evaluation and Replication of Ahmed and Robinson Estimates
As discussed above, Ahmed and Robinson applied a residual technique to create emigration estimates based on the 1980 and 1990 censuses. The residual technique is a general tool developed to compare an expected population at a specific point in time to an enumerated population at the same point in time in order to isolate a population of interest. In the case of emigration, this translates as applying a cohort-survival method to the foreign-born population enumerated in an earlier census and comparing that expected population to the corresponding population enumerated in a latter census. Theoretically, the difference should be the population that emigrated between the two censuses.
In order to produce the 1990 emigration estimates, Ahmed and Robinson (p. 2) summarized the methodology with the following equations:
||E1980-1990 = (P1980 - D1980-1990) - P1990
E1980-1990 = S1990 - P1990
|E1980-1990 is the estimated number of foreign born who emigrated between 1980 and 1990
P1980 is the foreign-born population enumerated in the 1980 census
D1980-1990 is the number of deaths experienced from 1980 to 1990 by the foreign-born population enumerated in 1980
P1990 is the foreign-born population that entered before 1980 and was enumerated in 1990
S1990 is the 1980 foreign-born population survived to 1990
The above methodology produces a stock number of emigrants and the emigration rates for the 10-year period (1980 to 1990) for the foreign-born population that entered before 1980. This method does not apply to the foreign-born population that entered between 1980 and 1990. Therefore, the emigration rates calculated for the before 1980 entrants, adjusted for the average length of stay and mortality, were applied to the foreign-born population enumerated in the 1990 census that entered between 1980 and 1990. A detailed discussion of the methodology is included below in the discussion of the replication efforts of Ahmed and Robinson's estimates.
To replicate the estimates produced by Ahmed and Robinson, we began by locating all existing papers, programs, and results found internally at the Census Bureau. This included obtaining original SAS programs written by Ahmed, input and output data, spreadsheets created by Ahmed, life tables tabulated by the Projections Branch, internal Census Bureau memorandums and notes, and working papers published in the Population Division working paper series. Once all the available information was gathered, we began replication of the estimates by following Ahmed and Robinson's methodology, piecing together programs and data sets. A majority of the replication efforts resulted in creating new programs.
Appendix A presents a detailed outline of the steps used to replicate the estimates based on SAS programs, those existing prior to replication efforts and those developed during the process. The main steps are listed below in order with a reference to the detailed step number from Appendix A in parenthesis. These steps do not represent the actual order of steps taken by Ahmed and Robinson.
- Preliminary emigration estimates for the population that entered before 1980. (Step 3)
- Emigration estimates for the population that entered before 1980 for selected countries. (Step 4)
- Emigration estimates for the population that entered between 1980 and 1990 for selected countries. (Step 6)
- Emigration estimates for the population that entered between 1980 and 1990 for non-selected countries. (Step 8)
- Emigration estimates for the population that entered before 1980 for non-selected countries. (Step 10)
- Final estimates of emigration for the 1980-1990 decade. (Step 11)
- Preliminary emigration estimates for population that entered before 1980
To begin, preliminary estimates of the emigrant population were created for the foreign-born population that entered before 1980. Data used to create the estimates were the 1980 census of the resident foreign-born population as of April 1, the 1990 census of the resident foreign-born population as of April 1, and 10 year survival rates calculated from 1990 life tables generated by the Population Projections Branch of the US Census Bureau. The 1980 census data were used to calculate the 1990 expected population, and the 1990 census was used as the enumerated population to compare the expected population. Survival rates were used to age the 1980 population 10 years to create the expected 1990 population.
The 1980 census results were available from a summary file previously adjusted for unknown country of birth. Characteristics include sex, age collapsed to 14 categories (see Appendix B), country of birth by 40 country groups (see Appendix C), citizenship status, and period of entry collapsed to 1975-1980, 1970-1974, 1960-1969, and before 1960. The methodology used to impute unknown country of birth for this file was not determined. During the estimates procedure it was discovered that the 1980 file contained people placed in incorrect age groups with respect to their period of entry. In all likelihood, this error occurred in the census editing procedure when imputing unknown characteristics of respondents. These cases were removed from the data set.
Data for 1990 were generated from the 1990 census (Summary Tape File 3 - Sample Data) and recoded for estimates purposes. Characteristics of the foreign-born population include sex, age collapsed to five-year age groups, race and Hispanic origin, country of birth by 111 country groups (see Appendix C), citizenship status, and period of entry collapsed to 1987-1990, 1985-1986, 1982-1984, 1980-1981, 1975-1979, 1970-1974, 1960-1969, and before 1960. Originally, the 1990 country of birth coding contained an additional code for those enumerated as being born abroad, at sea, and not specified, but this population was proportionally distributed prior to the emigration processing.
In order to produce reliable estimates using a large enough data set to provide "stability of the rates" (p. 4), Ahmed re-classified both the 1980 and 1990 census country of birth codes from 40 country groups into categories of four country groups, each theorized to represent the following race and ethnic groups: Hispanic, non-Hispanic White, Black, and Asian and Pacific Islander. Table 3 presents the list of countries used to create each mutually exclusive race and Hispanic origin group. The classification for each country was based on the reporting of race and Hispanic origin and country of birth of the foreign-born population in the 1990 census. We were unable to replicate this task due to time constraints. Although the race and origin groups are actually country groups and are not the census reported race and origin values, they are referred to hereafter as race groups to prevent confusion.
Survival rates were needed to obtain the 1990 expected population. National resident population life tables for 1990, generated by the Population Projections Branch, were used to calculate the ten-year survival rates needed to survive the 1980 foreign-born population of before 1980 entrants. Life tables were available for four race groups (White, Black, American Indian, Eskimo and Aleut, and Asian and Pacific Islander) by Hispanic and non-Hispanic origin, and for Hispanic origin only. To calculate survival rates by sex, age groups, and the four race groups based on country groups, Ahmed and Robinson used the non-Hispanic White, Hispanic only, Black, and Asian and Pacific Islander life tables. The Black and Asian and Pacific Islander life tables were generated for both the Hispanic and non-Hispanic population. See Ahmed and Robinson for survival rates (1994, p. 30).
Once the data sets were acquired and recoded, Ahmed and Robinson calculated a preliminary residual estimate of 1980-1990 emigrants for the population that entered before 1980. The expected foreign-born population for 1990 was calculated by surviving the 1980 enumerated foreign-born population by sex, age, race group, and period of entry. The difference between the expected 1990 population that entered before 1980 and the enumerated 1990 population for those entering before 1980 is the estimated number of emigrants who left between 1980 and 1990. Estimates were created by age, sex, country of birth groups (40), and period of entry. Emigrants for each of the 40 countries were maintained in the estimate process, irrespective of race group designation. See Ahmed and Robinson for the preliminary residual results (1994, p. 17).
Ideally, the residual would provide the total number of emigrants for each country as a positive number. The preliminary residual, however, resulted in negative differences for the aggregate value for particular countries. After conducting background research into why particular countries obtained negative values, Ahmed and Robinson separated the countries into selected and non-selected countries. The non-selected countries were those countries that showed a statistically significant negative difference. We were unable to locate or replicate the statistical tests conducted for the negative differences. Non-selected countries were Mexico, El Salvador, Guatemala, Peru, Haiti, Jamaica, Trinidad and Tobago, and Other South and East Asia. In addition, the selected and non-selected countries were then separated into the corresponding race groups previously assigned.
In summary, the data set was divided into two separate groups, the non-selected and selected countries. Both country groups are available by the four race groups, age groups, sex, country of birth, and period of entry. None of the non-selected countries fell into the non-Hispanic White race group.
- Emigration estimates for the population that entered before 1980 for selected countries
After categorizing the data into the selected and non-selected countries, estimates were derived of the number of emigrants and the emigration rate for selected countries only by age, sex, race group, country of birth, and period of entry. Emigration estimates were derived for 1970-1979, 1960-1969, and before 1960. The 1990 expected population for before 1980 entrants was compared to the observed population enumerated in 1990 that entered before 1980. Negative differences were still obtained for countries within the age, sex, and race group distributions, but not for the aggregate number of emigrants. The negative cases were removed from the data set. Based on the steps listed above, we replicated Ahmed's results of the 1980-1990 number of emigrants and the emigration rate for before 1980 entrants for selected countries by age, sex, and race.
- Emigration estimates for the population that entered between 1980 and 1990 for selected countries
Next, the number of emigrants and the emigration rate for selected countries were calculated for the population that entered between 1980 and 1990. Estimates for the 1980-1990 period of entry were not estimated using the residual methodology as was done for before 1980 entrants. In comparison, an emigration rate for 1980-1990 entrants was estimated and applied to the 1990 enumerated population that entered within the same period.
To begin, the foreign-born population enumerated in the 1990 census that entered between 1980 and 1990 by age, sex, and country of birth was isolated. The race groups were coded and the non-selected countries were removed. The period of entry for the 1990 census was collected in four categories (1987-1990, 1985-1986, 1982-1984, and 1980-1981). The following steps were taken to produce the 1980-1990 selected country emigration estimates:
- The proportion of 1980-1990 foreign-born entrant population enumerated in the 1990 census were calculated by age, sex, period of entry, and race.
- The average length of stay for 1980-1990 was determined for each age group. The proportions of foreign born in each period of entry and age group by sex and race were multiplied by the average potential number of years an immigrant could reside in the U.S. based on period of entry. The proportions for 1987-1990 entrants were multiplied by 1.63 years, 1985-1986 by 4.25 years, 1982-1984 by 6.75 years, and 1980-1981 by 9.25 years. Ahmed and Robinson define the average length of stay as, "... the weighted average of the duration to April 1, 1990 from the middle points of the years 1980-1981, 1982-1984, 1985-1986, and 1987-1990 (p. 8)."
- The mid-point of age in 1990 was determined by calculating the middle age value within an age group. For example, the middle age value for the 0-4 year age group (i.e., 0 to exact age 5) is 2.5 years.
- Average age at time of entry was then calculated by subtracting the average length of stay from the mid-point of age in 1990.
- To calculate 1980-1990 entrant emigration rates, the 1970-1979 entrant emigration rates were used as a base. Taken from Step 2 for selected countries, the 1970-1979 rate's age distribution was expanded from 13 to 15 categories to match the 1990 age groups and the ten-year rates were annualized.
- The 1970-1979 entrant emigration rates were multiplied by the average length of stay to weight the rates for the number of years migrants are able to emigrate.
- Survival rates were subtracted from the adjusted 1970-1979 entrant emigration rates to account for mortality. Rates were estimated by sex, age, and race using the 1990 life tables referenced above, based on the age at the time of entry and the length of stay.
- Estimates of the theoretical number of 1980-1990 entrants were derived by dividing the enumerated 1990 foreign-born population that entered between 1980 and 1990 by the adjusted 1970-1979 emigration rates to include the population subject to mortality by age, sex, and race.
- The theoretical number of 1980-1990 entrants was multiplied by the adjusted 1970-1979 emigration rate. This step produced estimates of the number of emigrants for selected country 1980-1990 entrants by age, sex, race group, and country of birth.
- Emigration estimates for the population that entered between 1980 and 1990 for non-selected countries
Emigration estimates for the selected countries for before 1980 and 1980-1990 entrants were produced in Step 2 and Step 3. In the next two steps, non-selected country emigrants and the emigration rates were calculated separately for both the before-1980 and 1980-1990 entrants.
As described above, non-selected countries were coded as the countries with statistically significant negative number of emigrants calculated in Step 1. Based on research presented in their paper, Ahmed and Robinson assumed that these countries were likely to have lower rates of emigration than the selected countries. Various percentages were tested before arriving at the conclusion that halving the emigration rate from the selected countries would be the most reasonable assumption. Non-selected 1980-1990 entrant emigration rates were calculated by multiplying the adjusted 1970-1979 emigration rates, calculated in Step 3, by .5.
Once the 1980-1990 non-selected emigration rates were calculated, the steps taken in Step 3 were duplicated for the non-selected countries. The 1990 ten-year survival rates by sex, age, and race were subtracted from the non-selected 1980-1990 emigration rates to account for mortality. The theoretical number of 1980-1990 entrants were estimated by dividing the enumerated 1990 foreign-born population by the 1980-1990 non-selected emigration rates adjusted for mortality.
- Emigration estimates for the population that entered before 1980 for non-selected countries
To estimate the number of emigrants for those who entered before 1980 for non-selected countries, the same steps were taken as in Step 2 with exception to differing emigration rates. The emigration rates produced in Step 2 for selected countries were halved and applied to the non-selected countries by age, sex, race groups, and period of entry (1970-1979, 1960-1969, before 1960).
- Final estimates of emigration for the 1980-1990 decade
Estimates for the before 1980 and 1980-1990 entrants for selected and non-selected countries were aggregated, maintaining the age, sex, race, period of entry, and selected/non-selected country characteristics.
Evaluation of Oosse Estimates
Time limitations permitted only a detailed analysis of the estimates produced by Oosse, as opposed to a complete validation effort. Therefore, only the methodology used to create the estimates will be discussed. Oosse used Quattro Pro spreadsheets to create the final 1980-1990 emigration rates.
As mentioned above, Oosse's estimates were created to update the emigration assumption for the processing of the national estimates and the demographic analysis estimates. Historically, a constant number of emigrants was used as input into the estimated net international migration component. Emigration rates for the foreign-born population allowed demographers to vary emigration estimates as the size and composition of the foreign-born population changed from year to year. In addition, the new estimates were expanded to single year of age from 0 to 115 years of age and 14 country groups. The following are the general steps taken to produce the revised set of estimates.
- Produced new emigration estimates for 1980 to 1990 based on Ahmed and Robinson's estimates.
In order to produce updated estimates based on the Ahmed and Robinson estimates, we collected the 1980 and 1990 census foreign-born results, survival rates calculated by Ahmed and Robinson, and emigration data supplied by Ahmed.
- To begin, Oosse calculated the expected 1990 foreign-born population by applying 10-year survival rates to the 1980 foreign-born population by age, sex, country of birth (40), and period of entry. The population was aggregated by country of birth and sex for analysis purposes.
- Oosse then calculated an average 1980 foreign-born population by age, sex, period of entry, and country of birth by calculating the average of the 1980 census of the foreign born and the expected 1990 foreign-born population calculated in Step (a) above.
- Emigration rates estimated by Ahmed and Robinson for the pre-1980 periods of entry are applied to the average 1980 foreign-born population, resulting in estimates of the average annual number of emigrants for the before 1980 entrants by age, sex, country of birth (40), period of entry, and selected/non-selected county status.
- After calculating the average annual before 1980 entrant emigration estimates, Oosse calculated the annual number of emigrants for the 1980-1990 period of entry. The 1990 census file was recoded for unknown country of birth for 1980-1990 entrants using a proportional distribution based on the country of birth distribution reported by Ahmed and Robinson. The adjusted 1980-1990 entrant emigration rate estimates produced by Ahmed and Robinson were applied to the 1980-1990 enumerated foreign-born population. These calculations resulted in the 1980-1990 annual number of emigrants by age, sex, and country of birth (40) for 1980-1990 entrants.
- New estimated 1980-1990 annual number of emigrants was calculated by summing across each period of entry by age, sex, country of birth, and selected/non-selected country status.
- Oosse compared the new annual emigration estimates to Ahmed and Robinson's final estimates. Based on this comparison, Oosse calculated sex and country of birth-specific rake factors that were applied to the new annual emigrant estimates to replicate the distribution of the 40 countries of birth distribution originally estimated by Ahmed and Robinson. These calculations produced an estimate of the annual number of emigrants for 1980 to 1990 by age, sex, and country of birth.
- The average 1985 foreign-born population was calculated by averaging the 1980 and 1990 enumerated foreign-born population by age, sex, and country of birth. The theoretical number of unauthorized migrants in the base population was removed by applying a proportion of the unauthorized population to each country. Adjustments were made to 13 countries. Table 5 presents a list of the 13 countries.
- 1980 to 1990 annual emigration rates were calculated by dividing the estimated annual number of emigrants for 1980 to 1990 calculated in Step (f) by the average 1985 foreign-born population calculated in Step (g). The rates were calculated by sex, age groups, and country of birth (40). Rates for Cuba were assumed to be zero.
- Created estimates of the annual number of emigrants for 1980 to 1990 and the respective rates in greater demographic detail.
Application of emigration estimates to the net international migration component required the adjustment of the 1980-1990 emigration estimates. The age distributions for the estimated number of emigrants and the average 1985 population were expanded to single year of age from 0 to 115 years and the country of birth groups collapsed from 40 to 14 groups. The age distribution was disaggregated by proportionally distributing the total number within each age group for both the five-year and ten-year groups. For emigrants and the 1985 average population in the 75 years and older category, the number is expanded to single year of age based on proportions in a 1989-1991 stable population life table published by NCHS.
For Demographic Analysis-Population Estimates (DAPE) research project replication purposes, we maintained the original 40 countries of birth distribution to allocate race and Hispanic origin characteristics. The imputations were created based on the results of the 1990 census. In conclusion, the estimated number of annual emigrants for 1980-1990, the respective emigration rates, and the average 1985 population by sex, single year of age (0-115), race, Hispanic origin, and country of birth were distributed internally to be used in the production of estimates by other DAPE task teams.
Creation of New 1990-2000 Foreign-born Emigration Estimates
In addition to replicating and analyzing the existing foreign-born emigration estimates, Task Team 6 endeavored to create new estimates for the 1990 to 2000 decade. Following the guidelines set forth by the DAPE Core Team and recognizing the existing time limitations, we adopted the methodology used by Ahmed and Robinson and attempted to produce 1990-2000 foreign-born emigration estimates by single year of age, sex, race, Hispanic origin, country of birth, and period of entry.
The methodology used previously was altered because additional data and research to support alternative methodological assumptions were available. Similar to Ahmed and Robinson, we began with the 1990 and 2000 enumerated foreign-born population from the census sample files. Survival rates were generated using 1999 life tables generated by the Population Projections Branch of the US Census Bureau.
The census population universe is the total resident population, which includes both the unauthorized and temporary migrant populations. Because we are estimating the emigration of the legal permanent resident foreign-born population only, the presence of these populations in the universe creates biases in the estimates. In comparison to the 1980-1990 methodology, we were able to isolate the temporary migrant population present in the population universe using the temporary migrant estimates produced by Task Team 8 for 1990 and 2000. Unauthorized migration estimates for 1990 and 2000, however, were created as part of the emigration estimates production process based on existing research (Passel 1999).
Estimates produced by Ahmed and Robinson used four race groups and applied race-Hispanic origin-based survival rates. For the 1990-2000 estimates, a race variable based on country of birth was not created. The 1990 Census race and Hispanic origin variables were used, which were proportionally distributed for Census 2000 based on reported age, sex, and country of birth in both the 1990 and 2000 census. Survival rates were calculated from life tables of the total population by sex and single year of age.
Once the unauthorized and temporary migrant populations were removed from the 1990 and 2000 foreign-born populations, we generated a preliminary estimate of the before 1990 entrant emigrants as was estimated by Ahmed and Robinson in Step 1 (above) for the before-1980 entrants. New estimates were generated by single year of age with no race detail.
Preliminary residual estimates produced several countries with negative results. The negative results may originate from overestimating mortality, erroneous assumptions about the unauthorized or temporary migrant population, and changes in census coverage of the foreign-born population between censuses. In all likelihood, the enumeration of the foreign-born population in 2000 improved over the 1990 census. Therefore, the foreign-born population for those who entered before 1990 could be larger in 2000 than is theoretically possible based on the results of the 1990 census. In addition, respondents may provide erroneous responses to the period of entry on the census.
Time limitations prevented the completion of 1990-2000 based estimates. Based on the presence of large negative values for several countries, it is necessary to revisit the original methodology and assumptions used to create the estimates.