Annual Updates to the Data and Methodology Used to Produce Population Estimates

Written by:

Estimated reading time: 9 minutes

Each year, the population of the United States grows and evolves. Births, deaths and migration are the three components of population change. A myriad of factors can affect the magnitude and speed of the changes.

For example, economic uncertainty can affect attitudes toward fertility and mobility, a global pandemic can cause mortality to spike, and policy changes can shape the number of people moving in or out of the country.

Through it all, staff in the U.S. Census Bureau’s Population Estimates Program (PEP) produce annual estimates of the population that reflect year-to-year changes. We ensure the estimates account for the various factors influencing the population by incorporating the most current data and refined methods.

In this blog, we take a deeper dive into the process for developing the annual estimates. Before getting into the specifics, we cover some basics that provide essential context. 

Population Estimates 101

As noted, each year, PEP produces population and housing unit estimates that reflect demographic and housing change since the date of the most recent census. Each set of estimates features a time series of data from that census date to July 1 of the current year and is known as a “vintage.”

With the release of each new vintage, the full time series is revised to include the latest data and method updates, so the most recent vintage supersedes the prior vintages.

The population estimates are not produced for a calendar year but for an “estimates year,” which is the period from July 1 to June 30. In many of our products, we report the population on July 1, which is the midpoint of the calendar year. The midpoint is important for anything we do by age — such as applying migration rates — because at that point, approximately half the population has had its birthday.

For the nation, states and counties, we produce:

  • Population estimates by age, sex, race and Hispanic origin.
  • The components of change — births, deaths and migration.
  • Housing unit estimates.

We also produce population totals for cities and towns. Any of the estimates produced for counties can be aggregated into metropolitan and micropolitan areas, which are built from one or more counties.

National, state and county population estimates are calculated using the cohort-component method. This approach starts with a base population and adjusts it by adding births, subtracting deaths, and accounting for net migration (the difference between people moving in and out of an area). The base population for the current estimates is the U.S. population on April 1, 2020 (the reference day for the census), and much of the data for the base comes from the 2020 Census.

Population Estimates Life Cycle

To produce the population estimates each year, we adhere to what we call the population estimates life cycle (Figure 1). These are the main steps in the life cycle:

  • Research Exploring new datasets or pursuing improvements to the current methodology. Not all research can be completed in one cycle, and big research projects often take more than one vintage to fully implement.
  • Simulation Testing new data or methodologies in the estimates before we implement a change.
  • Review — Carefully ensuring that all estimates are as accurate as possible, consistent and demographically plausible. We review at each stage of the population estimates life cycle.
  • Production Processing of the population estimates. Production starts in the fall after we receive the input data needed for the new vintage.
  • Dissemination — Publishing the estimates. Each vintage of estimates is released on a rolling basis usually from December through June. Typically, data tables are accompanied by a news release and a combination of blogs, data visualizations and datasets loaded into the Census Bureau's application programming interface (API).

Importantly, we are always seeking to improve the population estimates; therefore, in theory, each aspect of the life cycle could be improved from one vintage to another. However, not all improvements can be fully developed in a single cycle. In those cases, we strive to implement continual, incremental enhancements from year to year. 

Although the number and impact of improvements vary from vintage to vintage, most updates fall within two categories:

  • Base population updates.
  • Revisions to the components of change.

In both cases, this can include new or revised input data or changes to the methodology. When we make changes to the methodology, we typically use the new methods for the full time series, which means that the estimates for prior years could change. 

There can also be changes to how we bring the base and the components together to produce the estimates, but this type of change is less frequent.

Base Population Updates

The base population is the starting point for the estimates and is typically developed from the most recent census, featuring only minimal updates to data or methodology over the course of the decade.

Due to post-pandemic challenges affecting the availability of the 2020 Census files, we adapted population estimates by developing the base population that integrated 2020 Census data, Vintage 2020 population estimates, and the results of the 2020 Demographic Analysis into what is known as the “blended base.”

This innovative approach introduced the need for annual research into the inputs and mechanics of the base population, particularly as more detail from the 2020 Census became available to us. Furthermore, it ushered in a new era in which method updates to the base population are possible when there is strong evidence to suggest an improvement could, and should, be made.  

Earlier in 2025, we released the 2020 Modified Age and Race Census (MARC) file, which features 2020 Census data on race that have been reassigned to the categories we use in PEP. This file gives us the flexibility to phase out the blended base in lieu of a base population drawn more heavily from the 2020 Census results.

Incorporating the 2020 MARC file will have a big impact on the racial characteristics of the forthcoming Vintage 2025 estimates (the estimates by demographic detail are currently scheduled for release in June 2026) because this will be the first time that we are using race data from the 2020 Census in the population estimates.  

The race data from the 2010 Census, which were used in the blended base, differ considerably from the 2020 Census. Ideally, we try to limit large base population changes to the beginning of the decade when we have a new census, but the 2020 MARC file was not available until earlier this year.

Revisions to the Components of Change

The most common population estimate changes from one vintage to the next are revisions to the components of change. These changes come from using new or more current input data as well as from methodological changes to improve the accuracy of the estimated components.

Numerous datasets are ingested and processed for each new vintage of estimates, including internal Census Bureau data and data from other federal agencies.

  • The internal data include the 2020 MARC file and the American Community Survey (ACS) 1-year and 5-year files. 
  • We receive administrative records on births and deaths from the National Center for Health Statistics (NCHS), tax filing records from the Internal Revenue Service (IRS), and Medicare enrollment records from the Center for Medicare and Medicaid Services (CMS), among other inputs.

For all data, we strive to utilize the most current version possible in the population estimates. However, in some cases, the other agencies’ data lag by a year or two relative to the vintage we are processing because of the time they need to produce them.

For example, final, full birth and death records are two years behind, meaning that for Vintage 2025, we will use data from 2023 for much of the processing. In the cases of these vital records, NCHS is able to make other provisional data — national birth and death totals as current as of June 2025 — available to us. 

The IRS records we use for domestic migration are usually filed in the year that we make the estimates, so they are very current. The ACS data that we use for international migration lag by one year, so for Vintage 2025, the most recent data are from the 2024 ACS 1-year and the 2020-2024 ACS 5-year files.

When we’re working with lagged data, our methodology features assumptions used to project the data forward to cover the gap. This is significant because it represents a mechanism to reflect current trends in the components that otherwise may not be picked up by the inputs.     

Below are examples of recent changes to the components:

  • At the start of the COVID-19 pandemic, the NCHS deaths data did not capture the increase in deaths. A combination of new/more current files from the agency and a methodological adjustment to the deaths component improved the accuracy of the estimates.
  • Also during the COVID-19 pandemic, the lagged ACS-based estimates of net international migration did not yet account for large changes in international migration flows. In response, PEP staff sought out and incorporated other federal agencies’ data to adjust the level of international migration.
  • In Vintage 2024, we used data from the Department of Homeland Security, Department of State, and the Institute for International Education to account for the high numbers of humanitarian migrants (such as refugees and asylum seekers) who entered the United States from 2022 to 2024.

These are just some examples of revisions to the components of population change since 2020. They represent how PEP staff must remain aware of real-time changes to the population and be ready to seek reliable and empirical sources of data to confirm changes or patterns in the components and integrate these changes into the estimates.

A common thread weaving throughout many of these changes is that international migration is typically the most challenging component of population change to estimate and project. One reason is that patterns in international migration flows — unlike births or deaths — can shift dramatically in a short period of time. Additionally, whereas NCHS provides a stable source of highly reliable and comprehensive vital records, migration data are often limited, and the methods used to produce these estimates require numerous assumptions.

Since 2020, we have used administrative data from other agencies as a benchmark to inform adjustments to the level of non-U.S.-born immigration to the United States. We described this process in detail in a blog when we released the first set of estimates from Vintage 2024 in December 2024.

More recently, policy changes have led to increases in emigration, or people moving out of the country. As part of our ongoing process for improving our estimates, PEP staff consulted with outside experts and researched alternative data sources to develop new adjustments for the Vintage 2025 estimates (more details on these updates will be available in a blog that accompanies the first Vintage 2025 data release, currently scheduled for later this month). 

Conclusion

In January, we are scheduled to release the first estimates from Vintage 2025: national, state and Puerto Rico total population; voting-age population; and components of change. Following the estimates life cycle described above, this upcoming vintage will feature newly developed estimates for July 1, 2025, as well as updated estimates for April 1, 2020, and July 1, 2020, through 2024.

To help data users keep track of these changes, all vintage-to-vintage updates will be highlighted in the Vintage 2025 Release Notes, as well as in the updated methodology statement.

While we will remain focused on the rest of the Vintage 2025 products, we will do so with one eye to all the ways the U.S. population is growing and changing in 2026. 

Related Information


Page Last Revised - January 15, 2026