Modifications to the Entropy Balance Weighting for the 2016-2020 ACS 5-Year Release


As explained in a previously released blog, the 2020 ACS data collection operations were significantly impacted during the COVID-19 pandemic. These disruptions prevented us from collecting information from certain segments of the population. This resulted in substantial nonresponse bias in the 2020 ACS data, which means the characteristics of people who responded to the survey were significantly different from the people who did not respond. Because of this bias, the resulting data would not have been representative of the U.S. population.

We delayed the 2016-2020 ACS 5-year release to refine our methodology to reduce the impact of the nonresponse bias on the estimates and to ensure the methodology performed appropriately at various levels of geography.

Our revised methodology incorporated the entropy-balance weighting (EBW) methodology used to produce the 2020 ACS 1-year experimental data products into our standard production methodology. The goal was to leverage the EBW to reduce the bias in the 2020 portion of the 5-year data while keeping our standard methodology as intact as possible to support producing estimates for all of our standard 5-year geographic areas. 

Difference Between the Entropy Balance Weighting Used in the 2020 ACS 1-Year Experimental and the 2016-2020 ACS 5-Year

The biggest difference between our usage of the EBW in producing the 2020 1-year experimental estimates and the 5-year estimates was that we were not able to incorporate the EBW into the standard weighting methodology for the 1-year in the short amount of time available. For the 5-year, we made it a top priority to maintain as much of the production estimation methodology as possible. The 2020 ACS 1-year experimental estimates had a larger proportion of the data impacted by the nonresponse bias, whereas the 2016-2020 ACS 5-year estimates contained 4 other years of data so, proportionally, the nonresponse bias present in the 2020 data had less impact on the 5-year estimates than the 1-year estimates.

Additionally, we further enhanced the EBW methodology for the 5-year estimates.

  • First, we included additional population controls at smaller geographies for: 
    • Race/Hispanic origin by sex by age bins at the county level;
    • Population at the tract level for many tracts. 
  • Second, we used additional administrative data beyond what was included in the 2020 1-year EBW methodology. For the 5-year estimates, we used administrative records information for:    
    • Income, employment, financial, and household structure data from the Internal Revenue Service (IRS) 1040 and 1099 forms; 
    • Program benefit data from the Social Security Administration (SSA);
    • Demographic data from the 2010 Census and the SSA;
    • Industry data and firm size data from the Census Bureau’s Business Register;
    • Third-party data on home values. 
  • Third, we adjusted the weights to make key estimates consistent with estimates from prior ACS files and year-to-year changes from the experimental files. Starting with the administrative data, the EBW used year-to-year changes in the following characteristics to create weights that are consistent with estimates in prior ACS data:
    • Household (within race, Hispanic origin, sex, and age) - income, home value, building type, tenure, rent, number of vehicles, food stamp participation;
    • Person (by race, Hispanic origin, sex, and age) - income, earnings, marital status, labor force participation (weeks/hours), citizenship, poverty, health insurance coverage, education.

Final steps to produce the 2016-2020 ACS 5-year estimates 

To produce the 2016–2020 ACS 5-year estimates, we modified steps in the estimation methodology to partially process the four years of data from 2016–2019 using our standard methods before combining those data with the 2020 data that had been processed using the EBW methodology. Once the data from 2016–2020 were pooled together, we applied population and housing unit control totals, derived from estimates produced by the Population Estimates Program, as we normally do. We also conducted a more exhaustive review of the 5-year data.

This modification will be applied to subsequent 5-year estimates that include data collected in 2020.

For additional details on how the EBW was incorporated into the standard methodology, view the 2016–2020 ACS 5-year Accuracy of the Data document.


