U.S. flag

An official website of the United States government

Skip Header


A Machine Learning Approach for Counting Language Minority Groups in the United States

Written by:
RRS2025-03

Abstract

The U.S. Voting Rights Act (VRA) prohibits discrimination at the polls based on language minority status. The VRA requires the U.S. Census Bureau to use data on the voting-age population, including the number of citizens, limited English proficient individuals, and those with limited education, to identify those language minorities. In the 2021 cycle of determining which jurisdictions (states, counties, cities) must provide voting materials in languages in addition to English, Census Bureau statisticians developed both frequentist and Bayesian models to estimate the population sizes of language minority groups. In this paper, we present a new machine learning model that outperformed the previous 2021 statistical models for some language minority groups. Our machine learning model was developed in the framework of random forests (RF), which adopted the beta-binomial posterior as the objective function to construct RF trees. This adoption is in the spirit of soft computing because the new RF method relaxed a typical objective function used for the RF to accommodate the unique VRA data structure.

Page Last Revised - April 24, 2025
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header