Examining Multilingualism in the United States Using ACS Language Write-ins

Written by:
Working Paper Number: SEHSD-WP2026-02

Statistics on language use in the United States are available through the American Community Survey (ACS), which asks respondents whether they speak a language other than English (LOTE) at home. Currently, LOTE responses are recorded as write-ins, and only the first language is processed for further analysis even when multiple non-English languages are reported. In this project, I develop a text-processing algorithm to reclassify complete LOTE write-in responses into standardized language categories and identify the most spoken languages among multilingual individuals in the United States. I also compare the demographic characteristics of multilingual, bilingual, and English-only speakers using descriptive statistics and regression modeling, based on the reclassified 2017-2021 5-year ACS dataset. The analysis reveals significant differences across the three speaker groups in various socioeconomic dimensions, including race and ethnicity, educational attainment, and labor market outcomes.

Page Last Revised - January 13, 2026