Creating an Automated Industry and Occupation Coding Process for the American Community Survey

Written by:

Abstract

Every year the American Community Survey (ACS) collects data on millions of individuals on a variety of topics, including the industry and occupation in which individuals work. These data are collected in the form of a series of open-ended questions. Clerical coders take these open-ended responses and assign a numeric code for the industry and occupation.

The coding of industry and occupation for the ACS is a massive operation with over 2 million industry and occupation codes assigned every year. To reduce costs, a process was developed to assign industry and occupation codes using the open-ended responses and a logistic regression model. This paper discusses the development of this model and the early results. It is expected that beginning in 2012, 56% of industry codes and 43% of occupation codes will be assigned through this automated coding process for the ACS.

Page Last Revised - October 8, 2021