Inconsistent Search Results
Users may experience issues with the search function. We encourage you to browse our pages manually through the navigation until this is resolved. Thank you for your patience.

A Large Scale, High Quality U.S. Occupational Database: Results from Merged IRS and ACS Write-Ins

Written by:
Working Paper Number: SEHSD-WP2024-26

In this paper, we describe and analyze a new dataset consisting of matched ACS and IRS 1040 occupation reports. This dataset allows validation and quality analysis of the IRS’s large Form 1040 occupational write-in database by comparing it with the high-quality ACS write-in and coding process. We analyze the similarity between the two datasets both along the token and semantic dimensions. We find a bimodal distribution of response quality in the token dimension, with over 50 percent of the ACS sample a high-quality token match with its IRS counterpart, but also a significant set of seeming no-matches.

Page Last Revised - October 29, 2024