Michael Starsinic
The American Community Survey (ACS) is a continuous monthly survey that collects the data historically collected by the decennial census long form sample. Full implementation of the ACS began in January 2005, with the sample expanding to a size of approximately three million housing unit addresses, with sample in all counties and county equivalents in the 50 states, the District of Columbia, and Puerto Rico.

A single year’s worth of sample in the ACS is not adequate to publish estimates for all geographic areas for which long form estimates were published in Census 2000. Instead, 1-year estimates are published only for geographic areas with a population of at least 65,000. For smaller areas, several years of ACS sample are pooled together to create "period" estimates. The first estimates based on three years of pooled ACS data were published in 2008 for all areas with a population of at least 20,000 using data from 2005 through 2007. All geographic areas, including Census tracts and block groups, will be published using five years’ worth of pooled ACS data. The five-year data will first be published in 2010 for the years 2005-2009. (U.S. Census Bureau 2009)

The ACS follows in the footsteps of the long form in publishing a very large array of data products accessible through the Census Bureau’s American FactFinder (AFF) website. The ACS creates several thousand data products, some containing hundreds of individual estimates, for thousands of different geographic areas - over 6,000 areas for 1-year data and over 13,000 for 3-year data. That adds up to hundreds of millions of estimates released each year. The ACS realizes that not all the estimates that are produced are of high quality - many may be based on a handful of sampled observations, and others are zero, with no sample cases in that geographic area having those characteristics.

One way the ACS has chosen to address this problem of low-reliability data is by instituting a process of "data quality filtering" for 1-year and 3-year data products, which identifies products with the highest concentrations of low-reliability estimates and prevents their publication on AFF. This paper documents research that attempts to answer two questions about the ACS’s filtering procedures:

  • How does the current data quality filtering methodology affect the reliability of the data that the ACS publishes, for both 1-year and 3-year data products?
  • How do several alternate filtering methods affect the reliability of estimates that would be published under those rules?

