Skip Header

“Low Response Score” Indicator Arises Out of Crowdsourcing Solution

Thu Oct 02 2014
Written by: Nancy A. Bates, Research and Methodology Directorate; and Chandra Erdman, Center for Statistical Research and Methodology, Research and Methodology Directorate
Component ID: #ti37856403

In September 2012, the U.S. Census Bureau announced a global crowdsourcing competition. The contest – dubbed the “Census Return Rate Challenge” – encouraged teams and individuals to compete for prize money for predicting 2010 Census mail-return rates. The challenge asked participants to model geographic variations in return rates using predictive variables found in the updated 2012 Planning Database.

Component ID: #ti695929399

The challenge was a success. Over 244 teams and individual competitors submitted solutions. Bill Bame, a software developer from Maryland, submitted the winning model. The Bame model included 342 variables and employed data mining and machine learning techniques. Twenty-four of his top 25 predictors came from the 2012 Planning Database. With these variables, we developed an “ordinary least square” regression model to predict likelihood of self-response resulting in a predicted rate referred to as the “low response score.” See Erdman and Bates, 2014 [PDF 2.0 MB] for a full description of the methodology.

Component ID: #ti1226748730

Areas with low self-response require costly follow-up by telephone or in-person. Using the low response score and the wealth of information in the planning database, we can identify areas that are likely to have low rates of self-response and develop tailored strategies to increase these rates.

Component ID: #ti1226748731

The low response score is provided for each census tract and block group in the 2014 Planning Database, a publicly available database containing socioeconomic, housing and demographic variables from the 2010 Census and 2008-2012 American Community Survey. The low response score and updated 2014 Planning Database go hand-in-hand, and survey practitioners can use them in many ways. For example, one can use the score to stratify samples to delineate between areas with low and high likelihood of survey and census participation.

Component ID: #ti1226748732

Used in tandem, the score and database help survey and census planners to identify hard-to-count areas and understand why such areas are hard to count. This knowledge can then be applied to manage field resources and develop targeted self-response and nonresponse follow-up strategies more efficiently.

Component ID: #ti1226748733

For questions or comments, contact

Back to Header