end of header
You are here: Census.govSubjects A to Z › Center for Statistical Research and Methodology (CSRM)
Skip top of page navigation

Center for Statistical Research and Methodology (CSRM)

Small Area Estimation

Motivation: Small area estimation is important in light of a continual demand by data users for finer geographic detail of published statistics and for various subpopulations. Traditional demographic sample surveys designed for national estimates do not provide large enough samples to produce reliable direct estimates for small areas such as counties and even most states. The use of valid statistical models can provide small area estimates with greater precision; however, bias due to an incorrect model or failure to account for informative sampling can result.

Research Problem:

  • Development/evaluation of multilevel random effects models for capture/recapture models.
  • Development of small area models to assess bias in synthetic estimates.
  • Development of expertise using nonparametric modeling methods as an adjunct to small area estimation models.
  • Development/evaluation of Bayesian methods to combine multiple models.
  • Development of models to improve design-based sampling variance estimates.
  • Extend current univariate small-area models to handle multivariate outcomes.
  • Development of models to improve uncertainty estimates for design-based estimates near boundaries (e.g., counts near 0, rates near 0 or 1).
  • Development of formal methodology for generating small area applications by screening variables from Census Bureau and other federal statistical sample surveys for concordance with American Community Survey variables.

Potential Applications:

  • Development/evaluation of binary, random effects models for small area estimation, in the presence of informative sampling, cuts across many small area issues at the Census Bureau.
  • Using nonparametric techniques may help determine fixed effects and ascertain distributional form for random effects.
  • Improving the estimated design-based sampling variance estimates leads to better small area models which assumes these sampling error variances are known.
  • For practical reasons, separate models are often developed for counties, states, etc. There is a need to coordinate the resulting estimates so smaller levels sum up to larger ones in a way that correctly accounts for accuracy.
  • Using the American Community Survey to improve the precision of estimates from other smaller surveys.

Accomplishments (October 2017 - September 2018):

  • Developed a Multinomial-Dirichlet to model school district pieces population counts from county population totals and compared results against a Generalized Poisson small area model.
  • Obtained several new results, both theoretical and empirical, related to the understanding of functional and structural measurement error Fay-Herriot models, as well as of naïve models that ignore measurement errors.
  • Obtained several new results on the properties of confidence intervals for proportions for complex surveys, including the addition, testing, and analysis of the logit interval to our factorial simulation design, the analysis of the mean squared error, variance, and bias of the proposed effective sample size estimator, and the development and testing of user-friendly R functions associated with this work.
  • Extended previous work on loglinear models for longitudinal estimators of quarterly changes in labor force status and healthcare coverage to gross flows with multiple outcome categories, using fixed-effect models to exhibit small area estimates for smaller gross flows, by state (chapter to appear in 2019 edited book on Longitudinal Surveys).

Short-Term Activities (FY 2019):

  • Rebuild the design-based samples from the Artificial Population dataset (ACS 2008-2012) to allow for both area- and unit-level models to be evaluated.
  • Develop generalized variance functions for count estimates for school district population and poverty estimates from the ACS.
  • Develop a small area model for small sample R x C tables over areas (i.e. counties) using spatial effects to capture similarities of neighboring areas.
  • Finding examples of using the American Community Survey to improve the precision of estimates from other smaller surveys.
  • Extend work on small area analysis of gross flows in longitudinal surveys to random-effect models; this research involves extension of model-assisted pseudo-likelihood estimation to random-effect generalized logistic regression models.

Longer-Term Activities (beyond FY 2019):

  • Develop models to simultaneously estimate population and poverty in subcounty areas such as school districts and census tracts.
  • Incorporate spatial modeling into the small area effects and develop tests for their necessity.
  • Develop methods for estimating the change over time of small area parameters and for constructing interval estimates for this change.
  • Develop a standard approach to modeling sampling variance for rate and count estimates based on small sample sizes from complex surveys.

Selected Publications:

Arima, S., Bell, W. R., Datta, G. S., Franco, C., and Liseo, B. (2017). “Multivariate Fay-Herriot Bayesian Estimation of Small Area Means Under Functional Measurement Error,” Journal of the Royal Statistical Society--Series A, 180(4), 1191-1209.

Bell, W. R., Chung, H. C., Datta, G. S., and Franco, C. (In Press). “Measurement Error in Small Area Estimation: Functional vs. Structural Vs. Naïve Models," Survey Methodology.

Franco, C. and Bell, W. R. (2013). “Applying Bivariate/Logit Normal Models to Small Area Estimation,” In JSM Proceedings, Survey Research Methods Section. Alexandria, VA: American Statistical Association. 690-702.

Franco, C. and Bell, W. R. (2015). “Borrowing information over time in binomial/logit normal models for small area estimation,” Joint issue of Statistics in Transition and Survey Methodology, 16, 4, 563-584.

Franco, C., Little, R. J. A., Louis, T. A., and Slud, E. V. (2018). “Comparative Studies of Confidence Intervals for Proportions in Complex Surveys,” Journal of Survey Statistics and Methodology.

Datta, G., Ghosh, M., Steorts, R., and Maples, J. (2011). “Bayesian Benchmarking with Applications to Small Area Estimation,” TEST, Volume 20, Number 3, 574-88.

Huang, E., Malec, D., Maples J., and Weidman, L. (2007). “American Community Survey (ACS) Variance Reduction of Small Areas via Coverage Adjustment Using an Administrative Records Match,” Proceedings of the 2006 Joint Statistical Meetings, American Statistical Association, Alexandria, VA, 3150-3152.

Janicki, R. (2011). “Selection of prior distributions for multivariate small area models with application to small area health insurance estimates.” JSM Proceedings, Government Statistics Section. American Statistical Association, Alexandria, VA.

Janicki, R. and Vesper, A. (2017). "Benchmarking Techniques for Reconciling Small Area Models at Distinct Geographic Levels." Statististical Methods Applications, DOI: https://doi.org/10.1007/s10260-017-0379-x, 26, 557-581.

Janicki, R (2016). "Estimation of the difference of small area parameters from different time periods". Research Report Series (Statistics #2016-01), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, DC.

Joyce, P. and Malec, D. (2009). “Population Estimation Using Tract Level Geography and Spatial Information,” Research Report Series (Statistics #2009-3), Statistical Research Division, U.S. Census Bureau, Washington, DC.

Malec, D. (2005). “Small Area Estimation from the American Community Survey Using a Hierarchical Logistic Model of Persons and Housing Units,” Journal of Official Statistics, 21 (3), 411-432.

Malec, D. and Maples, J. (2008). “Small Area Random Effects Models for Capture/Recapture Methods with Applications to Estimating Coverage Error in the U.S. Decennial Census,” Statistics in Medicine, 27, 4038-4056.

Malec, D. and Müller, P. (2008). “A Bayesian Semi-Parametric Model for Small Area Estimation,” in Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh (eds. S. Ghoshal and B. Clarke), Institute of Mathematical Statistics, 223-236.

Maples, J. and Bell, W. (2007). “Small Area Estimation of School District Child Population and Poverty: Studying Use of IRS Income Tax Data,” Research Report Series (Statistics #2007-11), Statistical Research Division, U.S. Census Bureau, Washington, DC.

Maples, J. (2011). “Using Small-Area Models to Improve the Design-Based Estimates of Variance for County Level Poverty Rate Estimates in the American Community Survey,” Research Report Series (Statistics #2011-02), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, DC.

Maples, J. (2017). “Improving Small Area Estimates of Disability: Combining the American Community Survey with the Survey of Income and Program Participation,” Journal of the Royal Statistical Society-Series A, 180(4), 1211-1227.

Slud, E. and Maiti, T. (2006). “Mean-Squared Error Estimation in Transformed Fay-Herriot Models,” Journal of the Royal Statistical Society-Series B, 239-257.

Slud, E. and Maiti, T. (2011). “Small-Area Estimation Based on Survey Data from Left-Censored Fay-Herriot Model,” Journal of Statistical Planning & Inference, 3520-3535.

Contact: Jerry Maples, Ryan Janicki, Carolina Franco, Gauri Datta, Kyle Irimata, Bill Bell (R&M), Eric Slud

Funding Sources for FY 2018:

  • 0331 - Working Capital Fund / General Research Project
    Various Decennial, Demographic, and Economic Projects

Annual and Quarterly Reports

X
  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
X
No, thanks
255 characters remaining
X
Thank you for your feedback.
Comments or suggestions?
Source: U.S. Census Bureau | Research and Methodology Directorate | Center for Statistical Research & Methodology | (301) 763-9862 (or lauren.emanuel@census.gov) |   Last Revised: October 02, 2018