end of header
You are here: Census.govSubjects A to Z › Center for Statistical Research and Methodology (CSRM)
Skip top of page navigation

Center for Statistical Research and Methodology (CSRM)

Survey Sampling: Estimation and Modeling

Motivation: Survey sampling helps the Census Bureau provide timely and cost efficient estimates of population characteristics. Demographic sample surveys estimate characteristics of people or households such as employment, income, poverty, health, insurance coverage, educational attainment, or crime victimization. Economic sample surveys estimate characteristics of businesses such as payroll, number of employees, production, sales, revenue, or inventory. Survey sampling helps the Census Bureau assess the quality of each decennial census. Estimates are produced by use of design-based estimation techniques or model-based estimation techniques. Methods and topics across the three program areas (Demographic, Economic, and Decennial) include: sample design, estimation and use of auxiliary information (e.g., sampling frame and administrative records), weighting methodology, adjustments for non-response, proper use of population estimates as weighting controls, variance estimation, effects of imputation on variances, coverage measurement sampling and estimation, coverage measurement evaluation, evaluation of census operations, uses of administrative records in census operations, improvement in census processing, and analyses that aid in increasing census response.

Research Problem:

  • How to design and analyze sample surveys from "frames" determined by non-probabilistically sampled observational data to achieve representative population coverage. To make census data products based jointly on administrative and survey data fully representative of the general population, as our current surveys are, new sampling designs and analysis methods will have to be developed.
  • How can administrative records, supported by new research on matched survey and administrative lists, be used to increase the efficiency of censuses and sample surveys?
  • How can inclusion in observational or administrative lists be modeled jointly with indicator and mode of survey response, so that traditional survey methods can be extended to merged survey and non-survey data?
  • Can non-traditional design methods such as adaptive sampling be used to improve estimation for rare characteristics and populations?
  • How can time series and spatial methods be used to improve ACS estimates or explain patterns in the data?
  • Can generalized weighting methods be formulated and solved as optimization problems to avoid the ambiguities resulting from multiple weighting step and to explicitly allow inexact calibration?
  • How can we detect and adjust for outliers and influential sample values to improve sample survey estimates?
  • What models can aid in assessing the combined effect of all the sources of sampling and nonsampling error, including frame coverage errors and measurement errors, on sample survey estimates?
  • What experiments and analyses can inform the development of outreach methods to enhance census response?
  • Can unduplication and matching errors be accounted for in modeling frame coverage in censuses and sample surveys?
  • How can small-area or other model-based methods be used to improve interval estimates in sample surveys, to design survey collection methods with lowered costs, or to improve Census Bureau imputation methods?
  • Can classical methods in nonparametrics (e.g., using ranks) improve estimates from sample surveys?
  • How can we measure and present uncertainty in rankings of units based on sample survey estimates?
  • Can Big Data improve results from censuses and sample surveys?
  • How to develop and use bootstrap methods for expressing uncertainty in estimates from probability sampling?

Potential Applications:

  • Improve estimates and reduce costs for household surveys by introducing new design and estimation methods.
  • Produce improved ACS small area estimates thorough the use of time series and spatial methods, where those methods improve upon small area methods using covariates recoded from temporal and spatial information.
  • Streamline documentation and make weighting methodology more transparent by applying the same nonresponse and calibration weighting adjustment software across different surveys.
  • New procedures for adjusting weights or reported values in the monthly trade surveys and surveys of government employment, based on statistical identification of outliers and influential values, to improve accuracy of estimation monthly level and of month-to-month change.
  • Provide a synthesis of the effect of nonsampling errors on estimates of net census coverage error, erroneous enumerations, and omissions and identify the types of nonsampling errors that have the greatest effects. Employ administrative records to improve the estimates of census coverage error.
  • Measure and report uncertainty in rankings in household and economic sample surveys.
  • Develop bootstrap methods for expressing uncertainty as an alternative source of published variance estimates and as a check on existing methods of producing variances in Census Bureau sample surveys.

Accomplishments (October 2017 - September 2018):

  • Developed variance estimation methodology in Small Area Estimation settings in which Dirichlet-multinomial models for proportions couple with Horvitz-Thompson estimates, as in the analyses supporting the Census Bureau’s determinations for Language Ballot Assistance under the Voting Rights Act; the new Hybrid Variance Estimation methodology involving Successive Difference Replication together with Parametric Bootstrap is developed for more general application and is now being supported by simulation studies documenting its effectiveness.
  • Contributed to team development of methods for producing differentially private decennial census tabulations conforming to legally mandated error-free disclosure of block-level population totals under Public Law 94 as well as to Title 13 requirements for nondisclosure of individual-level data.
  • Demonstrated the potential for a market segmentation from an external source to improve self-response propensity models using data from the 2010 Census and the American Community Survey.
  • Demonstrated that market segmentation from an external source aid in providing useful information about problems in the Census enumeration of young children.
  • Assisted DSMD staff in diagnosing the cause of anomalous variance estimates using weight-replication methods (Balanced Repeated Replication or BRR on Nonself-representing strata, and Successive Difference Replication or SDR on Self-representing strata) on Current Population Survey data, resulting in follow-on research on general causes of biases in replication-based variance estimates (two 2018 JSM Proceedings papers).
  • Developed a simple and novel measure of uncertainty for an estimated ranking with theory, using American Community Survey travel time to work data, and with a visualization.
  • Extended the current equal proportions methodology by appealing to probability sampling results in Wright (2017).

Short-Term Activities (FY 2019):

  • Continue to refine model diagnostics techniques for small area prediction within language minority groups in connection with the determinations of alternative language election assistance by jurisdiction and American Indian Area under Section 203 of the Voting Rights Act.
  • Continue research via simulation studies and theoretical development to document the effectiveness of the Hybrid Variance Estimation methodology, combining SDR with parametric bootstrap, under the small area estimation framework used in supporting the Voting Rights Act Section 203(b) determinations of jurisdictions mandated to provide language ballot assistance.
  • Develop differentially private methods to estimate the variance and confidence intervals of differentially private estimates when post-processing of the differentially private methods has been utilized.
  • Continue to conduct statistical analyses focusing on problems in the Census enumeration of young children.
  • Contribute to statistical analyses that support the 2020 Census communications campaign.
  • Contribute to the analyses of results from a side-by-side test of influential value detection methodology currently being collected in an economic sample survey.
  • Conduct theoretical and computational research on a new EM pseudolikelihood approach to the model-assisted estimation of multi-level (random-effect) models in complex surveys, for application to the longitudinal small area estimation project on estimating gross flows in the CPS.

Longer-Term Activities (beyond FY 2019):

  • Develop methodology to increase understanding of the undercoverage of young children in censuses and surveys and contribute to improving coverage of this group.
  • Contribute to statistical analyses that support the 2020 Census communications campaign.
  • Develop software that is re-usable and easily implementable for small area prediction within language minority groups in connection with the determinations of ballot language assistance by jurisdiction and American Indian Area under Section 203 of the Voting Rights Act.
  • Further investigate the statistical implications and assumptions of formal privacy (e.g. differential privacy) methods in order to understand how the methods may impact the use of data products.
  • Develop statistical methods and theory related to the use of differential privacy to release data from unequal probability sampling surveys. A specific focus of this research would be on how to account for the sampling probabilities/weights in the planning of the privacy budget.
  • Develop probability sampling methods targeted to the complement of an administrative records database within a survey frame such as the MAF; this research will require combining statistical models for joint dependence of administrative records and survey or census response, to be incorporated into new response propensity models in terms of which the survey data can be analyzed.
  • Develop spatial models and associated small area estimation techniques in terms of Generalized Linear Mixed Models (GLMMs) with covariates recoded to incorporate local spatial geographic/demographic/economic effects, and compare the performance of these models with Bayes-hierarchical models currently being developed elsewhere at the Census Bureau using American Community Survey data. Such GLMM spatial models may also be applicable to the evaluation of canvassing and address status changes in the MAF.

Selected Publications:

Ashmead, R., Slud, E., and Hughes, T. (2017), “Adaptive Intervention Methodology for Reduction of Respondent Contact Burden in the American Community Survey,” Journal of Official Statistics, 33(4), 901-919.

Ashmead, R. and Slud, E. (2017), “Small area model diagnostics and validation with applications to the Voting Rights Act Section 203,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Dong, K., Trudell, T., Slud, E., and Cheng, Y. (In Press). “Understanding Variance Estimator Bias in Stratified Two-Stage Sampling,” Proceedings of the Survey Research Methods Section of the American Statistical Association.

Franco, C., Little, R., Louis, T., and Slud, E. (2014). “Coverage Properties of Confidence Intervals for Proportions in Complex Sample Surveys,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Franco, C., Little, R., Louis, T., and Slud, E. (In Press). “Comparative Study of Confidence Intervals for Proportions in Complex Sample Surveys,” Journal of Survey Statistics and Methodology.

Griffin, D., Slud, E., and Erdman, C. (2014). “Reducing Respondent Burden in the American Community Survey's Computer Assisted Personal Visit Interviewing Operation - Phase 3 Results,” ACS Research and Evaluation Memorandum #ACS 14- RER-28.

Hogan, H. and Mulry, M. H. (2014). “Assessing Accuracy of Postcensal Estimates: Statistical Properties of Different Measures,” in N. Hogue (Ed.), Emerging Techniques in Applied Demography. Springer. New York.

Hunley, Pat. (2014). “Proof of Equivalence of Webster’s Method and Willcox’s Method of Major Fractions,” Research Report Series (Statistics #2014-04), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Ikeda, M., Tsay, J., and Weidman, L. (2012). “Exploratory Analysis of the Differences in American Community Survey Respondent Characteristics Between the Mandatory and Voluntary Response Methods,” Research Report Series (Statistics #2012-01), Center for Statistical Research & Methodology, U.S. Census Bureau, Wash. DC.

Joyce, P., Malec, D., Little, R., Gilary, A., Navarro, A., and Asiala, M. (2014). “Statistical Modeling Methodology for the Voting Rights Act Section 203 Language Assistance Determinations,” Journal of American Statistical Association, 109 (505), 36-47.

Klein, M. and Wright, T. (2011). “Ranking Procedures for Several Normal Populations: An Empirical Investigation,” International Journal of Statistical Sciences, Volume 11 (P.C. Mahalanobis Memorial Special Issue), 37-58.

Klein, M., Wright, T., and Wieczorek, J. (2018). “A Simple Joint Confidence Region for A Ranking of K Populations: Application to American Community Survey’s Travel Time to Work Data,” Research Report Series (Statistics #2018-04), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, DC.

Lu, B. and Ashmead, R. (In Press). “Propensity Score Matching Analysis for Causal Effect with MNAR Covariates,” Statistica Sinica.

Mulry, M. H. (2014). “Measuring Undercounts in Hard-to-Survey Groups,” in R. Tourangeau, N. Bates, B. Edwards, T. Johnson, and K. Wolter (Eds.), Hard-to-Survey Populations. Cambridge University Press, Cambridge, England.

Mulry, M.H, Kaputa, S., and Thompson, K. (2018). “Initial M-estimation Parameter Settings for Detection and Treatment of Influential Values,” Journal of Official Statistics, 34(2). 483–501. http://dx.doi.org/10.2478/JOS-2018-0022

Mulry, M.H. and Keller, A. (2017). “Comparison of 2010 Census Nonresponse Followup Proxy Responses with Administrative Records Using Census Coverage Measurement Results.” Journal of Official Statistics. 33(2). 455–475. DOI: https://doi.org/10.1515/jos-2017-0022

Mulry, M.H., Nichols, E. M., and Hunter Childs, J. (2017). “Using administrative records data at the U.S. Census Bureau: Lessons learned from two research projects evaluating survey data.” In Biemer, P.P, Eckman, S., Edwards, B., Lyberg, L., Tucker, C., de Leeuw, E., Kreuter, F., and West, B.T. Total Survey Error in Practice. Wiley. New York. 467-473. Mulry, M. H., Nichols, E. M., and Childs, J. Hunter (2016). “A Case Study of Error in Survey Reports of Move Month Using the U.S. Postal Service Change of Address Records,” Survey Methods: Insights from the Field. Retrieved from http://surveyinsights.org/?p=7794

Mulry, M. H., Oliver, B. E., and Kaputa, S. J. (2014) “Detecting and Treating Verified Influential Values in a Monthly Retail Trade Survey.” Journal of Official Statistics, 30(4), 1–28.

Mulry, Mary H., Oliver, B., Kaputa, S., and Thompson, K. J. (2016). “Cautionary Note on Clark Winsorization.” Survey Methodology 42 (2), 297-305. http://www.statcan.gc.ca/pub/12-001-x/2016002/article/14676-eng.pdf

Nagaraja, C. and McElroy, T. (2015). “On the Interpretation of Multi-Year Estimates of the American Community Survey as Period Estimates.” Published online, Journal of the International Association of Official Statistics.

Shao, J., Slud, E., Cheng, Y., Wang, S., and Hogue, C. (2014). “Theoretical and Empirical Properties of Model Assisted Decision- Based Regression Estimators,” Survey Methodology 40(1), 81-104. Slud, Eric. (2015). “Impact of Mode-based Imputation on ACS Estimates,” American Community Survey Research and Evaluation Memorandum, #ACS-RER-O7.

Slud, E. and Ashmead, R., (2017), “Hybrid BRR and Parametric-Bootstrap Variance Estimates for Small Domains in Large Surveys,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Slud, E., Grieves, C., and Rottach, R. (2013). “Single Stage Generalized Raking Weight Adjustment in the Current Population Survey,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Slud, E. and Thibaudeau,Y. (2010). “Simultaneous Calibration and Nonresponse Adjustment,” Research Report Series (Statistics#2010-03), Statistical Research Division, U.S. Census Bureau, Washington, DC. Thibaudeau, Y., Slud, E., and Gottschalck, A. (2017), “Modeling Log-linear Conditional Probabilities for Estimation in Surveys,” Annals of Applied Statistics, 11 (2), 680-697.

Trudell, T., Dong, K., Slud, E., and Cheng, Y. (In Press). “Computing Replicated Variance for Stratified Systematic Sampling,” Proceedings of the Survey Research Methods Section of the American Statistical Association.

Wieczorek, J. (2017). “Ranking Project: The Ranking Project: Visualizations for Comparing Populations,” R package version 0.1.1. URL: https://cran.r-project.org/package=RankingProject.

Wright, T. (2012). “The Equivalence of Neyman Optimum Allocation for Sampling and Equal Proportions for Apportioning the U.S. House of Representatives,” The American Statistician, 66 (4), 217-224.

Wright, T. (2013). “A Visual Proof, a Test, and an Extension of a Simple Tool for Comparing Competing Estimates,” Research Report Series (Statistics #2013-05), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, DC.

Wright, T., Klein, M., and Wieczorek, J. (2013). “An Overview of Some Concepts for Potential Use in Ranking Populations Based on Sample Survey Data,” 2013 Proceedings of the World Congress of Statistics (Hong Kong), International Statistical Institute.

Wright, T. (2014). “A Simple Method of Exact Optimal Sample Allocation under Stratification with Any Mixed Constraint Patterns,” Research Report Series (Statistics #2014-07), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, DC.

Wright, T. (2014). “Lagrange’s Identity and Congressional Apportionment,” The American Mathematical Monthly, 121, 523-528.

Wright, T. (2016). “Two Optimal Exact Sample Allocation Algorithms: Sampling Variance Decomposition Is Key,” Research Report Series (Statistics #2016-03), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, DC.

Wright, T. (2017). “Exact Optimal Sample Allocation: More Efficient Than Neyman,” Statistics and Probability Letters, 129, 50-57.

Wright, T., Klein, M., and Wieczorek, J. (In Press). “A Primer on Visualizations for Comparing Populations, Including the Issue of Overlapping Confidence Intervals,” The American Statistician.

Wright, T. (2018). “No Calculation When Observation Can Be Made,” in A.K. Chattopadhyay and G. Chattopadhyay (Eds), Statistics and Its Applications, Springer Singapore, 139-154

Contact: Eric Slud, Mary Mulry, Michael Ikeda, Patrick Joyce, Robert Ashmead, Martin Klein, Ned Porter, Tommy Wright

Funding Sources for FY 2018:

  • 0331 - Working Capital Fund / General Research Project
    Various Decennial, Demographic, and Economic Projects

Annual and Quarterly Reports

X
  Is this page helpful?
Thumbs Up Image Yes    Thumbs Down Image No
X
No, thanks
255 characters remaining
X
Thank you for your feedback.
Comments or suggestions?
Source: U.S. Census Bureau | Research and Methodology Directorate | Center for Statistical Research & Methodology | (301) 763-9862 (or lauren.emanuel@census.gov) |   Last Revised: October 02, 2018