Sampling Estimation & Survey Inference

Motivation:

Survey sampling helps the Census Bureau provide timely and cost efficient estimates of population characteristics. Sampling methodology remains at the center of innovation at the Census Bureau as evidenced by three recent major efforts: the Household Trends and Outlook Pulse Survey (formerly Household Pulse Survey); the Small Business Pulse Survey; and the Annual Integrated Economic Survey. Demographic sample surveys estimate characteristics of people or households such as employment, income, poverty, health, insurance coverage, educational attainment, or crime victimization. Economic sample surveys estimate characteristics of businesses such as payroll, number of employees, production, sales, revenue, or inventory. Survey sampling helps the Census Bureau assess the quality and coverage of each decennial census. Estimates are produced by use of design-based estimation techniques or model-based estimation techniques. Methods and topics across the three program areas (Demographic, Economic, and Decennial) include: sample design, estimation and use of auxiliary information (e.g., sampling frame and administrative records), weighting methodology, adjustments for non-response, proper use of population estimates as weighting controls, variance estimation, effects of imputation on variances, coverage measurement sampling and estimation, coverage measurement evaluation, evaluation of census operations, uses of administrative records in census operations, improvement in census processing, and analyses that aid in increasing census response.

 

Research Problems:

·   How to design and analyze sample surveys from "frames" determined by non-probabilistically sampled observational data to achieve representative population coverage. To make census data products based jointly on administrative and survey data fully representative of the general population, as our current surveys are, new sampling designs and analysis methods will have to be developed.

·   How can inclusion in observational or administrative lists be modeled jointly with indicator and mode of survey response, so that traditional survey methods can be extended to merged survey and non-survey data?

·   Can non-traditional design methods such as adaptive sampling be used to improve estimation for rare characteristics and populations?

·   How can time series and spatial methods be used to improve ACS estimates or explain patterns in the data?

·   Can generalized weighting methods be formulated and solved as optimization problems to avoid the ambiguities resulting from multiple weighting step and to explicitly allow inexact calibration?

·   What models can aid in assessing the combined effect of all the sources of sampling and nonsampling error, including frame coverage errors and measurement errors, on sample survey estimates?

·   What experiments and analyses can inform the development of outreach methods to enhance census response?

·   Can unduplication and matching errors be accounted for in modeling frame coverage in censuses and sample surveys?

·   How can small-area or other model-based methods be used to improve interval estimates in sample surveys, to design survey collection methods with lowered costs, or to improve Census Bureau imputation methods?

·   Can classical methods in nonparametrics (e.g., using ranks) improve estimates from sample surveys?

·   How can we measure and present uncertainty in rankings of units based on sample survey estimates?

·   Can data from sources other than censuses and sample surveys be used to improve results from censuses and sample surveys?

·   How to develop and use bootstrap methods for expressing uncertainty in estimates from probability sampling?

 

Current Subprojects:

·   Integration of data from probability and nonprobability samples (Wright, Chen, Mulry, Ikeda)

·   The Ranking Project: Methodology Development and Evaluation (Wright, Yau, Wieczorek/Colby College, Hall)

·   Optimal Allocation Methods: Sample Allocation and Apportionment (Wright)

·   Replication methods for variance estimation: understanding successive difference replication and bootstrapping (Joyce, Wright)

 

Potential Applications:

·   Improve estimates and reduce costs for household sample surveys by introducing new design and estimation methods, possibly to compensate for smaller sample sizes.

·   Provide a synthesis of the effect of nonsampling errors on estimates of net census coverage error, erroneous enumerations, and omissions and identify the types of nonsampling errors that have the greatest effects. Employ administrative records and other data sources to improve the estimates from probability samples.

·   Measure and report uncertainty in estimated rankings in household and economic sample surveys.

·   Develop bootstrap methods for expressing uncertainty as an alternative source of published variance estimates and as a check on existing methods of producing variances in Census Bureau sample surveys.

 

Accomplishments (October 2020-September 2024):

·   Published a document providing a high-level discussion of the research and methodology underlying the use of administrative records in the 2020 Census enumeration in households living in housing units t some address in Nonresponse Follow-up.

·   Analyzed and published empirical results on reliability of the TopDown Algorithm (TDA) output using the 2020 Census redistricting data production settings version (epsilon = 17.14) of the TDA for all block groups (proxy for districts) and other geographic areas in the United States. Empirical results pointed to a minimum TOTAL population for providing reliable counts for different geographic areas.

·   Published on the Census Bureau’s website two “interactive” Research Data Visualizations as part of The Ranking Project (Comparisons of A State with Each Other State/Estimated Ranking of All States). For each of 89 Topics and the years 2018, 2019, 2021, and 2022 (now also 2025) based on published/official American Community Survey 1-year data, the first visual shows statistical comparisons of a state with each of the other states; and using the same data, the second visual shows statistical uncertainty in the overall estimated ranking of all fifty-one states (includes DC) using a novel joint confidence region.

·   Published a paper documenting the statistical research and development for the Section 203, Voting Rights Act, 2021 Determinations identifying which of nearly 8,000 jurisdictions must provide voting materials in languages in addition to English.

·   Developed methods, some novel, for handling missing values in poststratification variables and variables used in the probability (RDD) and nonprobability (web-panel-based) 2019-2020 Tracking Surveys on attitudes to the decennial census.

·   Completed draft of a paper to tighten a joint confidence region for an overall estimated ranking of K populations by optimal allocation of an overall sample size among the K populations.

·   Completed and published a somewhat longer research report than Chao (1982) aiming to provide more details, clarity, and proofs with title “Understanding Chao’s Method of Probability Proportional to Size Sampling.”

·   Published a research report presenting a new joint confidence region (DIFF) based on a family of confidence intervals for differences of two (k and k*) population parameters; proved a condition under which DIFF shows no greater uncertainty than the uncertainty of the INDI joint confidence region.

·   Published a short article showing the complete details of the 2020 apportionment computations.

·   Published an article demonstrating how several well-known mathematical and statistical results can be derived easily using Lagrange’s Identity (1773).

 

Short-Term Activities (FY 2025 – FY 2027):

·   Study literature and undertake empirical studies, evaluations, or simulations integrating probability & nonprobability methods.

·   Study and document replication methods for estimating variances.

·   Improve methodology for measuring uncertainty in rankings.

·   Extend methodology for exact optimal sample allocation and apportionment.

 

Longer-Term Activities (beyond FY 2027):

·   Study literature and undertake empirical studies, evaluations, or simulations integrating probability & nonprobability methods.

·   Study and document replication methods for estimating variances.

·   Improve methodology for measuring uncertainty in rankings.

·   Extend methodology for exact optimal sample allocation and apportionment.

 

Selected Publications (Journal Articles, Peer Review):

Wright, T. (2025). “Optimal Tightening of the KWW Joint Confidence Region for a Ranking,” Statistics and Probability Letters, Vol 217, 110288, https://doi.org/10.1016/j.spl.2024.110288.

Joyce, P. and McElroy, T. (2024). “Modeling Survey Time Series Data with Flow-Observed CARMA Processes,” Journal of Official Statistics, Vol. 40(4), 601-632, DOI: 10.1177/0282423X241286236

Mulry, M.H., Tello-Trillo, C.J., Mule, V.T., and Keller, A. (2024). “Comparison of administrative Records – Rosters to Census Self Responses and Nonresponse Follow-up Responses,” Statistical Journal of the International Association of Official Statistics, Vol 40, No. 1, 41-52.

Slud, E., Hall, A., and Franco, C. (2024). “Small Area Estimates for Voting Rights Act Section 203(b) Coverage Determinations,” Calcutta Statistical Association Bulletin, 76(1), 137-159, https://doi/10.1177/00080683231215985.

Mulry, M.H. and Mule, V.T. (2022). “Advances in the Use of Capture-Recapture Methodology in the Estimation of U.S. Census Coverage Error,” In Recent Advances on Sampling Methods and Educational Statistics. In Honor of S. Lynne Stokes. Editors Hon Keung Tony Ng and Daniel F. Heitjan, 93–116, ISSN 2524-7735, https://doi.org/10.1007/978-3-031-14525-4

Nayak, T.K. (2021). “A Review of Rigorous Randomized Response Methods for Protecting Respondent's Privacy and Data Confidentiality,” in Methodology and Applications of Statistics: A Volume in Honor of C.R. Rao on the Occasion of his 100th Birthday, ed. B.C. Arnold, N. Balakrishnan and C.A. Coelho, New York: Springer, pp. 319-341.

Wright, T. (2021). “From Cauchy-Schwartz to the House of Representatives: Application of Lagrange’s Identity,” Mathematics Magazine, Vol 94, 244-256.

Mulry, M., Bates, N., and Virgile, M. (2021). “Viewing Participation in Censuses and Surveys through the Lens of Lifestyle Segments,” (print) Journal of Survey Statistics and Methodology, doi:1093/jssam/smaa006.

Zhai, X., and Nayak, T.K. (2021). “A Post-randomization Method for Rigorous Identification Risk Control in Releasing Microdata,” Journal of Statistical Theory and Practice, 15, Article 8, https://doi.org/10.1007/s42519-020-00143-2.

Wright, T. (2020). “A General Exact Optimal Sample Allocation Algorithm: With Bounded Cost and Bounded Sample Sizes,” Statistics and Probability Letters, Vol 165, Article 108829.

Klein M., Wright, T., and Wieczorek, J. (2020). “A Joint Confidence Region for an Overall Ranking of Population,” Journal of the Royal Statistical Society, Series C, 69, Part 3, 589-606.

Franco, C., Little, R., Louis, T., and Slud, E. (2019). “Comparative Study of Confidence Intervals for Proportions in Complex Sample Surveys,” Journal of Survey Statistics and Methodology, 7, 334-364.

Slud, E. and Thibaudeau, Y. (2019). “Multi-Outcome Longitudinal Small Area Estimation, A Case Study,” Statistical Theory and Related Fields. Special Issue on Small Area Estimation, 3, 136-149.

Wright, T., Klein, M., and Wieczorek, J. (2019). “A Primer on Visualizations for Comparing Populations, Including the Issue of Overlapping Confidence Intervals,” The American Statistician, Vol 73, No 2, 165-178.

Chai, J. and Nayak, T. (2018). “A Criterion for Privacy Protection in Data Collection and its Attainment via Randomized Response Procedures,” Electronic Journal of Statistics 12 (2), 4264-4287.

de Oliveira, V., Wang, B., and Slud, E. (2018). “Spatial Modeling of Rainfall Accumulated over Short Periods of Time,” Journal of Multivariate Analysis, 166, 129-149.

Lu, B. and Ashmead, R. (2018). “Propensity Score Matching Analysis for Causal Effects with MNAR Covariates,” Statistica Sinica, 28, 2005-2025.

Mulry, M.H, Kaputa, S., and Thompson, K. (2018). “Initial M-estimation Parameter Settings for Detection and Treatment of Influential Values,” Journal of Official Statistics, 34(2). 483–501. http://dx.doi.org/10.2478/JOS-2018-0022

Nayak, T., Zhang, C., and You, J. (2018). “Measuring Identification Risk in Microdata Release and Its Control by Post‐randomisation,” International Statistical Review, 86 (2), 300-321.

Slud, E., Vonta, I., and Kagan, A. (2018). “Combining Estimators of a Common Parameter across Samples,” Statistical Theory and Related Fields, 2, 158-171.

Wright, T. (2018). “No Calculation When Observation Can Be Made,” in A.K. Chattopadhyay and G. Chattopadhyay (Eds), Statistics and Its Applications, Springer Singapore, 139-154.

Ashmead, R., Slud, E., and Hughes, T. (2017). “Adaptive Intervention Methodology for Reduction of Respondent Contact Burden in the American Community Survey,” Journal of Official Statistics, 33(4), 901-919.

Mulry, M.H. and Keller, A. (2017). “Comparison of 2010 Census Nonresponse Follow-up Proxy Responses with Administrative Records Using Census Coverage Measurement Results,” Journal of Official Statistics, 33(2), 455–475. DOI: https://doi.org/10.1515/jos-2017-0022

Mulry, M.H., Nichols, E. M., and Hunter Childs, J. (2017). “Using Administrative Records Data at the U.S. Census Bureau: Lessons Learned from Two Research Projects Evaluating Survey Data,” in Biemer, P.P, Eckman, S., Edwards, B., Lyberg, L., Tucker, C., de Leeuw, E., Kreuter, F., and West, B.T. Total Survey Error in Practice. Wiley. New York. 467-473.

Thibaudeau, Y., Slud, E., and Gottschalck, A. (2017). “Modeling Log-linear Conditional Probabilities for Estimation in Surveys,” Annals of Applied Statistics, 11 (2), 680-697.

Wieczorek, J. (2017). “Ranking Project: The Ranking Project: Visualizations for Comparing Populations,” R package version 0.1.1. URL: https://cran.r-project.org/package=RankingProject.

Wright, T. (2017). “Exact Optimal Sample Allocation: More Efficient Than Neyman,” Statistics and Probability Letters, 129, 50- 57.

Mulry, M.H., Nichols, E.M., and Childs, J. Hunter (2016). “A Case Study of Error in Survey Reports of Move Month Using the U.S. Postal Service Change of Address Records,” Survey Methods: Insights from the Field. Retrieved from http://surveyinsights.org/?p=7794

Mulry, M.H., Oliver, B., Kaputa, S., and Thompson, K.J. (2016). “Cautionary Note on Clark Winsorization.” Survey Methodology 42 (2), 297-305. http://www.statcan.gc.ca/pub/12-001-x/2016002/article/14676-eng.pdf

Nayak, T. and Adeshiyan, S. (2016). “On Invariant Post‐randomization for Statistical Disclosure Control,” International Statistical Review, 84 (1), 26-42.

Nayak, T., Adeshiyan, S. and Zhang, C. (2016). “A Concise Theory of Randomized Response Techniques for Privacy and Confidentiality Protection,” Handbook of Statistics, 34, 273-286.

Nagaraja, C. and McElroy, T. (2015). “On the Interpretation of Multi-Year Estimates of the American Community Survey as Period Estimates.” Published online, Journal of the International Association of Official Statistics.

Hogan, H. and Mulry, M. H. (2014). “Assessing Accuracy of Postcensal Estimates: Statistical Properties of Different Measures,” in N. Hogue (Ed.), Emerging Techniques in Applied Demography. Springer. New York.

Joyce, P., Malec, D., Little, R., Gilary, A., Navarro, A., and Asiala, M. (2014). “Statistical Modeling Methodology for the Voting Rights Act Section 203 Language Assistance Determinations,” Journal of American Statistical Association, 109 (505), 36-47.

Mulry, M. H. (2014). “Measuring Undercounts in Hard-to-Survey Groups,” in R. Tourangeau, N. Bates, B. Edwards, T. Johnson, and K. Wolter (Eds.), Hard-to-Survey Populations. Cambridge University Press, Cambridge, England.

Mulry, M.H., Oliver, B.E., and Kaputa, S.J. (2014) “Detecting and Treating Verified Influential Values in a Monthly Retail Trade Survey,” Journal of Official Statistics, 30(4), 1–28.

Shao, J., Slud, E., Cheng, Y., Wang, S., and Hogue, C. (2014). “Theoretical and Empirical Properties of Model Assisted Decision- Based Regression Estimators,” Survey Methodology 40(1), 81-104.

Tang, M., Slud, E., and Pfeiffer, R. (2014). “Goodness of Fit Tests for Linear Mixed Models,” Journal of Multivariate Analysis, 130, 176-193.

Wright, T. (2014). “Lagrange’s Identity and Congressional Apportionment,” The American Mathematical Monthly, 121, 523-528.

Wright, T. (2012). “The Equivalence of Neyman Optimum Allocation for Sampling and Equal Proportions for Apportioning the U.S. House of Representatives,” The American Statistician, 66 (4), 217-224.

Klein, M. and Wright, T. (2011). “Ranking Procedures for Several Normal Populations: An Empirical Investigation,” International Journal of Statistical Sciences, Volume 11 (P.C. Mahalanobis Memorial Special Issue), 37-58.

 

Selected Publications (CSRM Research Reports, CSRM Studies, Proceedings Papers, and Other):

Wright, T. (2024b). “Joint Confidence Region for a Ranking Based on Differences,” Research Report Series (Statistics #2024-03), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T. (2024a). “Understanding and Optimal Tightening of the KWW Joint Confidence Region for a Ranking,” Research Report Series (Statistics # 2024-01), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T. (2023). “Understanding Chao’s Method of Probability Proportional to Size Sampling,” Research Report Series (Statistics # 2023-05), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Mulry, M. (2023). “Comparisons of Administrative Record Rosters to Census Self-Responses and NRFU Household Member Responses,” Research Report Series (Statistics # 2023-01), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Slud, E. and Morris, D. (2022). “Methodology and Theory for Design-Based Calibration of Low-Response Household Surveys with Application to the Census Bureau 2019-2020 Tracking Survey,” Research Report Series (Statistics # 2022-03), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Slud, E., Franco, C., Hall, A., and Kang, J. (2022). “Statistical Methodology (2021) for Voting Rights Act, Section 203 Determinations,” Research Report Series (Statistics # 2022-06), Center for Statistical Research & Methodology, U. S. Census Bureau, Washington, D.C.

Wright, T. and Irimata, K. (August 5, 2021). “Empirical Study of Two Aspects of the TopDown Algorithm Output for Redistricting: Reliability & Variability (August 5, 2021 Update),” Study Series (Statistics # 2024-02), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T. and Irimata, K. (May 28, 2021). “Empirical Study of Two Aspects of the TopDown Algorithm Output for Redistricting: Reliability & Variability,” Study Series (Statistics # 2024-01), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Trudell, T., Dong, K., Slud, E., and Cheng, Y. (In Press). “Computing Replicated Variance for Stratified Systematic Sampling,” Proceedings of the Survey Research Methods Section of the American Statistical Association.

Wright, T. and Irimata, K. (2020). “Variability Assessment of Data Treated by the TopDown Algorithm for Redistricting,” Study Series (Statistics # 2020-02), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T., Klein, M., and Slud, E. (2020). “A Deterministic Retabulation of Pennsylvania Congressional District Profiles from 115th Congress to 116th Congress,” Study Series (Statistics # 2020-01), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T. (2019). “Direct Proof of Exact Sample Allocation Optimality with Cost Constraints,” Research Report Series (Statistics # 2019-03), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Dong, K., Trudell, T., Slud, E., and Cheng, Y. (2018). “Understanding Variance Estimator Bias in Stratified Two-Stage Sampling,” Proceedings of the Survey Research Methods Section of the American Statistical Association.

Klein, M., Wright, T., and Wieczorek, J. (2018). “A Simple Joint Confidence Region for A Ranking of K Populations: Application to American Community Survey’s Travel Time to Work Data,” Research Report Series (Statistics #2018-04), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Slud, E., Ashmead, R., Joyce, P., and Wright, T. (2018). “Statistical Methodology (2016) for Voting Rights Act, Section 203 Determinations,” Research Report Series (Statistics # 2018-12), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Ashmead, R. and Slud, E. (2017). “Small Area Model Diagnostics and Validation with Applications to the Voting Rights Act Section 203,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Slud, E. and Ashmead, R. (2017). “Hybrid BRR and Parametric-Bootstrap Variance Estimates for Small Domains in Large Surveys,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Wright, T. (2016). “Two Optimal Exact Sample Allocation Algorithms: Sampling Variance Decomposition Is Key,” Research Report Series (Statistics #2016-03), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Mulry, M. (2016). “Using 2010 census Coverage Measurement Results to Compare census Nonresponse Follow-up Proxy Responses with Administrative Records,” Research Report Series (Statistics # 2016-04), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Slud, Eric. (2015). “Impact of Mode-based Imputation on ACS Estimates,” American Community Survey Research and Evaluation Memorandum, #ACS-RER-O7.

Hunley, Pat. (2014). “Proof of Equivalence of Webster’s Method and Willcox’s Method of Major Fractions,” Research Report Series (Statistics #2014-04), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T. (2014). “A Simple Method of Exact Optimal Sample Allocation under Stratification with Any Mixed Constraint Patterns,” Research Report Series (Statistics #2014-07), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T., Klein, M., and Wieczorek, J. (2014). “Ranking Populations Based on Sample Survey Data,” Research Report Series (Statistics # 2014-12), Center for Statistical Research & Methodology, U.S. Census Bureau, Washington, D.C.

Franco, C., Little, R., Louis, T., and Slud, E. (2014). “Coverage Properties of Confidence Intervals for Proportions in Complex Sample Surveys,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Griffin, D., Slud, E., and Erdman, C. (2014). “Reducing Respondent Burden in the American Community Survey's Computer Assisted Personal Visit Interviewing Operation - Phase 3 Results,” ACS Research and Evaluation Memorandum #ACS 14- RER-28.

Slud, E., Grieves, C., and Rottach, R. (2013). “Single Stage Generalized Raking Weight Adjustment in the Current Population Survey,” Proceedings of Survey Research Methods Section, American Statistical Association, Alexandria, VA.

Wright, T. (2013). “A Visual Proof, a Test, and an Extension of a Simple Tool for Comparing Competing Estimates,” Research Report Series (Statistics #2013-05), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Wright, T., Klein, M., and Wieczorek, J. (2013). “An Overview of Some Concepts for Potential Use in Ranking Populations Based on Sample Survey Data,” 2013 Proceedings of the World Congress of Statistics (Hong Kong), International Statistical Institute.

Ikeda, M., Tsay, J., and Weidman, L. (2012). “Exploratory Analysis of the Differences in American Community Survey Respondent Characteristics between the Mandatory and Voluntary Response Methods,” Research Report Series (Statistics #2012-01), Center for Statistical Research & Methodology, U.S. Census Bureau, Wash. D.C.

Slud, E. and Thibaudeau,Y. (2010). Simultaneous Calibration and Nonresponse Adjustment,” Research Report Series (Statistics#2010-03), Statistical Research Division, U.S. Census Bureau, Washington, D.C.

 

Contact:

Tommy Wright, Mary Mulry, Michael Ikeda, Patrick Joyce, Sixia Chen (ASA/NSF/Census Research Fellow/University of Oklahoma Health Sciences)

 

Funding Sources for FY 2025-2030:

0331 – Working Capital Fund / General Research Project

Various Decennial, Demographic, and Economic Projects

Related Information


Page Last Revised - July 16, 2025