Skip Header

Empirical Study of Two Aspects of the Topdown Algorithm Output for Redistricting: Reliability & Variability (August 5, 2021 Update)

SSS2021-02
Tommy Wright and Kyle Irimata

Abstract

This two-part study provides an update of empirical results for ongoing research and development that were reported in [6]. In this update, data output from the same version of the TopDown Algorithm that will produce the “2020 Census Redistricting Data (Public Law 94-171) Summary File" are reported in the tables and figures. Except for wording changes due to changes in the data output, the wording throughout is the same as in [6]. The TopDown Algorithm (TDA) [1] is being used to protect the confidentiality of respondent data collected during the 2020 Census. Following the 2010 Census, the swapping methodology (SWA) [7] was applied to respondent data to protect confidentiality.

In Part I, we propose an empirically based solution to the question: “What is the minimum TOTAL population of a district to have reliable characteristics of various demographic groups?" To answer this question, we use data treated by the 2020 Census redistricting data production settings version (ϵ = 17.14, for the person file) of the TDA for all block groups (proxy for districts) in the United States. We also consider “places and minor civil divisions (MCDs)" as proxies for districts. Empirical results suggest a minimum TOTAL that is between 450 and 499 people in a block group provides reliable characteristics of various demographic groups in a block group based on the TDA. Similarly, a minimum TOTAL that is between 200 and 249 is observed to provide reliable characteristics for places and MCDs. No Congressional or state legislative district failed our test for reliability. It is important to keep in mind that these results are comparisons to the swapped 2010 Census data. They do not evaluate the reliability relative to the actual enumeration in 2010 because the 2010 redistricting data contained statistical uncertainty due to swapping.

Part II is an update of our results reported in [6] where ϵ = 10.3 with the difference being that this study uses ϵ = 17.14. The objective here is to assess the variability of data results from application of the 2020 Census redistricting data production settings version TDA to the 2010 Census Edited File (2010 CEF) for Rhode Island and for three additional jurisdictions. Our approach has two parts: (1) to report observations on variability of results among 25 runs of the TDA and (2) to report observations on variability between the results among the 25 runs of the TDA and the published 2010 Census Public Law 94-171 data. We observe that variability in data results from the TDA increases as we consider smaller pieces of geography and population. Variability with the 2020 Census redistricting data production settings version of the TDA (ϵ = 17.14) tends to be less than what we reported in [6] with the 2021-04-28 version where ϵ = 10.3.

You May Be Interested In

Top

Back to Header