U.S. flag

An official website of the United States government

Skip Header

Update on 2020 Census Data Processing and Quality

Written by:

Bottom Line Up Front

The Census Bureau has begun processing the data collected for the 2020 Census. Data collection for the decennial census is always a herculean task and 2020 was no exception. To the usual list of challenges were added the COVID-19 pandemic, hurricanes, wildfires, civil unrest, and a condensed schedule.  Through the hard work of thousands of dedicated employees, both temporary and permanent, the Census Bureau was able to overcome these challenges and count nationally 99.98% of addresses included in the 2020 Census. Fundamental to this operational success was the hugely successful deployment of new technologies to aid the count that yielded a more accurate address list from which to work, and tools that dramatically increased the productivity of our enumerators in the field. Some have suggested that these challenges would lead to an inaccurate or even a failed census. Our analysis of the data is just underway, and as with all prior censuses we’re seeing and working through data quality issues as we prepare the data for tabulation. As I discuss further below, some issues appear to be pandemic related, but most are what we experience with every decennial census and other Census Bureau surveys. Importantly, we’ve not uncovered anything so far that would suggest that the 2020 Census will not be fit for its constitutional and statutory purposes.

COVID-19’s Impact on the Schedule for the 2020 Census

The decennial census is comprised of many operations that require years of planning and coordination. Data collection commenced as planned in January 2020 with the Remote Alaska component of our Update Enumerate operations. This was also the time we were learning of the first U.S. case of COVID-19. We began collecting responses via the internet from most U.S. households on March 12, the day after the World Health Organization declared a pandemic and just a few days before the president issued a two-week stay-at-home order. Twice in March, we announced two-week delays in 2020 Census field operations including the onboarding and training of thousands of temporary enumerators. In April, we announced a plan to recommence field operations on June 1. This plan essentially shifted the original schedule by three months with field data collection ending on Oct. 31 rather than July 31. Because of this, we requested an extension to the deadlines for the delivery of statutorily mandated 2020 Census data products.

However, as the summer progressed it became clear that an extension would not be granted, forcing us to crash the schedule to find a way to comply with the Dec. 31, 2020, deadline for delivery of the apportionment counts. To accomplish this, we needed to adjust the plans for both field data collection and for post-collection data processing. The bulk of work for field data collection consists of sending enumerators to households that have not already self-responded. For the 2020 Census, over a third of the approximately 152 million residential addresses in the country required one or more visits by our enumerators. Put simply, the time needed to complete the Nonresponse Followup (NRFU) workload is a function of how many cases enumerators can complete each hour (i.e., enumerator productivity) and the total number of hours worked. In 2010, enumerators recorded responses on paper forms. In 2020, we deployed iPhones on which enumerators received optimally routed daily case assignments, entered responses, and tracked their hours and mileage expenses. From our 2018 Census Test in Providence, R.I., we knew that this technology upgrade substantially improved productivity such that we expected enumerators to complete 1.55 cases per hour versus 1.05 in 2010. Next, we deployed pay and bonus incentives designed to both get more enumerators hired and trained and to incentivize them to work more than the average of 19 hours per week. While we were somewhat successful in hiring enumerators to replace the usual attrition, we were less successful in encouraging folks to work additional hours per week. What did help, however, was that realized productivity was much higher than predicted. Our iPhone-enabled enumerators finished data collection by completing an average of 1.92 cases per hour — nearly double the productivity of their predecessors in 2010. 

Another key innovation area for the 2020 Census were improvements to our Master Address File (MAF) that featured strong partnerships with tribal, federal, state and local agencies and the use of modern Geographic Information Systems (GIS) to ensure we have a complete and accurate list of addresses to enumerate. Using GIS tools, we were able to reduce our expensive in-field Address Canvassing operation from 100% of addresses in 2010 to only 35% of addresses in 2020. Our accurate and updated MAF enabled us to make it easier for residents to respond through what we call “NonID response,” where a housing unit ID is not required to respond. We’ve received over 22 million such responses and have been able to match the address the respondents provided to the MAF in over 90% of the cases.

This successful deployment of technology was critical in enabling us to complete the count during the pandemic. Not only did technology make our enumerators more productive, it allowed us to adjust to conditions in the field much more efficiently than in the past. Being able to manage cases daily and message enumerators in real time allowed managers in our area census offices to adjust operations efficiently when confronted with many challenges mother nature threw at us during this year. It also aided our efforts to redeploy enumerators to areas lagging in the count. We redeployed over 26,000 enumerators outside their home areas and in many cases to different states. This included getting an extra 1,530 enumerators to Louisiana — hard hit by hurricanes Laura and Delta. This and the hard work of our incredibly dedicated field staff enabled us to get every state, the District of Columbia and Puerto Rico above the 99% completion rate and a national rate of 99.98%.

Early Impressions on the Quality of the 2020 Census

The processing of the data collected from a census is just as important to ensuring quality as is data collection itself and is a large and complex task on its own. It is during post-collection data processing that we ensure responses are coded to the correct location, that duplicate responses are removed, and that a final, complete universe of the U.S. population is created to which we can apply a variety of quality metrics. Prior to beginning post-collection data processing, we conducted an early review of the data as they were being collected, which was considerably more review than in past censuses. These early, real-time examinations of the data were not able to provide a definitive picture of the overall U.S. population, but did allow us to assess initial indicators of data quality and to identify and fix several data processing errors. This work is just starting and, as in past, will continue through the release of the results from our most rigorous and thorough quality assessment — the Post Enumeration Survey (PES). 

We’re seeing increased interest in these metrics from the media, stakeholders and professional organizations. The American Statistical Association issued a report discussing the metrics they would like to see, and I’ve reached out to them to discuss priorities and how they might assist the Census Bureau further. We are committed to publishing quality metrics as they become available, with the first major release coming in December with the release of our Demographic Analysis estimates that provide an independent assessment of the 2020 population counts.

It’s premature to definitively describe the quality of the 2020 Census or assess its fitness for use. But it is possible to look at some preliminary metrics to get a general sense of where we stand. I’ve already discussed one such metric — completion rates — which measures whether we can resolve the status of an address in our MAF and determine the number of residents living there. With a 99.98% completion rate, we know that we were able to get at least some basic information for nearly every known residential address in the country. Another metric is the self-response rate, which measures the percentage of addresses where a householder completed the census online, over the phone, or by returning a paper questionnaire. At 67%, the 2020 Census self-response rate is just higher than what we achieved in 2010. This is important as self-responses yield the most complete and accurate information.

Those households that do not self-respond are visited at least once by an enumerator. If our enumerators are unable to speak to a resident after one or more visits, we attempt to resolve that address with high quality administrative records that have accurate data for that address. This is new to the 2020 Census and has allowed us to accurately enumerate approximately 5.6% of the nation’s addresses. If we fail to contact a resident for an address that does not have high quality administrative data, after repeated visits, our enumerators will attempt to get basic information from a knowledgeable proxy such as a neighbor. The proxy rate is the share of all NRFU interviews conducted with proxy respondents. While still preliminary and subject to change as we continue to process and eliminate duplication of the data, the 2020 proxy rate of approximately 24% appears close to the 2010 rate. There is some evidence of pandemic impacts, however, as preliminary results show proxy rates in college towns are higher. Many residents of off-campus housing left town during the self-response phase that coincided with the first wave of the pandemic and were unlikely to return by the period when enumerators visited to follow up. To reiterate, the proxy rates are based upon early tabulations and a clearer, more definitive picture of the overall extent of proxy usage will emerge once all post-collection collection data processing has been completed.

A final early metric of quality is the completeness of responses. A challenge for all surveys is getting respondents to answer every question fully and accurately, and the decennial census is no exception.  Preliminary indications are that item nonresponse for questions on date of birth, sex, race and Hispanic origin are higher relative to 2010. Some observers were concerned that the compressed schedule would lead enumerators to accept more incomplete responses. However, we are seeing elevated item nonresponse rates for all response modes suggesting something else is at work. This will require additional analysis as processing progresses. But note the Census Bureau has well-established procedures for coping with missing items on the decennial census and its other surveys.

In summary, while there’s much more work to be done to assess the quality of the 2020 Census and its fitness for its constitutional and statutory uses, early indications are that, despite the many challenges that confronted in conducting the 2020 Census, we do not have any evidence yet of any unusual issues.  No census is perfect. The 2020 Census won’t be perfect either, but the imperfections seen thus far can be addressed in standard ways as in prior censuses.

Related blogs

Random Samplings Blog
2020 Census Operational Statistics on the Planning Database
The Census Bureau released the 2023 PDB, the first PDB to include operational statistics from the 2020 Census, including metrics on online self-response.

Random Samplings Blog
Updates to OMB’s Race/Ethnicity Standards
OMB published the results of its review of SPD 15 and issued updated standards for collecting and reporting race and ethnicity data across federal agencies.

Random Samplings Blog
Upcoming 2020 Census Coverage Estimates
The U.S. Census Bureau released coverage estimates for the 2020 Census.

Random Samplings Blog
The Post-Enumeration Survey: Measuring Coverage Error
Although we undertake extensive efforts to accurately count everyone in the decennial census, sometimes people are missed or duplicated.

Random Samplings Blog
Using Demographic Benchmarks to Help Evaluate 2020 Census Results
One of the primary methods of evaluating the quality of a census is comparing the results to other population benchmarks.

Random Samplings Blog
Programa de Evaluaciones y Experimentos del Censo del 2020
Este blog describe la serie de evaluaciones formales que miden diferentes aspectos de las operaciones del censo y los desafíos.

Random Samplings Blog
2020 Census Program for Evaluations, Experiments, and Assessments
This blog describes the series of formal evaluations and assessments that measure different aspects of census operations and specific challenges.

Random Samplings Blog
Improvements to the 2020 Census Race and Hispanic Origin Question Designs, Data Processing, and Coding Procedures
This blog discusses how we improved the census questions on race and Hispanic origin, also known as ethnicity, between 2010 and 2020.

Random Samplings Blog
Improvements to the 2020 Census Race and Hispanic Origin Question Designs, Data Processing, and Coding Procedures
This blog discusses how we improved the census questions on race and Hispanic origin, also known as ethnicity, between 2010 and 2020.

How We Complete the Census When Demographic and Housing Characteristics Are Missing
Although we strive to obtain all demographic and housing data from every individual in the census, missing data are part of every census process.

Random Samplings Blog
Censo del 2020: Métricas de calidad, Publicación 2
Este blog proporciona datos destacados del segundo grupo de métricas operacionales de calidad del Censo del 2020.

Random Samplings Blog
2020 Census Operational Quality Metrics: Release 2
Today we released the second round of 2020 Census operational quality metrics.

Random Samplings Blog
Examining Operational Quality Metrics
The Census Bureau is taking a multifaceted approach to studying the quality of the 2020 Census, so as to produce a more complete and informative picture.

Random Samplings Blog
Comparisons to Benchmarks as a Measure of Quality
Data quality is multidimensional and so approaching it from multiple angles produces a more insightful and holistic picture of a dataset.

Random Samplings Blog
2020 Census Data Review
For the 2020 Census, we are conducting one of the most comprehensive reviews in recent census history.

Random Samplings Blog
Revisión de los datos del Censo del 2020
En este blog hablamos sobre cómo estamos realizando una de las revisiones de datos más completas en la historia reciente del censo, para el Censo del 2020.

Random Samplings Blog
Completing the Census When Households or Group Quarters Don't Respond
As we continue to process 2020 Census responses, people have asked what happens when we don’t get a response from an address.

Random Samplings Blog
Cómo completamos el censo cuando los hogares no responden
Mientras continuamos procesando las respuestas al Censo del 2020, las personas han preguntado qué sucede cuando no obtenemos una respuesta de una dirección.

Random Samplings Blog
Administrative Records and the 2020 Census
Each decade we are asked, “Why don’t you just use the information the government already has about me for the census? Why ask me again?”

Random Samplings Blog
Los registros administrativos y el Censo del 2020
Este blog describe cómo el Censo del 2020 usó los registros administrativos para contar a las personas que no respondieron.

Census Operations
Introduction to Quality Indicators: Operational Metrics
In the coming weeks, the U.S. Census Bureau will release the first set of results from the 2020 Census. Our goal for every census is to count everyone once, only once, and in the right place.

Random Samplings Blog
2020 Census Group Quarters
As we continue processing 2020 Census results, we’d like to provide more information on how we count people living in group quarters (GQs).

Census Operations
Finding 'Anomalies' Illustrates 2020 Census Quality Checks Are Working
We’re in the midst of data processing for the 2020 Census. As Acting Census Bureau Director Ron Jarmin acknowledged in a recent blog, we’ve discovered some “anomalies” along the way that we’re looking into and resolving.

Random Samplings Blog
Encontrar ‘anomalías’ demuestra que los controles de calidad funcionan
El 9 de marzo de 2021, la Oficina del Censo de los EE. UU. publicó un blog (en inglés) sobre las “anomalías” que encontramos al procesar los datos del Censo del 2020.

Random Samplings Blog
Adapting Field Operations to Meet Unprecedented Challenges
As we process census responses and analyze the quality of the 2020 Census, it’s helpful to look back at some of the unprecedented challenges we faced during this census.

Random Samplings Blog
Adaptación de las operaciones de campo para enfrentar desafíos
La oficina del Censo de los EE. UU. compartió información en una publicación de blog el 1 de marzo de 2021, acerca de cómo la realización de un censo es una tarea enorme, incluso en circunstancias ideales.

Random Samplings Blog
Ensuring a Robust and Accurate Data Quality Analysis in the 2020 Census
Asking outside experts to review our work is standard operating procedure at the U.S. Census Bureau. It underscores our commitment to quality and transparency.

Random Samplings Blog
Timeline for Releasing Redistricting Data
We expect to deliver the redistricting data to the states and the public by Sept. 30, 2021.

Random Samplings Blog
Census Data Processing 101
Michael Thieme describes how census data processing works to ensure the census is accurate.

Director's Blog
2020 Census Processing Updates
I’m writing to provide an update on data processing for the 2020 Census.

Random Samplings Blog
Update on 2020 Census Data Processing and Quality
The Census Bureau has begun processing the data collected for the 2020 Census. Data collection for the decennial census is always a herculean task and 2020 was no exception.

Page Last Revised - October 8, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?


Back to Header