U.S. flag

An official website of the United States government

Skip Header


Research to Improve Data on Race and Ethnicity

Background

The United States Census Bureau has a long history of conducting research to improve questions and data on race and ethnicity. Since the 1970s, the Census Bureau has conducted content tests to research and improve the design and function of different questions, including questions on race and ethnicity. Since the 1980 Census, the Census Bureau has collected race and ethnicity data following U.S. Office of Management and Budget (OMB) guidelines, and these data are based upon self-identification.

One challenge these data collections have faced in recent decades is that Americans view “race” and “ethnicity” differently now in the 21st Century than in decades past. Since 1980, Census Bureau research has found an increasing number of people found the separate ethnicity and race questions categories confusing or expressed a desire to see their own specific group on the census questionnaire. Our research has also found that over time, there have been a growing number of people who do not identify with any of the official OMB race categories as prescribed in the original 1977 standards or the revised 1997 standards, and this led to an increasing number of respondents who were racially classified as “Some Other Race.”

In fact, in 2000 and in 2010, the Some Other Race (SOR) population, which was intended to be a small residual category, was the third largest race group. This was primarily due to reporting by Hispanics, who make up the overwhelming majority of those classified as SOR, not identifying with any of the OMB race categories. In addition, segments of other populations, such as Afro-Caribbean and Middle Eastern or North African populations, did not identify with any of the OMB race categories and identified as SOR.

Research on Race and Ethnicity

The Census Bureau’s 2010 Census Alternative Questionnaire Experiment (AQE) and the 2015 National Content Test (NCT)

Taking note of this, throughout the decade of the 2010s, Census Bureau researchers explored different strategies for improving respondent understanding of the questions, as well as improving the accuracy of the resulting data produced on race and ethnicity. This research began in 2008, with the design of the 2010 Census Alternative Questionnaire Experiment (AQE) Research on Race and Hispanic Origin, which at the time was the most comprehensive research effort on race and Hispanic origin ever undertaken by the Census Bureau. In 2012, the AQE research was completed, and the results demonstrated promising strategies that combined race and ethnicity into one question and addressed challenges and complexities of race and Hispanic origin measurement and reporting.

While the 2010 AQE research set the foundation, we still needed additional empirical research to test prospective question designs for the content of the 2020 Census, particularly with the new emphasis on using web-based designs for data collection. Thus, throughout 2014 and 2015, our Census Bureau research team shared and discussed plans for testing different question designs, and participated in numerous public dialogues about the research plans to obtain community feedback. The ultimate goal of this research would be to improve the question design and data quality for the 2020 Census, while addressing community concerns that we have heard over the past several years, including the call for more detailed, disaggregated data for our diverse American experiences as German, Mexican, Korean, Jamaican, and myriad other identities. This research effort culminated into the 2015 National Content Test (NCT), which was conducted to explore ways to improve our race/ethnicity questions, to better measure and represent our nation's myriad racial/ethnic identities, and build upon extensive research on race and ethnicity previously conducted by the Census Bureau.

Results from 2015 NCT Research

In October 2016, we began discussing the 2015 NCT’s preliminarily results on race and ethnicity with the public. The 2015 NCT results built upon the 2010 AQE results, showing no changes to distributions for major groups; obtaining decreased reporting of “Some Other Race;” achieving lower item nonresponse for the combined race/ethnicity question than for the separate race and ethnicity questions; and gaining higher overall consistency of race/ethnicity reporting for Hispanics. The 2015 NCT also yielded a very important improvement on the 2010 AQE research – by obtaining the same or higher levels of detailed reporting across all groups, including for Hispanics and Asians, through the use of an innovative combined question design with multiple detailed checkboxes and write-in response areas.

Further, our NCT research explored ways to collect and tabulate data for respondents of Middle Eastern or North African (MENA) heritage. During the 1990s, as part of the public comment process for the 1997 OMB Standards, OMB received a number of requests to add an ethnicity category for Arabs and Middle Easterners to the minimum collection standards, but OMB encouraged further research on how to collect and improve data on this population. The 2010 AQE was part of the research effort on how to collect and improve data for the MENA population, as findings from AQE focus groups revealed that a number of MENA participants did not see themselves in the current race and ethnicity response categories, and focus group participants often recommended a separate Middle Eastern, North African, or Arab category.

The Census Bureau also conducted extensive outreach with MENA community leaders and experts leading up to the 2020 Census about the development of a MENA category. In response to a December 2014 Federal Register Notice about the plans for the 2015 NCT, the Census Bureau received thousands of public comments from the MENA community supporting the testing of a MENA category. These sentiments were echoed during a May 2015 Census Bureau Forum on Ethnic Groups from the Middle East and North Africa, where over 30 MENA experts were updated on the 2015 NCT plans for testing a MENA category and invitees shared their feedback on a potential MENA category. The experts provided feedback on the term “Middle Eastern or North African,” as well as the Census Bureau’s working classification of MENA and potential tabulations of MENA responses to the question(s) on race and ethnicity in the 2020 Census.

These findings and ongoing dialogues with stakeholders led to the testing of a separate Middle Eastern or North African category in the 2015 NCT. The NCT research findings show that the use of a distinct MENA category elicits higher quality data; and people who identify as MENA use the MENA category when it is available, whereas they have trouble identifying as only MENA when no category is available.

2016 American Community Survey Content Test

From February to June of 2016, the U.S. Census Bureau conducted the 2016 American Community Survey (ACS) Content Test. While the optimal design for collecting data on race and ethnicity was determined by the 2015 NCT, the 2016 ACS Content Test served as an operational test of the concepts that were investigated in the 2015 NCT. The 2016 ACS Content Test provided an opportunity to test additional data collection modes and to examine contextual data from the ACS characteristic variables. Specifically, the 2016 ACS Content Test evaluated interviewer-administered collection modes, assessed the race and ethnicity questions against demographic and socioeconomic data, and separately compared the race and ethnicity results to data from the ancestry question.

The 2016 ACS Content Test results for race and ethnicity confirmed the results from the 2010 AQE and the 2015 NCT in that a combined question format and use of the MENA category results in higher data quality for race and ethnicity. Additionally, the 2016 ACS Content Test indicated that quality race and ethnicity data can be collected in the ACS environment using a combined question format and MENA category.

2017 Census Test

The 2017 Census Test was a nationwide self-response test that allowed the U.S. Census Bureau to assess the feasibility of collecting information on tribal enrollment, which is distinct from American Indian and Alaska Native racial identification. The 2017 Census Test consisted of two parts, an initial self-response survey and a follow up reinterview component. The reinterview component further assessed the consistency of the self-response tribal enrollment questions.

Although the original intent of this study was to compare self-reported tribal enrollment responses with tribal enrollment records, this type of analysis was not possible. As such, conclusions could not be made about the validity of self-report tribal enrollment data. Rather, the findings from the 2017 Census Test assessed the feasibility of collecting tribal enrollment data in a census environment. Concerns from the Census National Advisory Committee, from the National Congress of American Indian, and from tribal leaders regarding collecting tribal enrollment in a census environment ultimately led to the decision to not include any tribal enrollment questions in the 2020 Census.

2018 Census Test

The Census Bureau’s research leading up to the 2020 Census identified that a combined race and ethnicity question with multiple detailed checkboxes and a dedicated Middle Eastern or North African category is the optimal design for improving race and ethnicity data, in comparison with designs which use two separate questions. This approach was strongly supported by myriad stakeholders and organizations during OMB’s review of potential revisions to 1997 SPD 15. However, the Census Bureau does not make a unilateral decision on the content of the Census. In fact, determining the content for a census is an extensive undertaking with a three‐pronged approach involving empirical research, outreach, and engagement with stakeholders, and ultimately the review and approval from the U.S. Office of Management and Budget and the United States Congress.

In accordance with the 1997 OMB Standards, the 2018 End-to-End Census Test and the 2020 Census used two separate questions for collecting data on race and ethnicity. However, improvements to the coding, editing, and processing of race and ethnicity data collected through the separate questions designs were implemented in the 2018 End-to-End Census Test and the 2020 Census. These improvements included collecting multiple Hispanic ethnicities such as Mexican and Puerto Rican; adding a write-in area and examples for the White racial category and for the Black or African American racial category; removing the term “Negro”; and adding examples for the American Indian or Alaska Native racial category.

2020 Census Findings

In the 2020 Census, the Census Bureau collected ethnicity and race data in accordance with the 1997 SPD 15 framework, which required two separate questions on ethnicity and race. Our research and public feedback over the past decade illuminated strong interest from respondents to be able to self‐identify their detailed racial/ethnic background, such as German, Lebanese, Mexican, Jamaican, Nigerian, Chinese, Navajo, Samoan, etc. The 1997 SPD 15 encouraged this collection of detailed responses, and to address this, new examples and write‐in areas were added to the 2020 Census ethnicity question and race question to give respondents from all backgrounds the opportunity to self‐identify their racial/ethnic identities in the 2020 Census.

For the first time in a decennial census race question, the 2020 design collected detailed responses for the White population and for the Black or African American population. Through these improvements, we collected detailed responses for all major categories (Hispanic, White, Black or African American, Asian, American Indian or Alaska Native, Native Hawaiian or Other Pacific Islander, and Some Other Race). In turn, this provided the ability to produce detailed tabulations for myriad population groups in the United States, such as German, Lebanese, Mexican, Jamaican, Nigerian, Chinese, Navajo, Samoan, Brazilian, etc. These updates for 2020 enabled a more thorough and accurate depiction of how people self‐identify, yielding a more accurate portrait of how people report their ethnicity and race within the context of a two‐question format.

Related Information

Page Last Revised - December 20, 2024
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header