Estimated reading time: 8 minutes
Later this month, the U.S. Census Bureau plans to release the first results from the 2020 Census on race and ethnicity. These data will provide a snapshot of the racial and ethnic composition and diversity of the U.S. population as of April 1, 2020.
We will release the following measures of diversity to clearly present and analyze the complexity of the 2020 Census results compared to the 2010 Census results:
In this blog, we provide a preview of these measures and explain what each can tell you about the nation’s racial and ethnic composition and diversity.
The concept of “composition” refers to the racial and ethnic makeup of a population.
The concept of “diversity” refers to the representation and relative size of different racial and ethnic groups within a population, where diversity is maximized when all groups are represented in an area and have equal shares of the population.
First, it’s important to know how we collect and tabulate data on race and ethnicity in the 2020 Census. Like other statistical agencies, we follow standards on race and ethnicity set by the U.S. Office of Management and Budget (OMB) in 1997. These standards guide how the federal government collects and presents data on these topics. Per these standards:
For race, the OMB standards identify five minimum categories:
We use a sixth category, Some Other Race, for people who do not identify with any of the OMB race categories.
We tabulate statistics on people who report only one race in one of these six “race alone” categories, and we include people who report multiple races in the “Multiracial” population, also referred to as the “Two or More Races” population.
For ethnicity, the OMB standards classify individuals in one of two categories: “Hispanic or Latino” or “Not Hispanic or Latino.” We use the term “Hispanic or Latino” interchangeably with the term “Hispanic,” and also refer to this concept as “ethnicity.”
The OMB standards also emphasize that people of Hispanic origin may be of any race. In data tables, we often cross-tabulate the race and Hispanic origin categories to display Hispanic as a single category and the non-Hispanic race groups as categories summing up to the total population.
These diversity calculations require the use of mutually exclusive racial and ethnic (nonoverlapping) categories. For our analyses, we calculate the Hispanic or Latino population of any race as a category; each of the race alone, non-Hispanic groups as individual categories; and the Multiracial non-Hispanic group as a distinct category.
The following groups are used in the diversity calculations:
The recent blog Improvements to the 2020 Census Hispanic Origin and Race Questions and Coding Procedures describes how we code write-in responses into these standard categories.
One of the measures we will use to present the 2020 Census results is the Diversity Index, or DI. This index shows the probability that two people chosen at random will be from different race and ethnic groups.
The DI is bounded between 0 and 1, with a zero-value indicating that everyone in the population has the same racial and ethnic characteristics, while a value close to 1 indicates that everyone in the population has different characteristics.
We converted the probabilities into percentages to make the results easier to interpret. In this format, the DI tells us the chance that two people chosen at random will be from different racial and ethnic groups.
To illustrate how the DI works, we compare the composition of three hypothetical populations below.
In Figure 1, the population is made up of only two large and even groups. The DI for this population indicates that there is a 50% chance that two people chosen at random will be from different race and ethnic groups.
Figure 2, the second hypothetical example, shows a population with four equally sized groups where the DI is 75%. The chance that the two people come from different race and ethnic groups is increased, even though the size of each group is smaller than in the first example.
Figure 3 shows a hypothetical population with four unequally sized groups and a DI of 70%. Comparing Figure 2 and Figure 3, we see that the relative size of the racial and ethnic groups affects the DI score by decreasing the probability when some groups are larger than the others.
The DI for actual data from the 2010 Census for the United States and selected states illustrate how the metric can vary based on the distribution of the population by race and ethnicity. In 2010, there was a 54.9% chance that two people chosen at random from the U.S. population would be from different race or ethnicity groups (Figure 4).
In 2010, the DI varied greatly by state (not shown). Among all states, the DI ranged from a low of 10.8% in Maine to a high of 75.1% in Hawaii.
We can also measure racial and ethnic diversity using prevalence rankings and the diffusion score.
With prevalence rankings, which show the most common group in an area, we look at patterns in the percentage of the population that falls into the largest race or ethnic group, second-largest group, and third-largest group. The prevalence ranking approach uses tables or graphs to show the percentages of the largest groups.
From the rankings on 2010 Census data, we find:
The diffusion score measures the percentage of the population that is not in the first-, second- or third-largest racial and ethnic groups combined. This metric tells us how diverse and unconcentrated the population is relative to the three largest groups.
For example, the diffusion score for the United States was 7.7% in 2010, as 7.7% of the population was not one of the three largest racial or ethnic groups. When we look across the country, we see a lot of variation in the diffusion scores by state in 2010.
The final conceptual approach to illustrate the racial and ethnic diversity of the population is to map the most prevalent racial or ethnic group for all counties in the United States.
These prevalence maps show the geographic distribution of the largest or second-largest racial or ethnic group. It is similar to the prevalence ranking approach shown above.
Below we provide examples of prevalence maps for counties using data from the 2010 Census. We will use similar prevalence maps to highlight the racial and ethnic composition and diversity in the 2020 Census results.
Figure 5 shows the most prevalent racial or ethnic group for each county in 2010.
For most counties, White alone, non-Hispanic was the most prevalent group.
However, we see some regional variation:
Figure 6 shows the second-most prevalent racial and ethnic group for each county.
The number of racial or ethnic groups represented in the map increases. For example, the Asian alone, non-Hispanic population and the Multiracial, non-Hispanic population are now represented in some counties on the map as the second-most prevalent group. As with Figure 5, we also see regional patterns in the racial and ethnic distribution of the population.
We chose this new set of diversity measures — the DI, prevalence maps, prevalence ranking, and diffusion scores — because they have clear conceptual definitions and interpretations. They also overcome some of the limitations of the diversity measures we have used in the past.
In the past, the Census Bureau had sometimes used the concept of “majority” and “minority” for measuring diversity, but this approach has several conceptual and practical challenges that limit its ability to illustrate the complex racial and ethnic diversity of the U.S. population.
For example, while some people classify individuals who identify with multiple population groups (such as Hispanic and White; White and Black or African American; and White and Asian) as part of the majority population, others classify them as part of the minority population. The dual identities of these groups highlight the social, political and economic complexities of race and ethnicity in 21st century U.S. society.
The inclusion of certain groups as part of the “majority” or “minority” has also become more complex and contested in recent decades, especially as many people may not identify with certain population groups even if that is how they are classified and tabulated per federal standards. The majority-minority approach is ambiguous, and it is further complicated by complex demographic and social realities.
To overcome these limitations, we focused on these alternative race and ethnicity diversity measures to illustrate the racial and ethnic composition of the 2020 Census results. We plan to explore other diversity measures as part of our future research with 2020 Census data.
The analyses scheduled for release later this month will provide a complementary perspective to the statistics in the 2020 Census Redistricting Data (Public Law 94-171) Summary File and will help the public to understand the racial and ethnic makeup of the U.S. population and the myriad diversity of identities that people share.
In 2019, the Census Bureau formed the Disseminating Diversity Working Group to develop a strategy for producing statistics on racial and ethnic diversity in the 2020 Census data products and beyond. The working group comprises subject-matter experts in race and ethnicity, demography and data visualization. The diversity measures included in this blog were developed through research and collaborative discussions among the authors, as well as consultation with external experts and advisors. The blog was written by members of the working group and reflects our efforts to clearly communicate statistics about the racial and ethnic diversity of the U.S. population.