National Statistical Institutes often have the need to merge administrative files from a variety of sources for which unique identifiers are not available to facilitate matching. Agencies such as Eurostat have the need to connect data sources from different countries and sources and to verify the confidentiality of microdata. To do this merging of administrative lists, agencies need fast software for cleaning up and standardizing lists and for merging the lists. The U.S. Bureau of the Census has software for name standardization, address standardization, and matching that are considered state-of-the-art. The standardization software breaks names and addresses into components that are easily compared. The matching software accounts for typographical error, automatically estimates matching parameters, and optimizes sets of assignments over large groups of pairs of records.
Source: U.S. Census Bureau, Statistical Research Division
Created: July 23, 2001