U.S. Department of Commerce

Research Reports

You are here: Census.govSubjects A to ZResearch Reports Sorted by Year › Abstract of RRS2007/05
Skip top of page navigation

Automatically Estimating Record Linkage False Match Rates

William E. Winkler

KEY WORDS: EM algorithm, unsupervised and semi-supervised learning

ABSTRACT

This paper provides a mechanism for automatically estimating record linkage false match rates in situations where the subset of the true matches is reasonably well separated from other pairs. The method provides an alternative to the method of Belin and Rubin (JASA 1995) and is applicable in more situations. We provide examples demonstrating why the general problem of error rate estimation (both false match and false nonmatch rates) is likely impossible in situations without training data and exceptionally difficult even in the extremely rare situations when training data are available.

CITATION:

Source: U.S. Census Bureau, Statistical Research Division

Created: June 13, 2007
Last revised: June 13, 2007


[PDF] or PDF denotes a file in Adobe’s Portable Document Format. To view the file, you will need the Adobe® Reader® Off Site available free from Adobe.

This symbol Off Site indicates a link to a non-government web site. Our linking to these sites does not constitute an endorsement of any products, services or the information found on them. Once you link to another site you are subject to the policies of the new site.

Source: U.S. Census Bureau | Statistical Research Division | (301) 763-3215 (or chad.eric.russell@census.gov) |   Last Revised: October 08, 2010