Skip Main Navigation Skip To Navigation Content

Research Reports

You are here: Census.govSubjects A to ZResearch Reports Sorted by Year › Abstract of RR97/02
Skip top of page navigation

Approximate String Comparison and its Effect on an Advanced Record Linkage System

Edward H. Porter and William E. Winkler, Bureau of the Census

KEY WORDS: string comparator, bigram, assignment algorithm, EM algorithm, latent class.

ABSTRACT

This paper examines various methods of string comparison for dealing with typographical error, models their relationship to the main likelihood ratio used in the Fellegi-Sunter decision rule, and shows how they improve matching performance.

Source: U.S. Census Bureau | Statistical Research Division | (301) 763-3215 (or chad.eric.russell@census.gov) |   Last Revised: October 08, 2010