Evaluating String Comparator Performance for Record Linkage
William E. Yancey
We compare variations of string comparators based on the Jaro-Winkler comparator and edit distance comparator. We apply the comparators to Census data to see which are better classifiers for matches and non-matches, first by comparing their classification abilities using a ROC curve based analysis, then by considering a direct comparison between two candidate comparators in record linkage results.
Source: U.S. Census Bureau, Statistical Research Division
Created: June 13, 2005
Last revised: June 13, 2005