Skip Main Navigation Skip To Navigation Content

Information Quality

Skip top of page navigation
Census.gov Information Quality Main Statistical Quality Standards › Capturing and Processing Data: Statistical Quality Standard C4

Statistical Quality Standard C4: Linking Data Records


Purpose: The purpose of this standard is to ensure that methods are established and implemented to promote the accurate linking of data records.

Scope: The Census Bureau’s statistical quality standards apply to all information products released by the Census Bureau and the activities that generate those products, including products released to the public, sponsors, joint partners, or other customers. All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other individuals who receive Census Bureau funding to develop and release Census Bureau information products.

In particular, this standard applies to both automated and clerical record linkage used for statistical purposes. It covers linking that uses characteristics of an entity to determine whether multiple records refer to the same entity.


    Exclusions:
    In addition to the global exclusions listed in the Preface, this standard does not apply to:

    • Statistical attribute matching.
    • Linkages performed using only a unique identifier (e.g., Protected Information Key or serial number).
    • Linkages performed for quality assurance purposes.

Key Terms: Automated record linkage, blocking, clerical record linkage, field follow-up, record linkage, scoring weights, and statistical attribute matching.

Requirement C4-1: Throughout all processes associated with linking, unauthorized release of protected information or administratively restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additional provisions governing the use of the data (e.g., as may be specified in a memorandum of understanding or data-use agreement). (See Statistical Quality Standard S1, Protecting Confidentiality.)

Requirement C4-2: A plan must be developed that addresses:

  1. Objectives for linking the files.
  2. Data sets and files to be linked.
  3. Verification and testing of the linking systems and processes.
  4. Training for staff involved in the clerical record linkage operations.
  5. Evaluation of the results of the linkage (e.g., link rates and clerical error rates).

Notes:

  1. Statistical Quality Standard A1, Planning a Data Program, addresses overall planning requirements, including estimates of schedule and costs.
  2. The Data Stewardship Policy DS021, Record Linkage, states the principles that must be met for record linkage activities and a checklist that must be filled out before beginning record linkage activities.
  3. The Census Bureau Guideline Record Linkage provides guidance on procedures for automated and clerical record linkage.

Requirement C4-3: Record linkage processes must be developed and implemented to link data records accurately.


Sub-Requirement C4-3.1: Specifications and procedures for the record linkage systems must be developed and implemented.

    Examples of issues that specifications and procedures for automated record linkage systems might address include:

    • Criteria for determining a valid link.
    • Linking parameters (e.g., scoring weights and the associated cut-offs).
    • Blocking and linking variables.
    • Standardization of the variables used in linking (e.g., state codes and geographic entity names are in the same format on the files being linked).

    Examples of issues that specifications and procedures for clerical record linkage systems might address include:

    • Criteria for determining that two records represent the same entity.
    • Criteria for assigning records to a specific geographic entity or entities (i.e., geocoding).
    • Linking variables.
    • Guidelines for situations requiring referrals.
    • Criteria for sending cases to field follow-up.

Sub-Requirement C4-3.2: Record linkage systems must be verified and tested to ensure that all components function as intended.

    Examples of verification and testing activities for automated record linkage systems include:

    • Verifying that the specifications reflect system requirements.
    • Verifying that the systems and software implement the specifications accurately.
    • Performing a test linkage to ensure systems work as specified.

    Examples of verification and testing activities for clerical record linkage systems include:

    • Verifying that the specifications reflect system requirements.
    • Verifying that the instructions will accomplish what is expected.
    • Testing computer systems that support clerical linking operations.

Sub-Requirement C4-3.3: Training for the staff involved in clerical record linkage (as identified during planning) must be developed and provided.

    Examples of training activities include:

    • Instructing clerks on how to implement the specifications.
    • Providing a training database to give clerks a chance to practice their skills.
    • Assessing error rates of clerks and providing feedback.

Sub-Requirement C4-3.4: Systems and procedures must be developed and implemented to monitor and evaluate the accuracy of the record linkage operations and to take corrective actions if problems are identified.

    Examples of monitoring and evaluation activities for automated record linkage operations include:

    • Evaluating the accuracy of automated linkages by a manual review.
    • Monitoring link rates and investigating deviations from historical results, and taking corrective action if necessary.

    Examples of monitoring and evaluation activities for clerical record linkage operations include:

    • Establishing an acceptable error rate.
    • Establishing quality control sampling rates.
    • Monitoring clerks’ error rates and referrals, and taking corrective action if necessary (e.g., feedback or retraining).

Requirement C4-4: Documentation needed to replicate and evaluate the linking operations must be produced. The documentation must be retained, consistent with applicable policies and data-use agreements, and must be made available to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2, Managing Data and Documents.)

    Examples of documentation include:

    • Plans, requirements, specifications, and procedures for the record linkage systems.
    • Programs and parameters used for linking.
    • Problems encountered and solutions implemented during the linking operations.
    • Evaluation results (e.g., link rates and clerical error rates).

    Notes:

    1. The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restrictions that would preclude its release. (See Data Stewardship Policy DS007, Information Security Management Program.)
    2. Statistical Quality Standard F2, Providing Documentation to Support Transparency in Information Products, contains specific requirements about documentation that must be readily accessible to the public to ensure transparency of information products released by the Census Bureau.


Back to Main


Source: U.S. Census Bureau | Methodology and Standards Council |  Last Revised: August 04, 2011