U.S. Department of Commerce

Information Quality

Skip top of page navigation
Census.gov Information Quality Main Statistical Quality Standards › Capturing and Processing Data: Statistical Quality Standard C3

Statistical Quality Standard C3: Coding Data


Purpose: The purpose of this standard is to ensure that methods are established and implemented to promote the accurate assignment of codes, including geographic entity codes, to enable analysis and tabulation of data.

Scope: The Census Bureau’s statistical quality standards apply to all information products released by the Census Bureau and the activities that generate those products, including products released to the public, sponsors, joint partners, or other customers.  All Census Bureau employees and Special Sworn Status individuals must comply with these standards; this includes contractors and other individuals who receive Census Bureau funding to develop and release Census Bureau information products.

In particular, this standard applies to the development and implementation of post-collection coding operations, including the assignment of:

  • Codes to convert text and numerical data into categories.
  • Geographic entity codes (geocodes) and geographic attribute codes to distinguish and describe geographic entities and their characteristics within digital databases.


  • Exclusions:
    In addition to the global exclusions listed in the Preface, this standard does not apply to:

    • Behavior coding activities associated with cognitive interviewing.

Key Terms: American National Standards Institute codes (ANSI codes), coding, geocoding, geographic entity code (geocode), Master Address File (MAF), North American Industry Classification System (NAICS), Standard Occupational Classification System (SOC), and Topologically Integrated Geographic Encoding and Referencing (TIGER).

Requirement C3-1: Throughout all processes associated with coding, unauthorized release of protected information or administratively restricted information must be prevented by following federal laws (e.g., Title 13, Title 15, and Title 26), Census Bureau policies (e.g., Data Stewardship Policies), and additional provisions governing the use of the data (e.g., as may be specified in a memorandum of understanding or data-use agreement). (See Statistical Quality Standard S1, Protecting Confidentiality.)

Requirement C3-2: A plan must be developed that addresses:

  1. Required accuracy levels for the coding operations, including definitions of errors.
  2. Requirements for the coding systems, including requirements for input and output files.
  3. Verification and testing of the coding systems.
  4. Training for staff involved in the clerical coding operations.
  5. Monitoring and evaluation of the quality of the coding operations.

Notes:

  1. Statistical Quality Standard A1, Planning a Data Program, addresses overall planning requirements, including estimates of schedule and costs.
  2. The Census Bureau Guideline, Coding Verification, provides guidance on coding procedures.

Requirement C3-3: Processes must be developed and implemented to accurately assign codes for converting text and numerical data to categories and geocodes to identify and distinguish geographic entities and their attributes within a digital database.

Sub-Requirement C3-3.1: Specifications and procedures for the coding systems and operations must be developed and implemented.

    Examples of issues that coding specifications and procedures might address include:

    • A list and description of the admissible codes or values for each item on the questionnaire.
    • A list of acceptable reference sources, printed and electronic, that may be used by the coding staff (e.g., Employer Name List).
    • Procedures to add to the list of admissible codes or to add text responses to match existing codes.
    • Consistency of codes across data collection periods.
    • Procedures to assign and associate geocodes with other information within geographic files (e.g., the Master Address File/Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) database).

Sub-Requirement C3-3.2: Standardized codes, when appropriate, must be used to convert text data.

    Examples of current coding standards include:

    • American National Standards Institute (ANSI) Codes.
    • North American Industry Classification System (NAICS).
    • Standard Occupational Classification System (SOC).

Sub-Requirement C3-3.3: Coding systems must be verified and tested to ensure that all components function as intended.

    Examples of verification and testing activities include:

    • Verifying that coding specifications and procedures satisfy the coding requirements.
    • Validating coding instructions or programming statements against specifications.
    • Verifying that coding rules are implemented consistently.
    • Using a test file to ensure that the codes are assigned correctly.

Sub-Requirement C3-3.4: Training for staff involved in clerical coding operations (as identified during planning) must be developed and provided.

Sub-Requirement C3-3.5: Systems and procedures must be developed and implemented to monitor and evaluate the quality of the coding operations and to take corrective actions if problems are identified.

    Examples of monitoring and evaluation activities include:

    • Establishing a quality control (QC) system to check coding outcomes and providing feedback to coders or taking other corrective action.
    • Monitoring QC results (such as referral rates, error rates), determining the causes of systematic errors, and taking corrective action (e.g., providing feedback or retraining to coders, updating coder reference materials, or other corrective actions).
    • Incorporating a geocode verification within automated instruments and correcting geocodes when errors are detected.
    • Evaluating the accuracy of geocoding and determining the cause of errors in incorrect geocodes.
    • Reviewing and updating coding guidelines.
    • Reviewing software and procedures to reflect any changes in the coding guidelines.

Requirement C3-4: Documentation needed to replicate and evaluate the coding operations must be produced. The documentation must be retained, consistent with applicable policies and data-use agreements, and must be made available to Census Bureau employees who need it to carry out their work. (See Statistical Quality Standard S2, Managing Data and Documents.)

    Examples of documentation include:

    • Plans, requirements, specifications, and procedures for the coding systems.
    • Problems encountered and solutions implemented during the coding operations.
    • Quality measures from monitoring and evaluating the coding operations (e.g., error rates and referral rates). (See Statistical Quality Standard D3, Producing Measures and Indicators of Nonsampling Error.)

    Notes:

    1. The documentation must be released on request to external users, unless the information is subject to legal protections or administrative restrictions that would preclude its release.  (See Data Stewardship Policy DS007, Information Security Management Program.)
    2. Statistical Quality Standard F2, Providing Documentation to Support Transparency in Information Products, contains specific requirements about documentation that must be readily accessible to the public to ensure transparency of information products released by the Census Bureau.


Back to Main


Source: U.S. Census Bureau | Methodology and Standards Council |  Last Revised: August 04, 2011