To prepare economic census data for release to the public, the data are be processed in three primary ways:
Data captured in an economic census must be edited to identify and correct reporting errors. The data also must be adjusted to account for missing items and for businesses that do not respond. Data edits detect and validate data by considering factors such as proper classification for a given record, historical reporting for the record and industry/geographic ratios and averages.
The first step of the data editing process is classification. To assign a valid kind-of-business or industry classification code to the establishment, computer programs subject the respondents’ responses to pre-specified items of a series of data edit programs. The specific items used for classification depend on the census report forms and include:
If critical information is missing, the record is flagged and fixed by analysts before further processing occurs.
If all critical information is available, the classification code is assigned automatically. After classification codes are assigned, a "verification" operation is performed to validate the industry, geography and ZIP Codes.
After an establishment has been assigned a valid industry code, the data edits further evaluate the response data for consistency and validity—for example, assuring that employment data are consistent with payroll or sales/receipts data. Response data is always evaluated by industry; in some cases, type of operation or tax-exempt status is also taken into account. Additional checks compare current year data to data reported in previous censuses or from administrative sources.
Nonresponse is handled by estimating or imputing missing data. Imputation is defined as the replacement of a missing or incorrectly reported item with another value derived from logical edits or statistical procedures.
There are two types of nonresponse:
Title 13 of the United States Code states that respondents are required to answer all questions to the best of their ability. Incomplete forms, unclear or erroneous data, or nonresponse can affect data analyses and the quality of the published data.
Problems that arise from missing data include:
Although economic census nonresponse accounts for less than five percent of published figures, it is a significant source of nonsampling error.
Note: If a data cell contains too much imputation, the value will be suppressed with an ‘S’ flag.
Individual establishment records are tabulated in different ways based on data product and analytical needs. Tabulations include data summed by industry, specified geographic areas, establishment-size, products produced, materials used, fuels used and product lines sold.
The tabulations are subject to disclosure analysis prior to macro-analysis.