U.S. Department of Commerce

Information Quality

Skip top of page navigation
Census.gov Information Quality Main Statistical Quality Standards › Appendix D3-B: Requirements for Calculating and Reporting Response Rates: Economic Surveys and Censuses

Appendix D3-B: Requirements for Calculating and Reporting Response Rates: Economic Surveys and Censuses


1. Terms and Variables

For many economic programs, there is a need to distinguish between the survey (sampling) unit, the reporting unit, and the tabulation unit:

A survey unit is an entity selected from the underlying statistical population of similarly-constructed units. Examples of survey units for different economic programs include establishments, Employer Identification Numbers (EIN), firms, state and local government entities, and building permit-issuing offices. Some programs use different survey units for different segments of the total population. Examples include the Annual Retail Trade Survey (ARTS) and the Survey of Construction (SOC). The ARTS samples EINs and firms (which can be comprised of one or more establishments), and the SOC samples residential housing permits and newly constructed housing units in areas where no permit is required. For cross-sectional or longitudinal surveys, the survey unit may change in composition over time (perhaps due to mergers, acquisitions, or divestitures).

A reporting unit is an entity from which data are collected. Reporting units are the vehicle for obtaining data and may or may not correspond to a survey unit for several reasons. First, the composition of the originally-sampled entity can change over the sample’s life cycle, as noted above. Second, for some surveys, an entity may request (or the Census Bureau may ask the entity) to report data in several separate pieces corresponding to different parts of the business or other entity type. For example, a large, diverse company in a company-based collection may request a separate form for each region or line of business in which it operates or may ask to report separately for each of its establishments to align with their record keeping practices. Similarly, many government programs have a central collection agency that provides the data for several governments, but issue additional mail-outs to obtain supplemental items that are not obtained by the central collection agency.

A tabulation unit houses the data used for estimation (or tabulation, in the case of a census). As with reporting units, the tabulation units may not correspond to a survey unit. Some programs consolidate establishment or plant-level data to a company level or parent government level to create tabulation units, so that the tabulation unit is often equivalent to the survey unit. Other programs create artificial units that split a reporting unit’s data among the different categories in which the reporting unit operates; for example, creating separate tabulation units by industry. In this case, the tabulation unit represents a portion of a survey unit.

For each program, the "statistical period" describes the reference period for the data collection. For example, an annual program might collect data on the prior year’s business; the statistical period refers to the prior year, but the data are obtained in the calendar year. During a given statistical period, all three types of units can be active, inactive, or ineligible. An active unit is in business and is in-scope for the program during the statistical period. An inactive unit is not operating or is not in-scope during the statistical period but is believed to have been active in the past and can potentially become active and in-scope in the future; examples include seasonal businesses for monthly or quarterly programs (temporarily idle) or businesses that operate in more than one industry, with the primary activity for a given statistical period being conducted in an "out-of-scope" industry. Finally, a survey unit may become ineligible and permanently excluded from subsequent computations due to a merger or acquisition, a permanent classification category change, or a death. All units are considered active until verified evidence otherwise is provided.

Economic programs compute two different types of response rates: a unit response rate and weighted item response rates. The Unit Response Rate (URR) is defined as the ratio of the total unweighted number of "responding units" to the total number of units eligible for collection. URRs are indicators of the performance of data collection for obtaining usable responses. Consequently, the majority of business programs base URRs on their reporting units, whereas the majority of ongoing government programs base URRs on the survey units1 that correspond to the tabulation units. Other exceptions are addressed on a case-by-case basis. The formulae for the URR provided in Section 2.1 and the detailed unit nonresponse rate breakdowns presented in the Section 2.2.1 use the term "reporting unit" for simplicity. A program can produce at most one URR per statistical period and per release phase2.

Quantity and Total Quantity Response Rates (QRR and TQRR) are item-level indicators of the "quality" of each estimate. In contrast to the URR, these weighted response rates are computed for individual data items, so that a program may produce several QRRs and TQRRs per statistical period and release. Both are weighted measures that take the size of the tabulation unit into account as well as the associated sampling parameters. These rates measure the proportion of each estimate obtained directly or indirectly from the survey unit and are consequently based on the tabulation units. The QRR measures the weighted proportion of an estimate obtained directly from the respondent for the survey/census; the TQRR expands the rate to include data from equivalent quality sources.

To compute the weighted item response rates, it is necessary to determine the source of the final tabulated value of the associated data item for each tabulation unit i. This value could be directly obtained from respondent data, indirectly obtained from other equivalent quality data sources, or imputed. The classification process is straightforward for items that are directly obtained from the survey questionnaire (i.e., form items), less so for items that are obtained as functions of collected items (i.e., derived items). The formulae provided in Sections 2.1 and 2.2.2. can be applied to either form or derived items, but require that the item value classification process be performed immediately prior and that the classification process or rules be documented.


1 The central collection unit may provide the responses for the majority of the program data (e.g., providing responses from all associated sample units for most of the program items)." Supplemental mailings are used to obtain the rest of the items.

2 Leading indicator surveys often have more than one official release of the same estimate." For example, a program might release a preliminary estimate for the current statistical period along with a revised estimate from the prior period." Response rates should be computed at each release phase, and it is expected that the response rates (unit or item) will generally increase for the same estimate with each release.

 

1.1 Eligibility Status

The total number of active reporting units in a statistical period is defined as NRU. These reporting units can be classified by their eligibility status: eligible for data collection (E), ineligible (IA), unknown eligibility (U), or data obtained from qualified administrative sources (A). Reporting units that have been determined to be out-of-scope for data collection during the statistical period are excluded from all computations, as are inactive cases." Note that the U cases are assumed to be active and in-scope in the absence of evidence otherwise. Reporting units may be considered eligible in one survey or census but ineligible for another, depending upon the target population. For example, a reporting unit that was in business after October 2004 is eligible for the 2004 Annual Retail Trade Survey, but is ineligible for the October 2004 Monthly Retail Trade Survey.


Term E (Total Eligible)
Definition The count of reporting units that were eligible for data collection in the statistical period.
Variable ei – An indicator variable for whether a reporting unit is eligible for data collection in the statistical period. These include chronic refusal units (eligible reporting units that have notified the Census Bureau that they will not participate in a given program). If a reporting unit is eligible, ei = 1, else ei = 0.
Computation

The sum of the indicator variable for eligibility (ei) over all the reporting units in the statistical period.

eligible formula

 

 

Term IA (Total Ineligible/Inactive)
Definition The count of reporting units that were ineligible for data collection in the current statistical period.
Variable iai– An indicator variable for whether a reporting unit in the statistical period has been confirmed as not a member of the target population at the time of data collection. An attempt was made to collect data, and it was confirmed that the reporting unit was not a member of the target population at that time. These reporting units are not included in the URR calculations for the periods in which they are ineligible. Information confirming ineligibility may come from observation, from a respondent, or from another source. Some examples of ineligible reporting units include firms that went out of business prior to the survey reference period, firms in an industry that is out–of–scope for the survey in question, and governments that reported data from outside of the reference period. If a reporting unit is ineligible, iai = 1, else iai = 0.
Computation

The sum of the indicator variable for ineligibility (iai) over all the reporting units in the statistical period.

IA formula

 

 

Term U (Total Unknown Eligibility)
Definition The count of reporting units in the statistical period for which eligibility could not be determined.
Variable ui –An indicator variable for whether the eligibility of a reporting unit in the statistical period could not be determined. If a reporting unit is of unknown eligibility, ui = 1, else ui = 0. For example, units whose returns are marked as "undeliverable as addressed" have unknown eligibility (ui = 1), as do unreturned mailed forms.
Computation

The sum of the indicator variable for unknown eligibility (ui) over all the reporting units in the statistical period.

unknown eligibility formula

 

 

Term A (Administrative data used as source)
Definition The count of reporting units in the statistical period that belong to the target population and were pre–selected to use administrative data rather than collect survey data.
Variable ai – An indicator variable for whether administrative data of equivalent-quality-to-reported data rather than survey data was obtained for an eligible reporting unit in the statistical period. The decision not to collect survey data must have been made for survey efficiency or to reduce respondent burden and not because that reporting unit had been a refusal in the past. These reporting units are excluded from the URR calculations because they were not sent questionnaires, and thus could not respond, although their data are included in the calculation of the TQRRs. If a reporting unit is pre-selected to receive administrative data, ai = 1, else ai = 0.
Computation

The sum of the indicator variable for units pre-selected to use administrative data (ai) over all the reporting units in the statistical period.

Administrative data used formula


The relationship among the counts of reporting units in the statistical period in the four eligibility categories is given by NRU = E + IA + U + A." For the ith reporting unit, ei + iai + ui + ai= 1. Note that the value of NRU may change by statistical period.

 

1.2 Response Status

Response status is determined only for the eligible active reporting units in the statistical period.


Term R (Response)
Definition The count of reporting units in the statistical period that were eligible for data collection in the statistical period and classified as a response.
Variable rui –An indicator variable for whether an eligible reporting unit in the statistical period responded to the survey. To be classified as a response, the respondent for the reporting unit must have provided sufficient data, and the data must satisfy all the critical edits. The definition of sufficient data will vary across surveys. Programs must designate required data items before the data collection begins. If a reporting unit responded, rui = 1, else rui = 0 (note rui = 0 for reporting units which were eligible but did not respond and for reporting units classified as IA, U, or A).
Computation

The sum of the indicator variable for eligible reporting units that responded (rui) over all the reporting units in the statistical period.

response formula

 

1.3 Reasons for Nonresponse

To improve interpretation of the response rate and better manage resources, it is recommended that whenever possible, reasons for (or types of) nonresponse be measured on a flow basis whenever possible. These terms are used to describe "unit nonresponse" and will be presented in unweighted tabulations. Five specific terms describing nonresponse reasons are defined below. The first three terms (REF, CREF, and INSF) define nonresponse reasons for eligible reporting units. The final two terms (UAA and OTH) define the reasons for reporting units with unknown eligibility.


Term REF (Refusal)
Definition The count of eligible reporting units in the statistical period that were classified as "refusal."
Variable refi – An indicator variable for whether an eligible reporting unit in the statistical period refused to respond to the survey. If a reporting unit refuses to respond, refi = 1, else refi = 0.
Computation

Sum of the indicator variable for "refusal" (refi) over all the reporting units in the statistical period.

refusal_formula

 

 

Term CREF (Chronic refusal)
Definition The count of eligible reporting units in the statistical period that were classified as "chronic refusals."
Variable crefi – An indicator variable for whether an eligible reporting unit in the statistical period was a "chronic refusal." A chronic refusal is a reporting unit that informed the Census Bureau that it would not participate in a given program. The Census Bureau does not send questionnaires to chronic refusals, but they are considered to be eligible reporting units. Chronic refusals comprise a subset of refusals. If a reporting unit is a chronic refusal, crefi = 1, else crefi = 0.
Computation The sum of the indicator variable for "chronic refusal" (crefi) over all the reporting units in the statistical period.CREF formula

 

 

Term INSF (Insufficient data)
Definition The count of eligible reporting units in the statistical period that were classified as providing insufficient data.
Variable insfi - An indicator variable for whether an eligible reporting unit in the statistical period returned a questionnaire, but did not provide sufficient data to qualify as a response. If a reporting unit returned a questionnaire but failed to provide sufficient data to qualify as a response, insfi = 1, else insfi = 0.
Computation The sum of the indicator variable for "insufficient data" (insfi) over all the reporting units in the statistical period.INSF formula

 

 

Term UAA (Undeliverable as addressed)
Definition The count of reporting units in the statistical period that were classified as "undeliverable as addressed."
Variable uaai – An indicator variable for whether a reporting unit in the statistical period had a questionnaire returned as "undeliverable as addressed." These reporting units are of unknown eligibility." If a questionnaire is returned as "undeliverable as addressed," uaai = 1, else uaai = 0.
Computation

The sum of the indicator variable for "undeliverable as addressed" (uaai) over all the reporting units in the statistical period.

UAA formula

 

 

Term OTH (Other nonresponse)
Definition The count of reporting units in the statistical period that were classified as "other nonresponse."
Variable othi – An indicator variable for whether a reporting unit in the statistical period was a nonresponse for a reason other than refusal, insufficient data, or undeliverable as addressed. These reporting units are of unknown eligibility. If a reporting unit does not respond for reasons other than refusal, insufficient data, or undeliverable as addressed,othi = 1, else othi = 0.
Computation The sum of the indicator variable for "other nonresponse" (othi) over all the reporting units in the statistical period.OTH formula

 

1.4 Quantity Response Rate Terms

The total number of active tabulation units in the statistical period is defined as NTU. Recall that the number of tabulation units NTU may differ from the number of reporting units NRU, depending on the economic program. After a program creates tabulation units and performs any necessary data allocation procedures (from reporting unit(s) to tabulation unit(s)), the individual data items are classified according to the final source of data obtained for the units: data reported by the respondent, equivalent–quality–to–reported data obtained from the program–approved outside sources (such as company annual reports, Security Exchange Commission (SEC) sites, trade association statistics), or imputed data. Tabulation units that have been determined to be out–of–scope for data collection during the statistical period are excluded from all computations, as are inactive cases.


Variable vti (Tabulated value of data item t for tabulation unit i in the statistical period)
Definition The quantity stored in the variable for item t for the ith tabulation unit in the statistical period." This quantity may be reported, equivalent-quality-to-reported, or imputed.
   
Term Rt (Reported data tabulation units for item t)
Definition The count of eligible tabulation units that provided reported data during the studied statistical period for item t that satisfied all critical edits. This count will vary by item and by statistical period.
Variable rti – An indicator variable for whether tabulation unit i in the statistical period provided reported data for item t that satisfied all edits." If the "tabulated item t value for unit i (ti) contains reported data, then rti= 1, else rti = 0.
Computation

The sum of the indicator variable for reported data (rti) over all the tabulation units (NTU) in the statistical period.

reported data tabulation formula

 

 
Term Qt (Equivalent–quality–data tabulation units for item t)
Definition The count of eligible tabulation units that use equivalent–quality–to–reported data for item t. Note that these data are indirectly obtained for the tabulation unit. This count will vary by item and by statistical period.
Variable qti – An indicator variable for whether tabulation unit i in the statistical period contains equivalent–quality–to–reported data for item t." Such data can come from three sources: data directly substituted from another census or survey s (for the same reporting unit, data item concept, and time period), administrative data d, or data obtained from some other equivalent source c validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology area (e.g., company annual reports, Securities and Exchange Commission (SEC) filings, trade association statistics). If the tabulated item t value for unit i (ti)contains equivalent–quality–to–reported data then qti = 1, else qti = 0.
Computation

The sum of the indicator variable for equivalent–quality–to–reported data (qti) over all tabulation units (NTU ) in the statistical period.

Equivalent Quality formula

   
Term St (Substituted data tabulation units for item t)
Definition The count of eligible tabulation units containing directly substituted data for item t. This count will vary by item and by statistical period.
Variable sti – An indicator variable for whether a tabulation unit in the statistical period contains directly substituted data from another census or survey for item t." The same reporting unit must provide the item value (in the other program), and the item concept and time period for the substituted values must agree between the two programs." If the tabulated item t value for unit i (ti) contains directly substituted data from another survey, sti = 1, else sti = 0.
Computation

The sum of the indicator variable for directly substituted data (sti) over all tabulation units (NTU) in the statistical period.

Substituted Data Tabulation formula

   
Term Dt (Administrative data tabulation units for item t)
Definition The count of eligible tabulation units containing administrative data for item t. This count will vary by item and by statistical period.
Variable dti – An indicator variable for whether a tabulation unit in the statistical period contains administrative data for item t. If the tabulated item t value for unit i (ti) contains administrative data, dti = 1, else dti = 0.
Computation

The sum of the indicator variable for administrative data (dti) over all tabulation units (NTU) in the statistical period.

Administrative Data Tabulation Formula

   
Term Ct (Equivalent source data tabulation units for item t)
Definition The count of eligible tabulation units containing equivalent-source data that is neither administrative data nor data substituted directly from another economic program for item t. This count will vary by item and by statistical period.
Variable cti – An indicator variable for whether a tabulation unit in the statistical period contains equivalent-source data validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology area (e.g., company annual report, SEC filings, trade association statistics) for item t. If the tabulated item t value for unit i (ti) contains equivalent–source data, then cti = 1, else cti = 0.
Computation

The sum of the indicator variable for equivalent-source data (cti) over all tabulation units (NTU ) in the statistical period.

Equivalent source formula

   
Term Mt (Imputed data tabulation units for item t)
Definition The count of eligible tabulation units containing imputed data for item t. This count will vary by item and by statistical period.
Variable

mti – An indicator variable for whether a tabulation unit in the statistical period contains imputed data for item t. If the tabulated item t value for unit i (ti) contains imputed data, mti = 1, else mti = 0.

Computation

The sum of the indicator variable for imputed data (mti) over all tabulation units (NTU) in the statistical period.

Imputed data tabulation formula

 

 

The relationship among Qt, St, Dt, and Ct for item t in a statistical period is given by Qt = St + Dt + Ct. The relationship among the counts of tabulation units for item t in the statistical period is given by NTU = Rt + Qt + Mt.

Variable fi(Nonresponse weight adjustment factor)
Definition A tabulation unit nonresponse weight adjustment factor for the ith tabulation unit in the statistical period." The variable fi is set equal to 1 for surveys that use imputation to account for unit nonresponse.
   
Variable wi(Sample weight)
Definition The design weight for the ith tabulation unit in the statistical period. The design weight includes subsampling factors and outlier adjustments, but excludes post-sampling adjustments for nonresponse and for coverage. This variable represents the inverse unbiased probability of selection for the tabulation unit.
   
Variable ti (Design-weighted value of item t for tabulation unit i)
Definition The design–weighted tabulated quantity of the variable for item t for the ith tabulation unit in the statistical period (i.e, ti= wivti). Note that this value has not been adjusted for unit non–response.
   
Term T (Total value for item t)
Definition The estimated (weighted) total of data item t for the entire population represented by the tabulation units in the statistical period. T is based on the value of the data provided by the respondent, equivalent-quality-to-reported data, or imputed data. The calculation of T incorporates subsampling factors, weighting adjustment factors for unit nonresponse (adjustment-to-sample procedures only), and outlier-adjustment factors, but does not include post-stratification or other benchmarking adjustments.
Computation

The product of the design weighted tabulated value of item t for the ith tabulation in the statistical period (ti) and the nonresponse weight adjustment factor (fi), summed over all tabulation units (NTU) in the statistical period.

Total value formula

 

2. Response and Nonresponse Rates

The rates defined below serve as quality indicators in the process control sense for non–negatively valued items such as total employees or total payroll. For items that can take on positive and negative values, such as income or earnings on investments, the program should plan to develop two sets of weighted item response rates (QRRs and TQRRs) – one from negatively valued data and one from non-negatively valued data – or propose alternative quality indicators that provide adequate transparency into data quality and assist in taking corrective actions.

2.1 Primary Response Rates


Rate URR (Unit Response Rate)
Definition The proportion of reporting units in the statistical period based on unweighted counts, that were eligible or of unknown eligibility that responded to the survey (expressed as a percentage).
Computation URR = [R/(E+U)] * 100
   
Rate QRR (Quantity Response Rate for data item t)
Definition The proportion of the estimated (weighted) total (T) of data item t reported by the active tabulation units in the statistical period (expressed as a percentage).
Computation QRR= QRR formula
   
Rate TQRR (Total Quantity Response Rate for data item t)
Definition The proportion of the estimated (weighted) total (T) of data item t reported by the active tabulation units in the statistical period" or from sources determined to be equivalent-quality-to-reported data (expressed as a percentage).
Computation TQRR = TQRR formula

 

2.2 Detailed Response and Nonresponse Rates

2.2.1 Unit Nonresponse Rate Breakdowns

The following breakdowns provide unweighted unit nonresponse rates.


Rate REF rate (Refusal Rate)
Definition The ratio of reporting units in the statistical period that were classified as "refusal" to the sum of eligible units and units of unknown eligibility (expressed as a percentage).
Computation REF rate = [REF/(E+U)] * 100
   
Rate CREF rate (Chronic Refusal Rate)
Definition The ratio of reporting units in the statistical period that were classified as "chronic refusals" to the sum of eligible units and units of unknown eligibility (expressed as a percentage).
Computation CREF rate = [CREF/(E+U)] * 100
   
Rate INSF rate (Insufficient Data Rate)
Definition The ratio of reporting units in the statistical period that were classified as "insufficient data" to the sum of eligible units and units of unknown eligibility (expressed as a percentage).
Computation INSF rate = [INSF/(E+U)] * 100
   
Rate UAA rate (Undeliverable as Addressed Rate)
Definition The ratio of reporting units in the statistical period that were classified as "undeliverable as addressed" to the sum of eligible units and units of unknown eligibility (expressed as a percentage).
Computation UAA rate = [UAA/(E+U)] * 100
   
Rate OTH rate (Other Reason for Nonresponse Rate)
Definition The ratio of reporting units in the statistical period that were classified as "other reason for nonresponse" to the sum of eligible units and units of unknown eligibility (expressed as a percentage).
Computation OTH rate = [OTH/(E+U)] * 100
   
Rate U rate (Unknown Eligibility Rate)
Definition The ratio of reporting units in the statistical period that were classified as "unknown eligibility" to the sum of eligible units and units of unknown eligibility (expressed as a percentage).
Computation U rate = [U/(E+U)] * 100
   

 

2.2.2 Total Quantity Response Rate Breakdowns

The following breakdowns provide weighted response rates.

Rate Q rate (Equivalent-Quality-to-Reported Data Rate)
Definition The proportion of the total estimate for item t derived from equivalent-quality-to-reported data for tabulation units in the statistical period (expressed as a percentage).
Computation Q rate =Q Rate formula
   
Rate S rate (Survey Substitution Rate)
Definition The proportion of the total estimate for item t derived from substituted other survey or census data for tabulation units in the statistical period (expressed as a percentage)." To be tabulated in this rate, substituted data items must be obtained from the same reporting unit in the same time period as the target program, and the item concept between the two programs must agree.
Computation S rate = S rate formula
   
Rate D rate (Administrative Data Rate)
Definition The proportion of the total estimate of item t derived from administrative data for tabulation units in the statistical period (expressed as a percentage).
Computation D rate =D Rate formula
   
Rate C rate (Other Source Rate)
Definition The proportion of the total estimate of item t derived from other source data validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology area (such as company annual reports, SEC filing, trade association statistics) for tabulation units in the statistical period (expressed as a percentage).
Computation C rate = C rate formula
   
Rate M rate (Imputation Rate)
Definition The proportion of the total estimate of item t derived from imputes for tabulation units in the statistical period (expressed as a percentage).
Computation M rate = M rate formula
   


Back to Main


[PDF] or PDF denotes a file in Adobe’s Portable Document Format. To view the file, you will need the Adobe® Reader® Off Site available free from Adobe. This symbol Off Site indicates a link to a non-government web site. Our linking to these sites does not constitute an endorsement of any products, services or the information found on them. Once you link to another site you are subject to the policies of the new site.
Source: U.S. Census Bureau | Methodology and Standards Council |  Last Revised: July 06, 2012