Skip Main Navigation Skip To Navigation Content

Annual & Quarterly Services

Census.govBusiness & IndustryAnnual & Quarterly Services ›Technical Documentation
Skip top of page navigation

Service Annual Survey Technical Documentation

Introduction

The U.S. Census Bureau conducts the Service Annual Survey to provide national estimates of annual revenues and expenses of establishments classified in select service sectors. See the Coverage section below for more information on the industries included in the 2012 Service Annual Survey.

The estimates are developed using data from a probability sample of firms located in the United States that have paid employees (i.e., employer firms). Consequently, published estimates only include data for employer firms. The sample is regularly updated to reflect the universe of employer service businesses and covers both taxable firms and firms exempt from Federal income taxes. For more information about the design and selection of the sample, see the Sample Design and Estimation section below.

For some industries, firms without paid employees (i.e., nonemployers) may comprise a relatively large part of an industry. Because of the potential contribution to the industry totals from nonemployer firms, a separate table that provides total revenue estimates for employers plus nonemployers is provided. This table can be found here. The nonemployer data included in this table are obtained from administrative data provided by other Federal agencies and through imputation. The Census Bureau's Nonemployer Statistics program tabulates the administrative data to provide annual statistics on the universe of nonemployer firms. For more information, see the Nonemployer section below and the Nonemployer Statistics program website .

Coverage

The estimates are summarized by industry classification based on the 2007 North American Industry Classification System (NAICS). NAICS groups establishments into industries based on the activities in which they are primarily engaged. This system, developed jointly by the statistical agencies of Canada, Mexico, and the United States, allows for comparisons of business activity across North America.

Estimates are presented for select industries in the following NAICS sectors:

NAICS  Title

22            Utilities
48-49       Transportation and Warehousing
51            Information
52            Finance and Insurance
53            Real Estate and Rental and Leasing
54            Professional, Scientific, and Technical Services
56            Administrative and Support and Waste Management and
                 Remediation Services
61            Educational Services (except Elementary and Secondary Schools (NAICS 6111); Junior Colleges (NAICS 6112);
                 and Colleges, Universities, and Professional Schools (NAICS 6113))
62            Health Care and Social Assistance
71            Arts, Entertainment, and Recreation
81            Other Services (except Public Administration)

Detailed information about NAICS can be found on the Census Bureau website at:

http://www.census.gov/epcd/www/naics.html

Changes from the 2011 Publication

  • New detail NAICS levels have been added for the following items. The new levels will be released from 2012 forward.

    • Estimated revenues (Table 2) and expenses (Table 3).
      • NAICS level 5413X (includes 54134 (Drafting Services), 54135 (Building Inspection Services) , 54136 (Geophysical Surveying and Mapping Services), and 54137 (Surveying and Mapping (except Geophysical) Services) will be released at separate 54134, 54135 and 5413Z (includes 54136 and 54137) NAICS levels.
      • NAICS level 5414Y (includes 54142 (Industrial Design Services) and 54149 (Other Specialized Design Services)) will be released at separate 54142 and 54149 NAICS levels.
      • NAICS level 5418Y (includes 54187 (Advertising Material Distribution Services) and 54189 (Other Services Related to Advertising)) will be released at separate 54187 and 54189 NAICS levels.
    • Estimated selected expenses (Table 5).
      • Added NAICS levels for 5411 (Legal Services), 5412 (Accounting, Tax Preparation, Bookkeeping, and Payroll Services), 5413 (Architectural, Engineering, and Related Services), 5414 (Specialized Design Services), 5415 (Computer Systems Design and Related Services), 5416 (Management, Scientific, and Technical Consulting Services), 5417 (Scientific Research and Development Services) , 5418 (Advertising, Public Relations, and Related Services), 5419 (Other Professional, Scientific, and Technical Services). In prior years estimated selected expenses were only available at the sector level for 54 (Professional, Scientific, and Technical Services).
    • Estimated export revenue (Table 6).
      • Added NAICS level 5413Z (54136 and 54137).
    • To allow for revisions, we will continue to publish the aggregate NAICS levels related to the above estimates until we no longer plan to revise those estimates.
  • Estimated revenue (Table 2) and expense (Table 3) data will no longer be provided at the 6-digit NAICS levels for 485111 (Mixed Mode Transit Systems) , 485112 (Commuter Rail Systems), 485113 (Bus and Other Motor Vehicle Transit Systems), and 485119 (Other Urban Transit Systems) due to data quality concerns. SAS will continue to provide estimates at the 4851 (Urban Transit Systems) level.
  • The 2012 SAS adds e-commerce estimates (Table 9). In previous years, SAS e-commerce estimates were only released as part of the E-Stats Report. See http://www.census.gov/econ/estats for more details on the E-Stats Report.

Dollar Values

All dollar values presented in this report are expressed in current dollars; that is, the estimates are not adjusted to a constant dollar series. Consequently, when comparing estimates to prior years, users also should consider price level changes.

Confidentiality

Title 13 of the United States Code authorizes the Census Bureau to conduct censuses and surveys. Section 9 of the same Title requires that any information collected from the public under the authority of Title 13 be maintained as confidential. Section 214 of Title 13 and Sections 3559 and 3571 of Title 18 of the United States Code provide for the imposition of penalties of up to five years in prison and up to $250,000 in fines for wrongful disclosure of confidential census information. In accordance with Title 13, no estimates are published that would disclose the operations of an individual firm.

The Census Bureau's internal Disclosure Review Board sets the confidentiality rules for all data releases. A checklist approach is used to ensure that all potential risks to the confidentiality of the data are considered and addressed.

Disclosure Limitation

A disclosure of data occurs when an individual can use published statistical information to identify either an individual or firm that has provided information under a pledge of confidentiality. Disclosure limitation is the process used to protect the confidentiality of the survey data provided by an individual or firm. Using disclosure limitation procedures, the Census Bureau modifies or removes the characteristics that put confidential information at risk for disclosure. Although it may appear that a table shows information about a specific individual or business, the Census Bureau has taken steps to disguise or suppress the original data while making sure the results are still useful. The techniques used by the Census Bureau to protect confidentiality in tabulations vary, depending on the type of data.

Unpublished Estimates

Some unpublished estimates can be derived directly from this report by subtracting published estimates from their respective totals. However, the figures obtained by such subtraction are subject to poor response rates, high sampling variability, or other factors that result in their failure to meet Census Bureau standards for publication.

Individuals who use Service Annual Survey estimates to create new estimates should cite the Census Bureau as the source of only the original estimates.

SAMPLE DESIGN AND ESTIMATION PROCEDURES

A new sample was introduced with the 2011 Service Annual Survey (SAS). The new sample was designed to produce estimates based on the 2007 North American Industry Classification System (NAICS). This section describes the design, selection, and estimation procedures for the new sample. For descriptions of prior samples, see the Service Annual Survey publications at http://www.census.gov/services/sas/historic_data.html .

Sampling Frame

The sampling frame used for the Service Annual Survey (SAS) has two types of sampling units represented: EINs and large, multiple-establishment firms. Both sampling units represent clusters of one or more establishments owned or controlled by the same firm. The information used to create these sampling units was extracted from data collected as part of the 2007 Economic Census and from establishment records contained on the Census Bureau's Business Register as updated to December 2010. The next few paragraphs give details about the Business Register; the distinction between firms, EINs, and establishments; and the construction of the sampling units. Though important, they are not essential to understanding the basic sample design and readers may continue to the Stratification, Sampling Rates, and Allocation section.

The Business Register is a multi-relational database that contains a record for each known establishment that is located in the United States or one of its territories and has paid employees. An establishment is a single physical location where business transactions take place and for which payroll and employment records are kept. Groups of one or more establishments under common ownership or control are firms. A single-unit firm owns or operates only one establishment. A multiunit firm owns or operates two or more establishments. The treatment of establishments on the Business Register differs according to whether the establishment is part of a single-unit or multiunit firm. In particular, the structure of an establishment's primary identifier on the Business Register differs depending on whether it is owned by a single-unit firm or by a multiunit firm.

A single-unit firm's primary identifier is its EIN. The Internal Revenue Service (IRS) issues the EIN, and the firm uses it as an identifier to report social security payments for its employees under the Federal Insurance Contributions Act (FICA). The same act requires all employer firms to use EINs. Each employer firm is associated with at least one EIN and only one firm can use a given EIN. Because a single-unit firm has only one establishment, there is a one-to-one relationship between the firm and the EIN. Thus the firm, the EIN, and the establishment all reference the same physical location and all three terms can be used interchangeably and unambiguously when referring to a single-unit firm.

For multiunit firms however, a different structure connects the firm with its establishments via the EIN. Essentially a multiunit firm is associated with a cluster of one or more EINs and EINs are associated with one or more establishments. A multiunit firm consists of at least two establishments. Each firm is associated with at least one EIN and only one firm can use a given EIN. However, one multiunit firm may have several EINs. Similarly, there is a one-to-many relationship between EINs and establishments. Each EIN can be associated with many establishments but each establishment is associated with only one EIN. Because of the possibility of one-to-many relationships, we must distinguish between the firm, its EINs, and its establishments. The multiunit firm that owns or controls a particular establishment is identified on the Business Register by way of the establishment's primary identifier.

The primary identifier of an establishment owned by a multiunit firm consists of a unique combination of an alpha number and a plant number. The alpha number identifies the multiunit firm, and the plant number identifies a particular establishment within that firm. All establishments owned or controlled by the same multiunit firm have the same alpha number. Different multiunit firms have different alpha numbers, and different establishments within the same multiunit firm have different plant numbers. The Census Bureau assigns both the alpha number to the multiunit firm and plant numbers to the corresponding establishments based on the results of the quinquennial economic census and the annual Company Organization Survey.

To create the sampling frame, we extract the records for all establishments located in the United States and classified in select service sectors as defined by the 2007 NAICS. For these establishments, we extract revenue, payroll, employment, name and address information, as well as primary identifiers and, for establishments owned by multiunit firms, associated EINs.

To create the sampling units for multiunit firms, we aggregate the economic data of the establishments owned by these firms to an EIN level by tabulating the establishment data for all service establishments associated with the same EIN. Similarly we aggregate the data to a multiunit firm level by tabulating the establishment data for all service establishments associated with the same alpha number. No aggregation is necessary to put single-unit establishment information on an EIN basis or a firm basis. Thus, the sampling units created for single-unit firms simultaneously represent establishment, EIN, and firm information. The sampling frame is a complex amalgam of establishments, EINs, and firms.

Stratification, Sampling Rates, and Allocation

The primary stratification of the sampling frame is by industry group based on the detail required for publication. We further stratify the sampling units within industry group by a measure of size (substratify) related to their annual revenue. Sampling units expected to have a large effect on the precision of the estimates are selected "with certainty." This means they are sure to be selected and will represent only themselves (i.e., have a selection probability of 1 and a sampling weight of 1). Within each industry stratum, we determine a substratum boundary (or cutoff) that divides the certainty units from the noncertainty units. We base these cutoffs on a statistical analysis of data from the 2007 Economic Census. Accordingly, these values are on a 2007 revenue basis. We also used this analysis to determine the number of size substrata and substratum bounds for each industry stratum and to set preliminary sampling rates needed to achieve specified sampling variability constraints on revenue estimates for different industry groups. The size substrata, substratum bounds, and sampling rates are later updated through analysis of the sampling frame.

Sample Selection

First, if a firm's annual revenue is greater than the corresponding certainty cutoff, that firm is selected into the SAS sample with certainty.

Next, all firms not selected with certainty are subjected to sampling on an EIN basis. If a firm has more than one EIN, we treat each of its EINs as a separate sampling unit. To be eligible for the initial sampling, an EIN has to have nonzero payroll in 2009. The EINs are stratified according to their major industry and their estimated sales (on a 2007 basis). Within each noncertainty stratum, a simple random sample of EINs is selected without replacement.

Method of Assigning Tax Status

For kind-of-business classifications where there are substantial numbers of taxable and tax-exempt establishments, establishments are classified based on the Federal income tax filing requirement for the establishment or organization. This classification is based primarily on the response to an inquiry on the 2007 Economic Census questionnaire. Establishments that indicated that all or part of their income is exempt from Federal income tax under provisions of section 501 of the Internal Revenue Service (IRS) code are classified as tax-exempt; establishments indicating no such exemption are classified as taxable. All government-operated hospitals are classified as tax-exempt. For establishments without a report form, the tax status classification is based upon administrative data from other Federal agencies.

For selected kind-of-business classifications that are comprised primarily of tax-exempt establishments, all establishments in those classifications are defined as tax-exempt. All establishments in the remaining kind-of-business classifications (comprised primarily of taxable establishments) are defined as taxable.

Sample Maintenance

We update the sample to represent EINs issued since the initial sample selection. These new EINs, called births, are EINs that have an active payroll filing requirement on the IRS Business Master File (BMF). An active payroll filing requirement indicates that the EIN is required to file payroll for the next quarterly period. The Social Security Administration attempts to assign industry classification to each new EIN.

EINs with an active payroll filing requirement on the IRS Business Master File are said by the Bureau to be “BMF active” and EINs with an inactive payroll filing requirement are said to be “BMF inactive.”

We sample EIN births on a quarterly basis using a two-phase selection procedure. To be eligible for selection, a birth must either have no industry classification or be classified in an industry within the scope of the Service Annual Survey, the Annual Wholesale Trade Survey, or the Annual Retail Trade Survey, and it must meet certain criteria regarding its quarterly payroll. In the first phase, we stratify births by broad industry groups and a measure of size based on quarterly payroll. A relatively large sample is drawn and canvassed to obtain a more reliable measure of size, consisting of revenue in two recent months and a new or more detailed industry classification code.

Using this more reliable information, in the second phase we subject the selected births from the first phase to probability proportional-to-size sampling with overall probabilities equivalent to those used in drawing the initial Service Annual Survey sample from the December 2010 Business Register. Because of the time it takes for a new employer firm to acquire an EIN from the IRS and because of the time needed to accomplish the two-phase birth-selection procedure, we add births to the sample approximately nine months after they begin operation.

We include births that are selected in the quarterly birth-selection procedure in November of the reference year in the initial mailing of the Service Annual Survey questionnaires in January of the following year. To better represent all EIN births in the reference year, and specifically to account for the time it takes to identify and select new EINs, we add births to the Service Annual Survey sample that are selected in February, May, and August of the year following the reference year. We mail survey forms to these births in June and August to supplement the initial survey mailing.

To be eligible for the sample canvass and tabulation, an EIN selected in the noncertainty sampling operations must meet both of the following requirements:

  • It must have an active payroll filing requirement on the IRS Business Master File.

  • It must have been selected from the Business Register in either the initial sampling or during the quarterly birth-selection procedure.

If a firm was selected with certainty and had more than one establishment at the time of sampling, any new establishments that the firm acquires, even if under new or different EINs, are included in the sample with certainty.

Each quarter, we check against the current Business Register to determine if any EINs on the Service Annual Survey have become BMF inactive. Typically, we do not canvass BMF inactive EINs during the reference year. Likewise, if any EIN on SAS that was BMF inactive in a previous reference year is now BMF active on the current Business Register, we again include these EINs in the canvass. In both cases, we only tabulate data for that portion of the reference year that these EINs reported payroll to the IRS.

Single-unit EINs selected into the sample with certainty are not dropped from canvass and tabulation if they are no longer BMF active. Rather, the firm that used the EIN is contacted, and if a successor EIN is found, it is added to the survey. For both inactive EINs and any previously inactive EINs that are now active, data are tabulated for only the portion of the reference year that these EINs reported payroll to the IRS.

Estimation and Sampling Variance

Total estimates are computed using the Horvitz-Thompson estimator (i.e., as the sum of weighted data (reported or imputed) for all selected sampling units that meet the sample canvass and tabulation criteria (see Sample Maintenance section)). The weight for a given sampling unit is the reciprocal of its probability of selection into the Service Annual Survey sample. These estimates are input to a benchmarking procedure, as described below. Variances are estimated using the method of random groups and are used to determine if measured changes are statistically significant.

Benchmarking

For industries collected in the prior sample in 2007, we benchmarked to the total revenue level from the 2007 Economic Census. Sample linking to the prior sample for these industries implicitly benchmarks to the 2007 Economic Census.

For industries newly collected in the 2009 Service Annual Survey (expansion industires), we can only link samples; we can’t benchmark to the 2012 Economic Census until that data are available. Once available, all estimates will be benchmarked to the 2012 Census results. Expansion industries include the following:

  • NAICS 22 (Utilities);
  • NAICS 481 (Air Transportation);
  • NAICS 483 (Water Transportation);
  • NAICS 485 (Transit and Ground Passenger Transportation);
  • NAICS 486 (Pipeline Transportation);
  • NAICS 487 (Scenic and Sightseeing Transportation);
  • NAICS 488 (Support Activities for Transportation);
  • NAICS 521 (Monetary Authorities-Central Banks);
  • NAICS 522 (Credit Intermediation and Related Activities);
  • NAICS 5232 (Securities and Commodity Exchanges);
  • NAICS 52391 (Miscellaneous Intermediation);
  • NAICS 52399 (All Other Financial Investment Activities);
  • NAICS 524 (Insurance Carriers and Related Activities);
  • NAICS 531 (Real Estate) (Real Estate Investment Trusts, a new part of NAICS 5311
    (Lessors of Real Estate) in the current sample was not collected in the prior sample
    so this piece (with REITs) cannot even be linked.);
  • NAICS 533 (Lessors of Nonfinancial Intangible Assets, except Copyrighted Works); and,
  • NAICS 61 (Educational Services (except Elementary and Secondary Schools (NAICS 6111);
    Junior Colleges (NAICS 6112); and Colleges, Universities, and Professional Schools
    (NAICS 6113))).

Prior to linking, the following two operations are performed:

  • For industries affected by the change from 2002 to 2007 NAICS, published estimates for 1998 through 2010 from the prior sample are restated on a 2007 NAICS basis, using revenue distributions from the 2007 Economic Census that link the two sets of classification codes. This operation is performed only during the first year of the sample. All estimates from the new sample will be on a 2007 NAICS basis. For industries not affected by the change from 2002 to 2007 NAICS, there is no need to restate the published census-adjusted revenue estimates from the prior sample.
  • Historical corrections are made to current sample data back to 2010.

Linking Samples

  • Sectors for which data were collected in 2007 and most of NAICS 52 (with the exception of NAICS 522310 (Mortgage and Nonmortgage Loan Brokers)) are linked to the prior sample by multiplying the Horvitz-Thompson revenue estimate from the current sample by a ratio for years 2010 and forward. Previous years’ estimates remain unchanged.
  • The numerator is the 2010 revenue estimate for the industry on a 2007 NAICS basis from the prior sample.
  • The denominator is the 2010 Horvitz-Thompson estimate of revenue for the industry on a 2007 NAICS basis from the current sample.

The resulting revenue estimates (call these “modified” revenue estimates) were benchmarked to the 2007 Economic Census in the prior sample. Expansion industries weren’t collected in 2007, so they have not been benchmarked to the Economic Census. Expansion industries (except for most of NAICS 52 as noted above), use the Horvitz-Thompson estimate from the current sample for 2010 and subsequent years. For 2009 estimates, the 2009 published estimate is multiplied by the inverse of the revenue ratio above for its linked estimate.

For detail items correlated to revenue (detail revenue items, export revenue, inventory), estimates before 2010 will only be included in the 2011 publication tables if the data collected is on a comparable basis, and these will be unrevised from previously published estimates. For years 2010 and forward, the Horvitz-Thompson estimate for the item is multiplied by the revenue ratio above to get modified estimates.

The following method is used to produce “modified” total expense estimates. First, the revenue ratio described above is multiplied by the Horvitz-Thompson total expense estimates for each detailed industry for 2010 and subsequent years, resulting in modified total expense estimates for these years. Then the published estimates for 2007-2010 from the prior sample are input into a program. This program revises the 2008-2010 estimates in a manner that

  • uses published estimates for 2007 from the prior sample as a constraint resulting in no revision to the published 2007 estimate;
  • uses the modified estimate for 2010 from the current sample as a constraint; and,
  • minimizes the sum of squared differences between the year-to-year change of the input and revised estimates for 2008-2010.

The same method as described above for expenses is used to produce e-commerce estimates except the year held constant is 2004, not 2007.

For detail items correlated to expenses, the estimates through 2007 are fixed. In 2008 and 2009, the previously published 2008 or 2009 estimates are modified by the following ratio:

Numerator=Modified (2008 or 2009) total expenses

Denominator=Published (2008 or 2009) total expenses.

In subsequent years, the Horvitz-Thompson estimate for the detailed expense is multiplied by the revenue ratio above.

Modified estimates for data items that sum to total revenue or total expenses are raked to modified total revenue or total expenses to ensure additivity. Modified estimates at aggregate industry levels are computed by summing the estimates for the appropriate detailed industries comprising the aggregate.

Nonemployers:

Total revenue estimates for employers plus nonemployers are published in a separate table; all other tables contain estimates based only on employer firms. Firms without paid employees (nonemployers) are included in the total revenue estimates through administrative data provided by other Federal agencies and through imputation. Imputed nonemployer revenue totals for reference year 2011 have been replaced by values published by the Nonemployer Statistics program. Nonemployer revenue totals for reference year 2012 are imputed because values from the Nonemployer Statistics program are not yet available.

RELIABILITY OF THE ESTIMATES

The published estimates may differ from the actual, but unknown, population values. For a particular estimate, statisticians define this difference as the total error of the estimate. When describing the accuracy of survey results, it is convenient to discuss total error as the sum of sampling error and nonsampling error. Sampling error is the error arising from the use of a sample, rather than a census, to estimate population values. Nonsampling error encompasses all other factors that contribute to the total error of a sample survey estimate. Further descriptions of sampling error and nonsampling error are provided in the following sections. Data users should take into account the estimates of sampling error and the potential effects of nonsampling error when using the published estimates.

Sampling Error

Because the estimates are based on a sample, exact agreement with results that would be obtained from a complete enumeration of firms on the sampling frame using the same enumeration procedures is not expected. However, because each firm on the sampling frame has a known probability of being selected into the sample, it is possible to estimate the sampling variability of the survey estimates.

The particular sample used in this survey is one of a large number of samples of the same size that could have been selected using the same design. If all possible samples had been surveyed under the same conditions, an estimate of a population parameter of interest could have been obtained from each sample. For the parameter of interest, estimates derived from the different samples would, in general, differ from each other. Common measures of the variability among these estimates are the sampling variance, the standard error, and the coefficient of variation (CV). The sampling variance is defined as the squared difference, averaged over all possible samples of the same size and design, between the estimator and its average value. The standard error is the square root of the sampling variance. The CV expresses the standard error as a percentage of the estimate to which it refers. For example, an estimate of 200 units that has an estimated standard error of 10 units has an estimated CV of 5 percent. The sampling variance, standard error, and CV of an estimate can be estimated from the selected sample because the sample was selected using probability sampling. Note that measures of sampling variability, such as the standard error and CV, are estimated from the sample and are also subject to sampling variability. (Technically, we should refer to the estimated standard error or the estimated CV of an estimator. However, for the sake of brevity we have omitted this detail.) It is important to note that the standard error and CV only measure sampling variability. They do not measure any systematic biases in the estimates.

We estimate variances for published statistics (totals, ratios, and percent changes) using the method of random groups. To implement the random group method of variance estimation, we assign a random group number to each sampling unit at the time of sample selection. Then, for each tabulation level at which estimates are produced, we compute variance estimates using the assigned random group numbers. We use 16 random groups (G=16) to estimate variances for the Service Annual Survey. For more information on the random group method of variance estimation, click here. Adobe PDF

The Census Bureau recommends that individuals using published estimates incorporate this information into their analyses, as sampling error could affect the conclusions drawn from these estimates.

The estimate from a particular sample and its associated standard error can be used to construct a confidence interval. A confidence interval is a range about a given estimator that has a specified probability of containing the average of the estimates for the parameter derived from all possible samples of the same size and design. Associated with each interval is a percentage of confidence, which is interpreted as follows. If, for each possible sample, an estimate of a population parameter and its approximate standard error were obtained and using a t-statistic with 15 (=G-1) degrees of freedom, then:

  • For approximately 90 percent of the possible samples, the interval from 1.753 standard errors below to 1.753 standard errors above the estimate would include the average of the estimates derived from all possible samples of the same size and design.

To illustrate the computation of a confidence interval for an estimate of total revenue, assume that an estimate of total revenue is $10,750 million and the CV for this estimate is 1.8 percent, or 0.018. First obtain the standard error of the estimate by multiplying the total revenue estimate by its CV. For this example, multiply $10,750 million by 0.018. This yields a standard error of $193.5 million. The upper and lower bounds of the 90-percent confidence interval are computed as $10,750 million plus or minus 1.753 times $193.5 million. Consequently, the 90-percent confidence interval is $10,411 million to $11,089 million. If corresponding confidence intervals were constructed for all possible samples of the same size and design, approximately 9 out of 10 (90 percent) of these intervals would contain the average of the estimates derived from all possible samples.

Nonsampling Errors

Nonsampling error encompasses all other factors, other than sampling error, that contribute to the total error of a sample survey estimate and may also occur in censuses. It is often helpful to think of nonsampling error as arising from deficiencies or mistakes in the survey process. Nonsampling errors are difficult to measure and can be attributed to many sources: the inclusion of erroneous units in the survey (overcoverage), the exclusion of eligible units from the survey (undercoverage), nonresponse, misreporting, mistakes in recording and coding responses, misinterpretation of questions, and other errors of collection, response, coverage, or processing. Although nonsampling error is not measured directly, the Census Bureau employs quality control procedures throughout the process to minimize this type of error.

A potential source of bias in the estimates is nonresponse. Nonresponse is defined as the inability to obtain all the intended measurements or responses about all selected units. Two types of nonresponse are often distinguished. Unit nonresponse is used to describe the inability to obtain any of the substantive measurements about a sampled unit. In most cases of unit nonresponse, the questionnaire was never returned to the Census Bureau after several attempts to elicit a response. Item nonresponse occurs either when a question is unanswered or the response to the question fails computer or analyst edits.

For both unit and item nonresponse, a missing value is replaced by a predicted value obtained from an appropriate model for nonresponse. This procedure is called imputation and uses survey data and administrative data as input.

Further explanation of the quality of data and the estimates can be made available upon request.

Response Rates

Economic surveys at the Census Bureau are required to compute two different types of response rates: a unit response rate and weighted item response rates.

The next few paragraphs provide details about the types and status of units used to collect and tabulate data. Though important, they are not essential to understanding the response rate measures and readers may continue to the description of the two types of response rates.

A survey unit is an entity selected from the underlying statistical population of similarly-constructed units. Examples of survey units for different economic programs include establishments, Employer Identification Numbers (EIN), firms, state and local government entities, and building permit-issuing offices. For SAS, the survey unit is either an EIN or company, either of which can be comprised of one or more establishments owned or controlled by the same firm.

A reporting unit is an entity about which data are collected. Reporting units are the vehicle for obtaining data and may or may not correspond to a survey unit for several reasons. First, the composition of the originally-sampled entity can change over the sample’s life cycle, as noted above. Second, for some surveys, an entity may request (or the Census Bureau may ask the entity) to report data in several separate pieces corresponding to different parts of the business or other entity type. For example, a large, diverse company in a company-based collection may request a separate form for each region or kind of business in which it operates or may ask to report separately for each of its establishments to align with their record keeping practices. For SAS, reporting units are usually created to facilitate the collection and tabulation of data by industry.

A tabulation unit houses the data used for estimation (or tabulation, in the case of a census). As with reporting units, the tabulation units may not correspond to a survey unit. Some programs consolidate establishment or plant-level data to a company level to create tabulation units, so that the tabulation unit is often equivalent to the survey unit. Other programs create artificial units that split a reporting unit’s data among the different industries in which the reporting unit operates. In this case, the tabulation unit represents a portion of a survey unit. For SAS, the tabulation unit is the reporting unit.

For each survey, the statistical period describes the reference period for the data collection. For example, an annual program might collect data on the prior year’s business activity; the statistical period refers to the prior year, but the data are collected in the current calendar year.

During a given statistical period, all three types of units can be active, inactive, or ineligible. An active unit is in business and is in-scope for the program during the statistical period. An inactive unit is not operating or is not in-scope during the statistical period but is believed to have been active in the past and can potentially become active and in-scope in the future. Finally, a survey unit may become ineligible and excluded from subsequent computations due to a change in industry classification or ceasing to conduct business operations. All units are considered active until verified evidence otherwise is provided.

For additional information about response rates, see the Census Bureau’s Statistical Quality Standard D.3., Appendix B: Requirements for Calculating and Reporting Response Rates for Economic Surveys.

Two Types of Response Rates

The Unit Response Rate (URR) is defined as the percentage of reporting units in the statistical period, based on unweighted counts, that were eligible for data collection or of unknown eligibility that responded to the survey. URRs are indicators of the performance of data collection for obtaining usable responses. For a reporting unit to be classified as a response, a respondent must provide either total revenue or total expenses. Responses may be obtained by mail, telephone, facsimile, or Internet.

The Total Quantity Response Rate (TQRR) is defined as the percentage of the estimated (weighted) total of a given data item reported by the active tabulation units in the statistical period or from sources determined to be equivalent-quality-to-reported data. The TQRR is an item-level indicator of the “quality” of each estimate. In contrast to the URR, these weighted response rates are computed for individual data items, so that a survey may produce several TQRRs per statistical period and release. The TQRR is a weighted measure that takes the size of the tabulation unit into account as well as the associated sampling parameters. To compute the TQRR for a particular estimate, it is necessary to determine the source of the final tabulated value of the associated data item for each tabulation unit. This value could be directly obtained from respondent data, indirectly obtained from other equivalent quality data sources, or imputed.

The URRs and TQRRs for 2012 total revenue and total expense for employer firms at the published sector levels are as follows:

 

NAICS Sector Title URR TQRR Revenue TQRR Expense
22 Utilities 76.6 83.8 84.0
48-49 Transportation and Warehousing 70.9 89.3 83.8
51 Information 72.0 89.2 82.6
52 Finance and Insurance (except 525930) 78.3 90.9 91.9
53 Real Estate and Rental and Leasing 75.4 87.4 80.9
54 Professional, Scientific, and Technical Services 75.8 85.4 79.9
56 Administrative and Support and Waste Management and Remediation Services 72.5 80.3 74.0
61 Educational Services (except 6111, 6112, and 6113) 73.6 84.6 84.0
62 Health Care and Social Assistance 74.8 83.9 84.0
71 Art, Entertainment, and Recreation 76.8 87.1 88.0
81 Other Services (except Public Administration) 75.7 88.2 86.2

Estimates Suppressed from Publication

Estimates with a coefficient of variation greater than 30 percent or with a total quantity response rate less than 50 percent have been suppressed from publication. These estimates have been replaced with an "S" in the published table. For more information, see the Census Bureau's standards for Releasing Information Products.

 

[PDF] or PDF denotes a file in Adobe’s Portable Document Format. To view the file, you will need the Adobe® Reader® Off Site available free from Adobe. [Excel] or the letters [xls] indicate a document is in the Microsoft® Excel® Spreadsheet Format (XLS). To view the file, you will need the Microsoft® Excel® Viewer Off Site available for free from Microsoft®.
Source: U.S. Census Bureau | Services | (888) 211-5946 |  Last Revised: December 18, 2013