Work with interactive mapping tools from across the Census Bureau.
Collection of audio features and sound bites.
The Census Bureau packages data and information into easy-to-understand visuals.
Browse Census Bureau images.
Read briefs and reports from Census Bureau experts.
Watch Census Bureau vignettes, testimonials, and video files.
Read research analyses from Census Bureau experts.
Developer portal to access services and documentation for the Census Bureau's APIs.
Explore Census Bureau data on your mobile device with interactive tools.
Find a multitude of DVDs, CDs and publications in print by topic.
These external sites provide more data.
Download extraction tools to help you get the in-depth data you need.
Explore Census data with interactive visualizations covering a broad range of topics.
How we provide the best mix of timeliness, relevancy, quality, and cost for the data we collect.
Learn about other opportunities to collaborate with us.
Explore the rich historical background of an organization with roots almost as old as the nation.
Explore prospective positions available at the Census Bureau.
Explore Census programs targeted for particular needs.
Discover the latest in Census Bureau data releases, reports, and events.
The Census Bureau's Director writes on how we measure America's people, places and economy.
Find interesting and quirky statistics regarding national celebrations and major events.
Listen to audio files on fun facts, historical figures, and celebrations of the month.
Find media toolkits, advisories, and all the latest Census news.
See what's coming up in releases and reports.
2002 Economic Census:
|Introduction to the Economic Census|
|Purposes and uses||Industry classifications||Relationship to historical classifications||Additional data||Historical information||Sources for more information|
|Business Expenses - Introductory Text|
|Scope||Geographic area||Dollar values||Comparison to IRS / BEA||Comparability||Reliability of data||Disclosure||More frequent data||Related reports||Contacts||Abbreviations|
|Appendix C. Business Expenses Survey Methodology|
|Sample design||Estimation||Sampling error||Nonsampling error||Suppression|
The estimates for merchant wholesale, retail trade and service industries are derived from the 2002 Business Expenses Survey (BES). The BES sample is the combination of the samples used for the 2002 Annual Trade Survey, the 2002 Annual Retail Trade Survey, and the 2002 Service Annual Survey. These samples are probability samples of firms engaged in the various industries. A firm is a business organization consisting of one or more establishments under common ownership or control. An establishment is a single physical location where business is conducted or where services are performed.
The initial sample frames for the surveys were constructed from the Census Bureau's’s Standard Statistical Establishment List (SSEL) as of June 1999. The sample frames contained two types of sampling units represented - large multiple establishment firms and Employer Identification Numbers (EINs). Both sampling units can represent one or more establishments owned or controlled by the same firm. Firms were stratified by kind-of-business and then by a measure-of-size related to their annual receipts, revenue, or sales.
The frames included only employers, and only employers were actually mailed questionnaires in the survey. In the retail and service industries sales data for nonemployers were obtained from administrative records. Estimates of the expenses for nonemployers were derived based on the administrative sales for the nonemployers and the sales and expense data for employers.
To reduce the variability of the estimates, the sampling units with the largest measures of size were selected "with certainty." This means they are self-representing (i.e., each has a selection probability of one and a sampling weight of one). Within each kind-of-business, a substratum boundary (or cutoff) that divides the certainty units from the noncertainty units was determined. If a unit was included in the certainty portion, the firm was the sampling unit. All firms not selected with certainty were subjected to sampling on an EIN basis.
Data from the 1997 Economic Census were analyzed to determine the certainty cutoffs, noncertainty stratum boundaries, and the sampling rates needed to achieve specified sampling variability objectives for each kind-of-business group. These sampling rates were applied to the sample frames to determine the total sample size for each group, which was then allocated to the size classes optimally based on the number of sampling units and the standard deviation of the units' measures of size. Within each noncertainty stratum, a simple random sample of EINs was selected. The sampling rates for the EINs varied between one in three and one in 1,000.
A two-phase sample selection procedure was used for births (new EINs issued after the initial frames were created). EIN births are new EINs assigned by the Internal Revenue Service (IRS) on their latest available list of FICA (Federal Insurance Contributions Act) taxpayers. There are no receipts values available for these EINs, so a large sample was drawn and canvassed to obtain a more reliable measure of size (sales or receipts) and a more reliable kind-of-business code, if needed. Using this more reliable information, the selected births were subjected to probability proportional to size sampling with overall probabilities equivalent to those used in drawing the initial sample from the 1999 SSEL.
Data on Merchant Wholesale and Retail sales are reproduced on a 2002 NAICS basis from the 2002 Economic Census. Data compiled on Merchant Wholesale and Retail merchandise purchases are the same as presented in reports from the Annual Trade Survey and Annual Retail Trade Survey, respectively. These annual data had previously been adjusted to 2002 NAICS-based sales reported in the 2002 Economic Census. Data on service industries receipts and revenue presented in this report are reproduced on a 1997 NAICS basis from the 2002 Economic Census.
All estimates are computed as the sum of weighted data (reported and imputed) for all sampling units. The weight for a sampling unit is the reciprocal of the probability of selection (or sampling rate). Wholesale, Retail, and Accommodation and Food sales and expenses are adjusted to 2002 NAICS-based Census sales for the industry by multiplying them by the ratio of sales from the 2002 Census to sales from the BES. Service revenue and expenses are adjusted to 1997 NAICS-based Census revenue for the industry by multiplying them by the ratio of revenue from the 2002 Census to revenue from the BES. This adjustment puts revenue and expenses in line with the Census figures.
The sample used in this survey is one of many possible samples that could have been selected using the same sampling methodology. Each of these possible samples would likely yield different results. The Relative Standard Error (RSE), also referred to as the coefficient of variation (CV), is a measure of the variability among the estimates from these possible samples. The RSE accounts for sampling variability but does not account for nonsampling error or systematic biases in the data. Bias is the difference, averaged over all possible samples of the same design and size, between the estimate and the true value being estimated. The sample estimate and an estimate of its relative standard error can be used to estimate the standard error (SE) and then construct interval estimates with a prescribed level of confidence that the interval includes the average results of all samples. To illustrate, if all possible samples were surveyed under essentially the same condition, and estimates calculated from each sample, then:
Thus, for a particular sample, one can say with specified confidence that the average of all possible samples is included in the constructed interval.
Example of a confidence interval. Suppose the estimated operating expenses are $4,572 million, and the estimated relative standard error is 1.8 percent. Then the estimated standard error is $4,572 X .018 = $82.3. An approximate 90-percent confidence interval is $4,572 -/+ (1.6 X $82.3) or $4,440.3 to $4,703.7 million.
Relative Standard Errors have not been calculated for the percent estimates shown in this report. An upper bound on the RSE of a percent can be estimated by taking the square root of [(RSE for the value in the numerator squared) plus (RSE for the value in the denominator squared)].
A description of sample design and estimation procedures can be found on the Internet for the:
Nonsampling errors can be attributed to many sources: inability to obtain information about all companies in the sample; inability or unwillingness on the part of respondents to provide correct information; response errors; definition difficulties; differences in the interpretation of questions; mistakes in recording or coding the data; and other errors of collection, response, coverage, and estimation for nonresponse. Explicit measures of the effects of these nonsampling errors are not available. To minimize nonsampling error, precautionary steps were taken in all phases of the collection, processing, and tabulation of the data in an effort to minimize its influence.
A potential source of bias in the estimates is due to imputing data for nonrespondents and for data that failed the edit. Imputation is the process of replacing a missing value with administrative data or a predicted value obtained from an appropriate model for nonresponse. Nonresponse is defined as the inability to obtain all the intended measurements or responses about all selected units. Two types of nonresponse are often distinguished. Unit nonresponse is used to describe the inability to obtain any of the substantive measurements about a sampled unit. In most cases of unit nonresponse, the questionnaire was never returned to the Census Bureau, after several attempts to elicit a response. Item nonresponse occurs either when a question is unanswered or the response to the question fails computer or analyst edits.
Estimates are withheld, or suppressed, when publication standards are not met. Suppression occurs when one or more of the following criteria are met:
Suppressed data are denoted by the publication of the character ‘S’ in the data tables.