Sampling and Estimation Methodologies
The estimates in this report are based on two stratified simple random samples. The ACE-1 sample consists of 47,818 companies with paid employees (determined by the presence of payroll) in 2006. The ACE-2 sample consists of 15,000 businesses without employees. The two sample populations received different survey forms (see Appendix D for an example of each survey form).
The survey included all private, nonfarm, domestic companies. Major exclusions from the sample frame were government-owned operations (including the U.S. Postal Service), foreign-owned operations of domestic companies, establishments located in U.S. territories, establishments engaged in agricultural production (not agricultural services), and private households.
The 2006 Business Register (BR) was used to develop the 2007 ACE-1 sample frame. The BR is the U.S. Census Bureau’s establishment-based database. The database contains records for each physical business entity with payroll located in the United States, including company ownership information and current-year administrative data. In creating the ACE-1 frame, establishment data in the BR file were consolidated to create company-level records. Employment and payroll information was obtained for each six-digit North American Industry Classification System1 (NAICS) industry in which the company had activity. Next, payroll data for each company-level record were run through an algorithm to assign the company, first to an industry sector (i.e., manufacturing, construction, etc.), then to a subsector (three-digit NAICS code), then to an industry group (four-digit NAICS code), then to an industry (five-digit NAICS code), and finally to an Annual Capital Expenditures Survey (ACES) industry code based on the industry. The resulting sample frame contained nearly 6.3 million companies.
The 2007 ACE-1 sampling frame consists of a certainty portion and a noncertainty portion. The 17,688 companies with 500 or more employees were selected with certainty. The remaining companies with 1 to 499 employees were then grouped into 135 industry categories. Each industry was then further divided into four strata. Since capital expenditures data were not available on the sampling frame, 2006 payroll was used as the stratification variable. The stratification methodology resulted in minimizing the sample size subject to a desired level of reliability for each industry. The expected relative standard errors (RSEs) ranged from 1 to 3 percent.
The ACE-2 sample frame was selected from four categories of small businesses.
• Companies with no payroll and no employees on March 12 in the prior year, but with characteristics indicating possible employment during the survey period.
• Companies that had received an Employer Identification (EI) number within the last 2 years, but for which no payroll, employment, or receipts data had yet been received.
• Nonemployer corporations and partnerships.
• Nonemployer sole proprietorships with sales or receipts of $1,000 or more.
Each of these four categories was treated as a separate stratum. The source of the first two categories of businesses was the 2006 BR; the source of the second two categories was the 2006 Nonemployer Database. Companies within each stratum were selected using a simple random sample. From a universe of about 26.1 million businesses, 15,000 businesses were selected.
Each company selected for the survey has a sample weight that is the inverse of its probability of selection. All sampled companies within the same stratum and industry grouping have the same weight. Weights were increased to adjust for nonresponse. The coverage rate for all companies was 90.3 percent. The coverage rate is calculated by multiplying 100 by the ratio of the capital expenditures of all reporting companies weighted by the original sample weights, to the capital expenditures of all reporting companies weighted by the adjusted-for-nonresponse sample weights. Weight adjustment and publication estimation are described in the following subsections.
Weight Adjustment
For estimation purposes, each company was placed into 1 of 4 response-related categories:
1. Respondents
2. Nonrespondents
3. Not in business
4. Known duplicates
A company was considered a respondent or nonrespondent based on whether the company provided sufficient data in items 1 or 2 of the ACE-1 survey form for the ACE-1 segment or item 1 of the ACE-2 survey form for the ACE-2 segment. Companies that went out of business prior to 2007 and duplicates were dropped from the survey. Companies that went out of business during the survey year were kept in the sample, and efforts were made to collect data for the period the company was active.
Note: A statistical procedure was used in reweighting extreme outliers to minimize the mean square error of the estimates. Mean square error accounts for both sampling variability and bias.
ACE-1 segment. The following discussion assumes 675 strata (strata designation h = 1, 2, . . ., 675) which are based on 135 industries, each containing five strata (including the certainty stratum). The original stratum weights (Wh) were adjusted to compensate for nonresponse. The adjusted weight is computed as follows:
Wh(adj) is the adjusted stratum weight of the hth stratum
Wh = is the original stratum weight of the hth stratum
Nh is the population size of the hth stratum
nh is the sample size of the hth stratum
Phr is the sum of total company payroll for respondent companies in stratum h
Phn is the sum of total company payroll for nonrespondent companies in stratum h
ACE-2 segment. The ACE-2 segment initially was stratified into four strata based on the four small business categories mentioned above. The stratum consisting of ‘‘companies with no payroll and no employees on March 12 in the prior year, but with characteristics indicating possible employment during the survey period’’ was poststratified into three strata. The stratum ‘‘companies which had received an Employer Identification (EI) number within the last 2 years, but for which no payroll, employment, or receipts data had yet been received’’ was poststratified into three strata. In both instances, the poststratification was based on updated administrative-record data that were not available at the time the sample frames were created. This method resulted in eight strata (strata designation h = 1, 2, . . ., 8). The stratum population sizes, sample sizes, response counts, and stratum weights for the six strata resulting from the poststratification were modified accordingly. For these six strata, the following formulas use these modified sizes and weights; for the remaining two strata, the formulas use the original stratum sizes and weights.
The stratum weights (Wh) were adjusted to compensate for nonresponse. The adjusted weight is computed as follows:
where,
Wh(adj) is the adjusted stratum weight of the hth stratum
Wh = is the stratum weight of the hth stratum
Nh is the population size of the hth stratum
nh is the sample size of the hth stratum
rh is the number of respondents in the hth stratum
Publication Estimation
Publication cell estimates were computed by obtaining a weighted sum of reported values for companies treated as respondents. For those strata undergoing nonresponse adjustment, the estimates for Xj are biased since this method assumes that nonresponse is not a purely random event. No attempt was made to estimate the magnitude of this bias.
ACE-1 segment. The ACE-1 estimates were derived as follows. Each
estimated cell total,
is of the form

where,
Wh(adj) is the adjusted weight of the hth stratum
X(j),i,h is the value attributed to the ith company of stratum h,
where j is
the publication cell of interest.
Note: Although a company was assigned to and sampled in one ACES industry, it could report expenditures in multiple ACES industries. When this occurred, the reported data for all industries were inflated by the weight in the sample industry.ACE-2 segment. The ACE-2 estimates were derived as follows:
Wh (adj) is the adjusted weight of the hth stratum
X(j),i,h is the value attributed to the ith company in stratum h, where jThe data shown in this report are estimated from a sample and will differ from the data which would have been obtained from a complete census. Two types of possible errors are associated with estimates based on data from sample surveys: sampling errors and nonsampling errors. The accuracy of a survey result depends not only on the sampling errors and nonsampling errors measured but also on the nonsampling errors not explicitly measured. For particular estimates, the total error may considerably exceed the measured errors.
Sampling VariabilityThe sample used in this survey is one of many possible samples that could have been selected using the sampling methodology described earlier. Each of these possible samples would likely yield different results. The RSE is a measure of the variability among the estimates from these possible samples. The RSEs were calculated using a delete-a-group jackknife replicate variance estimator. The RSE accounts for sampling variability but does not account for nonsampling error or systematic biases in the data. Bias is the difference, averaged over all possible samples of the same design and size, between the estimate and the true value being estimated. The RSEs presented in the tables can be used to derive the Standard Error (SE) of the estimate. The SE can be used to derive interval estimates with prescribed levels of confidence that the interval includes the average results of all samples:
a. intervals defined by one SE above and below the sample estimate will contain the true value about 68 percent of the time.
b. intervals defined by 1.6 SE above and below the sample estimate will contain the true value about 90 percent of the time.
c. intervals defined by two SEs above and below the sample estimate will contain the true value about 95 percent of the time.
The SE of the estimate can be calculated by multiplying the RSE presented in the tables by the corresponding estimate. Note, the RSE is the measure of variability presented for all estimates in this publication except for the estimates of percent change presented in Table 2a[xls, 24KB], for which we provide the SE as the measure of variability (refer to Table 2b[xls, 22KB]). Also note that RSEs in this publication are in percentage form. They must be divided by 100 before being multiplied by the corresponding estimate.Examples of Calculating a Confidence Interval:
a. For a data value: using data from Tables 4a and 4c, the SE for nondurable manufacturing total capital expenditures would be calculated as follows:
million = $1,068
million
The 90-percent confidence interval can be constructed by multiplying 1.6 by the SE, adding this value to the estimate to create the upper bound, and subtracting it from the estimate to create the lower bound.
Examples of Calculating Absolute Differences and Percent Changes
Data for the current year along with revised data for the prior year are presented in this publication. Two numbers of interest for many data users may be the absolute difference between the prior year and the current year, and the percent change from the prior year to the current year.
The absolute difference is calculated as:
The percent change is calculated as 100 multiplied by the ratio of the difference divided by the prior estimate.
As an example, for finance and insurance, from Table 4a the total expenditures estimate for 2007 is $172,481 with the RSE found in Table 4c as 1.0, and for 2006 the revised estimate from Table 4b is $163,069 with the RSE found in Table 4d as 1.7. The above calculations would be:

so the 90-percent confidence interval is $9,412 +/- $5,224 million, or $4,188 million to $14,636 million.
The percent change is calculated as 100 multiplied by the ratio of the difference divided by the prior estimate. So continuing with both durable goods manufacturing and finance and insurance examples from above,
and a 90-percent confidence interval on this percent change is estimated as:

so the 90-percent confidence interval for durable goods manufacturing is 1.07 percent +/- 3.32 percent or -2.25 percent to 4.39 percent. Since this interval contains zero (0), we do not have sufficient evidence to conclude that the confidence interval on this percent change was statistically different from 0, i.e., the 90-percent confidence interval on the estimated percent change is not statistically significant.
Nonsampling Error
All surveys and censuses are subject to nonsampling errors. Nonsampling errors can be attributed to many sources, including: inability to obtain information about all companies in the sample; inability or unwillingness on the part of respondents to provide correct information; response errors; definition difficulties; differences in the interpretation of questions; mistakes in recording or coding the data; and other errors of collection, response, coverage, and estimation for nonresponse.
Explicit measures of the effects of these nonsampling errors are not available. However, to minimize nonsampling error, all reports were reviewed for reasonableness and consistency, and every effort was made to achieve accurate responses from all survey participants. Coverage errors may have a significant effect on the accuracy of estimates for this survey. The BR, which forms the basis of our survey universe frame, may not contain all businesses. Also, businesses that are contained in the BR may have their payroll misreported.
1North American Industry Classification System (NAICS) – United States, 2002. For sale by National Technical Information Service (NTIS), Springfield, VA 22161. Call NTIS at 1-800-553-6847 or go to www.census.gov/epcd/www/naics.html.