Skip header section
US Census Bureau
People Business Geography Newsroom Subjects A to Z Search@Census
 

Annual Capital Expenditures Survey

You are here: Census.govBusiness & IndustryAnnual Capital Expenditures › How the Data are Collected
Skip top of page navigation

How the Data are Collected

Estimation

Reliability of the Estimates

 

Sampling and Estimation Methodologies

The estimates in this report are based on two stratified simple random samples. The ACE-1 sample consists of 47,818 companies with paid employees (determined by the presence of payroll) in 2006. The ACE-2 sample consists of 15,000 businesses without employees. The two sample populations received different survey forms (see Appendix D for an example of each survey form).

The survey included all private, nonfarm, domestic companies. Major exclusions from the sample frame were government-owned operations (including the U.S. Postal Service), foreign-owned operations of domestic companies, establishments located in U.S. territories, establishments engaged in agricultural production (not agricultural services), and private households.

The 2006 Business Register (BR) was used to develop the 2007 ACE-1 sample frame. The BR is the U.S. Census Bureau’s establishment-based database. The database contains records for each physical business entity with payroll located in the United States, including company ownership information and current-year administrative data. In creating the ACE-1 frame, establishment data in the BR file were consolidated to create company-level records. Employment and payroll information was obtained for each six-digit North American Industry Classification System1 (NAICS) industry in which the company had activity. Next, payroll data for each company-level record were run through an algorithm to assign the company, first to an industry sector (i.e., manufacturing, construction, etc.), then to a subsector (three-digit NAICS code), then to an industry group (four-digit NAICS code), then to an industry (five-digit NAICS code), and finally to an Annual Capital Expenditures Survey (ACES) industry code based on the industry. The resulting sample frame contained nearly 6.3 million companies.

The 2007 ACE-1 sampling frame consists of a certainty portion and a noncertainty portion. The 17,688 companies with 500 or more employees were selected with certainty. The remaining companies with 1 to 499 employees were then grouped into 135 industry categories. Each industry was then further divided into four strata. Since capital expenditures data were not available on the sampling frame, 2006 payroll was used as the stratification variable. The stratification methodology resulted in minimizing the sample size subject to a desired level of reliability for each industry. The expected relative standard errors (RSEs) ranged from 1 to 3 percent.

The ACE-2 sample frame was selected from four categories of small businesses.

• Companies with no payroll and no employees on March 12 in the prior year, but with characteristics indicating possible employment during the survey period.

• Companies that had received an Employer Identification (EI) number within the last 2 years, but for which no payroll, employment, or receipts data had yet been received.

• Nonemployer corporations and partnerships.

• Nonemployer sole proprietorships with sales or receipts of $1,000 or more.

Each of these four categories was treated as a separate stratum. The source of the first two categories of businesses was the 2006 BR; the source of the second two categories was the 2006 Nonemployer Database. Companies within each stratum were selected using a simple random sample. From a universe of about 26.1 million businesses, 15,000 businesses were selected.


Estimation

(back to top)

Each company selected for the survey has a sample weight that is the inverse of its probability of selection. All sampled companies within the same stratum and industry grouping have the same weight. Weights were increased to adjust for nonresponse. The coverage rate for all companies was 90.3 percent. The coverage rate is calculated by multiplying 100 by the ratio of the capital expenditures of all reporting companies weighted by the original sample weights, to the capital expenditures of all reporting companies weighted by the adjusted-for-nonresponse sample weights. Weight adjustment and publication estimation are described in the following subsections.

Weight Adjustment

For estimation purposes, each company was placed into 1 of 4 response-related categories:

1. Respondents

2. Nonrespondents

3. Not in business

4. Known duplicates

A company was considered a respondent or nonrespondent based on whether the company provided sufficient data in items 1 or 2 of the ACE-1 survey form for the ACE-1 segment or item 1 of the ACE-2 survey form for the ACE-2 segment. Companies that went out of business prior to 2007 and duplicates were dropped from the survey. Companies that went out of business during the survey year were kept in the sample, and efforts were made to collect data for the period the company was active.

Note: A statistical procedure was used in reweighting extreme outliers to minimize the mean square error of the estimates. Mean square error accounts for both sampling variability and bias.

ACE-1 segment. The following discussion assumes 675 strata (strata designation h = 1, 2, . . ., 675) which are based on 135 industries, each containing five strata (including the certainty stratum). The original stratum weights (Wh) were adjusted to compensate for nonresponse. The adjusted weight is computed as follows:

This is the equation for nonresponse adjustment to the sampling weight for companies with employees


where,

Wh(adj) is the adjusted stratum weight of the hth stratum

Wh = is the original stratum weight of the hth stratum

Nh is the population size of the hth stratum

nh is the sample size of the hth stratum

Phr is the sum of total company payroll for respondent companies in stratum h

Phn is the sum of total company payroll for nonrespondent companies in stratum h

ACE-2 segment. The ACE-2 segment initially was stratified into four strata based on the four small business categories mentioned above. The stratum consisting of ‘‘companies with no payroll and no employees on March 12 in the prior year, but with characteristics indicating possible employment during the survey period’’ was poststratified into three strata. The stratum ‘‘companies which had received an Employer Identification (EI) number within the last 2 years, but for which no payroll, employment, or receipts data had yet been received’’ was poststratified into three strata. In both instances, the poststratification was based on updated administrative-record data that were not available at the time the sample frames were created. This method resulted in eight strata (strata designation h = 1, 2, . . ., 8). The stratum population sizes, sample sizes, response counts, and stratum weights for the six strata resulting from the poststratification were modified accordingly. For these six strata, the following formulas use these modified sizes and weights; for the remaining two strata, the formulas use the original stratum sizes and weights.

The stratum weights (Wh) were adjusted to compensate for nonresponse. The adjusted weight is computed as follows:

This is the equation for nonresponse adjustment to the sampling weight for companies without employees


where,

Wh(adj) is the adjusted stratum weight of the hth stratum

Wh = is the stratum weight of the hth stratum

Nh is the population size of the hth stratum

nh is the sample size of the hth stratum

rh is the number of respondents in the hth stratum


Publication Estimation

Publication cell estimates were computed by obtaining a weighted sum of reported values for companies treated as respondents. For those strata undergoing nonresponse adjustment, the estimates for Xj are biased since this method assumes that nonresponse is not a purely random event. No attempt was made to estimate the magnitude of this bias.

ACE-1 segment. The ACE-1 estimates were derived as follows. Each estimated cell total, is of the form

This is the equation for calculating estimates for companies with employees


where,

Wh(adj) is the adjusted weight of the hth stratum

X(j),i,h is the value attributed to the ith company of stratum h,

where j is

the publication cell of interest.

Note: Although a company was assigned to and sampled in one ACES industry, it could report expenditures in multiple ACES industries. When this occurred, the reported data for all industries were inflated by the weight in the sample industry.

ACE-2 segment. The ACE-2 estimates were derived as follows:

This is the equation for calculating estimates for companies without employees


where,

Wh (adj) is the adjusted weight of the hth stratum

X(j),i,h is the value attributed to the ith company in stratum h, where j
is the publication cell of interest (Note: since no industry level estimates are derived for ACE-2 companies, this j will always represent a national-level cell estimate.)

Reliability of the Estimates

(back to top)

The data shown in this report are estimated from a sample and will differ from the data which would have been obtained from a complete census. Two types of possible errors are associated with estimates based on data from sample surveys: sampling errors and nonsampling errors. The accuracy of a survey result depends not only on the sampling errors and nonsampling errors measured but also on the nonsampling errors not explicitly measured. For particular estimates, the total error may considerably exceed the measured errors.

Sampling Variability

The sample used in this survey is one of many possible samples that could have been selected using the sampling methodology described earlier. Each of these possible samples would likely yield different results. The RSE is a measure of the variability among the estimates from these possible samples. The RSEs were calculated using a delete-a-group jackknife replicate variance estimator. The RSE accounts for sampling variability but does not account for nonsampling error or systematic biases in the data. Bias is the difference, averaged over all possible samples of the same design and size, between the estimate and the true value being estimated. The RSEs presented in the tables can be used to derive the Standard Error (SE) of the estimate. The SE can be used to derive interval estimates with prescribed levels of confidence that the interval includes the average results of all samples:

a. intervals defined by one SE above and below the sample estimate will contain the true value about 68 percent of the time.

b. intervals defined by 1.6 SE above and below the sample estimate will contain the true value about 90 percent of the time.

c. intervals defined by two SEs above and below the sample estimate will contain the true value about 95 percent of the time.

The SE of the estimate can be calculated by multiplying the RSE presented in the tables by the corresponding estimate. Note, the RSE is the measure of variability presented for all estimates in this publication except for the estimates of percent change presented in Table 2a[xls, 24KB], for which we provide the SE as the measure of variability (refer to Table 2b[xls, 22KB]). Also note that RSEs in this publication are in percentage form. They must be divided by 100 before being multiplied by the corresponding estimate.

Examples of Calculating a Confidence Interval:

(back to top)

a. For a data value: using data from Tables 4a and 4c, the SE for nondurable manufacturing total capital expenditures would be calculated as follows:

This is an example of calculating a standard error of a total million = $1,068 million

The 90-percent confidence interval can be constructed by multiplying 1.6 by the SE, adding this value to the estimate to create the upper bound, and subtracting it from the estimate to create the lower bound.

This is the formula for a 90 percent confidence interval of a total


Using data from Table 4a, for nondurable manufacturing total capital expenditures, a 90-percent confidence interval would be calculated as:
$88,964 million ± [1.6*($1,068 million)] = $88,964 ± $1,709 million
This implies 90 percent confidence that the interval $87,255 million to $90,673 million contains the actual total for nondurable manufacturing capital expenditures, subject to further nonsampling errors.
b. For percent change: using data from Tables 2a and 2b, the 90-percent confidence interval can be constructed by multiplying 1.6 by the SE of the percent change, adding this value to the estimated percent change to create the upper bound, and subtracting it from the estimate to create the lower bound. For example, for the Health care and social assistance sector, the estimated percent change from 2006 to 2007 is 11.2 percent (from Table 2a), and the standard error of this estimate is 2.7 percent (from Table 2b)


This implies 90 percent confidence that the interval 6.9 percent to 15.5 percent contains the actual percent change for Health care and social assistance.

Examples of Calculating Absolute Differences and Percent Changes

(back to top)

Data for the current year along with revised data for the prior year are presented in this publication. Two numbers of interest for many data users may be the absolute difference between the prior year and the current year, and the percent change from the prior year to the current year.

The absolute difference is calculated as:

This is the formula for a difference between totals over different cycles


and a 90-percent confidence interval on this difference is estimated as:

This is the formula for a confidence interval for a difference


As an example, for durable goods manufacturing, from Table 4a the total expenditures estimate for 2007 is $107,989 with the RSE found in Table 4c as 1.4, and for 2006 the revised estimate from Table 4b is $106,843 with the RSE found in Table 4d as 1.5. The above calculations would be:

This an an example of calculating an absolute difference


And the 90-percent confidence interval is estimated as:

This is an example of calculating a confidence interval on the difference of total between two cycles


= 1.6*$2,203.21
= $3,525.14 = $3,525 million
so the 90-percent confidence interval is $1,146 +/- $3,525 million, or -$2,379 million to $4,671 million. Since this interval contains zero (0), we do not have sufficient evidence to conclude that the confidence interval on this difference was statistically different from 0, i.e., the 90-percent confidence interval on the estimated difference is not statistically significant.

The percent change is calculated as 100 multiplied by the ratio of the difference divided by the prior estimate.

As an example, for finance and insurance, from Table 4a the total expenditures estimate for 2007 is $172,481 with the RSE found in Table 4c as 1.0, and for 2006 the revised estimate from Table 4b is $163,069 with the RSE found in Table 4d as 1.7. The above calculations would be:

This an an example of calculating an absolute difference


And the 90-percent confidence interval is estimated as:

This is an example of calculating a confidence interval on the difference of total between two cycles


= 1.6*$3264.95
= $5,223.92 = $5,224 million

so the 90-percent confidence interval is $9,412 +/- $5,224 million, or $4,188 million to $14,636 million.

The percent change is calculated as 100 multiplied by the ratio of the difference divided by the prior estimate. So continuing with both durable goods manufacturing and finance and insurance examples from above,

This is an example of calculating percent change in totals over two cycles

and a 90-percent confidence interval on this percent change is estimated as:

This is an example of calculating percent change in totals over two cycles

so the 90-percent confidence interval for durable goods manufacturing is 1.07 percent +/- 3.32 percent or -2.25 percent to 4.39 percent. Since this interval contains zero (0), we do not have sufficient evidence to conclude that the confidence interval on this percent change was statistically different from 0, i.e., the 90-percent confidence interval on the estimated percent change is not statistically significant.

This is an example of calculating percent change in totals over two cycles

and a 90-percent confidence interval on this percent change is estimated as:

This is an example of calculating percent change in totals over two cycles

so the 90-percent confidence interval for finance and insurance is 5.77 percent +/- 3.34 percent or 2.43 percent to 9.11 percent.

Nonsampling Error

All surveys and censuses are subject to nonsampling errors. Nonsampling errors can be attributed to many sources, including: inability to obtain information about all companies in the sample; inability or unwillingness on the part of respondents to provide correct information; response errors; definition difficulties; differences in the interpretation of questions; mistakes in recording or coding the data; and other errors of collection, response, coverage, and estimation for nonresponse.

Explicit measures of the effects of these nonsampling errors are not available. However, to minimize nonsampling error, all reports were reviewed for reasonableness and consistency, and every effort was made to achieve accurate responses from all survey participants. Coverage errors may have a significant effect on the accuracy of estimates for this survey. The BR, which forms the basis of our survey universe frame, may not contain all businesses. Also, businesses that are contained in the BR may have their payroll misreported.

1North American Industry Classification System (NAICS) – United States, 2002. For sale by National Technical Information Service (NTIS), Springfield, VA 22161. Call NTIS at 1-800-553-6847 or go to www.census.gov/epcd/www/naics.html.


[Excel] or the letters [xls] indicate a document is in the Microsoft® Excel® Spreadsheet Format (XLS). To view the file, you will need the Microsoft® Excel® Viewer This link to a non-federal Web site does not imply endorsement of any particular product, company, or content. available for free from Microsoft®.
Source: U.S. Census Bureau | Annual Capital Expenditures | (301) 763-3324 |  Last Revised: June 29, 2009