- Examples of Calculating a Confidence Interval.
- Examples of Calculating Absolute Differences and Percent Changes

**Appendix C.
Sampling and Estimation Methodologies **

The estimates are based on two distinct stratified simple random samples. The first sample, receiving the ACE-1 form, are 46,427 companies with paid employees as determined by nonzero payroll in the previous year, 2007. The second sample, receiving the ACE-2 form, are 14,981 businesses without paid employees. Appendix D has examples of each type of survey form.

The survey's scope includes all private, nonfarm, domestic companies. Major exclusions from the frame are government-owned operations, including the U.S. Postal Service, foreign-owned operations of domestic companies, establishments located in U.S. territories, establishments engaged in agricultural production (but including agricultural services), and private households.

The 2007 final version of the Census Bureau's establishment-based database, the Business Register (BR) was used to develop the 2008 sampling frame. This database contains records for each physical business entity, the establishment, with payroll located in the United States. Records include company ownership information and current-year administrative data, such as payroll.

In creating the ACE-1 frame, establishment data are consolidated to create company-level records for companies that have more than one
establishment. This created a frame of slightly more than 6 million companies. To create business activity classifications, the
employment and payroll data for each establishment in that company was gathered on its assigned 2002 six-digit North American Industry
Classification System^{1} (NAICS) industry. This data is then assigned to an industry sector that has the most payroll (i.e., manufacturing,
construction, etc.), then subsector within that sector, industry group within that subsector, then industry within the industry group.
This company is assigned a 2002 NAICS industry is then recoded to an Annual Capital Expenditures Survey (ACES) code.

The 2008 ACE-1 sampling frame is partitioned into two portions: the certainty and noncertainty parts. The certainty portion is a group of 17,916 companies that had 500 or more employees in the frame year. These are all placed in the sample. The nearly 6 million remaining companies have between 1 to 499 employees, and are stratified into 1 of the 135 ACES industry categories. Each ACES industry is subdivided into four substrata based on diminishing 2007 value of payroll. The methodology used to determine how to create the substrata minimizes the sample size subject to a relative desired level of reliability. Samples are chosen from each of these 135 ACES strata and their 4 substrata. Collectively, 28,511 companies were chosen from this part of the ACE-1 frame.

The ACE-2 sample frame is a composite frame of four categories of small businesses, all treated as independent stratum. The 2007 BR is the source of the first two of these groups: companies without payroll in the prior year or employment on March 12th of in the prior year but had paid employees in the past and some IRS activity in the last 5 years, and companies that applied for an employer identification number (EIN) in the last 2 years, but still have no payroll or employment. A special 2007 nonemployer database is the source of the other two groups: nonemployer corporations and partnerships, and nonemployer sole proprietorships with receipts of $1000 or more. Collectively, there were about 29.0 million nonemployers. A simple random sample of different sizes was taken from each group, resulting in a sample of 14,981 selected companies.

^{1}Information about NAICS can be found at
http://www.census.gov/eos/www/naics/

The quality measure called the sampling unit response rate is the percentage of all mailed eligible companies that responded, which is 75.8%. All companies are not equally important to the estimate however. Each sampled company has a sample weight reflecting other unselected companies in the population. Sampled companies in the same substratum have identical weights, which range from one, only represents itself, to several thousand. Respondents weights are further increased to widen their representation to account for companies that would have been represented by nonrespondents. Final estimates use these increased weights. The coverage rate is a quality measure that is the percentage of the estimate of total capital expenditures from respondents using only their original sampling weight. The coverage rate for ACES was 90.3 %. The difference between the two quality measures is that while many companies did not report to the survey, they are not as influential in creating the estimate as many who did report.

Sampling Weights and Weight Adjustment for Nonresponse

After being given an initial sampling weight, the weights could be further adjusted based on activity and response status. The goal is to have the in-scope responding sample reflect the frame. Each sampled company becomes either a respondent, a nonrespondent, or is out of scope if found to have been out of business prior to the survey year or is a duplicate to another record. Companies that went out of business during the survey year are still in-scope, and efforts are made to collect data for the period the company was active.

A company is a respondent if they return a report, and they report nonzero amounts for item 111 (Capital Expenditures) or item 2 (more detailed Capital Expenditures) on the ACE-1 form, or item 1 (Capital Expenditures) of the ACE-2 form. Respondents will have their sampling weights adjusted upwards to account for the nonrespondents, such that the respondents still represent the entire in-scope population. The adjustment for ACE-1 respondents is based on the outstanding payroll nonrespondents account for in each ACES industry by substrata, while for ACE-2 respondents it is based solely on the percentage of companies not reporting, regardless of size. In addition, companies who are deemed 'extreme outliers' may have their weights further reduced to minimize the mean squared error of the estimates.

ACE-1 segment. The following discussion assumes 675 substrata (substrata designation h = 1, 2, . . ., 675) which are based on the 135
ACES industries, each containing five strata (four noncertainty strata and the certainty stratum). The sampling weights (W_{h}) are adjusted
for nonresponse based on payroll:

where,

Wh_{(adj)}: adjusted substratum weight of the h^{th} substratum

W_{h}: substratum sampling weight of the h^{th} substratum

N_{h}: population size of the h^{th} substratum

n_{h}: sample size of the h^{th} substratum

P_{hr}: sum of total company payroll for respondents in substratum h

P_{hn}: sum of total company payroll for nonrespondents in substratum h

ACE-2 segment. The ACE-2 segment initially was stratified into four strata based on the four small business categories mentioned above. The stratum consisting of ''companies with no payroll in the prior year and no employees on March 12 in the prior year, but with payroll in previous years'' was poststratified into two strata. The stratum ''companies which had received an Employer Identification Number (EIN) within the last 2 years, but for which no payroll, employment, or receipts data have yet been received'' was poststratified into two strata. In both instances, the poststratification was based on updated administrative record data. This method resulted in six strata (strata designation h = 1, 2, . . ., 6). The stratum population sizes, sample sizes, response counts, and stratum weights for the four new strata resulting from the poststratification were modified accordingly, while the other two strata retained the original weights.

The ACE-2 stratum weights (W_{h}) were also adjusted to compensate for nonresponse based on number of respondents:

where,

Wh_{(adj)}: adjusted stratum weight
of the h^{th} stratum

W_{h}: stratum weight of the
h^{th} stratum

N_{h}: population size of the
h^{th} stratum

n_{h}: sample size of the h^{th}
stratum

r_{h}: number of respondents in
the h^{th} stratum

Publication Estimation

Publication cell estimates were computed by obtaining a weighted sum of reported values for respondents. These estimates may be biased from the nonresponse adjustment, since its is assumed nonresponse is a purely random event, which it may not be. No attempt to measure the bias is made.

ACE-1 Estimation: The ACE-1 estimates, are (assuming 675 substrata)

where,

Wh_{(adj)}: adjusted weight of
the h^{th} substratum

X_{(j),i,h}: value attributed
to the i^{th} company of substratum h, where j is

the publication cell of interest.

Note: Although a company is assigned to and sampled from a single ACES industry, it can report capital expenditures in several ACES industries. Reported data for all industries are inflated by the weight in the sample industry of the respondent.

ACE-2 segment. The ACE-2 estimates, , are (with k=6 in 2008):

where,

W_{h(adj)} :adjusted weight of the hth stratum

is the publication cell of interest

Note that there are no industry level estimates from the ACE-2 companies. Therefore the j will always represent a national-level estimate.

The estimates are derived from sample data, and will differ from results derived from data from other samples or a complete census of the population. A sample and a census will both experience errors classified as nonsampling errors, which often introduce systematic bias into the results. Bias is the difference, averaged over all possible samples of the same design and size, between the estimate and the true value being estimated. These types of errors are not explicitly measured. Only samples have sampling errors, the error from only observing a subset of the population. With a probability sample, this type of error can be explicitly measured. For any particular estimate though, the total error from sampling and nonsampling error may considerably exceed the measured error.

Sampling Variability

The sample selected is only one of the many possible samples that could have been selected, with each possible sample producing possibly different results. The relative standard error (RSE) measures the variability among the possible estimates from these possible samples, relative to the estimates. These are calculated using a delete-a-group jackknife replicate variance estimator. The RSEs in the tables can be used to derive the standard error (SE), which can then be used to create interval estimates with prescribed levels of confidence.

The SE of the estimate is calculated by multiplying the RSE by its corresponding estimate. Note, the RSE is the measure of variability presented for all estimates in this publication except for the estimates of percent change. RSEs are also given as a percentage, and need to be divided by 100 before used to calculate the SE.

In general, those intervals defined by 1.6 standard errors above and below the sample estimate will contain the true population value about 90 percent of the time, while those intervals defined by 2 standard errors above and below the sample estimate will contain the true population value about 95 percent of the time. These intervals are called confidence intervals. Note that the SE is in the same units as the estimate, while the RSE is unitless.

**Examples
of Calculating a Confidence Interval:**

= (8.7 / 100) * $108,215 million = $9,415 million.

The 90-percent confidence interval can be constructed by multiplying 1.6 by the SE to create the margin of error (MOE), and adding and subtracting the MOE to the estimate. The 90% confidence interval for the estimate of nondurable manufacturing total capital expenditures is then:

$108,215 million ± [1.6*$9,415 million] = $108,215 ± $15,064 million

This implies that using the sampling method described, we are 90% confident that the true value of total capital expenditures for this subsector is between ($108,215-$15,064) $93,151 million and ($108,215+$15,064) $123,279 million. Since this confidence interval does not contain zero (0), we also have sufficient evidence to conclude that the estimated change was statistically larger than 0, i.e., this sector showed an increase in the amount of capital expenditures. This does not consider any additional issues due to nonsampling errors.

b. Calculating a confidence interval for a percent change of an estimate between two survey years: using estimates from table 2a and SEs from table 2b, the 90-percent confidence interval can be constructed by multiplying 1.6 by the SE of the percent change to create the MOE, and adding and subtracting the MOE to the estimate. For example, for the nondurable manufacturing total capital expenditures, the estimated percent change from 2007 to 2008 is 20.7% (from Table 2a), and the standard error of this estimate is 10.6 percent (from Table 2b)

20.7% ± [1.6 * 10.6%] = 20.7% ± 17.0%

This implies that using the sampling method described, we are 90% confident that the true value of the percentage change in this sector is between (20.7%-17.0%) 3.7% and (20.7% + 17.0% ) 37.7%. Since this confidence interval does not contain zero (0), we also have sufficient evidence to conclude that the estimated percent change was statistically larger than 0, i.e., this sector showed an increase in the amount of capital expenditures. This does not consider any additional issues due to nonsampling errors.

**Examples of Calculating Differences and Percent Changes**

Data for the current year along with revised data for the prior year are presented in this publication. Two numbers of interest for many data users may be the difference between the prior year and the current year, and the percent change from the prior year to the current year.

The difference is calculated as:

and the MOE for a 90-percent confidence interval on this difference:

As an example, for the nondurable goods manufacturing, from table 4a the total expenditures estimate for 2008 is $108,215 with the RSE found in "table 4c" as 8.7. The revised 2007 estimate from table 4b is $89,633 with the RSE found in table 4d as 1.2. The difference would be be:

[$108,215 million - $89,633 million] = $18,582 million

And the MOE for the 90-percent confidence interval is estimated as follows, including translating the RSEs into SEs:

= 1.6 * √ [ ((8.7/100) * $108,215 million )^{2} + ((1.2/100) * $89,633 million )^{2} ]

= 1.6 * √ [ (0.087 * $108,215 million )^{2} + (0.012 * $89,633 million )^{2} ]

= 1.6 * √ [ 88,636,670 + 1,156,907] million^{2}

= 1.6 * √ [89,793,577] million^{2}

= 1.6 * 9476 million

= $ 15,162 millionThe 90-percent confidence interval for the difference between the two years is $18,582 million ± $15,162 million, or the interval of $3,420 million to $33,774 million. At the 90-percent confidence level, the change is significant. In this instance, however, a 95-percent confidence interval, with a larger confidence interval, would have a lower bound below 0, and would be interpreted as not significant at that confidence level.

The percent change is calculated as 100 multiplied by the ratio of the difference divided by the prior estimate.

So continuing with the example from above,

= 100 * ($18,582/ $89,633)

= 20.7%

This is the number we used above in part b, which we took from table 2a. The MOE for a 90-percent confidence interval on this is estimated as:

= 1.6 * 100 * ($108,215 / 89,633 ) * √ [(8.7/100)^{2} + (1.2/100)^{2} ]

= 1.6 * 100 *(1.21) * √ [(0.087)^{2} + (0.012)^{2} ]

= 160 * 1.21 * √ [ .0077 ]

= 193 * 0.088

= 17.0 %

so the 90-percent confidence interval for the percent change is 20.7% ± 17.0%, or 3.7% to 37.7%. Since this interval does not contain zero (0), we can conclude that the positive percentage change from 2007 to 2008 is statistically significant at the 90-percent confidence level.

Nonsampling Error

All surveys and censuses are subject to nonsampling errors. Nonsampling errors can be attributed to many sources, including: inability to obtain information about all companies in the sample; inability or unwillingness on the part of respondents to provide correct information; difficulties in defining concepts; differences in the interpretation of questions; mistakes in recording or coding the data; and other errors of collection, response, coverage, and estimation for nonresponse.

Explicit measures of the effects of these nonsampling errors are not available. However, to minimize total nonsampling error, all reports were reviewed for reasonableness and consistency, and every effort was made to achieve accurate response from all survey participants. Coverage errors, errors from not including companies that are in-scope of the survey or mistakenly including those that are out-of-scope as eligible, may have a significant effect on the accuracy of estimates for this survey. The Business Register, which forms the basis of our survey universe frame, may not contain all in-scope businesses, or have incorrect values of payroll that then affect how they are sampled and their impact of their responses through their sampling weights.

A more detailed profile on the quality of the Annual Capital Expenditures Survey is available on request. Please contact the Business Investment Branch of the Company Statistics Division at 301-763-3324.

[PDF] or denotes a file in Adobe’s Portable Document Format. To view the file, you will need the Adobe® Reader® available free from Adobe.
[Excel] or the letters [xls] indicate a document is in the Microsoft® Excel® Spreadsheet Format (XLS). To view the file, you will need the Microsoft® Excel® Viewer available for **free** from Microsoft®.
This symbol indicates a link to a non-government web site. Our linking to these sites does not constitute an endorsement of any products, services or the information found on them. Once you link to another site you are subject to the policies of the new site.

Source: U.S. Census Bureau | Annual Capital Expenditures | (301) 763-3324 |
Last Revised:
June 20, 2011