U.S. flag

An official website of the United States government

Skip Header

Guidance for Small Business Data Users

Comparing Data

Most users are not just interested in what the statistics look like for a topic right now; they want to compare them with data from the past. Well, there are many variables one must consider when comparing data over time. Some of these differences include:

  • Changes in the survey/sample size or universe
  • Changes in the forms or mediums used to collect data
  • New decisions on how to tabulate data
  • New questions or a new order to old questions
  • Changes to industry classifications
  • Various geographies change over time (towns expand and contract, etc.)

For example, when comparing data from one Economic Census to the next, there are a couple of differences that stand out:

  • The 1997 Economic Census implemented the North American Industry Classification System (NAICS).  Although nearly half of the Standard Industry Classification (SIC) system codes in use in 1992 could be derived from the 1997 NAICS codes, a substantial number of industries could only be approximated under NAICS.  
  • The 2017 Economic Census implemented the North American Product Classification System (NAPCS).  Only about 80% of the 2017 NAPCS-based products map directly to a single 2012 legacy product. 

Statistical Accuracy - Confidence and Error

In order to understand random sampling, you need to become familiar with a couple of basic statistical concepts.

  • Error - This is that "plus or minus X%" that you hear about. What it means is that you feel confident that your results have an error of no more than X%.
  • Confidence - This is how confident you feel about your error level. Expressed as a percentage, it is the same as saying if you were to conduct the survey multiple times, how often would you expect to get similar results.

These two concepts work together to determine how accurate your survey results are. For example, if you have 90% confidence with an error of 4%, you are saying that if you were to conduct the same survey 100 times and calculated 90-percent confidence intervals each time, approximately 90 of the 100 intervals would contain the value of the population parameter.

If you are not sure what sort of error you can tolerate and what level of confidence you need, a good rule of thumb is to aim for 95% confidence with a 5% error level.

“Error” is also referred to as the "margin of error" and “Confidence” is also known as the "Confidence Level." In order to avoid confusion, these concepts will simply be referred to as "Error" and "Confidence".

Page Last Revised - March 10, 2022
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?


Back to Header