I hear questions related to statistical significance on a daily basis. It is usually some variation of “How much sample do we need to be significant?” which often reflects some confusion about the term.
Statistical significance is a concern when we are interested in detecting differences not due to chance between two or more groups (people, objects, ads, etc.) being compared.
The Role of Sample Size
As sample size increases, the margin of error around a percent or a mean gets smaller. Consequently, we get not only more precise estimates but also a higher sensitivity to detect differences that are not due to chance.
In a large sample, a difference of 1 or 2 percentage points may be significant. On the other hand, in a smaller sample, where there is more variation, we may need to see more than 10 percentage points to detect significant differences.
In survey research, we often talk as if the results are finite point estimates when in fact we should be talking in ranges since there is always a margin of error around any estimate. So if the margin of error is +/-3% and we get a value of 50% for a variable, it means the true value of the variable is likely to be between 47% and 53%.
Now, imagine we measure the same variable in another group with a sample size where the margin of error is +/-5%, and we get a value of 57% for the same variable. This means that the true value is expected to fall between 52% and 62%.
Despite the 7 percentage point differences, which seems large, we can’t say that it is statistically significant because there is some overlap between the margin of error range of each group (47% – 53% and 52%- 62%). The true value of the variable in the second group could be 52% or 53% which are values that fall in the first group’s margin of error range.
Confidence Interval & Level
How confident are we about the method we are using? We often say 95% confident, which means that if we repeat the study 100 times, we can expect to be right 95 times and be wrong 5 times. This is called the Confidence level.
The margin of error range is called Confidence Interval. In short, we want to make sure the true value falls within the same range every time we repeat the study.
Unfortunately, statistical confidence has an inverse relation with estimate precision. If you want to be 99% certain then you have to allow for a larger confidence interval that will include the true value.
If there is no comparative analysis involved, it doesn’t make sense to talk in terms of statistical significance. However, we are still concerned about estimating the precision of results in total. We want our margin of error to be as small as our budget and tolerance for risk allow.
To get greater precision, we need a larger sample, which in turn costs more money. To be more certain, you sacrifice some precision. There is always a trade-off to make when it comes to sample size.
Questions You Need to Answer
Next time when you estimate the sample size for a survey, get ready to answer these questions:
- What is the desired precision (margin of error)?
- How confident do you want to be in your estimation method?
- Can your budget accommodate the required sample size for the desired precision? If not, what are you willing to settle for?
- Are you doing any comparisons between groups? If so, how many?
- Can your budget accommodate the required sample size by groups to make meaningful comparisons?
Unfortunately, the difference between the sample you want and the one you can afford is often significant (pun intended). In other words, budget issues are always in the mix.
For more help on calculating sample size and margin of error, use our Sample Size and Margin of Error Calculators.
(An earlier version of this article was originally published on November 11, 2011. The article was last updated and revised on August 21, 2019.)