I hear questions related to statistical significance on a daily basis. It is usually some variation of “How much sample do we need to be significant?” which often reflects some confusion about the term.
Statistical significance is a concern when we are interested in detecting differences not due to chance between two or more groups (people, objects, ads etc.) being compared.
As sample size increases, the margin of error around a percent or a mean get smaller and we get, not only more precise estimates, but also more sensitivity to detect differences that are not due to chance. In a large sample, a difference of 1 or 2 percentage points may be significant, while in a smaller sample, where there is more variation, we may need to see more than 10 percentage points to detect significant differences.
In survey research, we often talk as if the results are finite point estimates when in fact we should be talking in ranges since there is always a margin of error around any estimate. So if the margin of error is +/-3% and we get a value of 50% for a variable, it means that the true value of the variable should be between 47% and 53%.
Now, if we measure the same variable in another group with a sample size where the margin of error is +/-5% and we get a value of 57% for the same variable, it means that the true value is expected to be between 52% and 62%. Despite the 7 percentage point differences, which seems large, we can’t say that it is statistically significant because there is some overlap between the margin of error range of each group (47% – 53% and 52%- 62%) and the true value of the variable in the second group could be 52% or 53% which are values included in the first group’s margin of error range.
How confident are we about this? We often say 95% confident, which means that if we repeat the study 100 times, we can expect similar results 95 times and be wrong 5 times. This is called Confidence level and the margin of error range is called Confidence Interval. In short, we want to make sure the true value falls within the same range every time we repeat the study. Unfortunately, statistical confidence has an inverse relation with estimate precision. If you want to be 99% certain then you have to allow for a larger confidence interval that will include the true value.
If there is no comparative analysis involved, it doesn’t make any sense to talk in terms of statistical significance. However, we are still concerned about estimate precision of results in total. We want our margin of error to be as small as our budget and tolerance for risk allow. To get greater precision, we need larger sample, which in turn costs more money. To be more certain, you sacrifice some precision. There is always a trade-off to make.
Next time when you considering sample size for a survey get ready to answer these questions:
- What is the desired precision (margin of error)?
- How confident do you want to be?
- Can your budget accommodate the required sample size for the desired precision? If not, what are you willing to settle for?
- Are you doing any comparisons between groups? If so, how many?
- Can your budget accommodate the required sample size by group to make meaningful comparisons?
Unfortunately, the difference between the sample you want and the one you can afford is often significant (pun intended), so budget questions are always in the mix. For more help on calculating sample size and margin of error, use our Sample Size and Margin of Error Calculators.