I’ve been doing survey design and market research for 25 years and I’m still amazed when I’m asked, “How many responses do I need to make my study ‘statistically significant’?”
Hearing those words “statistically significant” in a market research context always confuses me. My usual response is, “What exactly do you mean by that?”
I either get a blank stare or a variety of responses, but most people are intending to say “You know, ‘statistically significant,’ the magic label that makes my results better.”
Why The Confusion Over Statistical Significance?
This term gets thrown around out of context quite often, and it also gets applied to to things that cannot themselves be statistically significant (like a survey). Here are the most common sources of confusion:
- A survey can’t be “statistically significant,” nor can a certain number of responses. Only a test statistic (e.g., a calculated statistical quantity) can be statistically significant.
- To a statistician, “statistically significant” has a very particular meaning related to hypothesis testing requiring a specific set of assumptions, which are rarely if ever true in a market research setting.
- Researchers’ main focus should be finding out about the statistical significance of the survey results in the context of their objectives.
A Closer Look at Statistical Significance
As an example, suppose someone comes to me for help with a survey to test the market’s preferences for skin care products. Often the first question they ask me is how many responses are needed to get “statistically significant” results.
That is where the confusion starts, because that question only makes sense in the context of a statistical hypothesis test. A survey may involve many hypotheses that we want to test.
A statistical hypothesis test requires both a hypothesis (e.g. women buy more skin care products than men), and a test statistic (e.g. the percent of women who buy skin care products minus the percent of men who buy skin care products).
Now we can ask if the test statistic (the difference between the two percentages) is “statistically significant.” That’s a legitimate question.
But a more meaningful question might be whether the difference is “practically significant.”
Practical Significance Defined
A calculated difference is practically significant if the actual difference it is estimating will affect a decision to be made.
If the reason for running our survey is to find out if we should we focus more marketing dollars on women than we do on men, then the difference will determine if the results are practically significant.
Statistical significance, on the other hand, depends on the sample size.
A difference of 3% (58% for women minus 55% for men) can be statistically significant if the sample size is big enough, but it may not be practically significant. Three percent hardly seems big enough to warrant focusing on one market over the other.
A difference of 30% (65% for women minus 35% for men) may be practically significant (i.e., warrant a decision to focus more dollars on the women’s market), but if the difference isn’t statistically significant (that depends on sample size) then you can’t be sure the difference you see (30%) is real.
Therefore you either need to get more data or treat the two groups as the same.
Final Considerations For Determining Statistical Significance
There are additional statistical issues with statistical significance that I haven’t addressed here, such as all the assumptions needed to perform the testing correctly.
The assumptions that data is distributed totally evenly and that we’ll get 100% response rates are rarely, if ever, met in a market research setting. Both of these impact the validity of your sample and responses, but I’ll save those for another time.
So, how “statistically significant” is your survey data, and is that even the right question to ask?