One of the most frequently asked questions I receive regarding A/B testing is how many users are necessary for the testing? Naturally, like with 99% of the questions, the answer is the same: it depends!
Sample size – what does it depend on?
Fundamentally, there are 3 things to consider:
- Your baseline conversion rate (%)
- The minimum relative change you expect from the test (%)
- The statistical significance you expect from the test (~95%)
If you have these, then throw this in the Optimizely – Sample Size Calculator and you will instantly see the magical number:
As you can see in the given example, a 3% baseline conversion rate and 20% minimum relative change will get you 95% statistical significance with: 10170 people per version.
That said, if you have 10.000 visitors each week, then a 2 version AB-test will go through in 2 weeks.
3 other methods to decide between the question of significant versus non-significant:
You should know, that Optimizely’s engine runs strict measurements on whether the results are significant or not. This is fine as it is, but to be on the safe side, I use 3 other tools to verify whether the published results are valid or not. To be honest, if I get 3 positive results from the other methods, I often don’t wait for the strict results of Optimizely. See below:
The most classic AB-test verifying method. There’s a user-friendly, fill-in version available online (e.g. HERE). It’s a dry science – if you get a P-value < 0.05, then there’s a 95%+ chance that the winner will be the one who is currently winning. But this in itself is not enough.
2. Trend charts
Optimizely also shows how trends evolve. This is not magic: if you see the same results for two weeks and even your T-test presents good results, then you can be fairly certain that you have won.
3. AAB(B) test:
This is an expert trick! 😉
Even before starting the experiment, it’s advisable to prepare an unedited, original version too. This is how you get a 2 A version – or even a 2 B version as well. If there is no difference in the results of the two similar versions, that’s a good sign! That combined with the trend-chart and the T-test method, you can kick the question of significant versus non-significant in the ass!
I think with these 3 quick and dirty solutions you know everything you need to know when it comes to verifying the results of your AB-test.
Also when it comes to your first experiment make sure you keep the 5+1 rules of A/B testing.
If you want to be notified first about new content on data36 blog (like articles, videos, handbooks, etc.), sign up for the Newsletter!