A/B Test Calculator
Calculate sample size for statistically significant A/B tests.
Set your parameters and hit Calculate. We'll tell you how many visitors you need.
Related Tools
Marketing ROI Calculator
Calculate return on investment for campaigns, channels, and marketing spend.
Use tool →Keyword Density Checker
Analyze word frequency, n-grams, and keyword density in any text.
Use tool →UTM Link Builder
Build campaign URLs with UTM parameters for Google Analytics tracking.
Use tool →Subject Line Analyzer
Score email subject lines for length, power words, spam triggers, and mobile preview.
Use tool →How to Calculate A/B Test Sample Size
Use the Sample Size Planner to figure out how many visitors you need before you start your test. Enter your current conversion rate, the smallest improvement you'd want to detect, and your desired confidence level. The calculator uses a two-proportion z-test formula to give you an exact sample size per variation. Run your test until you hit that number — stopping early is the fastest way to get a false positive.
Once your test is complete, switch to the Results Evaluator. Enter your visitor counts and conversions for both variations. You'll get a p-value, confidence level, relative lift, and a clear verdict on whether you have a statistically significant winner.
What Is Statistical Significance?
Statistical significance tells you whether the difference between your control and variation is real or just random noise. A p-value below 0.05 means there's less than a 5% chance the difference you're seeing happened by luck. That's the standard threshold most teams use — it doesn't guarantee the result is correct, but it means you can be reasonably confident you're not chasing phantom improvements.
Understanding P-Values and Confidence Levels
The p-value is the probability of seeing a result at least as extreme as yours if there were actually no difference between variations. A p-value of 0.03 means a 3% chance of a false positive. The confidence level is simply 1 minus the p-value, expressed as a percentage — so a p-value of 0.03 gives you 97% confidence. Neither number tells you how large the effect is; that's what relative lift and the confidence interval are for.
Common A/B Testing Mistakes
Stopping tests early when one variant looks like it's winning. Peeking at results repeatedly and calling a winner the first time p dips below 0.05. Running tests on too little traffic and declaring a 50% lift that was really just noise. Not accounting for seasonal traffic patterns. Testing too many variations at once without adjusting for multiple comparisons. And the classic: running a test with no hypothesis about why the change should work, then reverse-engineering a story to fit whatever result you got.