Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

A/B Test Calculator

N-1 Two Proportion test for comparing independent proportions for small and large sample sizes

Jeff Sauro • April 17, 2012

SuccessesTotal%
Group 1
Group 2


No values entered


About this calculator

This calculator is based on the N-1 Chi-Square test as orginally proposed by Pearson, 1900 and recommended by Campbell, 2007. When expected cell counts fall below 1, the Fisher Exact test is used. Detailed calculations and examples are available in Chapter 5 of Quantifying the User Experience.

In general you should use the 2-tailed p-value unless you have a strong a priori reason to suspect one group really will have a higher proportion.


About Jeff Sauro

Jeff Sauro is the founding principal of Measuring Usability LLC, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 20 journal articles and 4 books on statistics and the user-experience.
More about Jeff...


Learn More

.

Posted Comments

There are 7 Comments

May 31, 2013 | mo wrote:

no, the p-value does only give you the probability of the test statistic if h0 would be true. what you interpret ("There is a X% chance the proportions are different) is imho a probability of the h1 to be true, but that is nothing you get out of pvalue. p(data|h0) is not p(h0|data). 


April 22, 2013 | Jeff Sauro wrote:

Richard,

Thanks for your comment. You are exactly right in your definition of the p-value. Unfortunately, the subtlety of your statement gets lost in the applied world of A/B testing. I understand the distinction but it usually confuses not informs the recipient of the results. See for example an earlier comment on this page as proof. We provide this context in Chapter 5 of the book.

Using 1 minus the p-value is a crude shorthand for interpreting the p-value for execs and those interpreting A/B test results. It is often communicated simply as confidence in many A/B testing packages (e.g. 95.7% confidence). So while this is not something I recommend reporting in scientific publications, in the world of picking the better alternative (version A or version B) using this nomenclature likely does more good than harm.
 


April 21, 2013 | Richard. wrote:

95.474% is not the chance the proportions are different. 4.526% is the chance that if the population (i.e. true) proportions were the same you would get a result as or more extreme than what you observe. 


February 11, 2013 | Zeeshan wrote:

very useful 


November 6, 2012 | Chuck wrote:

The Chi-square test assumes normal distribution of the variables. The results are too liberal [too low in this case]] The variables involved have a discrete distribution and a beta distribution is used for the test. Still, this is not too bad and is in line with many published simplifications. The results are not highly off, just a bit. 


September 9, 2012 | Zeeshan wrote:

I found this calculator very useful for non statistical persons 


May 16, 2012 | Bryan wrote:

A little help for those of us without stats degrees?

- What the hell are 1-tailed and 2-tailed p-values?
- In plain English, what does it mean that there's an N% chance the proportions are different, and why should I care?
- Ditto higher proportion

 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 3 + 1: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[4331 Subscribers]

Connect With Us

Our Supporters

Userzoom: Unmoderated Usability Testing, Tools and Analysis

Usertesting.com

Loop11 Online Usabilty Testing

Use Card Sorting to improve your IA

.

Jeff's Books

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professionals

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.