The Home-Page provides an easy starting point for calculators.
Sample Size and Power for Comparing Two Means
You need an estimate of how large of a difference you want to see between means in estimating your needed sample size. This calculator has an easy 3-step process that takes the guess work out of generating sample size and power calculations for your next comparative test.
Sample Size for a Margin of Error (Proportion)
If you’re conducting a usability test, then it will be good to know ahead of time how wide or narrow your confidence intervals will be around your completion rates. Enter the desired margin of error (e.g. 10%) and quickly get the sample size needed plus a quick reference graph showing you the effects of more users on the width of the margin of error.
Sample Size for a Margin of Error (Continuous Data)
If you’re conducting a usability test, then it will be good to know ahead of time how wide or narrow your confidence intervals will be around your task times or satisfaction scores. Enter the desired margin of error (e.g. 10%) and the estimate of the mean and standard deviation to generate the needed sample size. If you don’t have an estimate for these—the calculator provides default values based on data from hundreds of usability tests.
Compare 2+ Means (ANOVA)
The Analysis of Variance (ANOVA) detects whether there is a difference in at least one mean. Enter up to five categories (e.g. competing products) to test differences in task times, clicks or satisfaction scores.
Compare 2+ Completion Rates or Multiple Categorical Variables (Chi-Square Test)
The Chi-Square test calculator will allow you to compare 2 to 10 completion rates (binary responses), or 10 multiple category data (e.g. from sub-categories such as experience level).
Compare the Same Users on Different Applications
You can get more information with a smaller sample size by having the same users attempt tasks on different applications. A special statistical test (the Paired t-test) is used to analyze the data and is included in this package. Enter the raw task time for both applications for task times or satisfaction data to quickly look for significant differences. You get graphs, confidence intervals and statistical results. The calculator comes with a "How to Report" section which makes communicating the results to stakeholders easier.
Compare Two Task Times or Satisfaction Scores
Enter the raw task time or satisfaction data for (e.g. from competing products or different versions) to quickly look for significant differences. The calculator both graphs and provides numerical results as well as how large of a difference was observed (effect size). The calculator comes with a “How to Report” section which makes communicating the results to stakeholders easier.
Compare Completion Rates
Determine whether two completion rates (e.g. from competing products or different versions) are statistically different. Just enter the total users tested and number successful for each application and get the statistical results.
Test a Completion Rate against a Criterion & Confidence Interval
Enter the number of users who completed the task and the total attempted and the calculator provides the confidence interval and a graph of the results. Simply enter a test criterion (e.g. 70%) to see if the observed completion rate is statistically higher than the test criterion.
Test a Task Time against a Criterion & Confidence Interval
Just paste the raw task times and the calculator will transform the times to adjust for non-normality. You’ll instantly see a graph and confidence interval for the data. You can also enter a criterion (e.g. 100 seconds) to see if the observed mean time is statistically higher than the test criterion.
Sample Size and Power for Comparing Two Proportions
I’ve read through dozens of really boring statistics papers and have read Cohen’s Power Analysis to generate an accurate and easy way to know the sample size and power needed when you compare two completion rates. The calculator walks you through a 3-step process to generate an effect size estimate, power calculations and the number of users needed to detect differences in completion rates.