Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

How many users do people actually test?

Jeff Sauro • November 2, 2010

If you're familiar with usability testing then you're familiar with the magic number 5.

Five users will on average find most of the problems that affect at least one-third or more of your users.  If problems are less common, then you will need to test more users to find and fix them.

On many high-traffic websites usability problems affect less than 1 out of 10 users so a larger sample size is needed.  On business applications problems are about three times more common so smaller sample sizes will often suffice to find an equal amount of problems.

For comparing products, estimating completion rates or task times, the sample sizes formulas are different and depend on how precise you need to be and the variability in the sample. 

Despite the formulas and controversy, people ultimately decide on how many users they test.

To find out how many users people test, I asked subscribers to measuringusability.com last year how many users they tested in their most recent Formative and Summative usability tests.   In total, I received 130 responses and here's what people said.

Number of Users tested in Formative tests

There was a lot of variability in the responses. Of the 95 people who reported conducting a Formative test in the last two years, the median number of users tested was 10. Most people (82%) reported testing less than 15 users in total.



The mean number tested was 24 with a standard deviation of 54 (min 2 and max 350). These numbers are the total number of users tested, as many respondents reported testing multiple iterations of users. There were five tests with reported sample sizes above 100 that skewed the mean upward. These tests were of either websites or consumer products where a remote unmoderated test was conducted (e.g. to test the information architecture of a website). 81% reported testing more than 5 users.

Average Number of Users in Summative tests

Somewhat surprisingly, the number of users reported for Summative tests wasn't much different than the Formative number. Of the 68 respondents who had conducted a Summative usability test, most (70%) reported testing less than 15 users.  The median number of users tested was 12.



The mean number tested was 27 with a standard deviation of 46 (min. 4 and max 245). The largest differences between the Formative and Summative sample sizes were found between the smallest sample sizes (2-5 users) and largest sample sizes (20+ users). 

Summative Tests have 3x more users

There are many variables that affect the number of users people test such as the product type, budgets and industry. To attempt to control somewhat for this variability I compared the sample sizes for the 33 respondents who reported conducting both Formative and Summative tests.

On average these respondents reported testing almost 3 times as many users on their most recent Summative test compared to their most recent Formative test (95% CI between 1.5 times and 4 times higher). So the graphs above mask this interesting relationship.

Corroborating Data

In generalizing results from a sample to the larger population representativeness is more important than sample size. Subscribers to my email newsletter are perhaps more quantitatively focused and that might bias their sample sizes upward.  I looked at two other data sources to get an idea of how representative this data was. 

In 2009 Jim Lewis and I reviewed 97[pdf]  Summative datasets. We found the median number of users per test was 10, ranging from 4 to 296. Sixty-four percent of the tests had between 8 and 12 users and 80% had fewer than 20. These numbers are virtually identical to the Summative survey sample presented here.

In 2007, Hornbaek and Law [pdf] reviewed dozens of datasets that appeared in HCI publications. The average number of users per study was 32 with a standard deviation of 29 (min 6 and max. 181). They didn't distinguish between Formative and Summative tests in their analysis but this figure also isn't far off from the sample data.

Both sources suggest this sample is reasonably representative of the larger population of usability tests.

Key Findings

People Test More than 5 Users:  Evaluators typically test more than 5 users. In this sample 81% tested more than 5 users in Formative tests and 91% tested more than 5 users in Summative tests.

Benchmark Your Sample Sizes: You can use this data as another benchmark when planning your next sample size. For example, if you plan on testing four rounds of four users (16 total) in a Formative usability test you'd have a sample size greater than 80% of all Formative usability tests.

Remote-Testing will change the numbers:  Cheap unmoderated usability testing  services are allowing for much larger sample sizes--even in Formative usability tests.. The data analyzed here is a year old and such tools have continued to increase in the last year.  I suspect we'll see an increase in the average sample size, especially for Summative/benchmarking studies.

The difference between Formative and Summative Tests is blurring: There is a common belief that Formative tests (which inform design decisions) are small sample qualitative studies and Summative tests are large sample quantitative studies. The many large sample Formative tests here suggest this distinction is blurring.

Surveys are reliable ways of evaluating sample sizes: It is a lot easier to just ask people how many users they tested then go through the trouble of actually reviewing dozens of reports. One concern I had was that people may inflate their sample size because it sounded better. The data here suggests that surveys can be a reliable method for estimating sample sizes.


Learn More

Jeff Sauro hosted a live webinar on February 28th, 2012 on Best Practices for Remote Usability Testing. The event was overbooked so if you missed it you can now view a recording.


You Might Also Be Interested In:

Rate this Blog

Avg. Rating 6 (5)

Poor         Excellent
012345678910

Related Topics

Sample Size, Usability Testing
.

Posted Comments

There are 2 Comments

November 11, 2010 | Jeff Sauro wrote:

Dimiter,

You're right, it would be a lot more helpful to have that information. It was the first time I surveyed users on sample sizes and metrics. The results still tell you total sample sizes and for a lot of people this is still both relevant and interesting, especially because there's so little out there on what happens in practice. 


November 10, 2010 | Dimiter Simov wrote:

Jeff, the number of users tested per study is not very informative unless you know the number of different user groups covered by the study. In our studies, when we work with one group of users, we test 5-6 people, when we work with 2 groups, we test 8 to 12 people, and if we work with more groups, the number varies even more. 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 2 + 3: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[1977 Subscribers]

Connect With Us

Our Supporters

Use Card Sorting to improve your IA

Loop11 Online Usabilty Testing

About Jeff Sauro

Jeff Sauro is the founding principal of Measuring Usability LLC, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 15 journal articles and 3 books on statistics and the user-experience.
More about Jeff...

.

Jeff's Books

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professions (JUST RELEASED)

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.