Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

Comparison of Usability Testing Methods

Jeff Sauro • January 17, 2012

There was a time when we spoke of usability testing it meant expensive labs and one-way mirrors.

Not anymore.

There are three core ways of running usability tests.

Each has their advantages and disadvantages.
  • Lab-Based: This is the classic approach to usability testing: users physically come to a lab, often with a one-way mirror and are observed by a team of researchers.

  • Remote Moderated: Users log into screen sharing software like GoTo Meeting and attempt tasks.

  • Remote Unmoderated: Software like UserZoom, Loop11 or Webnographer walk participants through tasks and click paths are recorded.

Here are some additional considerations :

 

Testing Method

Attribute

Lab-Based

Remote Moderated

Remote Unmoderated

Geographic Diversity

Poor: Limited to 1 (or a few) Locations

Good: Users from across US and Globe can participate.  TimeZone Difference is main drawback for international studies.

Good: Users from across US and Globe can participate for  times that are convenient to them.

Recruiting

More difficult because the geographic pool is limited to  the testing location.

Easier because no geographic limitation but sessions are  still longer.

Easiest because no geographic limitation, shorter  sessions.

Sample Quality

Good-Excellent: Limited to People willing to take time  out of day. Tight control over user activity.

Good-Excellent: Able to recruit specialized users at  minor inconvenience and can view most interactions.

Fair-Good: Often attracts people who are in it for the  honorarium or people who try and game the system.

Qualitative Insights

Excellent: Direct observation of both interface and user  reactions. Facilitator can easily probe issues.

Good: Direct observation of interface and limited user  reactions. Facilitator can ask follow up questions and engage in a dialogue.

Fair-Good: If session recorded then direct observation of  interface.
  No recording: Insights are gleaned from answers to  specific questions.  

Sample Size

More Restricted due to geographic limitation and time.

Less Restricted: Restricted by time to run studies but  more flexible hours of scheduling.

Least Restricted: Easy to Run Large Sample Sizes (100+).

Costs

Most Expensive: Higher compensation costs for users and  facilitator time.

Less Expensive: User compensation is lower and requires  less facilitation time and no facility costs.

Least Expensive: Compensation is least expensive, doesn't  require facilitation or facility costs.

Metric Quality

Excellent: You can collect almost any measure (including  eye-tracking) and task time.

Good-Excellent: Some metrics are limited (eye-tracking)  but task-time data can still be collected.

Good: Because you don't know what users are doing.

Reported Usage by User Researchers*

52%50%23%

Growth in Method*

Flat19% Increase28% Increase

* Data come from the 2011 UPA Survey

No one method is always best. A combination of methods provides a more comprehensive picture of the user experience. For example, I often combine a few lab-based participants or remote moderated participants when I conduct an unmoderated study. It provides the best of both worlds--rich interaction and discussion with larger sample sizes and a more diverse and representative group.

Combining is not always an option. In my experience the two biggest drivers of the method chosen are budget and sample size. If you want to test a lot of users (or test several user groups) but have a limited budget then remote unmoderated testing is usually the way to go. Conversely, for mobile testing, it's still largely a lab-based evaluation to capture swipes and screens.

To help guide what method to use consider what factors are most important in your research:
  • Could the product or website being tested see significant benefits by drawing responses from an international audience?(moderated remote)
  • Does the interface being tested require a more in-depth look at direct, in-person responses? (lab)
  • Is a single function being evaluated, where simple answers will satisfy simple questions? (unmoderated remote)
  • Are the tasks closed-ended and easy for participants to understand and attempt? (unmoderated remote)
More reading :



About Jeff Sauro

Jeff Sauro is the founding principal of Measuring Usability LLC, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 20 journal articles and 4 books on statistics and the user-experience.
More about Jeff...


Learn More


UX Bootcamp: Aug 20th-22nd in Denver, CO
Best Practices for Remote Usability Testing
The Science of Great Site Navigation: Online Card Sorting + Tree Testing Live Webinar


You Might Also Be Interested In:

Related Topics

Remote Usability Testing, Usability Testing
.

Posted Comments

There are 9 Comments

July 15, 2013 | Danielle wrote:

This is an excellent table of testing methods with in-depth explanation of pros and cons of the techniques. I recently wrote an article about this topic and loved asking other ux professionals which testing technique they prefered. The results were nice and varied. If you'd like to check the results out to see what might work best for you, feel free. I'd like to get your feedback as well!rnrnhttp://blog.usabilla.com/the-top-5-user-testing-methods-of-ux-professionals/rnrn 


January 24, 2012 | Anne-Marie McReynolds wrote:

Thanks for your feedback. How many remote users do you recommend testing (i.e. UserTesting.com) for formative evaluation? Are three users (per their introductory offer of three users for $29) enough?  


January 18, 2012 | Jim wrote:

Another great User Testing service is http://www.usertesting.com 


January 18, 2012 | Jeff Sauro wrote:

Anne-Marie,

I'm impressed with how you tied those blogs together (even if it revealed some inconsistency or change in thinking).

One of the main reasons remote unmoderated testing gets a "Good" instead of Excellent rating for Metric Quality is because of the issue of task time reliability. If a user goes onto Facebook or answers the phone during an unmoderated session and we don't know it, it certainly affects the quality of our task time metric. The data suggests it doesn't seem to impact completion rates and task and test level satisfaction as much.

I also gave it "Good" instead of "Poor" because of the ability to record users for a reasonable cost. The last few unmoderated tests I've run I've had recordings of the users (from Usertesting.com and YouEye.com). This allows you to go back and qualify unusual times. The cost has gone down and technology has improved enough in just 2 years that even task-times (when properly qualified) are becoming more reliable from remote unmoderated tests. 


January 18, 2012 | Anne-Marie McReynolds wrote:

You previously stated (in "97 Things to Know about Usability") that "task time data from remote tests were an unreliable measure of actual user task time," citing a 30% difference between remote-unmoderated usability tests and lab-based tests. What changed your opinion of the metrics?  


January 18, 2012 | Jeff Sauro wrote:

Colin,

You're right, I added the source note to the table to make it clear the data came from the 2011 UPA Survey.  


January 18, 2012 | Colin McCormick wrote:

I like the comparison between the three methods but where do the statistics come from? 52% and 50% seem trivial without any context.  


January 18, 2012 | John Romadka wrote:

One other attribute that would be helpful in your chart of testing methods might be "Environment". Some methods (and tools) are better/worse suited for web sites vs. web apps (or even desktop apps). 


January 18, 2012 | John Romadka wrote:

Great article! I wanted to share it with all my circles on G+, but I couldn't find a "+1" button. It just so happens that I know what the URL is for adding that button: http://www.google.com/webmasters/+1/button/

Perhaps you could add it to your site? Me and the other "four" people on G+ could use it.

P.S. - Also, maybe you should make your spam equations more difficult so you can elevate the quality of comments, so people like me can't ask you to add a G+ button. <elbow, elbow, wink> 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 5 + 4: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[4079 Subscribers]

Connect With Us

UX Bootcamp

Denver CO, Aug 20-22nd 2014

3 Days of Hands-On Training on User Experience Methods, Metrics and Analysis.Learn More

Our Supporters

Userzoom: Unmoderated Usability Testing, Tools and Analysis

Loop11 Online Usabilty Testing

Use Card Sorting to improve your IA

Usertesting.com

.

Jeff's Books

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professionals

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.