The four corners of usability measurement
Jeff Sauro • August 16, 2011
There isn't a usability thermometer to tell us how usable an interface is. We observe the effects and indicators of bad interactions then improve the design.
There isn't a single silver bullet technique or tool which will uncover all problems. Instead, practitioners are encouraged to use multiple techniques and triangulate to arrive at a more complete set of problems and solutions.
Triangles of course have three sides, but I like to think of usability measurement as having four corners—more like quadrangulation (like that quad from college). Here are those four corners of usability measurement.
1. User Testing
This is what comes to mind when you think of usability-- for good reason. The root of the word usability is use—that is, you need to watch people actually using the interface. While this may seem obvious, there is still a tendency to think we will know the problems users will have because after all, we can use the software too.
While many problems can be avoided with better code, designs and expert inspections, the unique combination of tasks and users consistently generates surprising results.
There are basically two types of usability tests:
How it Works
- Problem Discovery Tests (Formative): Discover problems users have with the interface and fix them.
- Benchmark Tests (Summative) : Determine how usable an application is based on how well users can complete tasks.
: A usability expert or someone trained in observing user behavior generates a set of task-scenarios for a sample of users to attempt in a moderated or unmoderatted setting. Metrics are collected
(usability problems, task completion rates, task-times, errors and perceived satisfaction metrics). Moderated testing is best used for identifying usability problems
and unmoderated testing
is best for larger sample sizes (more precise metrics) and well defined tasks.
2. Inspection Methods
Professionals trained in the art and science of human computer interaction can examine an interface for problems users are likely to encounter. These include problems like links in navigation which disappear when hovered over
. The techniques of Heuristic Evaluations and Cognitive Walkthroughs
fall under the auspices of Inspection Methods.
It is often just referred to as an expert review (Heuristics Evaluation does sound better) but it is more than just an expert's opinion. Inspection Methods are based on principles derived from observing and analyzing user actions and intentions. How it Works:
An expert in usability methods and ideally the domain being evaluated review an interface against a set of criteria. These inspections are often done with an emphasis on learnability and use task-scenarios to help focus the evaluation.
Inspection methods tend to provide more coverage than user testing. There is some evidence they may generate too many false positives—that is, if there's no evidence a user will really have a problem with a design element, is it really a problem? Inspection Methods are best done with multiple evaluators and done in addition to user testing. They tend to find around half the problems found in a user test.
3. Cognitive Modeling
Like Usability Inspection methods, Cognitive Modeling doesn't involve direct measurement of users
. It instead relies on data from prior tests and general principles of human computer interaction.
Over the past 25 years, hundreds of users were asked to complete common software tasks and their times have been measured, refined and replicated by multiple researchers.The most commonly used cognitive modeling technique is Keystroke Level Modeling (KLM) which is a subset of GOMS[pdf]
. How it Works:
An evaluator determines the likely path users would take through an interface, what they'd click on, type and their decision points. The task is broken down into small chunks of work—typically lasting less than a second. For each segment, you use operators to estimate how long it would take a typical skilled user, committing no errors. The most common operators are clicking, pointing, typing and thinking.
Some research has indicated that the process of decomposing a task into small interaction chunks actually reveals many usability problems
4. Standardized Questionnaires
After a usability test it is common to have users answer questions about the overall usability of the interface using a standardized questionnaire like the SUS
for software or SUPR-Q
You can also administer these questionnaires outside of a usability test to get a current measure of the perceived ease of use of the interface. An advantage to this approach is that because it can be administered like a survey, you can get results quickly and for a fraction of the cost of a moderated usability test. How it Works
: Current users of a website or product are asked to answer standardized questions about the overall application usability. This score provide a baseline measurement which is used as a benchmark to compare future design iterations to.
Raw scores can also be converted into percentile ranks to get an idea of how an application's perceived usability compares to other websites and products. Questionnaires aren't terribly helpful for identifying usability problems and tend to be affected by task-difficulty
when administered in user tests.
There are many techniques that play important roles in improving the user experience that I didn't include. Some notable exceptions include Contextual Inquires, Log-File Analysis, Surveys and Focus Groups. Each of these can help measure usability but their central role lies more in defining users, their goals, the tasks they attempt and utility and functionality—all of which are topics of for another blog.