Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

Do severe problems affect more users than trivial ones?

Jeff Sauro • March 25, 2010

While testing with five users might reveal 85% of problems that impact 31% of users (given a set of tasks and user-type), it doesn't mean you're finding 85% of the critical problems.  Are severe usability problems likely to occur more frequently, less frequently or is problem severity independent of frequency?

The data on this is mixed. The paper from Bob Virzi in 1992 showed there to be an association between frequency and severity (more severe problems were encountered by more users). The follow up paper by Lewis 1994 found frequency and severity to be independent. There hasn't been much additional data addressing this issue since then.

One complication in this matter is the definition of critical. Rating a problem's severity is typically not an objective task. It usually involves thinking about the potential impact of the problem (losing work, crashing a system or making a trivial and correctible error).  Evaluators typically assign a severity code (1-3, 1-4 or 1-7) to the problem. These severity codes often correspond loosely to adjectives such as cosmetic, trivial, moderate and critical/severe and my personal favorite--catastrophe.

When assigning severity codes to problems, it would be hard not think about how many users would be impacted by the problem. That is, all things being equal, if two problems have about the same medium negative consequences, but one was seen affecting only one user in a test, and the other was seen affecting most users in a usability test, the latter would be considered as more severe. Taking account of severity and frequency into one rating can be called criticality (I think Jeff Rubin gets credit for that term) and I suspect many practitioners take this approach.
 
Even if different people rate problem severity separate from those who have knowledge of the problem frequency, problem statements often can contain hints about total impact based on whether the functionality is obscure or common (think of a problem on a homepage vs. one on the Terms & Conditions page).

Take Away

Assume severity and frequency are not related
It would be nice if more severe problems affected more users. If they did, we'd really get the bang for the buck with small sample sizes.  Until we have such evidence we should assume there is no association between severity and frequency.

Think in terms of classes of problems
In addition to frequency and severity, Bob Virzi suggests we should be thinking about whether or not there are "classes of problems that, if present, tend to affect lots of people, and also tend to be hard to recover from for individuals." An example of a class of problems would be problems related to the Microsoft Office ribbon (something that would affect a lot of users, but not necessarily be hard to recover from).

Quantify problem frequency and severity in formative tests
This is another reason to quantify and categorize the problems you have during a formative usability test. Qualitative feedback is essential for usability improvements, but with a little more effort you should also note :
  • Total number of users that experienced a problem.
  • Which users experience the problem.
  • Problem severity.

Having this information will allow you and your organization to quantitatively show how usability and design changes reduced the number and severity of problems.

References

  1. Rubin, Jeff (2008) Handbook of Usability Testing (2nd Edition)
  2. Lewis, James(1994) "Sample Sizes for Usability Studies: Additional Considerations[pdf] " in Human Factors 36(2) p. 368-378, 1994
  3. Virzi, Robert (1992) "Refining the Test phase of Usability Evaluation: How many subjects is enough?" in Human Factors (34) p 457-468 1992
 

Rate this Blog

Avg. Rating 5.5 (14)

Poor         Excellent
012345678910

.

Posted Comments

There are 4 Comments

March 30, 2010 | Jeff Sauro wrote:

Jim,

Thanks for your FMEA comment. I'm a big fan of tools like the FMEA and QFD as I've used them in past Six Sigma projects. They use simple math (multiplication and addition) but help systematize thinking. I'd love to see your adaptation of it to usability problems. 


March 30, 2010 | Asbjorn Folstad wrote:

Thanks, Jim, for recommending FMEA. Do you have any publicly available papers or descriptions of FMEA for usability to recommend? 


March 26, 2010 | Jim Jarrett wrote:

Have a look at "Failure Mode Effects Analysis" or FMEA. That design analysis method includes a calculation of a Risk Priority Number. Basically, you take three factors - likelihood, impact, and frequency - assign strict definitions for what a value of 1 to 10 for each means, and then multiply the values together for any specific issue. This provides an amplification effect, so that high likelihood, impact, and frequency numbers drive high RPNs so you don't have to judge the difference between a 13 and a 15, but rather a 850 vs a 200. I tailored the method to usability at rockwell to help drive priority decision making. 


March 26, 2010 | Asbjorn Folstad wrote:

Great post addressing a relevant issue. I would like to suggest an argument that may complicate the relationship between problem severity and frequency even further: The severity of a usability problem depends on the system context, of which the user is an important part. Thus, the same UI-element may affect different users differently - even though they participate in the same usability test. A typical situation seem to be that some users experience no problem when interacting with a given UI-element, others experience minor problems, and yet others experience major problems (catastrophes :-).

In this situation, it does not seem right to associate the UI-element with a single severity rating independently of a frequency calculation.

Personally, I have tried get around this by providing ratings that explicitly include both frequency and severity judgments (E.g. a critical usability problem could be defined as "either >20 percent of the participants experienced major obstructions/delays OR >50% experienced obstructions delays in total)".

I would really appreciated comments or feedback on this issue, and my approach to get around it. 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 3 + 2: (enter the number)

.
.