Jeff Sauro • June 14, 2004

The benefit of using a z-score in usability metrics was explained in "What's a Z-Score and why use it in Usability Testing?" this article discusses different ways of calculating a z-score.The short answer is: It depends on your data and what you're looking for. If you've encountered the z-score in a statistics book you usually get some formula like:

| Verbal | Quantitative |

Mean | 469 | 591 |

StDev | 119 | 148 |

By plugging in your scores you get the following:

Verbal z = (630 - 469) ÷ 119 = 1.35σ

Quantitative z = (700 - 591) ÷ 148 = .736σ

To convert these sigma values into a percentage you can look them up in a standard z-table, use the Excel formula =NORMSDIST(1.35) or use the Z-Score to Percentile Calculator (choose 1-sided) and get the percentages : 91% Verbal and 77% Quantitative. You can see where your score falls within the sample of other test takers and also see that the verbal score was better than the quantitative score. Assuming the sample data was normally distributed, here's how the scores would look graphically:Sample |

USL: 120 |

To calculate the process sigma you subtract the mean (104) of the sample from the target (120) and divide by the sample standard deviation (12). For Sample 1 the process sigma is -1.32σ. The visual representation of the data can be seen below:

In the case of task times, a negative process sigma is ideal--as you want more people completing the task below the task time, not above it. You can simply drop the negative when communicating the results in the event it causes confusion. If you were to make radical improvements to the UI and then sampled another set of ten users, here are more results:

Sample 2 |

60 75 99 88 65 72 75 72 87 65 |

USL: 120 Mean: 75.8 StDev: 12.14 |

In the redesign, the average of the new sample is well below the spec limit and the process sigma is now very high. The corresponding defect area is now only .01% and the quality area is 99.98%

Of course having users perform that much below the spec limit is not very common due to the inherent variability in user performance.

If you need more help with z-scores, see the Crash course in Z-scores, a tutorial with plenty of pictures, examples and review questions for you to grasp this concept.

8 Ways to Identify Unmet Customer Needs

The New Face of Usability Testing

Should you use 5 or 7 point scales?

5 Examples of Quantifying Qualitative Data

A Brief History of the Magic Number 5 in Usability Testing

10 Things to Know about Usability Problems

Confidence Interval Calculator for a Completion Rate

The Five Most Influential Papers in Usability

Why you only need to test with five users (explained)

8 Ways to Show Design Changes Improved the User Experience

How to Conduct a Usability test on a Mobile Device

97 Things to Know about Usability

What five users can tell you that 5000 cannot

How common are usability problems?

.

Quantifying the User Experience: Practical Statistics for User ResearchThe most comprehensive statistical resource for UX Professionals Buy on Amazon | |

Excel & R Companion to Quantifying the User ExperienceDetailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R Buy on Amazon | Download | |

A Practical Guide to the System Usability ScaleBackground, Benchmarks & Best Practices for the most popular usability questionnaire Buy on Amazon | Download | |

A Practical Guide to Measuring Usability72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software Buy on Amazon | Download |

.

.

.