Jeff Sauro • June 14, 2004

The benefit of using a z-score in usability metrics was explained in "What's a Z-Score and why use it in Usability Testing?" this article discusses different ways of calculating a z-score.The short answer is: It depends on your data and what you're looking for. If you've encountered the z-score in a statistics book you usually get some formula like:

| Verbal | Quantitative |

Mean | 469 | 591 |

StDev | 119 | 148 |

By plugging in your scores you get the following:

Verbal z = (630 - 469) ÷ 119 = 1.35σ

Quantitative z = (700 - 591) ÷ 148 = .736σ

To convert these sigma values into a percentage you can look them up in a standard z-table, use the Excel formula =NORMSDIST(1.35) or use the Z-Score to Percentile Calculator (choose 1-sided) and get the percentages : 91% Verbal and 77% Quantitative. You can see where your score falls within the sample of other test takers and also see that the verbal score was better than the quantitative score. Assuming the sample data was normally distributed, here's how the scores would look graphically:Sample |

USL: 120 |

To calculate the process sigma you subtract the mean (104) of the sample from the target (120) and divide by the sample standard deviation (12). For Sample 1 the process sigma is -1.32σ. The visual representation of the data can be seen below:

In the case of task times, a negative process sigma is ideal--as you want more people completing the task below the task time, not above it. You can simply drop the negative when communicating the results in the event it causes confusion. If you were to make radical improvements to the UI and then sampled another set of ten users, here are more results:

Sample 2 |

60 75 99 88 65 72 75 72 87 65 |

USL: 120 Mean: 75.8 StDev: 12.14 |

In the redesign, the average of the new sample is well below the spec limit and the process sigma is now very high. The corresponding defect area is now only .01% and the quality area is 99.98%

Of course having users perform that much below the spec limit is not very common due to the inherent variability in user performance.

If you need more help with z-scores, see the Crash course in Z-scores, a tutorial with plenty of pictures, examples and review questions for you to grasp this concept.

3 Days of Hands-On Training on User Experience Methods, Metrics and Analysis.Learn More

Identifying the 3 Types of Missing Data

10 Essential User Experience Methods

What five users can tell you that 5000 cannot

A Brief History of the Magic Number 5 in Usability Testing

Should you use 5 or 7 point scales?

How common are usability problems?

Confidence Interval Calculator for a Completion Rate

8 Ways to Show Design Changes Improved the User Experience

Nine misconceptions about statistics and usability

10 Things to Know about Usability Problems

5 Examples of Quantifying Qualitative Data

The Five Most Influential Papers in Usability

How to Conduct a Usability test on a Mobile Device

Why you only need to test with five users (explained)

.

Quantifying the User Experience: Practical Statistics for User ResearchThe most comprehensive statistical resource for UX Professionals Buy on Amazon | |

Excel & R Companion to Quantifying the User ExperienceDetailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R Buy on Amazon | Download | |

A Practical Guide to the System Usability ScaleBackground, Benchmarks & Best Practices for the most popular usability questionnaire Buy on Amazon | Download | |

A Practical Guide to Measuring Usability72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software Buy on Amazon | Download |

.

.

.