Measuring Usability Homepage
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro
Usability

Graphs & Calculators
Z-Score to Percentile Calculator December 3, 2007
Look up the area under the normal curve (1 or two-sided areas) from a standard score (Z-score).

Percentile to Z-Score Calculator December 4, 2007
Enter the area under the normal curve (a proportion between 0 & 1) and get the Z-critical value, one-sided or two-sided.

Usability Scorecard June 1, 2007
The UsabilityScorecard web-application will take raw usability metrics (completion, time, sat, errors and clicks) and calculate confidence intervals, z-scores, quality levels and graph the results automatically. You can also combine any combination of the metrics into a 2, 3 or 4 measure combined score. Data can be imported from Excel (.csv) and exported to Word(.rtf).

Confidence Interval Calculator for a Completion Rate October 1, 2005
If you've wanted to provide a confidence interval around a small sample completion rate but just didn't have time to do the math, this calculator does the work for you.

Graph and Calculator for Confidence Intervals for Task Times February 6, 2006
Visualizing your task time data is an essential step in understanding its distribution and computing accurate confidence intervals. This calculator creates a dot-plot of your task times, transforms the raw data to adjust for non-normality and computes the intervals.

SUM: Single Usability Metric (Presented at CHI 2005) April 17, 2005
SUM is a single usability metric that summarize the majority of variation in four common summative usability metrics. Download the calculator to convert raw metrics to a SUM score or read the CHI paper which explains the theoretical foundations.

Sample Size Calculator for Discovering Problems in a User Interface October 1, 2006
Use this calculator to determine the number of users you'd need to test given the probability of detecting a problem. If the probability of detecting the problem is unknown, this calculator also allows you to estimate the problem occurrence (p) from sample data.

Deriving a Problem Discovery Sample Size Side-bar to The Risks of Discounted Qualitative Studies: March 8, 2004
Shows the history and computation of deriving a sample size for discovering problems in an interface.


Methods & Techniques
How Do You Calculate a Z-Score/ Sigma Level? June 14, 2004
The basics of z-scores are discussed plus an example of raw usability data converted into z-scores including three of Nielsen's five usability attributes.

Calculating a Sigma Level from Task Success September 17, 2004
Often the most reported measures of usability is task success. How does task success translate into a quality sigma value that can be compared to other reported sigma values?

Usable Statistics June 14, 2005
Do you need to feel more confident about using statistics? Dismayed by overly complicated "introduction" courses that focus on theory and not application? Do the "basic" books assume you know where to look for your answer? The first module in this series is on using confidence intervals in usability testing.

What's a Z-Score and Why Use it in Usability Testing? September 17, 2004
This common statistical way of describing data can be used in usability testing to standardize disparate data types to allow easy comparison between products or versions and providing a universal way of assessing quality.

Calculating Sample Size for Task Completion (Discrete-Binary Method) September 17, 2004

Why 6σ is Not Limited to Manufacturing Processes September 17, 2004
One of the most common concerns about the six sigma methodology is that it cannot apply to something as byzantine as the interactions of humans with software. One of the major tenets of both Six Sigma and Human Factors is that the customer or user determine what's considered "quality."

Measuring & Analyzing Task Times September 17, 2004
The complexity and depth of this popular quantitative measurement is often only given cursory thought. There's much more to task times than using a stopwatch.

Calculating Sample Size for Task Times (Continuous Method) September 17, 2004

What's the 1.5σ Shift and Does it Apply to Software Usability? September 17, 2004
Conventional six sigma commonly adds a 1.5 sigma buffer to account for the shifting of a process over time. Does this make sense for software usability?

The Importance of Task Order Randomizing during a Usability Test September 17, 2004
The order in which a task is administered during a usability test can have an effect on the user's performance especially as measured by task time. By randomizing task order the effects of this lurking variable can be mitigated.

What is an Acceptable Level of Quality for Usability? September 17, 2004
Is attaining Six Sigma a reasonable or even attainable goal for usability? A product's usability is the sum of several usability measures. Each Relative movement in your sigma value is good predictor of usability improvements.

Task Times in Formative Usability Tests June 6, 2008
Time-on-task can be used as a valuable diagnosis and comparative tool during formative evaluations.


Theory & Publications
Making Sense of Usability Metrics: Usability and Six Sigma Presented at UPA 2005 June 11, 2005
This paper identifies the limitations of traditional usability metrics and presents a process to increase their meaning by adapting Six Sigma methods. We define how common usability metrics can be evaluated in terms of a standardized defective rate or quality level and explore the benefits of this data transformation. Use the Usability Scorcard or the excel-based SUM calculator to standardize your metrics.

Restoring Confidence in Usability Results October 18, 2004
Adding confidence intervals to completion rates in usability tests will temper both excessive skepticism and overstated usability findings. Confidence intervals make testing more efficient by quickly revealing unusable tasks with very small samples.

Relevant Publications for Measuring Usability September 17, 2004
I've begun to collect a list of articles and publications that relate to the quantitative measures of usability.


Commentary
Premium Usability: Getting the Discount without Paying the Price ACM Subscription Required December 1, 2004
You can use measures such as confidence intervals, sample size calculations—and other statistics normally associated with more premium usability methods—without the high costs. These methods require no money to compute yet provide a wealth of information. Even better, you can still provide these quantitative qualifiers while using most discount methods. Pre-Published PDF Version

The Risks of Discounted Qualitative Studies: Response to Nielsen March 8, 2004
The discerning usability analyst should employ a mix of both qualitative and quantitative methods when discovering usability problems. The risks of relying heavily on a qualitative approach can lead to a severe misdiagnosis especially when usability problems are difficult to detect

Current Usability Solutions are Unpredictable September 17, 2004
Many popular usability testing techniques are the right method to gather user data, however, their results alone will only scratch the surface of the true state of usability. Often their results can be misleading.

Task Times in Formative Usability Tests June 6, 2008
Time-on-task can be used as a valuable diagnosis and comparative tool during formative evaluations.