Measuring Usability Homepage
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro
SUM: Single Usability Metric
(Presented at CHI 2005)
by Jeff Sauro | April 17, 2005 :: 3 Related Questions
How useful was this article?

Avg. Rating: 59 ( 11 ) | 4 Comments


Page Tags

Tag Name#Vote
Single Usability Metric76
CHI 200571
Usability Score Card 68
Z-Statistic61


New Tag:   


SUM is a standardized, summated and single usability metric. It was developed to represent the majority of variation in four common usability metrics used in summative usability tests: task completion rates, task time, satisfaction and error counts. The theoretical foundations of SUM are based on a paper presented at CHI 2005 entitled "A Method to Standardize Usability Metrics into a Single Score." Sauro and Kindlund.

Usability ScoreCard Added June. 2007
The UsabilityScorecard web-application will take raw usability metrics (completion, time, sat, errors and clicks) and calculate confidence intervals and graph the results automatically. You can also combine any combination of the metrics into a 2, 3 or 4 measure combined score. Data can be imported from Excel (.csv) and exported to Word(.rtf).

SUM Calculator

The SUM calculator will take raw usability metrics and convert them into a SUM score with confidence intervals. The analyst needs to provide the raw metrics on a task-by-task basis and know the opportunity for errors. SUM will automatically calculate the maximum acceptable task time, or it can be provided. This calculator is an Excel based version of the UsabilityScorecard except that it can only combine 4 measures (time, errors, sat and completion) and does not graph the results.

Download the SUM Calculator (Free Registration Required)



SUM FAQ's

  1. Why would you want a single usability metric?
If you agree that to truly know the usability of a product means measuring its usability, then you must necessarily ask: how do you measure usability? For us, that's SUM--a composite of multiple measures that all attempt to measure usability.

The bulk of usability activities are formative, that is, their major intent is to uncover and fix usability problems in a user interface. A single score would not be as beneficial here. It is most beneficial during a benchmarking test or summative assessment when you want to know how usable the application is, as opposed to what the usability problems are. Before you can say how usable something is, you need to be able to measure usability.

Why would you want to measure usability? If you cannot measure the construct of usability, or agree on what accounts for usability in some measurable way, then how will we know if any of the UE activities actually make the product more usable? This is a larger philosophical issue that has been asked in many contexts, here are some examples:

Measurement is at the heart of our scientific method. "Numerical Precision is the very soul of science" D'Arcy Wentworth Thompson, On Growth and Form (1917)
If you can't measure it, you cant manage it. (Old Management Saying)
>When you can measure what you are speaking about, and express it in numbers you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of and unsatisfactory kind: It may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of Science. Lord (William Thompson) Kelvin, pioneer in thermodynamics and electricity,1891.


If you'd like an email when a new article or calculator is posted sign up for Email Updates.



 
Related Questions

Ask a Question
A random sample of 100 pumpkins is obtained and the mean circumference is found to be 40.5 cm. Assuming that the population standard deviation is known to be 1.6 cm, use a 0.05 significance level to test the claim that the mean circumference of all pumpkins is equal to 39.9 cm. 1. State the null hypothese 2. state the alternative 3. Identify the test statistic 4. Find the P-value 5. What is the conclusion regarding the null hypothesis? 6. What is the final conclusion that addresses the original claim?
Various temperature measurements are recorded at different times for a particular city. The mean of 25C is obtained for 60 temperatures on 60 different days. Assuming that pop st/deviation is 1.5C, test the claim that the population mean is 23C. Use a 0.05 significance level. Identify the null hypothesis, alternative hypothesis, test statistic, P-value, conclusion about the null hypothesis, and final conclusion that address the original claim.
What constitutes a small sample size in a test of two populations?

Ask a Question


Comments
Name
Email Address
Not Published

To prevent comment spam, please answer the following question before submitting (tags not permitted) :
What is 1 + 4: (enter the number)