Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

What Statistical Test do I Use?

Jeff Sauro • June 26, 2012

What do you think the most common question in statistics is?

Several times a year I teach a statistics course for UX professionals and get asked this question a lot.

We're offering the class this fall at the LeanUX Denver conference and a portion of it is available for download.

Some attendees have had statistics classes and for others it's their first one. Regardless of the background, almost everyone who uses statistics wants to know: What statistical procedure do I use? 

It's hard enough to grasp many of the concepts in statistics. Most people in UX aren't math majors and never intended to use statistics as part of their job.

For this reason we have a decision tree to help you know when to use which statistical procedure in both the Excel calculator and in Chapter 2 of our book Quantifying the User Experience

Getting to know the decision map is one of the most popular parts of the course because you can click right to the appropriate calculator after answering a couple questions, paste your data and get your answer.

I've included a similar clickable version below with links to several of our free online calculators.   To use the decision map you just need to know a few things.

What type of Data do you have?

For this decision map, the primary thing you need to know is if your data is binary or not.
  • Binary (pass/fail, yes/no, purchase/didn't purchase) and coded as 1 or 0.  These include things like completion rates  and conversion rates.

  • Continuous: If your data isn't coded as 1's or 0's then we can usually treat it as what's called continuous or metric data.  
Once you've established whether your data is binary or continuous, you start with the appropriate decision map below. The rest is just answering a couple questions about what you're trying to do.  Four examples follow the decision maps to give you some practice.

The blue boxes at the termination points on the two decision maps below link to our free online calculator and if I don't have one up yet (more coming soon!) it will link to the Excel calculator

You can also download a free high-resolution printable version [3.5MB pdf] of both maps. Pin it to your wall for help or throw darts at it if statistics just frustrates you too much.

Continuous Data Decision Map

                                                        

Binary Data Decision Map

                   

Example 1

Let's say you had a rating scale question in a survey that went from strongly disagree to strongly agree and was coded from 1 to 5 for each level of agreement. The average score was a 3.9 (sd = 1.2) from 36 people. What test would you use to find out how much that sample mean would fluctuate? 
  1. It's not a binary measure so we'd use the continuous decision map.
  2. We aren't comparing data, we only want to know how precise our estimate is so we pick "N" in the first branch.
  3. We don't have a benchmark we're testing against (such as testing whether the average exceeds 3.5) so we again pick "N".
  4. The data isn't task time so we pick "N" and end up at the t confidence interval link. Entering in the mean, standard deviation and sample size as summary data gets us a 95% confidence interval from 3.494 to 4.306.

Example 2

Let's say you had two different groups of users who attempted to locate the nearest rental car location. In one group 12 out of 12 users found the correct location on Budget.com. In another group, 17 out of 24 found the correct location on Enterprise.com. Is there a statistically significant difference between rental car websites?
  1. Completion rates are a binary measure (pass/fail) so we'd use the binary decision map.
  2. We ARE comparing groups—we want to know if users on different websites will have different completion rates when finding a location so we pick the "Y" in the first branch.
  3. We have different users in each group (called Between Subjects) so we pick "Y" again.
  4. We have only 2 groups so we pick "N" at the 3 or more groups box.
  5. We end at the N-1 Two Proportion Test with a link to the A/B Test calculator link.  Entering 12 out of 12 and 17 out of 24 gets us a p-value of .03985, which means the difference is statistically significant.

Example 3

Let's say you watched as 20 out of 25 people found a Brother Sewing Machine in a tree test from Optimal Workshop or Userzoom. Is there evidence that at least 75% of all users would find the Sewing Machine?
  1. Completion rates are a binary measure (pass/fail) so we'd use the binary decision map.
  2. We are not comparing groups so we pick the "N" in the first branch.
  3. We ARE testing against a benchmark of 75% so we pick "Y" again.
  4. We end at the 1 Sample Binomial Test with a link to the One Proportion Calculator. Entering 20 out of 25, "Is Greater Than" and a Test Proportion of .75 tells us there's about a 70% chance at least 75% of all users would be able to find the Sewing Maching--not terribly compelling evidence.

Example 4

Let's say you administered the System Usability Scale (SUS) after users spent time using an accounting application. You tested 15 users on an old version and 17 on a new version and wanted to know if there was evidence that users thought the new version was easier to user.  The SUS means and standard deviations were 68 (sd=14) and 76 (sd =12) on the respective versions.
  1. SUS scores are not binary so we'd use the continuous decision map.
  2. We ARE comparing groups so we pick the "Y" in the first branch.
  3. We have different users in each group (called Between Subjects) so we pick "Y" again.
  4. We have only 2 groups so we pick "N" at the 3 or more groups box.
  5. We end at the 2 Sample t test with a link to the online calculator.  We enter the summarized values and get a p-value of .0962 which means there's good, although not overwhelming evidence, that the difference is statistically significant.

About Jeff Sauro

Jeff Sauro is the founding principal of Measuring Usability LLC, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 20 journal articles and 4 books on statistics and the user-experience.
More about Jeff...


Learn More


UX Bootcamp: Aug 20th-22nd in Denver, CO
Best Practices for Remote Usability Testing
The Science of Great Site Navigation: Online Card Sorting + Tree Testing Live Webinar


You Might Also Be Interested In:

Related Topics

Statistics
.

Posted Comments

There are 9 Comments

January 29, 2014 | lovely wrote:

helpful.thanks 


December 15, 2013 | Ido Badash wrote:

This was a great tool! I'm very thankful I found your website. Please post more practice problems like this! 


November 22, 2013 | RASHEED ADEKUNLE FASASI wrote:

WHAT STATISTICAL TOOL DO I USE FOR DECIDING THE BEST PREDICTOR OF AN OUTCOME OUT OF THREE VARIABLES THAT CAN EACH PREDICT IT? 


March 19, 2013 | Katie wrote:

For the Binary Decision Map tree, what is considered a "Large Sample"?rnrnThanks!rn 


December 16, 2012 | rahayu hamzah wrote:

I would like to know, what test should I use, because I feel confused. I am doing a research on adequacy of fire protection system in the laboratory. My objective are
1) to identify the adequacy of fire protection in the laboratory
2) to identify the knowledge level of fire protection system among the laboratory staff.

Could you help me on this..
 


September 14, 2012 | Adel wrote:

I have obtained scores from analysisng a dataset raw of banking system , I would divide these scors in two growps acoording ownesrship (sta, private , and foreign) and also tre groups acoording size (small, medium, and size)
I would test the hypothesis according ownership ( state banks more efficient than private or foreign 0 also hypotheses of large banks are more efficnt than small or medium banks
what is the best test can I do and what is the best statistical tool? 


July 14, 2012 | Diogo wrote:

Hi,Seems like it’s a nice blog. So let us also add something ufuesl in it. Trading in volatile market can be very fruitful also if we follow technical levels closely. It’s a common saying that stock market can change fortune in either way. But now the question is how to earn money from the Indian stock market. Traders are advised to strictly follow technical analyses and investors can follow fundamental analysis. Many analysts say it’s not wise to follow technical and fundamental analysis together. But we say what the problem is if one does so? As more knowledge will add up things will not have any negative impact. 


July 11, 2012 | Jeff Sauro wrote:

David,


Yes indeed it does. For each procedure we explain why we think it works bests, which is usually because it generates the best long-term results for applied researchers (not too conservative but still accurate) and we reference appropriate studies. For some of the procedures we dedicate a considerate amount of text to discussing alternatives (e.g. the N-1 Two Proportion Test) for those who might be familiar with other approaches. 


July 10, 2012 | David Dunkle wrote:

Wonderful resource! Does your book provide explanations/references for the stat choices? I'm sure if I ran a McNember Exact Test, the first question my customer would ask is, "Why?" 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 1 + 3: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[4087 Subscribers]

Connect With Us

UX Bootcamp

Denver CO, Aug 20-22nd 2014

3 Days of Hands-On Training on User Experience Methods, Metrics and Analysis.Learn More

Our Supporters

Use Card Sorting to improve your IA

Usertesting.com

Loop11 Online Usabilty Testing

Userzoom: Unmoderated Usability Testing, Tools and Analysis

.

Jeff's Books

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professionals

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.