Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

5 Second Usability Tests

Jeff Sauro • November 9, 2010

In a few seconds what can you tell about people… or websites?

Some famous research has shown that student evaluations given after only a few seconds of video[pdf]are indistinguishable from evaluations from students who actually had the professor for an entire semester!

There has been some relevant research on the importance of immediate website actions and impressions:

Visual Appeal: Impressions of a homepage's visual appeal and aesthetics happen within milliseconds[pdf]

5-second tests: Give users five seconds to look at an image or page-design and you get instant feedback on salient elements or problems in a design. If users can't find their way or orient to your design immediately, then this can be an early indication the design needs improving.

First-Click Test: The first click users make on a webpage is an excellent predictor of whether they will successfully complete the task

But what about task-level usability? While our visceral reactions to static images or teaching-styles might be reliable, would it hold up for a typical usability test?

Traditional Usability Tests Last Hours not Seconds

A lot of time and effort goes into planning a usability test. In a typical lab-based test, users spend several minutes (sometimes hours) on a website attempting tasks. In a remote unmoderated test users have more distractions and typically less motivation to focus on a website for long periods of time. However, users still typically spend 10 to 30 minutes working through tasks.

Attempting tasks familiarizes users with the website architecture and usability. The process primes users to then answer post-test usability questionnaires such as the System Usability Scale (SUS) which provides an overall numeric picture of the usability of a website.

If we consider the SUS as a reliable measures of website usability, is five seconds enough to answer questions such as "I found the website unnecessarily complex" or "I thought the website was easy to use" ?

7 Websites and 256 Users Later...

To find out I set up unmoderated usability tests across seven websites.
  • 4 airline websites: AA.com, UAL.com, Southwest.com, JetBlue.com
  • 3 retail websites: CrateandBarel.com, ContainerStore.com, Pier1.com
Users were recruited on the internet and asked to complete two core tasks then answer the SUS at the end of the test.  The same airline tasks were used across all four airline websites(finding the price of round-trip tickets) and the same tasks were used across the three retail website (locating products and finding store hours).

I created three testing conditions to understand the effects of limited testing time on SUS scores. Users were randomly assigned to one of three conditions:
  • 5 seconds
  • 60 seconds
  • No time limit
In total I tested 256 users approximately balanced between the 7 websites and 3 conditions. There were between 79 and 91 users in each test condition and between 36 and 42 users on each website.

Results

Interestingly enough, the perception of website usability from the 5 second condition was statistically indistinguishable from the no time limit condition. The observed difference in average SUS scores was less than 3 points (4%). 



Despite frequent comments from users like "I wish I had more time on the site" their SUS scores were similar to users who spent a lot more time on the site.

The 60 Second Phenomenon

Somewhat surprisingly, the SUS scores from the 60 second group were between 8 and 12% higher than the 5 second and unlimited time groups (p <.01).

I'm somewhat puzzled by this result and am unsure why users who are interrupted after one minute tend to rate websites as more usable than users who have only five seconds or an unlimited amount of time. One hypothesis is that these interrupted users have an inflated sense of accomplishment and assume they would complete the task and thus rate the website higher on the SUS.

Prior website exposure doesn't affect the patterns

Perhaps users already have preconceived notions of usability from prior visits to the websites. I used large public facing websites so it would make sense that a good proportion of users had some exposure.  Fortunately I asked users how many times they'd visited each website. Across the websites, 70% of users reported having never visited the site before. 

When I compare the SUS scores for just these first-time users the same pattern does hold. The difference between the five second users and full-time users was larger (8%) but still not statistically significant (p >.11). 



The 60-second group again rated the websites highest—10 to 19% higher than the no limit and 5-second groups respectively. Again this confirms the bizarre result.

Limitations

There could be a few reasons for these results. For example, the tasks I've used could be too homogeneous in their difficulty. Perhaps harder tasks would affect the ratings more.  Also, a good portion of users might have been able to complete the tasks in under 60 seconds--thus making the website seem easier than users who had to complete two tasks. More research is needed to understand the effects of task difficulty, the number of tasks and task-length on the effects of SUS scores.

The results of this analysis do however suggest that under typical conditions, perceptions of usability are formed within seconds of viewing a website.

Giving users only five seconds to complete tasks generates SUS scores that are very similar to SUS scores from users taking 5 to 15 minutes to complete tasks. Usability, as measured by standardized questionnaires appears also to be strongly affected by initial impressions.

About Jeff Sauro

Jeff Sauro is the founding principal of Measuring Usability LLC, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 20 journal articles and 4 books on statistics and the user-experience.
More about Jeff...


Learn More


UX Bootcamp: Aug 20th-22nd in Denver, CO
Best Practices for Remote Usability Testing
The Science of Great Site Navigation: Online Card Sorting + Tree Testing Live Webinar


You Might Also Be Interested In:

Related Topics

SUS, Usability Testing
.

Posted Comments

There are 8 Comments

November 16, 2010 | Jeff Sauro wrote:

Elizabeth,
Great point, well put. Benchmarks of perceived usability tell us how usable a site is and not what is unusable.
Nevertheless, I really wasn't expecting to see such stable results after only five seconds. I was always under the impression users needed a solid exposure to the site--much as students need solid exposure to a teacher before providing reliable ratings of performance.
 


November 16, 2010 | Jeff Sauro wrote:

Michael,

Thanks for your comment and critical thinking on this one. I was always under the impression that usability is largely context dependent, so change the tasks and the assessment of usability changes. I did not expect to have such similar results. I would lend more credence to your random sample hypothesis except for the 60 second phenomenon. Here too is a random sample, just a longer one, why does this group tend to rate higher?

Good question on the variability. While I didn't provide the raw standard deviations in the article, you can see from the 95% confidence interval bars that the widths between groups are all about the same--and the sample sizes are all about the same--meaning the the variances are all about the same. The raw standard deviations per group are 19.9, 18.1, 20.5 for the 5, 60 and full time groups, which are both nominally and statistically virtually the same.

You of course get a lot more data from actually having users complete the tasks (both task level data and usability problem identification) so I'm certainly not saying that 5 second tests can replace traditional testing. It just suggests that ideas about usability probably form very quickly.  


November 15, 2010 | Michael Zuschlag wrote:

I guess I wouldn’t necessarily expect _average_ usability to differ after five seconds versus many minutes of use. That’s the result you’d get if you assume each user effectively samples a random aspect of the site in the first seconds. For there to be an average difference, the distribution of usability problems needs to be systematically biased towards either the earlier or later stages of the task. While I can think of reasons why that can happen, it also seems reasonable that usability problems are usually uniformly distributed.

How did the variances compare across the conditions? Do users with a five second exposure show significantly more variability in their SUS than those with longer time, as would be predicted if the first seconds represent a sample? That would imply that for given number of users you get a more precise estimate of the site’s performance with longer tasks than shorter tasks. Pragmatically, depending on the overhead cost of acquiring each user and their willingness to complete longer tasks, it may be better to have longer usability tasks.  


November 15, 2010 | Elizabeth Buie wrote:

Jeff, I think we have to say that the SUS may be a reliable measure of user satisfaction with a site — which is only one component of usability. What the longer usability tests are giving us is far more important (if we do them right): the opportunity to discover specifically what works and what doesn't work in a practical sense. 


November 11, 2010 | Jeff Sauro wrote:

Nik
I have been collecting task level difficulty and I think you're totally on to something. It's worth investigating. 


November 10, 2010 | Nik Sargent wrote:

Interesting!
One factor that might be worth considering here is how complex users expect the task to be (which could be dependent on levels of conditioning and priming) which could affect how accepting they are of difficulties and weaknesses. I specialise in the Voice UI space which has some different characteristics (e.g. everything is realtime and linear) but at the same time shares some psychology. Frustration levels, for example, could be related to how complex users expect a task to be.
Just my 2p worth. 


November 10, 2010 | Alex Debkalyuk wrote:

Great study! Thanks!

I always blame folks who torture users for an hour+ to perform the simplest tasks. 


November 10, 2010 | Jessica Kerr wrote:

Another very interesting study, Jeff.

I wonder if what you're seeing is:
- 5 second viewers being negatively influenced by the time limit and thus thinking the site is less usable;
- 60 second viewers getting started on the task but not necessarily getting to the point of failing, and thus feeling confident about the site's usability; and
- no time limit viewers having that initial flush of success but then also striking challenges, and so rating the site as less usable.

Very interesting results nonetheless! 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 3 + 4: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[3810 Subscribers]

Connect With Us

UX Bootcamp

Denver CO, Aug 20-22nd 2014

3 Days of Hands-On Training on User Experience Methods, Metrics and Analysis.Learn More

Our Supporters

Usertesting.com

Loop11 Online Usabilty Testing

Use Card Sorting to improve your IA

Userzoom: Unmoderated Usability Testing, Tools and Analysis

.

Jeff's Books

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professionals

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.