by Jeff Sauro | June 6, 2008 ::
RSS
It is common to think of time-on-task data gathered only during summative evaluations because, during a formative evaluation, the focus is on finding and fixing problems, or at least finding the problems and delivering a report. For a variety of reasons, time-on-task measures often get left out of the mix. In this article, I show that time-on-task can be a valuable diagnostic and comparative tool during formative evaluations.
The three most common reasons I've heard for not using time-on-task in formative studies are:
Below I discuss why these reasons should NOT prevent you from collecting time-on-task in your next formative evaluation.
Getting an accurate and stable measure of the actual user time-on-task is more problematic that comparing designs. One would expect task times to increase as users are asked to think-aloud while completing tasks. The published data, however, is mixed, with some published studies actually showing faster performance while thinking-aloud possibly due to the invocation of cognitive processes that improve rather than degrade performance (Berry and Broadbent (1990). For a good summary of the evidence, see Lewis 2006 p. 1282. More research is needed to draw a conclusion on this aspect. Regardless, I recommend focusing on relative task time improvements between designs because this avoids this issues altogether.

Figure 1: Time to cancel a reservation on a hotel-website (in log-transformed seconds). One user took over 4 times the mean time to complete the task. Red solid line is the geometric mean and the green-dashed lines are the upper and lower bounds of the 95% Confidence Interval.
In graphing the report we quickly see that one user took over 4 times longer than the mean time to cancel the reservation (I graphed the data using the Graph and Calculator for Confidence Intervals for Task Times). This simple graph of the task times allows the investigator and reader of a report to zero in on potential causes of such a long task time (relative to the other users). While it's unclear from the report as to what was occurring during this task, an analysis of this user's profile shows that she had never visited a hotel website or ever made a reservation at a hotel website prior to the test. Her comments also reinforce her being a "novice" Internet user: "I feel that my inexperience with the web had a lot to do with difficulties." Whether it was just the user's inexperience or some specific interface problems, perhaps particularly damaging to a novice, it is clear this user had trouble during the task. A few pixels tell the story.
Time-on-task is an under-utilized tool for formative evaluations. It costs nothing (just start and stop the time), is useful with any-number of users and it can be a valuable tool for diagnosing problems as well as making objective comparisons between iterations. I encourage you to collect time-on-task during your next formative evaluation.
View All Articles |
Subscribe to RSS
|
Follow on Twitter |
Get Email Updates
| February 11, 2010 | Dana Chisnell wrote: |
| Jeff, I like your position on time-on-task during formative testing being an indicator of something gone wrong. I would argue that it\'s just one way of telling that something went wrong and what the issue was. In fact, I\'d say that in the problem you describe -- the person spending a longer time canceling a hotel reservation -- there were many other bits of evidence indicating that this person was having a problem. For example, with the think aloud, she probably was pretty verbal about her issues and questions. There *is* a cost to the moderator (or the team). Usually, there are a lot of things going on in a formative test. Tracking time on task adds to that overhead, and adds to the data analysis time, as well. This also assumes that the amount of time that the think aloud slows people down is constant from participant to participant. This seems very unlikely. Anyway, neat idea, but for most of the teams I work with who are developing new designs, time is not the paramount indicator of success, failure, or progress. |
If 1 of 5 users has a problem in a usability test will it impact 1% or 20% of all users?
Why you only need to test with five users (explained)
Does better usability increase customer loyalty?
The Five Most Influential Papers in Usability
What five users can tell you that 5000 cannot
6 things you didn’t know about Heuristic Evaluations
Can you use the SUS for websites?
Featured Product
Copyright © 2004-2010 Measuring Usability LLC
