What five users can tell you that 5000 cannotWeb-Analytics and User-testing
Jeff Sauro • June 16, 2010
With usability testing it used to be that we had to make our best guess as how users actually interacted with software outside a contrived lab-setting. We didn't have all the information we needed. Knowing what users did was in a sense a puzzle with a lot of missing pieces. Web-analytics provides us with a wealth of data about actual usage we just never had before. With real-time access to click-paths, time on pages and navigation paths we now know much more about what users have done. Where we once didn't have enough information, now we have a new problem--too much information. Web analytics is transforming user behavior from a puzzle to a mystery
. Mysteries require judgment and the assessment of uncertainty.
To solve the mysteries of why users are doing what they're doing, we still need to observe users and ask them about their intentions and expectations. This can help solve the mystery of why. A small lab based study of a small number of users can tells us things analytic data from 5000 cannot.
Web analytics is transforming user behavior from a puzzle to a mystery.
Why were users downloading the wrong version?
Recently I was assisting a team working on a consumer software product. There were problems with a trial version available for download off their company website. Users were calling tech-support because the 64-bit version they downloaded was incompatible with their operating system. As you can imagine, getting users to install a trial and then convert to a paying customer is an important business strategy, so any impediment to installation hits the bottom line.
The analytic data provided a partial answer to the mystery. There was a lot of data showing users downloading different versions of the software (some 32 -bit but most 64-bit). But you can't tell what the users intended to download. Were users who mistakenly downloaded the 64 bit version mislead by what they saw on the page? Did they understand the difference between the two versions? With some deeper analytic mining the operating systems of the users revealed many more should've been downloading the 32-bit version.
An observational study was conducted to see what users might be doing. Eleven users were observed as they browsed the website, picked their products and went for the download. Three of these eleven users downloaded the 64bit version of the product. A few minutes into the installation these users got the operating system error.
The mystery generated from the web-logs was easily solved from watching and asking a few users
They needed to download the 32-bit but instead downloaded and attempted to install the wrong version. Why?
It was obvious from watching the users with their mouse movements and asking them why they were confused. It turned out a design element on the download page was luring some people to the 64-bit download. The mystery generated from the web-logs was easily solved from watching and asking a few users.
Ah, but only 3 out of 11 users had the problem. You can hear the Analytics Team and Marketing department dismissing this result as not being "statistically significant." And yet it is. If we see 3 out of 11 users have a problem we can be 95% sure between 9% and 52%
of all users will have that problem downloading the correct version.
It is easy to prove something is NOT usable with small sample sizes
The problem with small sample sizes is that we're only able to reliably detect major issues (issues that affect a lot of users). The good news about small sample sizes is that we're detecting issues that matter! So when you see a problem occur repeatedly with a small sample test, it means a problem is probably affecting a lot of your users. Small sample sizes don't do a good job of finding problems that only affect a small portion of the users. As Jim Lewis likes to say: "It is easy to prove something is NOT usable with small sample sizes. It is hard to show that something IS usable with small sample size."
Within hours a new design element was mocked-up, approved and uploaded to the web. Within 24 hours the live A/B testing results showed a 4 percentage point increase in the download rate of the 32-bit trial version. This was definitely an improvement but the observational study showed that even the most conservative estimate suggested at least 9% of users are clicking the wrong trial. Apparently the new version only solved part of the problem. With a 4% increase, there's still a lot more to fix. And more is being done: eliminating the choice altogether. A new version of the trial page will detect the correct operating system based on the user's web-signature and suggest the correct version for download (after all, not everyone knows whether their system is 32 or 64bit).
There will be a continued demand for user researchers who can quantify observational data and make the most of analytic data.
Analytic data is not a replacement for user testing--but it's a good place to start
Analytic data is an easy first place to start understanding user behavior, but it is not a replacement for user testing. While having more information may reduce the puzzle problem it doesn't address the mystery of why. Just like the classic whodunit mystery
, with murder weapons, motives and suspects—we need to solve the mystery of why; why users do what they do. To answer that we need the classic tools of user-research—the small sample observational studies that tell us so much about why users do what they do. There will be a continued demand for user researchers who can quantify observational data and make the most of analytic data--the Quantitative Starter package can help you get started