by Jeff Sauro | March 8, 2010 ::
RSS
View All Articles |
Subscribe to RSS
|
Follow on Twitter |
Get Email Updates
| April 27, 2010 | Rob Todd wrote: |
| Excellent explaination! |
| April 21, 2010 | John Sorflaten wrote: |
| Jeff, hi I still wonder how Faulkner\'s analysis fits this article. Doesn\'t finding \"85% of the problems\" mean ON AVERAGE...over lots of testing events with 5 participants? Your web site and blogs are great. You must be getting a good following. I\'m working at www.saic.com as of Feb, this year. I\'m with a large gov\'t client for a couple year project in Maryland, near Baltimore. best John |
| April 14, 2010 | Jennifer Romano wrote: |
| Wow. Tough crowd. Keep up the good work, Jeff. I sure appreciate it. |
| April 9, 2010 | Jim Hall wrote: |
| This was exactly what I needed to help explain why my team only tests with an average of 10 users. |
| April 2, 2010 | John Haugeland wrote: |
| Unfortunately, despite that these lessons are clearly wrong, the author has chosen to leave them up anyway. Note that the author has hundreds of users on his blog, yet one user found more than a dozen bugs on their blog alone. (Indeed, several of the located bugs, including one of the ones in that video, several weeks later remain unresolved.) What this author is measuring is the probability of finding one bug, not all bugs, with five users. If you\'re willing to accept finding one bug, by all means, follow this extremely poor and non-measured advice. |
| March 23, 2010 | jetm wrote: |
| Nice Articles |
| March 22, 2010 | Thanassis wrote: |
| Jeff, if you don't use/know git, the link to the code is at github dot com slash ttsiodras slash binomialProbabilities |
| March 22, 2010 | Thanassis wrote: |
| The code proving the error in "tossing a die" was messed up in the comments. I placed a copy in: git@github.com:ttsiodras/binomialProbabilities.git The code has both the "theory" and the "experiment" (using Python's random module) to prove it. |
| March 22, 2010 | Thanassis wrote: |
| Tossing a coin: \"3 or fewer\" => \"3 or more\". Tossing a die: \"10 or fewer\" => \"11 or more\". The \"tossing a die\" is wrong, it needs 11, not 10: # Python code p = 1./6. for n in xrange(2, 15): print \"The theory says...\", sum([choose(n,i)*(p**i)*((1-p)**(n-i)) for i in xrange(1,n+1)]) def choose(n,k): return fact(n)/(fact(k)*fact(n-k)) def fact(n): return 1 if n<=1 else n*fact(n-1) |
| March 19, 2010 | David Travis wrote: |
| This is a great article to give people an intuitive understanding of why you can test with 5 users. Thanks for writing it! But this got me thinking… In most usability tests, participants carry out more than one task. And with multiple tasks, people are more likely to run into the problem (for example, let's say it's a problem with search, and every task involves reviewing the search results). Although the participant might not experience the problem on the first task, they may do if they carry out 6 tasks. Doesn't the maths you outline above assume a one-shot scenario, rather than the multiple task scenario? |
| March 18, 2010 | Jeff Sauro wrote: |
| Jason, Thanks for your comment I'm not familiar with the Pascal Model. The derivation from the binomial, Geometric and Poisson all generate the same results, so I suspect this might be another way to arrive at the same answer. |
| March 18, 2010 | Jeff Sauro wrote: |
| To the anonymous post below, perhaps you could elaborate a bit on what you disagree with. |
| March 18, 2010 | Take a probability course, take a stats course wrote: |
| You could always try to learn stats and probability before you tried to use concrete examples to convince those in your situation otherwise. P-values are not used that way either. |
| March 18, 2010 | Jason wrote: |
| As a stats minor you should be using the Pascal Model for the first two examples, not the Binomial... |
| March 17, 2010 | John Haugeland wrote: |
| Hm, looks like you're fixing bugs on the fly, without scrubbing the bad data out of the database. Two of my old comments have quoting bugs presenting, though new ones don't, and you've still left an unknown number of fake ratings 10 in the database. |
| March 17, 2010 | John Haugeland wrote: |
| Awesome, if I post a comment with a URL in it, the comment disappears without notice. Third attempt at posting: ---------- Ok, so here are the two videos. The first shows the quoting bug, and the second shows the posting bug hitting twice in a row. The encoder got seriously slow for 20 seconds at the beginning of the second video; that clears up. Sorry about that; I'm on a laptop. http colon slash slash sc.tri-bit.com/outgoing/BrokenForm.wmv http colon slash slash sc.tri-bit.com/outgoing/BrokenForm2.wmv |
| March 17, 2010 | John Haugeland wrote: |
| Awesome, twice in a row this time. (Three really, but one wasn't on tape.) Note in the video that the name is correct, the email address is valid, and the math is right. |
| March 17, 2010 | John Haugeland wrote: |
| Well, maybe I'll figure it out if I see source. Otherwise, meh, I'll just look stupid. |
| March 17, 2010 | John Haugeland wrote: |
| Maybe it's about having a colon in place? Trying: I'll stop after. |
| March 17, 2010 | John Haugeland wrote: |
| One last try, then I'll just look dumb. |
| March 17, 2010 | John Haugeland wrote: |
| It only breaks sometimes. ' " ' " |
| March 17, 2010 | John Haugeland wrote: |
| Testing the broken form on video, so that Jeff can see the bug. Please note that the name and email address will be the same each time, the number will be correct, and that apostrophes ( ' ) will come back incorrectly quoted. I haven't tried double quotes ( " ) yet. |
| March 17, 2010 | John Haugeland wrote: |
| Arthur: good catch. Bet there's a lot of other stuff like that too. |
| March 17, 2010 | Jeff Sauro wrote: |
| Arthur, yes, you're right, it's a temporary fix. Thanks for finding it though! |
| March 17, 2010 | Arthur wrote: |
| Jeff Sauro wrote: > I've since fixed the page width issue Okay, Jeff, you didn't fix *that*, either. |
| March 17, 2010 | Arthur wrote: |
| Okay, now I just have to test the "really wide comment" thing for myself. :)
|
| March 17, 2010 | John Haugeland wrote: |
| Also, it looks like if you make a rating then post a comment, the blog is trying to re-make the rating also; it continues to protest that my rating is already in place when I post a comment. If he's found 85% of defects with his however many more than five users, and in less than 10 minutes of site usage I've found nine defects, that suggests that if he's worked on this codebase for just a few weeks, there are tens of thousands of defects already solved. The mystery math just doesn't hold up. |
| March 17, 2010 | John Haugeland wrote: |
| Unfortunately, it also seems that Jeff hasn't cleared the vote-10s out of his database, meaning that the score is wildly distorted; I know because I just tried to vote 0 again, and it still thinks I already voted 10. |
| March 17, 2010 | John Haugeland wrote: |
| Earl Franklin: it's frustrating when someone just recites things they've heard from false sources without actually checking the work. I found four more bugs in this site already, one a security defect, and I'm requesting access to the code so that I can show Jeff how well this five user principle actually works. The fact of the matter is simple: Jeff has a lot more than five users, and there are a whole *bunch* of bugs about to be discovered, if he plays ball. Also, the captcha adder appears to fail one time in three, give or take, in current Firefox. The security defect is the most frustrating part of all of this. It makes clear how appropriate for this guy to be giving this kind of advice. |
| March 17, 2010 | Jeff Sauro wrote: |
| Thank you to those who pointed out the coding bugs on the site. I've since fixed the page width issue and the 0 rating problem, so rest assured your 0 votes for this article are being recorded. |
| March 17, 2010 | Logi Ragnarsson wrote: |
| (You want to delete or edit the post by AnonymousCoward below so it stops expanding the width of the page. Then you want to get some better code, which won't allow that to happen. I still don't know or sufficiently car what you wrote above.) |
| March 17, 2010 | Logi Ragnarsson wrote: |
| After not reading this post, since it's completely unusable, I noticed that it had an average rating of 8.48. I rated it at 0 since I couldn't read it without my eyes bleeding. However, I happened to glance at a comment saying that 0-votes weren't counted, so I re-voted at 1 instead, and I got this gem: "You already rated this page a 10" I really, really, really, hope that your content is better than your packaging. But I don't suppose I'll ever know. |
| March 17, 2010 | Logi Ragnarsson wrote: |
| Usability test this. I have no idea what you wrote beyond the title, since the lines are about a meter wide and I just don't are enough. |
| March 17, 2010 | Ken Zutter wrote: |
| If usability is the topic, then why is this webpage about 50 million pixels wide? The content does not flow. I have to scroll to the right. I cannot even see the submit button for this form. LOL FF 3.6 maximized on 1152 wide screen |
| March 17, 2010 | AnonymousCoward wrote: |
| I ran the last test 1% for a while. It took 200 samples to reach 85%. At 100 samples I had about 79% IIRC. 445, 73, 17, 107, 16, 514, 176, 54, 9, 164, 92, 447, 126, 127, 331, 11, 62, 190, 123, 229, 16, 88, 109, 58, 15, 76, 62, 116, 279, 26, 36, 120, 127, 128, 270, 110, 3, 51, 47, 46, 73, 47, 223, 136, 155, 90, 326, 135, 142, 95, 361, 18, 34, 237, 61, 31, 40, 14, 18, 181, 175, 12, 322, 245, 36, 45, 325, 12, 111, 229, 6, 3, 33, 52, 151, 11, 49, 121, 237, 199, 301, 153, 43, 35, 325, 2, 31, 42, 4, 6, 27, 158, 58, 179, 55, 119, 15, 138, 44, 261, 21, 26, 62, 49, 31, 13, 21, 54, 33, 17, 75, 115, 187, 181, 328, 26, 19, 32, 225, 390, 1, 117, 23, 216, 36, 66, 5, 138, 75, 59, 13, 29, 54, 298, 41, 32, 122, 4, 98, 26, 240, 78, 15, 26, 66, 54, 95, 77, 201, 33, 28, 78, 34, 168, 26, 64, 346, 84, 11, 10, 147, 76, 71, 434, 99, 47, 50, 120, 137, 47, 135, 39, 98, 91, 180, 280, 152, 148, 83, 82, 43, 93, 5, 55, 52, 57, 46, 8, 25, 60, 131, 78, 77, 32, 12, 31, 20, 7, 85, 50 |
| March 17, 2010 | Jeff Sauro wrote: |
| John, Regarding your point about the 31%, that probability comes from the article by Nielsen and some of his papers (his article is linked to a couple times). Whether the probability a user will encounter a problem that frequently will depend on a number of things and you really don't know until you run the users. You would expect this high of a problem frequency early in a design and not typically on released software. But the nice thing is, you don't need to know the probability ahead of time. When you test with only 5 users, you've most likely seen problems that affect this many users. It's both a caution and reassurance. Many people have been using the magic number 5 as a guide and think they are discovering 85% of all problems when they run five users. Instead, my hope was to show that you're just going to see 85% of the more obvious problems that impact (31%-100% of users). And on a well tested application, one should expect that number to be below 10 or 5%, meaning you're gonna need more users. |
| March 17, 2010 | Tom wrote: |
| That comment below should have said "pointy-hairs", as in the pointy-haired boss from Dilbert, not "point hairs". |
| March 17, 2010 | Tom wrote: |
| If it\'s detectable 31% of the time, it\'s going to be found long before user testing. .31% (0.0031) might be a more reasonable number for code that has undergone any testing before user testing. And, of course, the probability is a function of users/unit of time. The sad thing is some point hairs are going to read the article, miss the qualifications, and this comment, and think they now know how much testing they need (5 users, for any period of time). |
| March 17, 2010 | Earl Franklin wrote: |
| Probability John Haugeland woke up on the wrong side of the bed today: 100% |
| March 17, 2010 | John Haugeland wrote: |
| Just to see how the blog would react, I tried to vote your article a 1 a second time. "You already rated this page a 10." And of course, it's also stripping whitespace out of comments. Clearly, you're to whom to go for software quality advice. Did you even bother to test your platform? |
| March 17, 2010 | John Haugeland wrote: |
| Also, your blog is discarding 0 ratings on articles; I had to re-vote as a 1 before the vote count would go up, or the rating change (which says a lot about your readiness to talk about finding defects, and the validity of your current article ratings). |
| March 17, 2010 | John Haugeland wrote: |
| > The five user number comes from the number of users you would need to detect approximately 85% of the problems in an interface, given that the probability a user would encounter a problem is about 31%. Well, if you\'re satisfied with 85 percent, or if you actually take this unsourced 31 percent number seriously, then this is probably enough to get you to believe. |
| March 17, 2010 | Lou Rawls wrote: |
| Oh wow, thats incredible. Lou |
| March 17, 2010 | Richard Metzler wrote: |
| How do you usability test your website? Is there any script in the background that measures where my cursor goes and where I try to click? I just wondering, because I just tried to click on your heading to jump to your home page but that did not work until I realized you only linked your logo and not your heading to the main page. Adding this would increase usabillity. Thanks for the article- found it really useful. |
| March 17, 2010 | polat alemdar wrote: |
| testing a user is like what? rolling dice? rolling 20 number dice? toss a coin? that is not mentioned. The axiomatic condition is not explained. |
| March 16, 2010 | Bill Wun wrote: |
| @ton bil: Results: 88% of 300 samples found a tails in less than 3 tosses |
| March 15, 2010 | Ton Bil wrote: |
| At the first test I'm pretty stable in the range 89 - 91 % after 50 samples (I went up to 250). |
| March 9, 2010 | Doug Baker wrote: |
| This is a response to murph. Based on Jeff's recommended strategy above, I'd say that you have to figure out for your projects what threshold constitutes an outlier. Is it below 20%? 15%? 10%? If you want to discover all issues that are not outliers, then you use the binomial formula as Jeff describes and test 9, 11, or how ever many participants to discover those issues. If you decide to that an outlier is below is a 10% population, and then run a test with 5 participants, then Jeff says that all problems will affect at least a third (roughly) of your user population. That means that the issues you discover are not outliers. Now, if you are concerned about a participant being an outlier, there is always a risk of that. The way that risk is mitigated is through a careful recruitment/screening process. This is important no matter how many participants one chooses to test, but is even more important for a small sample size. |
| March 9, 2010 | murph wrote: |
| Discovery of an issue is one thing - establishing that it is not the result of an outlier is another thing entirely. A client who must spend substantial development hours to correct an issue is going to want to know how likely this issue will occur across the user population. It\'s certainly easier to rationalize away an issue if there is no solid basis for determining the return on investment. |
Why you only need to test with five users (explained)
Five ways to make any usability test more credible
Can you use the SUS for websites?
Confidence Interval Calculator for a Completion Rate
The Five Most Influential Papers in Usability
Does better usability increase customer loyalty?
6 things you didn’t know about Heuristic Evaluations
If 1 of 5 users has a problem in a usability test will it impact 1% or 20% of all users?
Featured Product
Copyright © 2004-2010 Measuring Usability LLC
