Calculating Sample Size for Task Times (Continuous Method)by Jeff Sauro | September 17, 2004 :: 63 Related Questions
We already saw how a manageable sample of users can provide
meaningful data for
discrete-binary data like
task completion. With continuous data like task times, the sample size
can be even smaller.
The continuous calculation is a bit more complicated and
involves somewhat of a Catch-22. Most want to determine the sample size
ahead of time, then perform the testing based on the results
of the sample size calculation as in the
binary
sample calculation. In this case, we need to have some data already
or at least a strong hypothesis of our user population.
As with the binary calculation for task completion, we
know when testing experienced users (those who complete the task at
least weekly) they should overwhelmingly complete the task successfully.
With task times we should also have a rough estimate of the mean and
standard deviation ( there's the Catch 22). If you're performing a benchmarking
study and already have some data, then you can use that data. If you
have time, sample a pretest of users, say four, to get a sense of the
range in times. Of course when all else fails you can have some internal
folks complete the tasks--perhaps some sales or service employees or
whomever comes close to matching the speed and accuracy of you target
users. You'll need to have an idea of the standard deviation(in seconds)
for each task you're testing.
For example, lets use the sample task, "Looking up
a balance on an account number" (a very common task in accounting
software). You write up a scenario and try the task yourself and have
three of you colleagues complete it. Chances are you're probably completing
the task faster than your users, nevertheless it will still provide
you a range of times. Here are the times in seconds
| Time (in seconds) |
| You |
101 |
| Colleague 1 |
132 |
| Colleague 2 |
125 |
| Colleague 3 |
145 |
| |
|
| Mean |
125.75 |
| St Deviation |
18.46 |
| Range |
44 |
From this pre-test sample you want to be able to derive
as close an estimate as possible to the range in times of your actual
users. To operationalize this, you would say "I want to be 95%
confident of the mean time within ten seconds. So instead of simply
asking, "How many users do I need to test?", you ask "How
many users do I need to test to be 95% sure I know their mean task time
within ten seconds?" Here's where the real statistics start.
That ten second range will become the confidence interval.
The confidence interval is that + or - fudge factor seen with the polls
on TV. With this confidence interval we can work backwards to arrive
at our sample size. Because we don't know the standard deviation of
the whole population of users(again the Catch 22) we need to estimate
it from the small sample we have. For small samples (less than 30) where
the parent standard deviation (σ) is not known you use what's
called the
student
t distribution. The student t distribution uses values from
a t table instead of the more familiar z table of normal values.
The confidence interval is calculated by multiplying this
t-statistic (t*) by the Standard Error (SE). The Standard Error is just
the sample standard deviation divided by the square root of the sample
size. So the confidence interval formula usually looks something like
this:

To arrive at the elusive "significant"
sample size, you need to try a few reasonable sample sizes and see which
ones fall within the limits of the confidence interval. The values (n)
you choose will affect the the critical value for t and the Standard
Error since both use n in their equation. We'll use 25, 20, 15, 10 and
5 and which ever value has a confidence interval at about 10 seconds
we'll use as the ideal sample. (Again all this assumes that our internal
sample did a good job of determining the standard deviation of the larger
population).
| Sample |
95% CI |
SE |
SQRT N |
Stdev |
t * (.95) |
| 25 |
7.61
|
3.692
|
5
|
18.46 |
2.063
|
| 20 |
8.63
|
4.12
|
4.47
|
18.46 |
2.093
|
|
15 |
10.22
|
4.76
|
3.87
|
18.46 |
2.144
|
| 10 |
13.20
|
5.83
|
3.16
|
18.46 |
2.262
|
| 5 |
22.92
|
8.25
|
2.23
|
18.46 |
2.776
|
At about 15 users, the conifdence interval
narrows close enough to ten seconds that it will probably be sufficient.
I'd use this 15 as the approximate number of users you'd need to sample
and know that to get more precise, you'd need to sample more than 15
users. This result is much better than thinking you need to test 100
or 1000 in order to get "statistically significant results. If
+/- 10 seconds isn't precise enough you can:
-
Decrease your confidence level to 90% or 85%.
-
Sample more users.
-
Decrease your confidence interval and increase
your sample.
Sample Sizes in the Real World of Usability
Testing
If you've run enough usability tests, in many cases your
sample size is usually determined ahead of time--that is, you know your
budget and time frame and therefore approximately how many users you'll
be sampling--usually somewhere between 10 and 30. I then approach sampling
as getting as many users as I can within that range and then compute
the statistics later.
For example, lets say we followed our initial indication
and sampled 15 users (assuming our budget and time fit nicely with this
figure). We had them complete the same task of looking up an account
balance as our small internal employee sample. Here are the results
next to our initial internal sample:
| Real-Users Sample |
| User |
Time (in seconds) |
| 1 |
101 |
| 2 |
140 |
| 3 |
112 |
| 4 |
144 |
| 5 |
132 |
| 6 |
99 |
| 7 |
118 |
| 8 |
125 |
| 9 |
154 |
| 10 |
115 |
| 11 |
118 |
| 12 |
125 |
| 13 |
141 |
| 14 |
145 |
| 15 |
130 |
| |
|
| Mean |
126.6 |
| St Deviation |
16.33 |
| Range |
55 |
|
| Internal Sample |
| |
Time |
| You |
101 |
| Colleague 1 |
132 |
| Colleague 2 |
125 |
| Colleague 3 |
145 |
| |
|
| Mean |
125.75 |
| St Dev |
18.46 |
| Range |
44 |
|
With this sample we can now estimate the true mean time
of our population. Using the formula for the student t distribution:

 |
mean time of your sample (126.6) |
 |
true mean time of the entire population of users
|
| n |
number of users in the sample (15) |
| s |
the standard deviation of the sample
(16.33) |
| t* |
t
statistic = (2.144789) or use the excel function =TINV(.05,14)
[confidence level(.05) and degrees of freedom n-1 (14) ] |
Plugging in the numbers, for the estimated mean of the total population
of users on this task we get:
= 126.6 + or -
9.08
So when reporting the mean time for this task we would
say, "We are 95% confident the mean time is between 117.5 seconds
and 135.6 seconds." In this example, our original sample turned
out to be a good estimate of the mean time and standard deviation
but don't expect that to usually work out so well.
If you'd like an email when a new article or calculator is posted sign up
for
Email Updates.
Related QuestionsAsk a Question | In a national wide survey a researcher expects 30 percent of the population will agree with an value statement. He wishes to have less than 2% error and 95% confident. What sample size is needed?? |
| Use the given degree of confidence and sample data to construct a confidence interval for the population mean x. Assume that the population has a normal distribution.
The principal randomly selected six students to take an aptitude test. Their scores were:
77.9 89.1 80.7 78.6 74.4 82.0
Determine a 90 percent confidence interval for the mean score for all students.
76.36 < x < 84.54
84.64 < x < 76.26
76.26 < x < 84.64
84.54 < x < 76.36 |
| Suppose you are planning a sample of employees to determine the monthly average # of vacation days. Standards set: Confidence level of 99% and an error of less than 5 units. Standard deviation be 6 units. What would be the required sample size? |
| Annual starting salaries for college graduates with degrees in business administration are generally expected to be between $30,000 and $45,000. Assume that a 95% confidence interval estimate of the population mean annual starting salary is desired. What is the planning value for the population standard deviation (0 decimals)? |
| If 350 respondents out of a random sample of 1,000 Americans reported that they did not trust their government, what is your estimation at a 99% confidence level of the proportion of the American population who do not trust their government? |
| Survey to determine what proportion of new car buyers continue to have their car serviced at the dealership after warranty ends. Estimates 30% of customer do so. Results should be accurate within 5%. Also 95% confident of the results. What sample size is necessary? |
| What effect does an increase in the level of confidence have on the width of the confidence interval? |
| Determine the critical value Za/2 that corresponds to 94% level of confidence
Compute the 90% confidence interval about m if the sample size, n, is 55. How does does increasing the sample size affect the margin of error, E? |
| Linemarking for parking bays shall be determined by the retroreflectivity performance of glass beads in the linemarking. The linemarking average level of reflectivity over the City is to be not less than 100 mcd/sqm/lx and the minimum aceptable reflectivity is 80 mcd/m2/lx. Conduct a quarterly assessment of roadmarking condition, randomly sampling 5% of the City Roads each quarter so that 20% of all the roadmarkings are assessed each year. The results shall produce the results specified below. The methods used to perform the assessment shall be sufficient to produce a confidence interval of +/- 5% with a level of confidence of 95%;
(a)Average retroreflectivity for the roadmarking
(b)Percentage of roadmarking that are below the minimum acceptable standard.
The recordings of roadmarkings for the second quarter are as follows:
260.00
263.00
254.00
180.00
174.00
229.00
230.00
56.00
209.00
260.00
309.00
359.00
389.00
491.00
202.00
387.00
440.00
Thanks again |
| in a sample of 1000 tv vieweres 330 watched a particlar programme. find 99 pecent confidence limits for tv viewers who watched this programme |
| The width of a confidence estimate for a proportion will be:
a. Narrower for a 99% confidence interval than for a 95% confidence interval.
b. Wider for a sample size of 100 than for a sample size of 50.
c. Narrower for 90% confidence than for 95% confidence
d. Narrower when the sample proportion is .50 than when the sample proportion is .20. |
| How can I compute a one-sided 97.5% confidence interval using SPSS for this ?
IN a cohort of 121 eyes treated with drug A, 3 eyes experience a drug related side effect, i.e 3/121. Thanks |
| Philadelphia is conducting a study on the characteristics of tourists who drive to Eagles football games. Previous studies indicate that approximately 70% of all game attendees are people who decided to drive from out of town. If the researcher leading the study desires a 99% confidence level and an interval range of plus or minus 10%, what size should the sample be? |
| A researcher is interested in estimating the average salary of fire fighters in a large city. He wants to be 95% confident that his estimate is correct. If the standard deviation is $1050, how large a sample is needed to get the desired information and to be accurate within $200? |
| Popcorn kernels take between 100 and 200 seconds to pop. What sample size (number of kernels) would be needed to estimate the true mean seconds to pop with and error of 5 seconds and 95% confidence level? |
| The Web-based company Oh Baby! Gifts has a goal of processing 95 percent of its orders on the same day they are received. If 485 out of the next 500 orders are processed on the same day, would this prove that they are exceeding their goal, using á = .025? (See story.news.yahoo.com accessed June 25, 2004.) |
| Biting an unpopped kernel of popcorn hurts! As an experiment, a self-confessed connoisseur of cheap popcorn carefully counted 773 kernels and put them in a popper. After popping, the unpopped kernels were counted. There were 86. (a) Construct a 90 percent confidence interval for the proportion of all kernels that would not pop. (b) Check the normality assumption. (c) Try the Very Quick Rule. Does it work well here? Why, or why not? (d) Why might this sample not be typical? |
| Administrative staff based at a business school in the UK are advised to take a 15 minute break from their personal computer after working on it for 90 minutes continuously. A random sample of 36 staff revealed that, one average a break was taken after 97.4 minutes of continuous use. The corresponding standard deviation was measured at 5.1 minutes.
a. Provide a 99% confidence interval for the population mean time for working on a PC continuously, prior to taking a break.
b. Conduct a suitable test at the 1% level of significance, to see whether staff, on average, are working longer than advised. |
| A researcher expects the population proportion of the Cubs Fans in Chicago to be 80%. Error of less than 5% confident of an estimate to be made from a mail survey. What is the sample size required? |
| In 1992, the FAA conducted 86,991 pre-employment drug tests on job applicants who were to be engaged in safety and security relatd jobs and found that 1,143 were positive a. construct a 95% confidence interval for the population proportion of positive drug test. b. why is the normality assumption not a problem, despite the very small value? |
| When calculating confidence interval estimates...how is the x and standard deviation calculated? |
| State the main points of the Central Limit Theorem for a mean. B. Why is population shape of concern when estimating a mean? What does sample size have to do with it? |
| I am trying to do A/B testing onweb page displays. One is the test the other is the control.
Could you help me determine the formula for sample size at a 95% confidence level?
My historical Conversion Rate (success): 11%
Visitors: 280 per day
I would like to see a 2% improvement in my conversion rate.
|
| When estimating the mean of a population, how large must the sample be in order that the 95% error margin is 1/8 the standard deviation? |
| Find the margin or error for the 95% confidence interval used to estimate the population proportion. In a survey of 7100 TV viewers, 38% said they watch network news programs. |
| Jeff:
Can you possibly share the actual formula you are using in this application to produce the Adjusted Wald calculations?
Thanks |
| What is the Trimmed Mean For? |
| A central university has a student population of 60,000. The university is interested in determining what proportion of them is in favour of a new grading system. Determine a sample size with confidence level of 95% that will show the true proportion of population in favour of the new system within plus and minus 0.02. |
| How to determine the sample size for comparing mulltiple parameters like Height , weight, Blood pressure , Blood parameters like blood glucose, total cholesterol, etc in two different populations? |
| An engineer in an automotive factory wishes to know what the tire pressure is on all cars leaving the factory. She measures the tire pressure on a sample of 10 randomly selected cars as they are about to leave the plant, in psi. The results are:
32.1
32.3
32.0
30.9
31.5
32.4
32.9
33.1
32.2
31.4
Calculate a 95% Confidence Interval on these numbers |
| Hi Jeff thanks for all your help so far, your answers have been great and very timely. It is great to be able have help from someone with great professionalism as yourself.
I have another question which I am unsure as how to tackle it. I hope you can help me once again – thanks. So here goes.
For traffic signs, measurement of retro reflective performance with a reflectometer will be needed as well as visual assessment during Street Surveillance inspections.
The Service provider (that’s me!) shall collate and report on the results of actual reflectivity readings for all signs visually identified as required a measured reflectivity assessment to ensure that the Contract Manager and the Service provider have confidence in the visual assessment of retro-reflective performance.
The Service provider shall ensure that its procedures to determine the retro reflective performance of Signs provide valid assessments.
(So what we have done is taken random readings of various signs for the quarter, see below.)
The Australian Standards for these signs states that for Class 1W signs the minimum level of luminous intensity for white, yellow & red signs are 380, 265 & 75 respectively which are to replace the old standard Class 1 signs; white, yellow and red of min. levels of 250, 75 & 50 respectively.(This is a works in progress. It was not noted when taking the readings whether the signs were the new or old standard? I assume the the old standard signs would be the ones with readings of lower values? But they could also be old faded signs with low readings. Do we need to separate the two classes of signs or should we just keep it simple with a general observation?)
Further more there are also Class 2 signs which the standards state that the minimum luminous intensity for white, yellow & red signs are 70 50 & 14 respectively.
Finally there is one other question which I don’t believe we are in the position to answer without further historical data but I would appreciate your comments.
The question asks to assess and report on the degradation of reflectivity performance as a method of predicting the future reflectivity performance of signs.
Both Class1 sign readings are:
Reflectivity Colour
91.9 W
277 W
403 W
514 W
458 W
97.7 W
520 W
416 W
481 W
508 W
448 W
516 W
282 W
264 W
240 W
276 W
262 W
258 W
296 W
211 W
183 W
46.5 R
45.1 R
80 R
381 R
235 R
35.9 R
and for Class2:
Reflectivity Colour
32.1 Y
51.4 Y
5.7 Y
41.2 W
45.3 W
79.2 W
52.1 W
54.8 W
63.5 W
22.1 W
60.4 W
77.9 W
78 W
81.6 W
78 W
68.2 W
66.4 W
21 W
72.5 W
3.1 W
50.1 W
106 W
87.4 W
85.3 W
18.1 W
78.9 W
62.5 W
64.5 W
61.2 W
76.9 W
82 W
24.5 W
65.5 W
19.6 W
48.5 W
22.4 W
48.4 W
79.9 W
40 W
84.2 W
7.4 W
45.4 W
80 W
9 W
86.2 W
70.2 W
80.3 W
87.4 W
105 W
86.3 W
11.4 R
26.5 R |
| A random sample of 25 households finds that an average of 2.3 people reside in each house (the standard deviation is 0.35). With a 95% confidence level, what is your estimation of the population average? |
| My question is really about sample size. Say you work at a facility and want to perform an assessment of your safety culture – this would involve multiple topics of questions expecting answers like agree, tend to agree, not sure, tend to disagree, disagree. How would you estimate the sample size if your total facility population is only 80 persons? Would the manner in which you estimate the sample size be different if you used any combination of the following methods to conduct the assessment: a written survey, individual interviews, one on one observations, or focus groups? What if these assessments must be performed at eight different facilities that have no relationship to each other and each of their total populations range from 15-1000, with a mean of 153? To make it even more complicated, would the manner in which you estimate sample size change if you really wanted each of the facilities to assess eight work groups or divisions in the workforce (e.g., management, operations, maintenance, engineering, etc.) at each of their facilities? Greatly appreciate your input. |
| Why don't statisticians calculate 100% confidence intervals? |
| Find a confidence interval for the mean assuming that each sample is from a normal population. Mean = 127, s = 27, n = 16. Find the 90% Confidence Interval. |
| As an experiment, a self-confessed connoisseur of cheap popcorn carefully counted 773 kernels and put them in a popper. After popping, the unpopped kernels were counted. There were 86. a)Construct a 90% confidence interval for the proportion of all kernels that would not pop. b)Check the normality assumption. c) Try the Very Quick Rule. Does it work well here? Why, or why not? d) Why might this sample not be typical? |
| Calculate a 98% confidence interval for the following data 15.7,15.7,15.5,15.2,15.2,15.1,15.3. |
| Calculate the 95% confidence interval on the following GPA's from 30 randomly selected students.
0.979
0.891
0.962
0.858
0.909
0.936
0.963
0.903
0.914
0.925
0.867
0.888
0.735
0.897
0.851
0.776
0.999
0.967
0.503
0.711
0.963
0.943
0.396
0.951
0.747
0.933
0.909
0.583
0.95
0.756 |
| A random sample of n=64 children of working mothers showed that they were absent from school an average of 5.3 days per term, with a standard deviation of 1.8 days. Provide a 96% confidence interval for the average number of days absent for all students. |
| A sample of 20 pages was taken without replacement from the 1,591-page phone directory Ameritech Pages Plus Yellow Pages. On each page, the mean area devoted to display ads was measured
(a display ad is a large block of multicolored illustrations, maps, and text). The data (in square millimeters) are shown below:
0 260 356 403 536 0 268 369 428 536
268 396 469 536 162 338 403 536 536 130
(a) Construct a 95 percent confidence interval for the true mean. (b) Why might normality be an
issue here? (c) What sample size would be needed to obtain an error of ±10 square millimeters with 99 percent confidence? (d) If this is not a reasonable requirement, suggest one that is. I am new at this and it would help if you could give me the formula and break it down step by step so I can understand. Thanks |
| I am charged with sampling expense report submissions for accuracy. We get 8000 T&E claims, and want to sample a subset making inferences about the population. I can probably get a good estimate of the populations' SD, how would i calculate required sample size? I think i would be measuring the delta between actual and claimed - the majority be 0. Is this a one sided issue? Thanks in advance - FT
Also, would i need the acceptable level first? ie. we would accept an average difference of $5?
How can i work backwards if the first sample size yields an average of $1? (if that made sense) |
| A telescope manufacturer wants its telescopes to have standard deviations in resolution to be significantly below 2 when focusing on objects 500 light-years away. When a telescope is used to focus on an object 500 light years away 30 times, the sample standard deviation turns out to be 1.46.
a.State explicit null and alternate hypotheses
b.Test your hypothesis at the á=0.01 level. |
| Can you give me a formula to calculate sample size? |
| A machine produces 3 inch nails. A sample of 100 nails is selected, and it is found that 25 are shorter than 3 inches. Find the 95% confidence interval on the proportion of all such nails that are shorter than 3 inches. |
| How do you determine the sample size for data for which the mean and standard deviation are not known? |
| A sample of the math test scores of 35 fourth-graders has a mean of 82 with a standard deviation of 15.
Find the 95% confidence interval of the mean math test scores of all fourth-graders.
Find the 99% confidence interval of the mean math test scores of all fourth-graders.
Which interval is larger? Explain why. |
| Based on information obtained from a sample of 54, a 98% confidence interval for the average profit level of regional banks is given by 67.4 million to 87.78 million. Determine the sample standard deviation of profit |
| A central university has a student population of 60,000. The university is interested in determining what proportion of them is in favour of a new grading system. Determine a sample size with confidence level of 95% that will show the true proportion of population in favour of the new system within plus and minus 0.02. |
| An engineer in an automotive factory wishes to know what the tire pressure is on all cars leaving the factory. She measures the tire pressure on a sample of 10 randomly selected cars as they are about to leave the plant, in psi. The results are:
32.1
32.3
32.0
30.9
31.5
32.4
32.9
33.1
32.2
31.4
Calculate a 95% Confidence Interval on these numbers |
| In 1992, the FAA conducted 86,991 pre-employment drug tests on job applicants who were to be engaged in safety and security-related jobs, and found that 1,143 were positive. (a) Construct a 95 percent confidence interval for the population proportion of positive drug tests. (b) Why is the normality assumption not a problem, despite the very small value of p? |
| Assume that the heights of 5 year old boys are normally distribued with a mean of 100cm and a standard deviation of 60. What is the sampling distribution of the mean for a sample size of 900 and its confidence interval at the 99% level? |
| out of a random sample 167 students pass an exam out of the 300. how do you calculate an exact 99% confidence interval for the proportion of sudents who passed the exam? |
| how to calculate one sided 95% confidence limits for a proportion.could you provide a formula |
| How do I determine what test statistic to use if given a sample of test scores for a present year and a previous year using a .05 significance level to retain or reject the null hypothesis. |
| What is the Z score of a 98% confidence interval? |
| In a survey of 500, 60% responded positively to an value question.
Calculate a confidence level at 95% to get an interval estimate for proportion? |
| A random sample of 10 miniature Tootsie Rolls was taken from a bag. Each piece was weighed on very accurate scale. The results in grams were 3.087 3.131 3.241 3.241 3.270 3.353 3.400 3.411 3.437 3.477 (a)Construct a 90% confidence intervalfor the true mean weight. (b)What sample size would be necessary to estimate the true weight an error of +/- 0.03 gram with 90% confidence? |
| How would I figure out a 90% and 95% confidence interval for the below informaion? What would be the formula? Can I figure this out in Excel or Megastat? How would I do that?
Men Female
count 53 47
mean 36,492.92 24,451.51
sample variance 340,313,003.72 154,893,232.30
sample standard deviation 18,447.57 12,445.61
standard error of the mean 2,533.97 1,815.38 |
| Annual starting salaries for college graduates with degrees in business administration are generally expected to be between $30,000 and $45,000. Assume that a 95% confidence interval estimate of the population mean annual starting salary is desired. What is the planning value for the population standard deviation (0 decimals)?
__________
How large a sample should be taken if the desired margin of error is as shown below (0 decimals)?
a. $500? __________
b. $200? __________
c. $100? __________
d. Would you recommend trying to obtain the $100 margin of error?
_________________ |
| An automobile manufacturer wants to estimate the mean gasoline mileage that its customers will obtain with its new compact model. How many sample runs must be performed in order that the estimate be accurate to within 0.25 mpg at 90% confidence? (Assume that ó = 2.0.) |
| Briefly describe the concept of a confidence interval and provide an example. |
| what is the value of z score required for a 70% confidence interval? |
| Your company asks you to compare a new advertising campaign and the old one. Sample data on the accounts as follows:
Old: 40, 28, 35, 38, 31, 42, 26, 44, 29, 43
New: 29, 26, 31, 26, 28, 31, 19, 21, 27, 30
Calculate the mean, median, and mode.
Calculate variance and standard deviation for each set.
Calculate a 95% Confidence Interval for the two sets. |
Ask a Question
| October 16, 2008 | AKHILESH wrote: |
| 5. A telescope manufacturer wants its telescopes to have standard deviations in resolution to be significantly below 2 when focusing on objects 500 light-years away. When a telescope is used to focus on an object 500 light years away 30 times, the sample standard deviation turns out to be 1.46.
a. State explicit null and alternate hypotheses
b. Test your hypothesis at the á=0.01 level. |
|
| July 16, 2008 | Jackie Aylsworth wrote: |
| can you tell me what an application of split-half would be as well as the appropriateness (ie: when or when not to use it as well as strengths and weaknesses of split half? |
|