Measuring Usability
Quantitative Usability, Statistics & Six Sigma by Jeff Sauro

That's the worst website ever!: Effects of extreme survey items

Understanding the effect of using extremely worded items on the System Usability Scale

Jeff Sauro • September 21, 2010

Items in questionnaires are typically worded neutrally so as not to state concepts in the extreme. They are like an even-tempered friend—they have opinions but aren't overly optimistic or chronically pessimistic about things.

What happens when items in a questionnaire or survey are worded in the extreme?

Two years ago we tried a little experiment at the annual UPA conference to find out.

We wanted to know what would happen if we rephrased the moderately worded items of the popular System Usability Scale questionnaire (SUS). Specifically, we wanted to see the effects of using extremely worded items instead of the original neutral items.

For example, item one of the SUS questionnaire is: I think that I would like to use this system frequently.

What would happen if we made it an extreme positive statement?

I think that this is one of my all-time favorite web sites.

Or an extreme negative statement?

I think I never want to use the web site again.

Would respondents notice the difference?  If so, how would it affect their scores?  A group of us (Keith Karn, Alex Little, Greg Nelson, Jeff Sauro, Jurek Kirakowski, William Albert and Kent Norman) created two new versions of the SUS; one an extreme positive version and the other an extreme negative version  (shown below).

The extreme positive SUS.
  1. I think that this is one of my all-time favorite web sites.
  2. I found the web site was really straightforward.
  3. I thought the web site was amazingly easy to use.
  4. I think that technical support services are just not required for the web site.
  5. I found the various pages on the web site worked together very smoothly.
  6. I thought the web site was consistent throughout.
  7. I would imagine anybody could use the web site like a pro from day one.
  8. I found the web site was a delight to use.
  9. I felt completely confident using the web site.
  10. Everything I needed to know about using the website was there for me.

The extreme negative SUS.
  1. I think I never want to use the web site again.
  2. I found the web site to be horribly complex for no good reason.
  3. I thought the web site was very difficult to use.
  4. I think that I would need a permanent hot-line to the help desk to be able to use the web site.*
  5. I found all the pages on the web site to be an ugly mess.
  6. I thought the inconsistency in the web site would kill it.
  7. I found the web site to be completely impossible to use.
  8. I found that this web site was extremely awkward to use.
  9. I felt utterly confused by the web site.
  10. Absolutely nothing about the web site worked
*Indicates my personal favorite.

We sought out volunteers and asked them to review the UPA website. After the review, participants were randomly presented with one of five SUS questionnaires. They received either the all positive extreme, all negative extreme, one of two versions of an extreme mix(half positive and half negative extreme), or as a baseline the standard SUS questionnaire. Around 60 people in total participated giving us between 10-14 responses per condition.

What happened?

In short, extreme wording makes a difference—a big difference in fact.  The perception of usability as measured using the original SUS items was 60. The average score on the extremely worded negative questionnaire was 77, or around 25% higher

The average score on the extremely worded positive questionnaire was 41 or around 30% lower than the original SUS score. 

Both differences were statistically significant at the relatively small sample sizes (p <.01) and are shown along with 95% confidence intervals in the graph below.


Figure 1: Mean and 95% Confidence Intervals for SUS scores by type of SUS questionnaire.



People basically reacted to these extreme items by disagreeing to them more.

Interestingly enough though, users that got half-extreme positive and half-extreme negative items showed no significant differences from the standard SUS. The higher responses from the extreme negative items basically canceled out the effects of the lower responses from the extreme positive items. Item intensity and direction were confounded so separating the effects of reversing items and making them extreme is difficult at this sample size (although I'll cover this in a future blog).

Why do people disagree to extreme items?

There are probably several reasons why users tend to disagree more with the extremely worded items but one good explanation comes from some of the earliest research on rating scales (Thurnstone 1928).

It has been noted that people tend to only agree with statements that are close to their attitude and disagree with all other statements. By rephrasing items to their extreme concept, only respondents who had passionately favorable attitudes about the usability of the UPA website tended to agree with the extremely phrased positive statements—resulting in a significantly lower average score. Likewise, only respondents who passionately disfavored the usability agreed with the extremely negatively questions—resulting in a significant higher average score.

While I don't recommend that anyone uses the above questions in their next usability evaluation, it should be clear from this data that extremely worded items will make a major difference in scores. In fact, compared to other changes you can make such as the number of scale points or the alignment of the response options these effects are huge—about 3 times as large.
 
When you're creating your next survey or questionnaire, keep in mind that questions and items interpreted to be extreme will likely result in fewer people agreeing with them. In most cases you'll probably want items that have a more neutral wording. Of course, what makes an item "extreme" can be highly contextual and controversial. 

In hindsight it seems obvious that reactions to extremely worded questions would be like our reactions to extremely opinionated people—we tend to disagree with them more. The good news is that despite the many ways you can mess-up a questionnaire, as long as you use the same questionnaire when you make comparisons between designs, you'll probably still have meaningful results.
 

Learn More

Jeff Sauro hosted a live webinar on February 28th, 2012 on Best Practices for Remote Usability Testing. The event was overbooked so if you missed it you can now view a recording.


You Might Also Be Interested In:

Rate this Blog

Avg. Rating 7.93 (15)

Poor         Excellent
012345678910

Related Topics

Rating Scale, Questionnaires, SUS, Survey
.

Posted Comments

Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 4 + 1: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[1977 Subscribers]

Connect With Us

Our Supporters

Use Card Sorting to improve your IA

Loop11 Online Usabilty Testing

About Jeff Sauro

Jeff Sauro is the founding principal of Measuring Usability LLC, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 15 journal articles and 3 books on statistics and the user-experience.
More about Jeff...

.

Jeff's Books

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professions (JUST RELEASED)

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.