Which Rating Scales Should I Use?

Summary: Different survey rating scales are bound to yield different results as we are mainly dealing with human perception. We need to be careful in how to use them.

2 minutes to read. By author Michaela Mora on August 19, 2011
Topics: Analysis Techniques, Market Research, Survey Design

Which Rating Scales Should I Use?

Clients often ask me to review surveys or analyze data collected via surveys they developed themselves. More often than not I find rating scales, (aka Likert scales) of different sizes and directions within the same survey. When I ask why I get answers such as “This is the one we have always used.”

It seems this question type is often chosen based on preference or habit (e.g. legacy surveys). This is not surprising since there is no consensus on which scales work best. They all yield different results, which is disheartening in a way.

What The Research Says

A lot of research has been dedicated to this subject. Unfortunately, there is no simple answer to the question of which rating scales we should use.

Research on Rating Scales
Source: International Journal of Social Research Methodology, Vol. 13, No.1 Feb. 2010, 17-27 (Hartley and Betts)

How to Avoid or Handle Rating Scales

This extensive body of research shows that different rating scales are bound to yield different results as we are mainly dealing with human perception. They mean different things to different people and the values, words, and order in which we present them have an impact on how they are interpreted. What to do?

  • Whenever possible, favor question formats other than rating scales. For example, MaxDiff has been shown to discriminate better in preference and important measurements.
  • If you still have to use rating scales, strive for consistency and use them with full knowledge of the bias they introduce in the data, particularly if you want to analyze data from different rating scales and data from different surveys. This is particularly relevant in tracking studies. A change in rating scale from one wave to another may show artificial significant differences mainly due to the measurement error introduced by the change in scale.
  • Above all, triangulate the results with other data sources to understand how different scale points correlate with actual behavior and ask why the person gives a particular rating. If possible use a text analytics tool to get at the heart of what the scale really means for a respondent. The example below says it all.
Product Rating