Making the case for MaxDiff (Maximum Difference Scaling) is getting easier. In the research of preferences and importance attribution, MaxDiff provides greater advantages over rating, ranking, and allocation questions.
First, let’s look at the question types that have been traditionally used to study preferences and what’s important to people.
It is rare these days, to find surveys without rating questions, despite the many problems associated with them. Rating questions are susceptible to:
- Satisficing: They make it easier for users to engage in minimum-effort behaviors to satisfy the question’s requirement. We see it often in acquiescence biased responses (tendency to agree with everything).
- Scale Meaning Bias: Scale points can mean different things to different respondents in the same culture and across cultures. Some respondents may use only certain parts of the scale, which leads to extreme responding.
- Social Desirability Bias: Respondents are likely to select the most desirable side of the scale for questions about sensitive topics.
- Lack of discrimination: Respondents often rate everything as preferred or important. They are not forced to prioritize their answers.
- Ordinal Data: Although rating questions are often treated as “ratio” data, at best, they provide ordinal data, which don’t allow to assess the magnitude of differences in comparisons.
Ranking questions are also common in surveys. Unfortunately, they also have many limitations. These include:
- Order bias: We get different results depending on whether the respondent ranks the items from highest to lowest or vice versa.
- Respondent Burden: Ranking is a difficult task as respondents have to evaluate all items at the same time to determine their ranking.
- Number of items: Only a limited number of items can be tested without increasing the level of effort required by the respondent.
- Ordinal Data: Even when respondents use the same scale points to rank items, the distance between those points is unknown to the researcher. This again doesn’t allow us to estimate the magnitude of the differences in comparisons.
- Ties: Don’t allow for ties, which can occur in reality.
- Limited Reporting: It is difficult to report ranking questions beyond counts for the top items, leading to information loss.
Constant Sum Questions
Although less common, constant sum questions are often used in an attempt to find greater discrimination. However, these questions have their own set of weaknesses, including:
- Respondent Burden: Allocating points to different items is an even more difficult task. Respondents have to evaluate all items at the same time to determine the number of points that they need to allocate to each item.
- Number of items: Like ranking questions, we can only test a limited number of items without increasing the level of effort required by the respondent.
- Response Strategies: Given the level of difficulty these questions always pose, respondents often engage in strategies trying to make the task easier (e.g. allocating an equal amount of points to each item; given all points to one item, etc.)
The problems with each of these question types, particularly with rating questions, has led to an increased interest in the use of Maximum Difference Scaling or MaxDiff as is commonly called.
This is a trade-off analysis technique that allows us to do multiple pairwise comparisons in an effective way. We do it by repeatedly asking respondents to select the most and the least preferred or important items from a list on.
- Strong discrimination power
- It is a simple task for the respondent
- Allows to test a larger number of items
- Eliminates scaling bias
- Allows for diversity, which is necessary for international studies
- Provides ratio data and a measure of difference magnitudes
In order to implement a MaxDiff study, we need to:
- Identify the number of items to test.
- Create an experimental design that provides:
- Frequency balance (each item appears the same number of times),
- Orthogonality (each item is paired with other items the same number of times)
- Position balance (each item appears the same number of times in each position).
- Estimate utilities for each items using Hierarchical Bayes analysis or MNL, and rescale them for interpretation.
The standard output of MaxDiff analysis is usually a ranking of the items tested based on rescaled utilities.
Since Maxdiff yieds ratio data, we can also conduct further multivariate analysis including TURF and segmentation analysis
We have used MaxDiff to study preferences for and importance of a number of things including, among others:
- Features of products and services
- Benefits of products and services
- Names for new products and services
- Positioning Statements
- Advertising Banners
- Design Concepts
- Loyalty Program Offers
- Advertising Messages
- Satisfaction Drivers
- Brand Perceptions
- User Behaviors
- User Frustrations, motivations
- Topics of interest
MaxDiff is better than the alternatives discussed above, but it is not perfect.
- Survey Lenght: It will extend the survey length by a few minutes, particularly if you are trying to test many items.
- Result Interpretation: Results are relative to the list of items tested. Depending on your research goals, the relative measure it provides may not be what you want. In other words, MaxDiff estimates the preferences of items relative to each other but doesn’t tell us if we have a good or a bad batch from an absolute perspective.
Although there are ways to calibrate the MaxDiff results to “absolute” levels of preference or importance, we always recommend doing preliminary research, ideally qualitative research, to make sure to include relevant items in the MaxDiff list.
Next time you need to measure preferences or importance consider using Maxdiff instead of traditional approaches such as rating, ranking or constant sum questions.
You will gain in data quality, greater discrimination and the ability to provide better insights to support business decisions.
An earlier version of this article was published on September 28, 2010. The article was last updated and revised on October 7, 2019.)