Validity and Reliability in Surveys

Summary: Validity is about measurement accuracy. Reliability is about the measurement of internal consistency. To achieve both, good survey design is a must.

4 minutes to read. By author Michaela Mora on February 21, 2011
Topics: Analysis Techniques, Market Research, Sample Size, Survey Design

Validity and Reliability in Surveys

We expect validity and reliability in surveys, but a lot of work is required to achieve both.

We need to consider many things in order to write surveys that gather high-quality data. These include, among others:

  • Data collection method
  • Respondent effort
  • Questions’ wording, order, format, structure, and visual layout
  • Measured behaviors
  • Accuracy of the elicited information

Although all these issues are important, at the end of the day, what we want is to create surveys that yield results that are valid and reliable.

Discussions about validity and reliability are common in the field of psychometrics, but not so much in market research. Nonetheless, we assume they are present. 

Validity

Validity is concerned with the accuracy of our measurement. Although often discussed in the context of sample representativeness, we know that survey design also affects validity. In other words, it depends on asking questions that measure what we want to measure.

Most surveys often have what is called face validity, which is a matter of appearances. The questions seem like a reasonable way to obtain the information we are looking for, but are they really?

There are other types of validity survey writers should strive for.

Content Validity

This is related to our ability to create questions that reflect the issue we are researching and make sure that key related subjects are not excluded.

For example, we may want to learn how consumers use hair styling products and only ask how they used them in the past week. In this case, we are likely to miss information about product usage under different weather conditions (given that humidity can give you a bad hair day in a blink of an eye). Consequently, we may end up with an incomplete picture of consumers’ behavior in this category.

Internal Validity

This asks whether the questions we pose can really explain the outcome we want to research. In our hair styling product example, we need to ask questions that help us identify factors that influence the selection of hair styling products.

Here we are looking for a relationship between independent variables (e.g., hair type, hairstyle, etc.) and the dependent variable (e.g., likelihood to buy the hair styling products).

External Validity

This refers to the extent to which the results can be generalized to the target population the survey sample is representing. As we all know, the way we ask questions will determine the answer we get.

In other words, the questions should represent how the target population talks and think about the issue under research, which often calls for the need to conduct exploratory qualitative research.

In our example, assume we want to estimate the share of preference of our product in the hairstyling product category. To achieve this, we need to include other brands that represent this category, otherwise, we can’t extrapolate the results to the category as a whole.

Reliability

Reliability, on the other hand, is concerned with the consistency of our measurement. This is the degree to which the questions elicit the same type of information each time we use them, under the same conditions.

This is particularly important in satisfaction and brand tracking studies because changes in question wording and structure are likely to elicit different responses.

Reliability is also related to internal consistency, which refers to how different questions or statements measure the same characteristic.

Market segmentation studies provide a practical application of this concept. Many of these studies try to capture psychographics and construct behavioral or satisfaction segments. We do it often by asking respondents to rate a list of statements using different rating scales (e.g., agreement/disagreement; likes/dislikes; excellent/poor, etc.).

In our example, if we want to identify “lovers of styling products,” we should use statements to describe such consumers in a consistent way. We can test it with the help correlation analysis, split-sample comparisons, or methods such as Cronbach’s Alpha.

In Short

Validity and reliability are not always aligned. Reliability is needed, but not sufficient to establish validity.

We can get high reliability and low validity. This would happen when we ask the wrong questions over and over again, consistently yielding bad information. 

Also, if the results show large variability, they may be valid, but not reliable.

In short, don’t assume reliability and validity, unless you design surveys that really measure what you want and do it consistently.